aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/core/SkCpu.cpp
Commit message (Collapse)AuthorAge
* Does everyone support __has_include() now?Gravatar Mike Klein2018-06-13
| | | | | | | | | | Let's find out. Change-Id: I8ff2103c389d6627f3963a2f067baa0a211647c9 Reviewed-on: https://skia-review.googlesource.com/134510 Commit-Queue: Ben Wagner <bungeman@google.com> Auto-Submit: Mike Klein <mtklein@chromium.org> Reviewed-by: Ben Wagner <bungeman@google.com>
* detect ASIMDHP on ARM64Gravatar Mike Klein2018-03-26
| | | | | | | | | | | | | | (ASIMDHP == "advanced SIMD half-precision" == NEON half-float compute.) Testing: printed features after detection Pixel 1: 0x08 Galaxy S9: 0x18 (All as expected.) Change-Id: I3c6987d9ad50b0eb244c2be4354c1c13fdd24815 Reviewed-on: https://skia-review.googlesource.com/116480 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* eliminate SK_BUILD_FOR_WIN32Gravatar Mike Klein2018-01-26
| | | | | | | | | | SK_BUILD_FOR_WIN and SK_BUILD_FOR_WIN32 have long meant the same thing. Chrome fix is https://chromium-review.googlesource.com/c/chromium/src/+/884007 Change-Id: I0e907b1bcd2a358eabf776f414fd3aeb3c689561 Reviewed-on: https://skia-review.googlesource.com/99340 Reviewed-by: Mike Reed <reed@google.com>
* Make Skia compatible with Android NDK r16Gravatar bsheedy2017-11-28
| | | | | | | | | | | | | | | Changes to Skia that are necessary to make Chromium compile with Android NDK r16, which switches to unified headers. Sister CLs: src/third_party/android_tools/ndk: https://chromium-review.googlesource.com/c/android_ndk/+/784230 src/: https://chromium-review.googlesource.com/c/chromium/src/+/777822 Bug: chromium:771171 Change-Id: I3d35df5b99d8eb7d7d938d21b5aecdf4c2d5da0f Reviewed-on: https://skia-review.googlesource.com/75422 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* fine-grained ARMv7 CPU feature detectionGravatar Mike Klein2017-10-25
| | | | | | | | | | VPFv4 does not imply NEON, so check that bit separately. Bug: b/63553517 Change-Id: Ibc218871804204d5a91d0b7fc8d5c91fe2e95f01 Reviewed-on: https://skia-review.googlesource.com/63640 Reviewed-by: Bailey Forrest <bcf@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Tweak HWCAP_... names to avoid clash with hwcap.hGravatar Mike Klein2017-07-19
| | | | | | | | | | | | | | | These HWCAP_... values are defined in hwcap.h, but we don't get them from there because some platforms have older hwcap.h that don't have these bits named yet. Even though we don't directly include hwcap.h, it seems it can get itself included somehow on some platforms. That leads to a name clash with the HWCAP_... #defines in there. To avoid it, rename them. Change-Id: I70788b5e4072c307c6eee55d6f197c3b9a49f5dc Reviewed-on: https://skia-review.googlesource.com/24408 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* define HWCAP_* ourselves in SkCpu.cppGravatar Mike Klein2017-06-06
| | | | | | | | | | | | | For compatibility with older system headers, instead of looking for HWCAP_ values in asm/hwcap.h, just define the bits we want to test ourselves. This lets us compile this code on systems before those bits were defined. At runtime the bits will harmlessly test as zero. Change-Id: I44b6aba7d6f0fc2c5df08ad262c2b0537d900209 Reviewed-on: https://skia-review.googlesource.com/18844 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Add AVX-512 detection to SkCpu, try 2.Gravatar Mike Klein2017-03-01
| | | | | | | | | | | | | This time, don't call xgetbv() before checking we can. This reverts commit b26373cfd8151b2fa56bdf532ddcde4919cce09f. CQ_INCLUDE_TRYBOTS=skia.primary:Test-Mac-Clang-MacMini4.1-GPU-GeForce320M-x86_64-Debug Change-Id: I148302cb36446891b1d79b2e60cde0b43420c1a8 Reviewed-on: https://skia-review.googlesource.com/9089 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Revert "Add AVX-512 detection to SkCpu"Gravatar Cary Clark2017-02-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 3c322e23a013e78fcbe0edd7adccd580af8466bc. Reason for revert: crash in SkCpu on Mac Original change's description: > Add AVX-512 detection to SkCpu > > I've added a SKY alias for the five new bits detected on a Skylake Xeon. > > Change-Id: I9f7dd48f4dc866608d81befd061434ca325ef451 > Reviewed-on: https://skia-review.googlesource.com/9043 > Reviewed-by: Herb Derby <herb@google.com> > Commit-Queue: Mike Klein <mtklein@chromium.org> > TBR=mtklein@chromium.org,herb@google.com,reviews@skia.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true Change-Id: I3cc06c7e32391e68d6cfe084786b18270cdab631 Reviewed-on: https://skia-review.googlesource.com/9074 Reviewed-by: Cary Clark <caryclark@google.com> Commit-Queue: Cary Clark <caryclark@google.com>
* Add AVX-512 detection to SkCpuGravatar Mike Klein2017-02-28
| | | | | | | | | I've added a SKY alias for the five new bits detected on a Skylake Xeon. Change-Id: I9f7dd48f4dc866608d81befd061434ca325ef451 Reviewed-on: https://skia-review.googlesource.com/9043 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Simplify SkCpu.cpp preprocessor guards.Gravatar Mike Klein2017-02-08
| | | | | | | | | | | | | | | | | | | | | | | We have a couple ways to detect CPU features on ARM: - on ARMv8, getauxval(AT_HWCAP) - on ARMv7, getauxval(AT_HWCAP) and cpu-features.h This guards each of these methods with preprocessor guards to match exactly when we can use them. Today they're sort of a mix of that and higher level expectations about particular build and operating systems. I'm looking into doing this directly by reading CPU registers, much like we do for x86 further up the file. None of this is super important right now, so as long as we don't decide that we have these features when we don't, things will be fine. It's no big deal for now if we fail to detect them. Change-Id: I3b7768483086d0f3f4f6516b754c3ea5ec2d03e5 Reviewed-on: https://skia-review.googlesource.com/8182 Reviewed-by: Chinmay Garde <chinmaygarde@google.com> Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* some armv7 hackingGravatar Mike Klein2017-01-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | We can splice these stages if we drop them down to 2 at a time. Turns out this is significantly (2-3x) faster than the status quo. SkRasterPipeline_… …f16_compile 1x …srgb_compile 2.06x …f16_run 3.08x …srgb_run 4.61x Added a couple ways to detect (likely) the required VFPv4 support: - use hwcap when available (NDK ≥21, Android framework) - use cpu-features when not (NDK <21) The code in SkSplicer_generated.h is ARM, not Thumb2. SkSplicer seems to be blx'ing into it, so that's great, and we bx lr out. There's no point in attempting to use Thumb2 in vector heavy code... it'll all be 4 byte anyway. Follow ups: - vpush {d8-d9} before the loop, vpop {d8-d9} afterwards, skip these instructions when splicing; - (probably) drop jumping stages down to 2-at-a-time also. Change-Id: If151394ec10e8cbd6a05e2d81808488d743bfe15 Reviewed-on: https://skia-review.googlesource.com/6940 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Remove dependency on NDK cpufeatures.Gravatar Mike Klein2016-12-12
| | | | | | | | | | | | | | | | | | | | | | | Instead of relying on cpu-features.c, just do what it does. Good reading: http://man7.org/linux/man-pages/man3/getauxval.3.html While it's nice to use the headers when possible, should either of these headers not be available, we can fall back to doing it all manually: extern "C" uint32_t getauxval(uint32_t) static const int AT_HWCAP = 16; static const int HWCAP_CRC32 = (1<<7); To keep things simple I've slimmed cpu feature detection down to just the features we actually make use of. This removes all runtime feature detection for ARMv7... we expect NEON to be globally available, and so far we haven't used the other FMA/FP16 bits on ARMv7. ARMv8 feature dection remains the same, CRC32 before, CRC32 after. x86 (cpuid-based detection) and MIPS (nothing) are untouched. We need to keep //third_party/cpu-features for //third_party/libwebp. Change-Id: I6c96df9a09ae68c8c0e54c1152aa177ba9bafc83 Reviewed-on: https://skia-review.googlesource.com/5800 Reviewed-by: Derek Sollenberger <djsollen@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Add an SkOpts target for Haswell+ Intel chips.Gravatar Mike Klein2016-09-30
| | | | | | | | | | | | | | | | Haswell brought a whole slew of handy new instructions for us (AVX2, FMA, BMI1+BMI2) and also feature F16C, which came one generation earlier on Ivybridge. We work with integers often enough that we really want to target AVX2 instead of AVX, and this means it's pretty practical to ask for all those other goodies along with it. Chrome's GN files and Google3's BUILD file will need an update, before or after this CL. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2840 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Change-Id: I826daf77b5104664c5d31ddaabee347e287b87a2 Reviewed-on: https://skia-review.googlesource.com/2840 Commit-Queue: Mike Klein <mtklein@chromium.org> Reviewed-by: Herb Derby <herb@google.com>
* GN: AndroidGravatar mtklein2016-08-25
| | | | | | | | | | | | | | | | Once you have downloaded an android NDK, you can set the ndk GN arg to use it. E.g. my gn.args looks like: is_debug = false ndk = "/opt/android-ndk" This should be enough to get you going for an arm64 build. You ought to be able to tweak that to other architectures by changing target_cpu to "arm", "x86", "x86-64", etc. That won't quite work until I follow this up a bit, but the skeleton is there. This is enough to get me compiled, linked, and running to completion on my N5x. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2275983004 Review-Url: https://codereview.chromium.org/2275983004
* Detect CRC32 instructions on ARMv8.Gravatar mtklein2016-08-18
| | | | | | | | | | | | | I have successfully detected CRC32 instruction support on my Nexus 5x. Use of these instructions to follow... I am not yet sure which compilers if any will give me instrinsics or let me write them in asm. defined(__ARM_FEATURE_CRC32) should cover users like Android Framework who build with the best settings possible. cpu-features.h covers use cases like Clank and our bots. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2259133002 Review-Url: https://codereview.chromium.org/2259133002
* Clean up hyper-local SkCpu feature test experiment.Gravatar mtklein2016-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | This removes the code paths where we make SkCpu::Supports() calls from within a tight loop. It keeps code paths using SkCpu::Supports() to choose entire routines from src/opts/. We can't rely on these hyper-local checks to be hoisted up reliably enough. It worked pretty well with the first couple platforms we tried (e.g. Clang on Linux/Mac) but we can't gaurantee it works everywhere. Further, I'm not able to actually do anything fancy with those tests outside of x86... I've not found a way to get, say, NEON+F16 conversion code embedded into ordinary NEON code outside writing then entire function in external assembly. This whole idea becomes less important now that we've got a way to chain separate function calls together efficiently. We can now, e.g., use an AVX+F16C method to load some pixels, then chain that into an ordinary AVX method to color filter them. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2138073002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2138073002
* SkCpu w/o static initializerGravatar mtklein2016-04-21
| | | | | | | | | | | | | | | | | | | | | | | | I think I cracked it. Though, this may not technically be legal C++... I've only got one definition of SkCpu::gCachedFeatures, but two different declarations: non-const in SkCpu.cpp, const elsewhere. Is this... - legal C++? - not C++ but probably works as I think? - not C++ and will probably blow up? - who knows, let's see? I have tested that the features are cached properly, read properly, and that the generated code treats SkCpu::gCachedFeatures as a global constant outside SkCpu.cpp. So it all observably works optimally. Expanding testing to more bots. TBR=reed@google.com BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1905683003 Review URL: https://codereview.chromium.org/1905683003
* Move CPU feature detection to its own file.Gravatar mtklein2016-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Moves CPU feature detection to its own file. - Cleans up some redundant feature detection scattered around core/ and opts/. - Can now detect a few new CPU features: * F16C -> Intel f16<->f32 instructions, added between AVX and AVX2 * FMA -> Intel FMA instructions, added at the same time as AVX2 * VFP_FP16 -> ARM f16<->f32 instructions, quite common * NEON_FMA -> ARM FMA instructions, also quite common * SSE and SSE3... why not? This new internal API makes it very cheap to do fine-grained runtime CPU feature detection. Redundant calls to SkCpu::Supports() should be eliminated and it's hoistable out of loops. It compiles away entirely when we have the appropriate instructions available at compile time. This means we can call it to guard even a little snippet of 1 or 2 instructions right where needed and let inlining hoist the check (if any at all) up to somewhere that doesn't hurt performance. I've explained how I made this work in the private section of the new header. Once this lands and bakes a bit, I'll start following up with CLs to use it more and to add a bunch of those little 1-2 instruction snippets we've been wanting, e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps (for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Committed: https://skia.googlesource.com/skia/+/872ea29357439f05b1f6995dd300fc054733e607 Review URL: https://codereview.chromium.org/1890483002
* Revert of Move CPU feature detection to its own file. (patchset #7 id:120001 ↵Gravatar mtklein2016-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of https://codereview.chromium.org/1890483002/ ) Reason for revert: many unexpected GM diffs across GPU+CPU configs on Windows (hopefully just text masks on GPU?). seems like we pick a different srcover variant in some places. Original issue's description: > Move CPU feature detection to its own file. > > - Moves CPU feature detection to its own file. > - Cleans up some redundant feature detection scattered around core/ and opts/. > - Can now detect a few new CPU features: > * F16C -> Intel f16<->f32 instructions, added between AVX and AVX2 > * FMA -> Intel FMA instructions, added at the same time as AVX2 > * VFP_FP16 -> ARM f16<->f32 instructions, quite common > * NEON_FMA -> ARM FMA instructions, also quite common > * SSE and SSE3... why not? > > This new internal API makes it very cheap to do fine-grained runtime CPU > feature detection. Redundant calls to SkCpu::Supports() should be eliminated > and it's hoistable out of loops. It compiles away entirely when we have the > appropriate instructions available at compile time. > > This means we can call it to guard even a little snippet of 1 or 2 instructions > right where needed and let inlining hoist the check (if any at all) up to > somewhere that doesn't hurt performance. I've explained how I made this work > in the private section of the new header. > > Once this lands and bakes a bit, I'll start following up with CLs to use it more > and to add a bunch of those little 1-2 instruction snippets we've been wanting, > e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps > (for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM. > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002 > CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > Committed: https://skia.googlesource.com/skia/+/872ea29357439f05b1f6995dd300fc054733e607 TBR=fmalita@chromium.org,herb@google.com,reed@google.com,mtklein@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/1892643003
* Move CPU feature detection to its own file.Gravatar mtklein2016-04-14
- Moves CPU feature detection to its own file. - Cleans up some redundant feature detection scattered around core/ and opts/. - Can now detect a few new CPU features: * F16C -> Intel f16<->f32 instructions, added between AVX and AVX2 * FMA -> Intel FMA instructions, added at the same time as AVX2 * VFP_FP16 -> ARM f16<->f32 instructions, quite common * NEON_FMA -> ARM FMA instructions, also quite common * SSE and SSE3... why not? This new internal API makes it very cheap to do fine-grained runtime CPU feature detection. Redundant calls to SkCpu::Supports() should be eliminated and it's hoistable out of loops. It compiles away entirely when we have the appropriate instructions available at compile time. This means we can call it to guard even a little snippet of 1 or 2 instructions right where needed and let inlining hoist the check (if any at all) up to somewhere that doesn't hurt performance. I've explained how I made this work in the private section of the new header. Once this lands and bakes a bit, I'll start following up with CLs to use it more and to add a bunch of those little 1-2 instruction snippets we've been wanting, e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps (for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1890483002