aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/core/SkUtilsArm.h
Commit message (Collapse)AuthorAge
* Remove NEON runtime detection support.Gravatar mtklein2016-05-05
| | | | | | | | BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1952953004 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/1952953004
* Move CPU feature detection to its own file.Gravatar mtklein2016-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Moves CPU feature detection to its own file. - Cleans up some redundant feature detection scattered around core/ and opts/. - Can now detect a few new CPU features: * F16C -> Intel f16<->f32 instructions, added between AVX and AVX2 * FMA -> Intel FMA instructions, added at the same time as AVX2 * VFP_FP16 -> ARM f16<->f32 instructions, quite common * NEON_FMA -> ARM FMA instructions, also quite common * SSE and SSE3... why not? This new internal API makes it very cheap to do fine-grained runtime CPU feature detection. Redundant calls to SkCpu::Supports() should be eliminated and it's hoistable out of loops. It compiles away entirely when we have the appropriate instructions available at compile time. This means we can call it to guard even a little snippet of 1 or 2 instructions right where needed and let inlining hoist the check (if any at all) up to somewhere that doesn't hurt performance. I've explained how I made this work in the private section of the new header. Once this lands and bakes a bit, I'll start following up with CLs to use it more and to add a bunch of those little 1-2 instruction snippets we've been wanting, e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps (for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Committed: https://skia.googlesource.com/skia/+/872ea29357439f05b1f6995dd300fc054733e607 Review URL: https://codereview.chromium.org/1890483002
* Revert of Move CPU feature detection to its own file. (patchset #7 id:120001 ↵Gravatar mtklein2016-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of https://codereview.chromium.org/1890483002/ ) Reason for revert: many unexpected GM diffs across GPU+CPU configs on Windows (hopefully just text masks on GPU?). seems like we pick a different srcover variant in some places. Original issue's description: > Move CPU feature detection to its own file. > > - Moves CPU feature detection to its own file. > - Cleans up some redundant feature detection scattered around core/ and opts/. > - Can now detect a few new CPU features: > * F16C -> Intel f16<->f32 instructions, added between AVX and AVX2 > * FMA -> Intel FMA instructions, added at the same time as AVX2 > * VFP_FP16 -> ARM f16<->f32 instructions, quite common > * NEON_FMA -> ARM FMA instructions, also quite common > * SSE and SSE3... why not? > > This new internal API makes it very cheap to do fine-grained runtime CPU > feature detection. Redundant calls to SkCpu::Supports() should be eliminated > and it's hoistable out of loops. It compiles away entirely when we have the > appropriate instructions available at compile time. > > This means we can call it to guard even a little snippet of 1 or 2 instructions > right where needed and let inlining hoist the check (if any at all) up to > somewhere that doesn't hurt performance. I've explained how I made this work > in the private section of the new header. > > Once this lands and bakes a bit, I'll start following up with CLs to use it more > and to add a bunch of those little 1-2 instruction snippets we've been wanting, > e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps > (for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM. > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002 > CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > Committed: https://skia.googlesource.com/skia/+/872ea29357439f05b1f6995dd300fc054733e607 TBR=fmalita@chromium.org,herb@google.com,reed@google.com,mtklein@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/1892643003
* Move CPU feature detection to its own file.Gravatar mtklein2016-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Moves CPU feature detection to its own file. - Cleans up some redundant feature detection scattered around core/ and opts/. - Can now detect a few new CPU features: * F16C -> Intel f16<->f32 instructions, added between AVX and AVX2 * FMA -> Intel FMA instructions, added at the same time as AVX2 * VFP_FP16 -> ARM f16<->f32 instructions, quite common * NEON_FMA -> ARM FMA instructions, also quite common * SSE and SSE3... why not? This new internal API makes it very cheap to do fine-grained runtime CPU feature detection. Redundant calls to SkCpu::Supports() should be eliminated and it's hoistable out of loops. It compiles away entirely when we have the appropriate instructions available at compile time. This means we can call it to guard even a little snippet of 1 or 2 instructions right where needed and let inlining hoist the check (if any at all) up to somewhere that doesn't hurt performance. I've explained how I made this work in the private section of the new header. Once this lands and bakes a bit, I'll start following up with CLs to use it more and to add a bunch of those little 1-2 instruction snippets we've been wanting, e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps (for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1890483002
* Style bikeshed - remove extraneous whitespaceGravatar halcanary2016-03-29
| | | | | | GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1842753002 Review URL: https://codereview.chromium.org/1842753002
* Specialize Sk2d for ARM64Gravatar mtklein2015-03-20
| | | | | | | | | | | | | | | | | | | The implementation is nearly identical to Sk2f, with these changes: - float32x2_t -> float64x2_t - vfoo -> vfooq - one extra Newton's method step in sqrt(). Also, generally fix NEON detection to be defined(SK_ARM_HAS_NEON). SK_ARM_HAS_NEON is not being set on ARM64 bots right now (nor does the compiler seem to set __ARM_NEON__), so this CL fixes everything up. BUG=skia: Committed: https://skia.googlesource.com/skia/+/e57b5cab261a243dcbefa74c91c896c28959bf09 CQ_EXTRA_TRYBOTS=client.skia.compile:Build-Mac10.7-Clang-Arm7-Debug-iOS-Trybot,Build-Ubuntu-GCC-Arm64-Release-Android-Trybot Review URL: https://codereview.chromium.org/1020963002
* Revert of Specialize Sk2d for ARM64 (patchset #3 id:40001 of ↵Gravatar mtklein2015-03-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://codereview.chromium.org/1020963002/) Reason for revert: https://uberchromegw.corp.google.com/i/client.skia.compile/builders/Build-Mac10.7-Clang-Arm7-Debug-iOS/builds/2441/steps/build%20most/logs/stdio https://uberchromegw.corp.google.com/i/client.skia.compile/builders/Build-Mac10.7-Clang-Arm7-Release-iOS/builds/2424/steps/build%20most/logs/stdio https://uberchromegw.corp.google.com/i/client.skia.compile/builders/Build-Ubuntu-GCC-Arm64-Release-Android/builds/8/steps/build%20most/logs/stdio Original issue's description: > Specialize Sk2d for ARM64 > > The implementation is nearly identical to Sk2f, with these changes: > - float32x2_t -> float64x2_t > - vfoo -> vfooq > - one extra Newton's method step in sqrt(). > > Also, generally fix NEON detection to be defined(SK_ARM_HAS_NEON). > SK_ARM_HAS_NEON is not being set on ARM64 bots right now (nor does the compiler > seem to set __ARM_NEON__), so this CL fixes everything up. > > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/e57b5cab261a243dcbefa74c91c896c28959bf09 TBR=msarett@google.com,reed@google.com,mtklein@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/1028523003
* Specialize Sk2d for ARM64Gravatar mtklein2015-03-20
| | | | | | | | | | | | | | | The implementation is nearly identical to Sk2f, with these changes: - float32x2_t -> float64x2_t - vfoo -> vfooq - one extra Newton's method step in sqrt(). Also, generally fix NEON detection to be defined(SK_ARM_HAS_NEON). SK_ARM_HAS_NEON is not being set on ARM64 bots right now (nor does the compiler seem to set __ARM_NEON__), so this CL fixes everything up. BUG=skia: Review URL: https://codereview.chromium.org/1020963002
* Update NEON compiler defines to use SK_ prefixGravatar djsollen2014-08-01
| | | | | | | | | BUG=skia:2785 R=mtklein@google.com Author: djsollen@google.com Review URL: https://codereview.chromium.org/433513004
* SK_CPU_ARM --> SK_CPU_ARM32Gravatar mtklein2014-06-03
| | | | | | | | | | | That's what it means. It keeps confusing us as named today. BUG=skia: R=djsollen@google.com, mtklein@google.com, reed@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/314643004
* Revert "Temporarily disable NEON on Android framework builds."Gravatar commit-bot@chromium.org2014-05-22
| | | | | | | | | | R=scroggo@google.com Author: djsollen@google.com Review URL: https://codereview.chromium.org/294183002 git-svn-id: http://skia.googlecode.com/svn/trunk@14844 2bbb7eff-a529-9590-31e7-b0007b416f81
* Temporarily disable NEON on Android framework builds.Gravatar djsollen@google.com2014-05-06
| | | | | | | | | | | The GCC 4.8 compiler has an AARCH64 bug that generated non-PIC output that fails to link. R=scroggo@google.com Review URL: https://codereview.chromium.org/266883011 git-svn-id: http://skia.googlecode.com/svn/trunk@14597 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 35 - First AArch64 supportGravatar commit-bot@chromium.org2014-04-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aarch64 support This change contains the necessary modifications to have Skia build and run properly on an ARMv8 processor in aarch64 execution state. Here's a list of the changes: - add an arm64 target to the build system + SK_CPU_ARM64 flag - MatrixTest was failing when built in Release mode. Fused MAC instructions were generated which made some intermediate results more accurate. As the test relies on result comparison, the more precise results when compared to others led to a gap bigger than what was tolerated. As I don't know if some actual skia code relies on results being comparable, I've disabled fused MAC instruction with -ffp-contract=off for arm64. - Modify include/core/SkOnce.h to have barriers work. - SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS. - use existing Xfermode optimisations with modifications that can be removed in the future when toolchains are ready. Also save a few instructions is two Xfermodes (will apply to ARM too). - use existing SkBoxBlur and SkMorphology optimisations. - use existing SkBlitMask optimisations - use existing BitmapProcState and Convolution optimisations. Future changes will include: - Blitters (only partialy merged upstream) - SkUtils (there's little value in sending asm optimisations without having them benchmarked on real hardware). Signed-off-by: Kevin PETIT <kevin.petit@arm.com> BUG=skia: Committed: http://code.google.com/p/skia/source/detail?r=13980 R=djsollen@google.com, reed@google.com, mtklein@google.com, halcanary@google.com Author: kevin.petit@arm.com Review URL: https://codereview.chromium.org/143423004 git-svn-id: http://skia.googlecode.com/svn/trunk@14025 2bbb7eff-a529-9590-31e7-b0007b416f81
* Revert of ARM Skia NEON patches - 35 - First AArch64 support ↵Gravatar commit-bot@chromium.org2014-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (https://codereview.chromium.org/143423004/) Reason for revert: GYP's failing on most (all?) bots. Original issue's description: > ARM Skia NEON patches - 35 - First AArch64 support > > Aarch64 support > > This change contains the necessary modifications to have Skia build and > run properly on an ARMv8 processor in aarch64 execution state. > > Here's a list of the changes: > > - add an arm64 target to the build system + SK_CPU_ARM64 flag > > - MatrixTest was failing when built in Release mode. Fused MAC > instructions were generated which made some intermediate results > more accurate. As the test relies on result comparison, the more > precise results when compared to others led to a gap bigger than > what was tolerated. As I don't know if some actual skia code relies > on results being comparable, I've disabled fused MAC instruction > with -ffp-contract=off for arm64. > > - Modify include/core/SkOnce.h to have barriers work. > > - SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS. > > - use existing Xfermode optimisations with modifications that can be > removed in the future when toolchains are ready. Also save a few > instructions is two Xfermodes (will apply to ARM too). > > - use existing SkBoxBlur and SkMorphology optimisations. > > - use existing SkBlitMask optimisations > > - use existing BitmapProcState and Convolution optimisations. > > Future changes will include: > > - Blitters (only partialy merged upstream) > > - SkUtils (there's little value in sending asm optimisations without > having them benchmarked on real hardware). > > Signed-off-by: Kevin PETIT <kevin.petit@arm.com> > > BUG=skia: > > Committed: http://code.google.com/p/skia/source/detail?r=13980 R=djsollen@google.com, reed@google.com, halcanary@google.com, kevin.petit@arm.com TBR=djsollen@google.com, halcanary@google.com, kevin.petit@arm.com, reed@google.com NOTREECHECKS=true NOTRY=true BUG=skia: Author: mtklein@google.com Review URL: https://codereview.chromium.org/216113005 git-svn-id: http://skia.googlecode.com/svn/trunk@13983 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 35 - First AArch64 supportGravatar commit-bot@chromium.org2014-03-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Aarch64 support This change contains the necessary modifications to have Skia build and run properly on an ARMv8 processor in aarch64 execution state. Here's a list of the changes: - add an arm64 target to the build system + SK_CPU_ARM64 flag - MatrixTest was failing when built in Release mode. Fused MAC instructions were generated which made some intermediate results more accurate. As the test relies on result comparison, the more precise results when compared to others led to a gap bigger than what was tolerated. As I don't know if some actual skia code relies on results being comparable, I've disabled fused MAC instruction with -ffp-contract=off for arm64. - Modify include/core/SkOnce.h to have barriers work. - SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS. - use existing Xfermode optimisations with modifications that can be removed in the future when toolchains are ready. Also save a few instructions is two Xfermodes (will apply to ARM too). - use existing SkBoxBlur and SkMorphology optimisations. - use existing SkBlitMask optimisations - use existing BitmapProcState and Convolution optimisations. Future changes will include: - Blitters (only partialy merged upstream) - SkUtils (there's little value in sending asm optimisations without having them benchmarked on real hardware). Signed-off-by: Kevin PETIT <kevin.petit@arm.com> BUG=skia: R=djsollen@google.com, reed@google.com, mtklein@google.com, halcanary@google.com Author: kevin.petit@arm.com Review URL: https://codereview.chromium.org/143423004 git-svn-id: http://skia.googlecode.com/svn/trunk@13980 2bbb7eff-a529-9590-31e7-b0007b416f81
* eliminate all warnings in non-thirdparty code on macGravatar humper@google.com2013-01-07
| | | | | | | | | | | | | | | | Most of these issues were due to functions whose definitions appear in header files; I changed those functions to be 'static inline' instead of just 'static' or 'inline', which kills the warning for such functions. Other functions that were static or anonymous-namespaced but were unused in cpp files were probably called at some point but are no longer; someone who knows more than I do should probably scrub all the functions I either deleted or #if 0'ed out and make sure that the right thing is happening here. Lots of unused variables removed, and one nasty const issue handled. There remains a single warning in thirdparty/externals/cityhash/src/city.cc on line 146 related to a signed/unsigned mismatch. I don't know if we have control over this library so I didn't fix this one, but perhaps someone could do something about that one. BUG= Review URL: https://codereview.appspot.com/7067044 git-svn-id: http://skia.googlecode.com/svn/trunk@7051 2bbb7eff-a529-9590-31e7-b0007b416f81
* first cut at making iOS workGravatar caryclark@google.com2012-09-20
| | | | | | | | | | | | | Replace __arm__ with SK_CPU_ARM add support for iOS simulator and device fix const warning in iOSSampleApp update gyp files https://code.google.com/p/skia/issues/detail?id=900 tracks fixing missing arm assembly Review URL: https://codereview.appspot.com/6552045 git-svn-id: http://skia.googlecode.com/svn/trunk@5606 2bbb7eff-a529-9590-31e7-b0007b416f81
* rm: Introduce SK_ARM_NEON_WRAP handy wrapper macro.Gravatar digit@google.com2012-08-06
| | | | | | | It is used to simplify arm/neon dispatch logic code. Review URL: https://codereview.appspot.com/6458060 git-svn-id: http://skia.googlecode.com/svn/trunk@4958 2bbb7eff-a529-9590-31e7-b0007b416f81
* arm: Move SkUtilsArm.h from include/core to src/coreGravatar digit@google.com2012-08-01
There is no reason to make this visible to client code. Review URL: https://codereview.appspot.com/6441082 git-svn-id: http://skia.googlecode.com/svn/trunk@4892 2bbb7eff-a529-9590-31e7-b0007b416f81