aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/core/SkUtilsArm.cpp
Commit message (Collapse)AuthorAge
* Move CPU feature detection to its own file.Gravatar mtklein2016-04-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Moves CPU feature detection to its own file. - Cleans up some redundant feature detection scattered around core/ and opts/. - Can now detect a few new CPU features: * F16C -> Intel f16<->f32 instructions, added between AVX and AVX2 * FMA -> Intel FMA instructions, added at the same time as AVX2 * VFP_FP16 -> ARM f16<->f32 instructions, quite common * NEON_FMA -> ARM FMA instructions, also quite common * SSE and SSE3... why not? This new internal API makes it very cheap to do fine-grained runtime CPU feature detection. Redundant calls to SkCpu::Supports() should be eliminated and it's hoistable out of loops. It compiles away entirely when we have the appropriate instructions available at compile time. This means we can call it to guard even a little snippet of 1 or 2 instructions right where needed and let inlining hoist the check (if any at all) up to somewhere that doesn't hurt performance. I've explained how I made this work in the private section of the new header. Once this lands and bakes a bit, I'll start following up with CLs to use it more and to add a bunch of those little 1-2 instruction snippets we've been wanting, e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps (for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Committed: https://skia.googlesource.com/skia/+/872ea29357439f05b1f6995dd300fc054733e607 Review URL: https://codereview.chromium.org/1890483002
* Revert of Move CPU feature detection to its own file. (patchset #7 id:120001 ↵Gravatar mtklein2016-04-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of https://codereview.chromium.org/1890483002/ ) Reason for revert: many unexpected GM diffs across GPU+CPU configs on Windows (hopefully just text masks on GPU?). seems like we pick a different srcover variant in some places. Original issue's description: > Move CPU feature detection to its own file. > > - Moves CPU feature detection to its own file. > - Cleans up some redundant feature detection scattered around core/ and opts/. > - Can now detect a few new CPU features: > * F16C -> Intel f16<->f32 instructions, added between AVX and AVX2 > * FMA -> Intel FMA instructions, added at the same time as AVX2 > * VFP_FP16 -> ARM f16<->f32 instructions, quite common > * NEON_FMA -> ARM FMA instructions, also quite common > * SSE and SSE3... why not? > > This new internal API makes it very cheap to do fine-grained runtime CPU > feature detection. Redundant calls to SkCpu::Supports() should be eliminated > and it's hoistable out of loops. It compiles away entirely when we have the > appropriate instructions available at compile time. > > This means we can call it to guard even a little snippet of 1 or 2 instructions > right where needed and let inlining hoist the check (if any at all) up to > somewhere that doesn't hurt performance. I've explained how I made this work > in the private section of the new header. > > Once this lands and bakes a bit, I'll start following up with CLs to use it more > and to add a bunch of those little 1-2 instruction snippets we've been wanting, > e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps > (for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM. > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002 > CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > Committed: https://skia.googlesource.com/skia/+/872ea29357439f05b1f6995dd300fc054733e607 TBR=fmalita@chromium.org,herb@google.com,reed@google.com,mtklein@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/1892643003
* Move CPU feature detection to its own file.Gravatar mtklein2016-04-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Moves CPU feature detection to its own file. - Cleans up some redundant feature detection scattered around core/ and opts/. - Can now detect a few new CPU features: * F16C -> Intel f16<->f32 instructions, added between AVX and AVX2 * FMA -> Intel FMA instructions, added at the same time as AVX2 * VFP_FP16 -> ARM f16<->f32 instructions, quite common * NEON_FMA -> ARM FMA instructions, also quite common * SSE and SSE3... why not? This new internal API makes it very cheap to do fine-grained runtime CPU feature detection. Redundant calls to SkCpu::Supports() should be eliminated and it's hoistable out of loops. It compiles away entirely when we have the appropriate instructions available at compile time. This means we can call it to guard even a little snippet of 1 or 2 instructions right where needed and let inlining hoist the check (if any at all) up to somewhere that doesn't hurt performance. I've explained how I made this work in the private section of the new header. Once this lands and bakes a bit, I'll start following up with CLs to use it more and to add a bunch of those little 1-2 instruction snippets we've been wanting, e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps (for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1890483002
* Style bikeshed - remove extraneous whitespaceGravatar halcanary2016-03-29
| | | | | | GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1842753002 Review URL: https://codereview.chromium.org/1842753002
* Style Change: NULL->nullptrGravatar halcanary2015-08-27
| | | | | | DOCS_PREVIEW= https://skia.org/?cl=1316233002 Review URL: https://codereview.chromium.org/1316233002
* Fix usage of SK_BUILD_* defines.Gravatar tfarina2014-10-06
| | | | | | | | | | | | | | | Since we just 'define' them, but not attribute anything to them, like '1' for example, cpp expands it to nothing and that breaks the "#if" clauses. To fix that, uses "#if defined(...)" which will correctly check if your macro name was defined or not. BUG=skia:2850 TEST=make most R=robertphillips@google.com Review URL: https://codereview.chromium.org/628763005
* Always use cpu-features library on android.Gravatar djsollen2014-08-26
| | | | | | | | | | | | This CL also removes the debug capability of runtime switching in/out of NEON mode as it uses deprecated APIs. BUG=skia:1061 R=tomhudson@google.com Author: djsollen@google.com Review URL: https://codereview.chromium.org/506033003
* Only set USE_ANDROID_NDK_CPU_FEATURES if it's not already been explicitly setGravatar commit-bot@chromium.org2014-03-11
| | | | | | | | | | | R=djsollen@google.com, reed@google.com BUG=skia: Author: george@mozilla.com Review URL: https://codereview.chromium.org/189263015 git-svn-id: http://skia.googlecode.com/svn/trunk@13750 2bbb7eff-a529-9590-31e7-b0007b416f81
* Enable SkUtilsArm on all ARM platforms and always use NDK compliant NEON ↵Gravatar djsollen@google.com2013-08-05
| | | | | | | | | | detection on Android. R=scroggo@google.com Review URL: https://codereview.chromium.org/22193002 git-svn-id: http://skia.googlecode.com/svn/trunk@10530 2bbb7eff-a529-9590-31e7-b0007b416f81
* Use the NDK's cpu-features library when building skia for Chromium/Android.Gravatar digit@google.com2013-01-14
| | | | | | | | | | | | | | | | | | | | This patch ensures that when Skia is built for Chromium, it will always use the Android NDK's cpu-features helper library to detect NEON at runtime. This is needed because sandboxed Chromium renderer processes cannot access /proc, and the probing performed in SkUtilsArm.cpp will never work. As such, the NEON code paths will never be used even when the device supports them. Chromium has special code that ensures that the browser process passes the CPU features flags to every renderer process, but Skia needs to use android_getCpuFeatures() to get them. See http://crbug.com/164154 for full details. Review URL: https://codereview.appspot.com/7102045 git-svn-id: http://skia.googlecode.com/svn/trunk@7149 2bbb7eff-a529-9590-31e7-b0007b416f81
* Result of running tools/sanitize_source_files.py (which was added in ↵Gravatar rmistry@google.com2012-08-23
| | | | | | | | | https://codereview.appspot.com/6465078/) This CL is part I of IV (I broke down the 1280 files into 4 CLs). Review URL: https://codereview.appspot.com/6485054 git-svn-id: http://skia.googlecode.com/svn/trunk@5262 2bbb7eff-a529-9590-31e7-b0007b416f81
* arm: dynamic NEON support for SkBitmapProcState matrix operations.Gravatar digit@google.com2012-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch implements dynamic ARM NEON support for the functions implemented by src/core/SkBitmapProcState_matrixProcs.cpp. - Because the SkBitmapProcState_matrix_{clamp,repeat}.h headers are NEON-specific, they are renamed with a _neon.h suffix, and moved to src/opts/ (from src/core/) - Add a new file src/opts/SkBitmapProcState_matrixProcs_neon.cpp which implements the NEON code paths for all builds, and add it to the 'opts_neon' static library. - Modify SkBitmapProcState_matrixProcs.cpp to select the right code-path depending on our build configuration. Note that in the case where 'arm_neon == 1', we do not embed regular ARM code paths in the final binary. Only 'arm_neon_optional == 1' builds will contain both regular and NEON code paths at the same time. Note that there doesn't seem to be a simple way to put the NEON-specific selection from that currently is in SkBitmapProcState_matrixProcs.cpp into src/opts/. Doing so would require much more drastic restructuring. This is also true of the other SkBitmapProcState source files that will be touched in a future patch. Review URL: https://codereview.appspot.com/6453065 git-svn-id: http://skia.googlecode.com/svn/trunk@4888 2bbb7eff-a529-9590-31e7-b0007b416f81
* arm: First step towards dynamic NEON support.Gravatar digit@google.com2012-05-30
This patch adds minimal support for dynamic ARM NEON support, i.e. the ability to probe the CPU at runtime for NEON and provide alternate code paths when it is available. - Add include/core/SkUtilsArm.h, which declares a few helper macros (e.g. SK_NEON_ARM_IS_DYNAMIC), plus the handy function 'sk_cpu_arm_has_neon()' which returns true if the target CPU supports the ARM NEON instruction set. Note that the header is in include/core/ because it will have to be included from NEON-specific code under src/code/ It would probably be more logical to put it under include/opts/ instead, but this would require moving all the NEON-specific stuff under src/code/ into src/opts/, which is not trivial due to the way the code is currently architected. - Add src/core/SkUtilsArm.cpp which implements 'sk_cpu_arm_has_neon' for ARM-based Linux systems, only when SK_NEON_ARM_IS_DYNAMIC is true. (For other cases, 'sk_cpu_arm_has_neon' is an inline function that returns a constant 'true' or 'false' value). There is no user-level accessible CPUID instruction on ARM, so do all CPU feature probing by parsing /proc/cpuinfo. This is Linux-specific. For Debug build types, the CPU probing result is printed to the Android log (or Linux command-line) for easier debugging. - Create a new 'opts_neon' target (static library) which shall contain all the NEON-specific code paths for the library. This is necessary because -mfpu=neon impacts also non-scalar code. Just like with -mssse3 on x86, we can't build the rest of the library with this flag. Note that for now, we only include memset16_neon and memset32_neon in this library. - Modify opts_check_arm.cpp to implement SK_ARM_NEON_IS_DYNAMIC properly. Compared to a 'xoom' build, the only difference is the use of NEON-optimized memset16/32 functions. Later patches will move more NEON-specific code paths to 'opts_neon'. Review URL: https://codereview.appspot.com/6247058 git-svn-id: http://skia.googlecode.com/svn/trunk@4069 2bbb7eff-a529-9590-31e7-b0007b416f81