aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/core/SkUtilsArm.h
diff options
context:
space:
mode:
authorGravatar mtklein <mtklein@chromium.org>2016-04-19 14:00:13 -0700
committerGravatar Commit bot <commit-bot@chromium.org>2016-04-19 14:00:13 -0700
commit4311f016612a814282029daa4bd102053a853d82 (patch)
treed610830e4fc571bf80e122db696837441812a404 /src/core/SkUtilsArm.h
parent312aa6a81e508f80a46419a14ec842b129ffe563 (diff)
Move CPU feature detection to its own file.
- Moves CPU feature detection to its own file. - Cleans up some redundant feature detection scattered around core/ and opts/. - Can now detect a few new CPU features: * F16C -> Intel f16<->f32 instructions, added between AVX and AVX2 * FMA -> Intel FMA instructions, added at the same time as AVX2 * VFP_FP16 -> ARM f16<->f32 instructions, quite common * NEON_FMA -> ARM FMA instructions, also quite common * SSE and SSE3... why not? This new internal API makes it very cheap to do fine-grained runtime CPU feature detection. Redundant calls to SkCpu::Supports() should be eliminated and it's hoistable out of loops. It compiles away entirely when we have the appropriate instructions available at compile time. This means we can call it to guard even a little snippet of 1 or 2 instructions right where needed and let inlining hoist the check (if any at all) up to somewhere that doesn't hurt performance. I've explained how I made this work in the private section of the new header. Once this lands and bakes a bit, I'll start following up with CLs to use it more and to add a bunch of those little 1-2 instruction snippets we've been wanting, e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps (for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Committed: https://skia.googlesource.com/skia/+/872ea29357439f05b1f6995dd300fc054733e607 Review URL: https://codereview.chromium.org/1890483002
Diffstat (limited to 'src/core/SkUtilsArm.h')
-rw-r--r--src/core/SkUtilsArm.h14
1 files changed, 5 insertions, 9 deletions
diff --git a/src/core/SkUtilsArm.h b/src/core/SkUtilsArm.h
index 317677115c..dde933bafa 100644
--- a/src/core/SkUtilsArm.h
+++ b/src/core/SkUtilsArm.h
@@ -8,6 +8,7 @@
#ifndef SkUtilsArm_DEFINED
#define SkUtilsArm_DEFINED
+#include "SkCpu.h"
#include "SkUtils.h"
// Define SK_ARM_NEON_MODE to one of the following values
@@ -37,18 +38,13 @@
// is ARMv7-A and supports Neon instructions. In DYNAMIC mode, this actually
// probes the CPU at runtime (and caches the result).
-#if SK_ARM_NEON_IS_NONE
static inline bool sk_cpu_arm_has_neon(void) {
+#if SK_ARM_NEON_IS_NONE
return false;
-}
-#elif SK_ARM_NEON_IS_ALWAYS
-static inline bool sk_cpu_arm_has_neon(void) {
- return true;
-}
-#else // SK_ARM_NEON_IS_DYNAMIC
-
-extern bool sk_cpu_arm_has_neon(void) SK_PURE_FUNC;
+#else
+ return SkCpu::Supports(SkCpu::NEON);
#endif
+}
// Use SK_ARM_NEON_WRAP(symbol) to map 'symbol' to a NEON-specific symbol
// when applicable. This will transform 'symbol' differently depending on