diff options
author | Mike Klein <mtklein@chromium.org> | 2017-01-12 11:36:46 -0500 |
---|---|---|
committer | Skia Commit-Bot <skia-commit-bot@chromium.org> | 2017-01-13 17:25:15 +0000 |
commit | 4ef8cb3527b7e3f453dccd39eea76e31eb2c33c7 (patch) | |
tree | bbb69e2f6aa3113192508451a8f99f96eefc8e07 /src/core | |
parent | 70b49fd063171a78d3c664ca8af3988f5426319b (diff) |
some armv7 hacking
We can splice these stages if we drop them down to 2 at a time.
Turns out this is significantly (2-3x) faster than the status quo.
SkRasterPipeline_…
…f16_compile 1x …srgb_compile 2.06x …f16_run 3.08x …srgb_run 4.61x
Added a couple ways to detect (likely) the required VFPv4 support:
- use hwcap when available (NDK ≥21, Android framework)
- use cpu-features when not (NDK <21)
The code in SkSplicer_generated.h is ARM, not Thumb2. SkSplicer seems
to be blx'ing into it, so that's great, and we bx lr out. There's no
point in attempting to use Thumb2 in vector heavy code... it'll all be
4 byte anyway.
Follow ups:
- vpush {d8-d9} before the loop, vpop {d8-d9} afterwards,
skip these instructions when splicing;
- (probably) drop jumping stages down to 2-at-a-time also.
Change-Id: If151394ec10e8cbd6a05e2d81808488d743bfe15
Reviewed-on: https://skia-review.googlesource.com/6940
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Diffstat (limited to 'src/core')
-rw-r--r-- | src/core/SkCpu.cpp | 30 |
1 files changed, 30 insertions, 0 deletions
diff --git a/src/core/SkCpu.cpp b/src/core/SkCpu.cpp index 28bdf6936d..1ae6723983 100644 --- a/src/core/SkCpu.cpp +++ b/src/core/SkCpu.cpp @@ -8,6 +8,10 @@ #include "SkCpu.h" #include "SkOnce.h" +#if !defined(__has_include) + #define __has_include(x) 0 +#endif + #if defined(SK_CPU_X86) #if defined(SK_BUILD_FOR_WIN32) #include <intrin.h> @@ -69,6 +73,32 @@ return features; } +#elif defined(SK_CPU_ARM32) && defined(SK_BUILD_FOR_ANDROID) && \ + __has_include(<asm/hwcap.h>) && __has_include(<sys/auxv.h>) + // asm/hwcap.h and sys/auxv.h won't be present on builds targeting NDK APIs before 21. + #include <asm/hwcap.h> + #include <sys/auxv.h> + + static uint32_t read_cpu_features() { + uint32_t features = 0; + uint32_t hwcaps = getauxval(AT_HWCAP); + if (hwcaps & HWCAP_VFPv4) { features |= SkCpu::NEON|SkCpu::NEON_FMA|SkCpu::VFP_FP16; } + return features; + } + +#elif defined(SK_CPU_ARM32) && defined(SK_BUILD_FOR_ANDROID) && \ + !defined(SK_BUILD_FOR_ANDROID_FRAMEWORK) + #include <cpu-features.h> + + static uint32_t read_cpu_features() { + uint32_t features = 0; + uint64_t cpu_features = android_getCpuFeatures(); + if (cpu_features & ANDROID_CPU_ARM_FEATURE_NEON) { features |= SkCpu::NEON; } + if (cpu_features & ANDROID_CPU_ARM_FEATURE_NEON_FMA) { features |= SkCpu::NEON_FMA; } + if (cpu_features & ANDROID_CPU_ARM_FEATURE_VFP_FP16) { features |= SkCpu::VFP_FP16; } + return features; + } + #else static uint32_t read_cpu_features() { return 0; |