diff options
author | Mike Klein <mtklein@chromium.org> | 2016-11-03 14:43:48 -0400 |
---|---|---|
committer | Skia Commit-Bot <skia-commit-bot@chromium.org> | 2016-11-03 19:09:17 +0000 |
commit | f0348c2413c5c72820a42749879d41c6dd4ab16c (patch) | |
tree | 477ef93c6b2c07213afb2ab1bcaf00b66e6dcf83 /src/opts | |
parent | 145dbcd165d9d27298eb8888bc240e2d06a95464 (diff) |
Implement SkNx_fma() for Sk4f on ARMv8.
I was looking at the disassembly of matrix_4x5() and noticed it didn't have any FMAs. This makes things that call SkNx_fma() actually use the FMA instruction.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=4400
CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Change-Id: Ia353a77b0ca14385a43b564997b05586f9472996
Reviewed-on: https://skia-review.googlesource.com/4400
Reviewed-by: Matt Sarett <msarett@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Diffstat (limited to 'src/opts')
-rw-r--r-- | src/opts/SkNx_neon.h | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/src/opts/SkNx_neon.h b/src/opts/SkNx_neon.h index b5d89891d1..c85d583ea2 100644 --- a/src/opts/SkNx_neon.h +++ b/src/opts/SkNx_neon.h @@ -218,6 +218,12 @@ public: float32x4_t fVec; }; +#if defined(SK_CPU_ARM64) + AI static Sk4f SkNx_fma(const Sk4f& f, const Sk4f& m, const Sk4f& a) { + return vfmaq_f32(a.fVec, f.fVec, m.fVec); + } +#endif + // It's possible that for our current use cases, representing this as // half a uint16x8_t might be better than representing it as a uint16x4_t. // It'd make conversion to Sk4b one step simpler. |