diff options
author | mtklein <mtklein@chromium.org> | 2015-11-17 14:19:52 -0800 |
---|---|---|
committer | Commit bot <commit-bot@chromium.org> | 2015-11-17 14:19:52 -0800 |
commit | cbf4fba43933302a846872e4c5ce8f1adb8b325e (patch) | |
tree | 96dad6cc0a2241544a0cf52cccdc7a0fbe89f9b1 /src/core/Sk4px.h | |
parent | 56847a65648af4d06da9c26c55242949a1bf31ab (diff) |
div255(x) as ((x+128)*257)>>16 with SSE
_mm_mulhi_epu16 makes the (...*257)>>16 part simple.
This seems to speed up every transfermode that uses div255(),
in the 7-25% range.
It even appears to obviate the need for approxMulDiv255() on SSE.
I'm not sure about NEON yet, so I'll keep approxMulDiv255() for now.
Should be no pixels change:
https://gold.skia.org/search2?issue=1452903004&unt=true&query=source_type%3Dgm&master=false
BUG=skia:
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1452903004
Diffstat (limited to 'src/core/Sk4px.h')
-rw-r--r-- | src/core/Sk4px.h | 6 |
1 files changed, 1 insertions, 5 deletions
diff --git a/src/core/Sk4px.h b/src/core/Sk4px.h index a7f5c9f4c6..3755488a4a 100644 --- a/src/core/Sk4px.h +++ b/src/core/Sk4px.h @@ -66,11 +66,7 @@ public: Sk4px addNarrowHi(const Sk16h&) const; // Rounds, i.e. (x+127) / 255. - Sk4px div255() const { - // Calculated as ((x+128) + ((x+128)>>8)) >> 8. - auto v = *this + Sk16h(128); - return v.addNarrowHi(v >> 8); - } + Sk4px div255() const; // These just keep the types as Wide so the user doesn't have to keep casting. Wide operator * (const Wide& o) const { return INHERITED::operator*(o); } |