aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/core/Sk4px.h
diff options
context:
space:
mode:
authorGravatar mtklein <mtklein@chromium.org>2015-11-17 14:19:52 -0800
committerGravatar Commit bot <commit-bot@chromium.org>2015-11-17 14:19:52 -0800
commitcbf4fba43933302a846872e4c5ce8f1adb8b325e (patch)
tree96dad6cc0a2241544a0cf52cccdc7a0fbe89f9b1 /src/core/Sk4px.h
parent56847a65648af4d06da9c26c55242949a1bf31ab (diff)
div255(x) as ((x+128)*257)>>16 with SSE
_mm_mulhi_epu16 makes the (...*257)>>16 part simple. This seems to speed up every transfermode that uses div255(), in the 7-25% range. It even appears to obviate the need for approxMulDiv255() on SSE. I'm not sure about NEON yet, so I'll keep approxMulDiv255() for now. Should be no pixels change: https://gold.skia.org/search2?issue=1452903004&unt=true&query=source_type%3Dgm&master=false BUG=skia: CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1452903004
Diffstat (limited to 'src/core/Sk4px.h')
-rw-r--r--src/core/Sk4px.h6
1 files changed, 1 insertions, 5 deletions
diff --git a/src/core/Sk4px.h b/src/core/Sk4px.h
index a7f5c9f4c6..3755488a4a 100644
--- a/src/core/Sk4px.h
+++ b/src/core/Sk4px.h
@@ -66,11 +66,7 @@ public:
Sk4px addNarrowHi(const Sk16h&) const;
// Rounds, i.e. (x+127) / 255.
- Sk4px div255() const {
- // Calculated as ((x+128) + ((x+128)>>8)) >> 8.
- auto v = *this + Sk16h(128);
- return v.addNarrowHi(v >> 8);
- }
+ Sk4px div255() const;
// These just keep the types as Wide so the user doesn't have to keep casting.
Wide operator * (const Wide& o) const { return INHERITED::operator*(o); }