From 2aab22a58a366df4752c1cf0f004092c6e7be335 Mon Sep 17 00:00:00 2001 From: mtklein Date: Fri, 26 Jun 2015 10:46:31 -0700 Subject: Color dodge and burn with SkPMFloat. Both 25-35% faster with SSE. With NEON, Burn measures as a ~10% regression, Dodge a huge 2.9x improvement. The Burn regression is somewhat artificial: we're drawing random colored rects onto an opaque white dst, so we're heavily biased toward the (d==da) fast path in the serial code. In the vector code there's no short-circuiting and we always pay a fixed cost for ColorBurn regardless of src or dst content. Dodge's fast paths, in contrast, only trigger when (s==sa) or (d==0), neither of which happens any more than randomly in our benchmark. I don't think (d==0) should happen at all. Similarly, the (s==0) Burn fast path is really only going to happen as often as SkRandom allows. In practice, the existing Burn benchmark is hitting its fast path 100% of the time. So I actually feel really great that this only dings the benchmark by 10%. Chrome's still guarded by SK_SUPPORT_LEGACY_XFERMODES, which I'll lift after finishing the last xfermode, SoftLight. BUG=skia: Review URL: https://codereview.chromium.org/1214443002 --- src/opts/SkPMFloat_sse.h | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'src/opts/SkPMFloat_sse.h') diff --git a/src/opts/SkPMFloat_sse.h b/src/opts/SkPMFloat_sse.h index 802b17ba0c..28aa90bf29 100644 --- a/src/opts/SkPMFloat_sse.h +++ b/src/opts/SkPMFloat_sse.h @@ -33,4 +33,9 @@ inline SkPMColor SkPMFloat::round() const { return c; } +inline Sk4f SkPMFloat::alphas() const { + static_assert(SK_A32_SHIFT == 24, ""); + return _mm_shuffle_ps(fVec, fVec, 0xff); // Read as 11 11 11 11, copying lane 3 to all lanes. +} + } // namespace -- cgit v1.2.3