aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/opts
Commit message (Collapse)AuthorAge
* Revert of Tidy up SkNx_neon. (patchset #3 id:40001 of ↵Gravatar mtklein2016-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://codereview.chromium.org/2196773002/ ) Reason for revert: https://luci-milo.appspot.com/swarming/task/3055149a25621b10 Not Nexus 5 specific. Reproduces on Pixel C with --gcc -t Debug -d arm_v7_neon. Not sure about other configs yet. Original issue's description: > Tidy up SkNx_neon. > > This takes advantage of the fact that all the compilers we use that > support NEON implement it with their own vector extensions. This means > normal things like c = a + b work on the underlying vector types already. > Odd instructions like min or saturated add need to stay intrinsics. > > Also, rearrange functions to a more consistent order. > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2196773002 > CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > Committed: https://skia.googlesource.com/skia/+/6ad22315eb6eacfcd35497cd118440a619d05b18 TBR=msarett@google.com,mtklein@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG=skia: Review-Url: https://codereview.chromium.org/2196953002
* simplify neon shiftsGravatar mtklein2016-07-29
| | | | | | | | | | | | These still generate vshr/vshl with immediates with both GCC and Clang. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2194953002 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Based on https://codereview.chromium.org/2196773002 Review-Url: https://codereview.chromium.org/2194953002
* Tidy up SkNx_neon.Gravatar mtklein2016-07-29
| | | | | | | | | | | | | | | This takes advantage of the fact that all the compilers we use that support NEON implement it with their own vector extensions. This means normal things like c = a + b work on the underlying vector types already. Odd instructions like min or saturated add need to stay intrinsics. Also, rearrange functions to a more consistent order. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2196773002 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2196773002
* SkNx: add Sk4uGravatar mtklein2016-07-29
| | | | | | | | | | This lets us get at logical >> in a nicely principled way. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2197683002 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2197683002
* Add color space xform support to SkJpegCodec (includes F16!)Gravatar msarett2016-07-29
| | | | | | | | | | | | | | | | | Also changes SkColorXform to support: RGBA->RGBA RGBA->BGRA Instead of: RGBA->SkPMColor TBR=reed@google.com BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2174493002 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Committed: https://skia.googlesource.com/skia/+/73d55332e2846dd05e9efdaa2f017bcc3872884b Review-Url: https://codereview.chromium.org/2174493002
* Revert of Add color space xform support to SkJpegCodec (includes F16!) ↵Gravatar msarett2016-07-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (patchset #9 id:260001 of https://codereview.chromium.org/2174493002/ ) Reason for revert: Breaking MSAN Original issue's description: > Add color space xform support to SkJpegCodec (includes F16!) > > Also changes SkColorXform to support: > RGBA->RGBA > RGBA->BGRA > > Instead of: > RGBA->SkPMColor > > TBR=reed@google.com > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2174493002 > CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > Committed: https://skia.googlesource.com/skia/+/73d55332e2846dd05e9efdaa2f017bcc3872884b TBR=mtklein@google.com,reed@google.com,herb@google.com,brianosman@google.com # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review-Url: https://codereview.chromium.org/2195523002
* Add color space xform support to SkJpegCodec (includes F16!)Gravatar msarett2016-07-28
| | | | | | | | | | | | | | | | Also changes SkColorXform to support: RGBA->RGBA RGBA->BGRA Instead of: RGBA->SkPMColor TBR=reed@google.com BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2174493002 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2174493002
* Fix alignment problems in NEON Sk4b.Gravatar mtklein2016-07-26
| | | | | | | | | | | | | | | As written at head, the compiler can assume these loads and stores are 4 byte aligned [1]. We want Sk4b to load from any 1-byte aligned address, to prevent crashes like [2]. [1] https://llvm.org/bugs/show_bug.cgi?id=24421 [2] https://luci-milo.appspot.com/swarming/task/304079e125b1b910/steps/nanobench/0/stdout BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2183133002 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2183133002
* Add Sk4h_load4 for loading F16.Gravatar mtklein2016-07-26
| | | | | | | | | | | | | | Should feel very similar to Sk4h_store4: NEON uses its native instruction, SSE unpacks manually. Since we'll have our F16s in 4 Sk4h by the time we're done here, this also extracts an Sk4h->Sk4f routine from the old uint64_t->Sk4f one. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2184753002 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2184753002
* Use sk_srgb_to_linear_trunc in SkColorXform_optsGravatar msarett2016-07-25
| | | | | | | | | | | | | | | This gives us a little more control over instruction order, allowing us to pipeline the muls and get better performance. Technically, clang should be able to do this for us anyway... Performance on HP z620 (201295.jpg): toSRGB: 371us -> 356us BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2175413002 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2175413002
* Delete SkDefaultXform, handle edge cases in SkColorSpaceXform_BaseGravatar msarett2016-07-25
| | | | | | | | | | | | | | | | | | | | | | | | | "Edge" cases include: (1) Matrices with translation (2) colorLUTs Performance on HP z620: 201295.jpg to2Dot2: 386us -> 414us toSRGB: 346us -> 371us toHalf: 282us -> 302us strange0-translate.jpg toSRGB: 1060us -> 244us strange1-colorLUT.jpg toSRGB: 2.74ms -> 2.00ms BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2177173003 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2177173003
* Add clamp to sk_linear_to_srgb, reorder instructionsGravatar msarett2016-07-22
| | | | | | | | | | | | | | | | | | | Improves performance for xforms toSRGB and to2Dot2. Seems more optimal to save clamping until the end. That way we don't stall the mul pipeline with a min/max. toSRGB: 371us -> 346us to2Dot2: 404us -> 387us FWIW, it probably makes sense to clamp inside sk_linear_to_srgb anyway. If not, we should potentially provide two versions (one that clamps and one that doesn't). BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2173803002 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2173803002
* Miscellaneous color space refactorsGravatar msarett2016-07-21
| | | | | | | | | | | | | | | | (1) Use float matrix[16] everywhere (enables future code sharing). (2) SkColorLookUpTable refactors *** Store in a single allocation (like SkGammas) *** Eliminate fOutputChannels (we always require 3, and probably always will) (3) Change names of read_big_endian_* helpers BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2166093003 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2166093003
* Correct sRGB <-> linear everywhere.Gravatar mtklein2016-07-20
| | | | | | | | | | | | | | | | | | | | | | This trims the SkPM4fPriv methods down to just foolproof methods. (Anything trying to build these itself is probably wrong.) Things like Sk4f srgb_to_linear(Sk4f) can't really exist anymore, at least not efficiently, so this refactor is somewhat more invasive than you might think. Generally this means things using to_4f() are also making a misstep... that's gone too. It also does not make sense to try to play games with linear floats with 255 bias any more. That hack can't work with real sRGB coding. Rather than update them, I've removed a couple of L32 xfermode fast paths. I'd even rather drop it entirely... BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2163683002 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2163683002
* Tune linear->sRGB constants to round-trip all bytes.Gravatar mtklein2016-07-20
| | | | | | | | | | | | I basically just ran a big 5-deep for-loop over the five constants here. This is the first set of coefficients I found that round trips all bytes. I suspect there are many such sets. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2162063003 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2162063003
* Improve naive SkColorXform to half floatsGravatar msarett2016-07-19
| | | | | | | | | | | | | | | | | | | | | | | | This should give us a good baseline to explore using SkRasterPipeline. A particular colorxform to half float drops from 425us to 282us on my desktop. Color Xform to Half Float (HP z620) Original 425us Trans16 (not 32) 355us Vector Trans16 378us Trans16 + Keep Halfs in Vector 335us Vector Trans16 + Keep Halfs in Vector 282us Final 282us Color Xform to Half Float (Nexus 5X) Original 556us Final 472us BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2159993003 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2159993003
* Add capability for SkColorXform to output half floatsGravatar msarett2016-07-15
| | | | | | | | BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2147763002 CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2147763002
* Add a bench to measure the best way to pack from int to uint16_t with SSE.Gravatar mtklein2016-07-15
| | | | | | | | | | | | | | | | | | | | | | | I measured relative runtimes on my laptop: pack_int_uint16_t_ss… 1036 …e41 1x …se3 1.01x …e2_b 3.01x …e2_a 3.02x I've run into Clang problems with the actual _mm_packus_epi32 instruction, I think, so I'm going to exercise a little cowardice and leave that option disabled for now. The ssse3 version probably looks a little faster than it will be in practice. We'll usually need to load its mask, which here is hoisted out of the bench loop. The two sse2 variants are close enough in speed that I'm tie breaking them on other concerns: the <<16, >>16 version doesn't need any scratch registers or to load any constants, so it wins. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2150343002 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot Review-Url: https://codereview.chromium.org/2150343002
* Expand _01 half<->float limitation to _finite. Simplify.Gravatar mtklein2016-07-15
| | | | | | | | | | | | | | | | | | | | | | | It's become clear we need to sometimes deal with values <0 or >1. I'm not yet convinced we care about NaN or +-inf. We had some fairly clever tricks and optimizations here for NEON and SSE. I've thrown them out in favor of a single implementation. If we find the specializations mattered, we can certainly figure out how to extend them to this new range/domain. This happens to add a vectorized float -> half for ARMv7, which was missing from the _01 version. (The SSE strategy was not portable to platforms that flush denorm floats to zero.) I've tested the full float range for FloatToHalf on my desktop and a 5x. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145663003 CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot Committed: https://skia.googlesource.com/skia/+/3296bee70d074bb8094b3229dbe12fa016657e90 Review-Url: https://codereview.chromium.org/2145663003
* Revert of Expand _01 half<->float limitation to _finite. Simplify. ↵Gravatar mtklein2016-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (patchset #7 id:120001 of https://codereview.chromium.org/2145663003/ ) Reason for revert: Unit tests fail on Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast Original issue's description: > Expand _01 half<->float limitation to _finite. Simplify. > > It's become clear we need to sometimes deal with values <0 or >1. > I'm not yet convinced we care about NaN or +-inf. > > We had some fairly clever tricks and optimizations here for NEON > and SSE. I've thrown them out in favor of a single implementation. > If we find the specializations mattered, we can certainly figure out > how to extend them to this new range/domain. > > This happens to add a vectorized float -> half for ARMv7, which was > missing from the _01 version. (The SSE strategy was not portable to > platforms that flush denorm floats to zero.) > > I've tested the full float range for FloatToHalf on my desktop and a 5x. > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145663003 > CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot > > Committed: https://skia.googlesource.com/skia/+/3296bee70d074bb8094b3229dbe12fa016657e90 TBR=msarett@google.com,mtklein@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review-Url: https://codereview.chromium.org/2151023003
* Expand _01 half<->float limitation to _finite. Simplify.Gravatar mtklein2016-07-14
| | | | | | | | | | | | | | | | | | | | | | It's become clear we need to sometimes deal with values <0 or >1. I'm not yet convinced we care about NaN or +-inf. We had some fairly clever tricks and optimizations here for NEON and SSE. I've thrown them out in favor of a single implementation. If we find the specializations mattered, we can certainly figure out how to extend them to this new range/domain. This happens to add a vectorized float -> half for ARMv7, which was missing from the _01 version. (The SSE strategy was not portable to platforms that flush denorm floats to zero.) I've tested the full float range for FloatToHalf on my desktop and a 5x. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145663003 CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2145663003
* Update SkOpts namespaces.Gravatar mtklein2016-07-13
| | | | | | | | | | | | | | | | | | | | | | | | | If we make sure all SkOpts functions are static, we can give the namespaces any name we like. This lets us drop the sk_ prefix and give a real indication of the default SIMD instruction set rather than just saying sk_default. Both of these changes help debugger, profiler, and crash report readability. Perhaps more importantly, keeping these functions static helps prevent accidentally linking in unused versions of functions, as you see here with sk_avx::srcover_srgb_srgb(). This requires we update SkBlend_opts tests and benches to call SkOpts functions through SkOpts rather than declaring the methods externally. In practice this drops testing of the SSE2 version on machines with SSE4. If we still really need to test/bench the compile time best SIMD level version of this method against the runtime detected best, we can include SkBlend_opts.h into the tests or benches directly, similar to what we do for the trivial, brute-force, or best non-SIMD versions. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145833002 CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2145833002
* SkRasterPipeline preliminariesGravatar mtklein2016-07-12
| | | | | | | | | | | | | | Re-uploading to see if I can get a CL number < 2^31. patch from issue 2147533002 at patchset 240001 (http://crrev.com/2147533002#ps240001) Already reviewed at the other crrev link. TBR= BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2147533002 CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2144573004
* Remove bloat from SkBlend_opts.Gravatar herb2016-07-12
| | | | | | | | BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2130183003 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2130183003
* Add Sk4f_RoundToIntGravatar msarett2016-07-12
| | | | | | | | BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2134753006 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2134753006
* Revert of try to speed-up maprect + round2i + contains (patchset #8 ↵Gravatar msarett2016-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | id:140001 of https://codereview.chromium.org/2133413002/ ) Reason for revert: Breaking the roll... https://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_ng/builds/253294/steps/compile%20%28with%20patch%29/logs/stdio Original issue's description: > try to speed-up maprect + round2i + contains > > We call roundOut in a few places. If we can get SkNx::Ceil we could efficiently implement that as well. > > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2133413002 > CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > Committed: https://skia.googlesource.com/skia/+/b42b785d1cbc98bd34aceae338060831b974f9c5 TBR=mtklein@google.com,reed@google.com # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review-Url: https://codereview.chromium.org/2136343002
* try to speed-up maprect + round2i + containsGravatar reed2016-07-11
| | | | | | | | | | We call roundOut in a few places. If we can get SkNx::Ceil we could efficiently implement that as well. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2133413002 CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2133413002
* Clean up hyper-local SkCpu feature test experiment.Gravatar mtklein2016-07-11
| | | | | | | | | | | | | | | | | | | | | | | | | | This removes the code paths where we make SkCpu::Supports() calls from within a tight loop. It keeps code paths using SkCpu::Supports() to choose entire routines from src/opts/. We can't rely on these hyper-local checks to be hoisted up reliably enough. It worked pretty well with the first couple platforms we tried (e.g. Clang on Linux/Mac) but we can't gaurantee it works everywhere. Further, I'm not able to actually do anything fancy with those tests outside of x86... I've not found a way to get, say, NEON+F16 conversion code embedded into ordinary NEON code outside writing then entire function in external assembly. This whole idea becomes less important now that we've got a way to chain separate function calls together efficiently. We can now, e.g., use an AVX+F16C method to load some pixels, then chain that into an ordinary AVX method to color filter them. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2138073002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2138073002
* Make all color xforms 'fast' (step 1)Gravatar msarett2016-07-11
| | | | | | | | | | | This refactors opt code to handle arbitrary src and dst gammas that are specified by tables. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2130013002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2130013002
* Move sRGB <-> linear conversion components to their own files.Gravatar mtklein2016-07-08
| | | | | | | | | | | | | | | This makes them a little easier to use outside SkColorXform code. I've added some notes about how best to use them and their eccentricities, and added a test. Ultimately any software sRGB <-> linear conversion should funnel somehow through here. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2128893002 CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot;client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Committed: https://skia.googlesource.com/skia/+/45e58c8807179638980aae8503573b950b844e4c Review-Url: https://codereview.chromium.org/2128893002
* Revert of Move sRGB <-> linear conversion components to their own files. ↵Gravatar mtklein2016-07-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (patchset #5 id:80001 of https://codereview.chromium.org/2128893002/ ) Reason for revert: Monotonicity assert is failing on ARM. (Different rsqrt() and invert() precision?) Will investigate a bit tomorrow... might reland with the test TODO. Original issue's description: > Move sRGB <-> linear conversion components to their own files. > > This makes them a little easier to use outside SkColorXform code. > > I've added some notes about how best to use them and their eccentricities, and added a test. > > Ultimately any software sRGB <-> linear conversion should funnel somehow through here. > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2128893002 > CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > Committed: https://skia.googlesource.com/skia/+/45e58c8807179638980aae8503573b950b844e4c TBR=reed@google.com,msarett@google.com,mtklein@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review-Url: https://codereview.chromium.org/2131793002
* Move sRGB <-> linear conversion components to their own files.Gravatar mtklein2016-07-07
| | | | | | | | | | | | | | This makes them a little easier to use outside SkColorXform code. I've added some notes about how best to use them and their eccentricities, and added a test. Ultimately any software sRGB <-> linear conversion should funnel somehow through here. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2128893002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2128893002
* Add stub for avx.Gravatar herb2016-06-23
| | | | | | | GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2087343002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2087343002
* Do loads and math in parallel in SkColorXform_optsGravatar msarett2016-06-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Note that baselines have changed a little since I recently started using clang. 201295.jpg on HP z620 (300x280) Skia Xform sRGB Dst Before 0.378 ms Skia Xform sRGB Dst After 0.322 ms 1.17x Skia Xform 2.2 Dst Before 0.428 ms Skia Xform 2.2 Dst After 0.395 ms 1.08x QCMS Xform 0.418 ms sRGB Dst vs QCMS 1.30x 2.2 Dst vs QCMS 1.06x -------------------------------------------- Nexus 6P: Skia Xform sRGB Dst Before 1.58 ms Skia Xform sRGB Dst After 1.43 ms Skia Xform 2.2 Dst Before 2.69 ms Skia Xform 2.2 Dst After 2.62 ms Dell Venue 8: Skia Xform sRGB Dst Before 2.78 ms Skia Xform sRGB Dst After 2.74 ms Skia Xform 2.2 Dst Before 3.73 ms Skia Xform 2.2 Dst After 3.64 ms BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2081933005 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2081933005
* Use a table-based implementation of SkDefaultXformGravatar msarett2016-06-22
| | | | | | | | BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2084673002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2084673002
* Support sRGB dsts in opt codeGravatar msarett2016-06-20
| | | | | | | | | | | | | | | 201295.jpg on HP z620 (300x280) QCMS Xform 0.418 ms Skia NEW Xform 0.378 ms Vs QCMS 1.11x BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2078623002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2078623002
* Quick bandaid for chromium:611002.Gravatar mtklein2016-06-17
| | | | | | | | | | | | | | | | | | | | We're somehow receiving non-premultiplied src inputs like 0x00ffffff to this SrcOver blend. That's a bug I intend to follow up on. But for a quick compatibility fix, go back to treating values like 0x00ffffff as transparent, like we used to before crrev.com/1820313002. This will not affect the correctness of code paths using properly premultiplied colors. This should not change performance in any meaningful way. The SIMD code paths (handling strides of 16 pixels at a time) happen to treat invalid colors like 0x00fffff as transparent already. BUG=chromium:611002 GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2075173002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2075173002
* port SkColorXform_opts to Sk4fGravatar mtklein2016-06-17
| | | | | | | | | | I have tested that this compiles. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2078913003 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2078913003
* Clean up two unlaunched SSE 4.1 8888 blits.Gravatar mtklein2016-06-16
| | | | | | | | | | | | | | | | | | | | | | This code was running on our bots but never in Chrome. That's a bad state to be in. My plan here use to be to redesign how our 8888 blits worked in SSE 4.1, mainly for perfect correctness but also for speed, then to spread what I learned there to SSE2, AVX+, and NEON. I have since lost interest in changing any aspect of how our legacy 8888 blits work. There's not much point in making them a bit or two more correct when the math is fundamentally wrong. This will cause many diffs in Gold, none perceptible. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2062853002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Committed: https://skia.googlesource.com/skia/+/6e472093009bf2fc4a8e53010b51040efcb71213 Review-Url: https://codereview.chromium.org/2062853002
* Implement fast, correct gamma conversion for color xformsGravatar msarett2016-06-16
| | | | | | | | | | | | | | | | | | | | | | 201295.jpg on HP z620 (300x280, most common form of sRGB profile) QCMS Xform 0.495 ms Skia Old Xform 0.235 ms Skia NEW Xform 0.423 ms Vs Old Code 0.56x Vs QCMS 1.17x So to summarize, we are now much slower than before, but still a bit faster than QCMS. And now we are also far more accurate than QCMS :). BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2060823003 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2060823003
* Revert of Clean up two unlaunched SSE 4.1 8888 blits. (patchset #1 id:1 of ↵Gravatar mtklein2016-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://codereview.chromium.org/2062853002/ ) Reason for revert: Breaks a couple Google3 goldens. I need to rebaseline google3 with -DSK_SUPPORT_LEGACY_X86_BLITS first, then reland this. Original issue's description: > Clean up two unlaunched SSE 4.1 8888 blits. > > This code was running on our bots but never in Chrome. > That's a bad state to be in. > > My plan here use to be to redesign how our 8888 blits worked in SSE 4.1, mainly > for perfect correctness but also for speed, then to spread what I learned there > to SSE2, AVX+, and NEON. > > I have since lost interest in changing any aspect of how our legacy 8888 blits > work. There's not much point in making them a bit or two more correct when the > math is fundamentally wrong. > > This will cause many diffs in Gold, none perceptible. > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2062853002 > CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > Committed: https://skia.googlesource.com/skia/+/6e472093009bf2fc4a8e53010b51040efcb71213 TBR=reed@google.com # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review-Url: https://codereview.chromium.org/2066453003
* Clean up two unlaunched SSE 4.1 8888 blits.Gravatar mtklein2016-06-13
| | | | | | | | | | | | | | | | | | | | | This code was running on our bots but never in Chrome. That's a bad state to be in. My plan here use to be to redesign how our 8888 blits worked in SSE 4.1, mainly for perfect correctness but also for speed, then to spread what I learned there to SSE2, AVX+, and NEON. I have since lost interest in changing any aspect of how our legacy 8888 blits work. There's not much point in making them a bit or two more correct when the math is fundamentally wrong. This will cause many diffs in Gold, none perceptible. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2062853002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2062853002
* Move immintrin/arm_neon includes to where they are used.Gravatar mtklein2016-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | On my Mac (so, immintrin), this improves compile time, both wall and cpu, by about 16%. To test I ran this on an SSD with files hot in their caches: $ env CC=/usr/bin/clang CXX=/usr/bin/clang++ ./gyp_skia && \ ninja -C out/Release -t clean && \ time ninja -C out/Release Before: 159 wall / 3367 cpu 159 wall / 3368 cpu After: 137 wall / 2860 cpu 136 wall / 2863 cpu I also tried further refining immintrin down to emmintrin / tmmintrin / smmintrin etc. That made no signficant difference, so I've kept immintrin for its simplicity. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2045633002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot TBR=reed@google.com No public API changes. Committed: https://skia.googlesource.com/skia/+/12dfaaa53c23f3d03050bde8f64136ac1f44164a Review-Url: https://codereview.chromium.org/2045633002
* Optimize color xforms with 2.2 gammas for SSE2Gravatar msarett2016-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Because we recognize commonly used gamma tables and parameters as 2.2f, about 98% of jpegs with color profiles will pass through this xform (assuming the dst is also 2.2f). Sample size is 10,322 jpegs. I won't go crazy with performance numbers because this is a work in progress, particularly in terms of correctness. 201295.jpg on HP z620 (300x280, most common form of sRGB profile) Decode Time + QCMS Xform 1.28 ms QCMS Xform Only 0.495 ms Decode Time + Skia Opt Xform 1.01 ms Skia Opt Xform Only 0.235 ms Decode Time + Xform Speed-up 1.27x Xform Only Speed-up 2.11x FWIW, Skia xform time before these optimizations was 41.1 ms. But we expected that code to be slow. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2046013002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2046013002
* Revert of Move immintrin/arm_neon includes to where they are used. (patchset ↵Gravatar mtklein2016-06-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | #2 id:20001 of https://codereview.chromium.org/2045633002/ ) Reason for revert: Appears to have broken the ARMv7 aspect of the Google3 roll in bizarre seemingly-unrelated ways. Original issue's description: > Move immintrin/arm_neon includes to where they are used. > > On my Mac (so, immintrin), this improves compile time, both wall and cpu, > by about 16%. To test I ran this on an SSD with files hot in their caches: > > $ env CC=/usr/bin/clang CXX=/usr/bin/clang++ ./gyp_skia && \ > ninja -C out/Release -t clean && \ > time ninja -C out/Release > > Before: 159 wall / 3367 cpu > 159 wall / 3368 cpu > > After: 137 wall / 2860 cpu > 136 wall / 2863 cpu > > I also tried further refining immintrin down to emmintrin / tmmintrin / smmintrin etc. > That made no signficant difference, so I've kept immintrin for its simplicity. > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2045633002 > CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > TBR=reed@google.com > No public API changes. > > Committed: https://skia.googlesource.com/skia/+/12dfaaa53c23f3d03050bde8f64136ac1f44164a TBR=herb@google.com,mtklein@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review-Url: https://codereview.chromium.org/2046213002
* Move immintrin/arm_neon includes to where they are used.Gravatar mtklein2016-06-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | On my Mac (so, immintrin), this improves compile time, both wall and cpu, by about 16%. To test I ran this on an SSD with files hot in their caches: $ env CC=/usr/bin/clang CXX=/usr/bin/clang++ ./gyp_skia && \ ninja -C out/Release -t clean && \ time ninja -C out/Release Before: 159 wall / 3367 cpu 159 wall / 3368 cpu After: 137 wall / 2860 cpu 136 wall / 2863 cpu I also tried further refining immintrin down to emmintrin / tmmintrin / smmintrin etc. That made no signficant difference, so I've kept immintrin for its simplicity. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2045633002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot TBR=reed@google.com No public API changes. Review-Url: https://codereview.chromium.org/2045633002
* I have found a more efficient way of detecting 1 and 0 alpha in SSE2. In ↵Gravatar herb2016-05-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | addition, I found a stall on an execution unit for the lea instruction and rearranged to code to avoid that. Before 1,362.01 LinearSrcOvericonstrip.pngVSkOptsSSE41 2,132.54 LinearSrcOvericonstrip.pngVSkOptsDefault 1,717.77 LinearSrcOvericonstrip.pngVSkOptsNonSimdCore 3,525.14 LinearSrcOvericonstrip.pngVSkOptsTrivial 11,181.78 LinearSrcOvericonstrip.pngVSkOptsBruteForce 644.77 LinearSrcOvermandrill_512.pngVSkOptsSSE41 682.51 LinearSrcOvermandrill_512.pngVSkOptsDefault 1,169.65 LinearSrcOvermandrill_512.pngVSkOptsNonSimdCore 2,486.45 LinearSrcOvermandrill_512.pngVSkOptsTrivial 11,635.94 LinearSrcOvermandrill_512.pngVSkOptsBruteForce 217.76 LinearSrcOverplane.pngVSkOptsSSE41 437.09 LinearSrcOverplane.pngVSkOptsDefault 275.91 LinearSrcOverplane.pngVSkOptsNonSimdCore 481.70 LinearSrcOverplane.pngVSkOptsTrivial 1,504.66 LinearSrcOverplane.pngVSkOptsBruteForce 323.90 LinearSrcOverbaby_tux.pngVSkOptsSSE41 497.49 LinearSrcOverbaby_tux.pngVSkOptsDefault 456.08 LinearSrcOverbaby_tux.pngVSkOptsNonSimdCore 786.46 LinearSrcOverbaby_tux.pngVSkOptsTrivial 2,554.65 LinearSrcOverbaby_tux.pngVSkOptsBruteForce 484.83 LinearSrcOveryellow_rose.pngVSkOptsSSE41 821.86 LinearSrcOveryellow_rose.pngVSkOptsDefault 655.37 LinearSrcOveryellow_rose.pngVSkOptsNonSimdCore 1,323.80 LinearSrcOveryellow_rose.pngVSkOptsTrivial 5,802.61 LinearSrcOveryellow_rose.pngVSkOptsBruteForce After changes to sse2 and sse4.1 1,343.12 LinearSrcOvericonstrip.pngVSkOptsSSE41 1,441.17 LinearSrcOvericonstrip.pngVSkOptsDefault 1,679.97 LinearSrcOvericonstrip.pngVSkOptsNonSimdCore 3,481.05 LinearSrcOvericonstrip.pngVSkOptsTrivial 10,979.99 LinearSrcOvericonstrip.pngVSkOptsBruteForce 574.17 LinearSrcOvermandrill_512.pngVSkOptsSSE41 641.40 LinearSrcOvermandrill_512.pngVSkOptsDefault 1,169.44 LinearSrcOvermandrill_512.pngVSkOptsNonSimdCore 2,359.84 LinearSrcOvermandrill_512.pngVSkOptsTrivial 12,106.02 LinearSrcOvermandrill_512.pngVSkOptsBruteForce 209.95 LinearSrcOverplane.pngVSkOptsSSE41 249.12 LinearSrcOverplane.pngVSkOptsDefault 270.36 LinearSrcOverplane.pngVSkOptsNonSimdCore 466.30 LinearSrcOverplane.pngVSkOptsTrivial 1,431.14 LinearSrcOverplane.pngVSkOptsBruteForce 309.70 LinearSrcOverbaby_tux.pngVSkOptsSSE41 354.86 LinearSrcOverbaby_tux.pngVSkOptsDefault 442.69 LinearSrcOverbaby_tux.pngVSkOptsNonSimdCore 764.12 LinearSrcOverbaby_tux.pngVSkOptsTrivial 2,756.16 LinearSrcOverbaby_tux.pngVSkOptsBruteForce 457.70 LinearSrcOveryellow_rose.pngVSkOptsSSE41 500.50 LinearSrcOveryellow_rose.pngVSkOptsDefault 677.84 LinearSrcOveryellow_rose.pngVSkOptsNonSimdCore 1,301.50 LinearSrcOveryellow_rose.pngVSkOptsTrivial 5,786.40 LinearSrcOveryellow_rose.pngVSkOptsBruteForce BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1998373002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/1998373002
* Add tests and benches to support the sRGB blitter for SkOptsGravatar herb2016-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1,370.85 LinearSrcOvericonstrip.pngVSkOptsSSE41 2,359.69 LinearSrcOvericonstrip.pngVSkOptsDefault 1,828.72 LinearSrcOvericonstrip.pngVSkOptsNonSimdCore 3,277.40 LinearSrcOvericonstrip.pngVSkOptsTrivial 9,862.34 LinearSrcOvericonstrip.pngVSkOptsBruteForce 633.55 LinearSrcOvermandrill_512.pngVSkOptsSSE41 684.29 LinearSrcOvermandrill_512.pngVSkOptsDefault 1,201.88 LinearSrcOvermandrill_512.pngVSkOptsNonSimdCore 2,382.63 LinearSrcOvermandrill_512.pngVSkOptsTrivial 10,888.74 LinearSrcOvermandrill_512.pngVSkOptsBruteForce 209.14 LinearSrcOverplane.pngVSkOptsSSE41 562.24 LinearSrcOverplane.pngVSkOptsDefault 272.64 LinearSrcOverplane.pngVSkOptsNonSimdCore 436.46 LinearSrcOverplane.pngVSkOptsTrivial 1,327.23 LinearSrcOverplane.pngVSkOptsBruteForce 318.01 LinearSrcOverbaby_tux.pngVSkOptsSSE41 529.05 LinearSrcOverbaby_tux.pngVSkOptsDefault 441.33 LinearSrcOverbaby_tux.pngVSkOptsNonSimdCore 720.50 LinearSrcOverbaby_tux.pngVSkOptsTrivial 2,191.10 LinearSrcOverbaby_tux.pngVSkOptsBruteForce 479.68 LinearSrcOveryellow_rose.pngVSkOptsSSE41 1,095.03 LinearSrcOveryellow_rose.pngVSkOptsDefault 668.60 LinearSrcOveryellow_rose.pngVSkOptsNonSimdCore 1,257.19 LinearSrcOveryellow_rose.pngVSkOptsTrivial 4,970.25 LinearSrcOveryellow_rose.pngVSkOptsBruteForce BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1939513002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Committed: https://skia.googlesource.com/skia/+/554784cd85029c05d9ed04b1aeb71520d196153a Committed: https://skia.googlesource.com/skia/+/bc927548db17accec2195af6e15053f7918bb3f5 Review-Url: https://codereview.chromium.org/1939513002
* Revert of Add specialized sRGB blitter for SkOpts (patchset #21 id:400001 of ↵Gravatar reed2016-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://codereview.chromium.org/1939513002/ ) Reason for revert: broke some debug bots: Running LinearSrcOvericonstrip.pngVSkOptsSSE41 nonrendering ../../../bench/SkBlend_optsBench.cpp:118: fatal error: ""fPixmap.colorType() == kRGBA_8888_SkColorType"" Original issue's description: > Add tests and benches to support the sRGB blitter for SkOpts > > 1,370.85 LinearSrcOvericonstrip.pngVSkOptsSSE41 > 2,359.69 LinearSrcOvericonstrip.pngVSkOptsDefault > 1,828.72 LinearSrcOvericonstrip.pngVSkOptsNonSimdCore > 3,277.40 LinearSrcOvericonstrip.pngVSkOptsTrivial > 9,862.34 LinearSrcOvericonstrip.pngVSkOptsBruteForce > > 633.55 LinearSrcOvermandrill_512.pngVSkOptsSSE41 > 684.29 LinearSrcOvermandrill_512.pngVSkOptsDefault > 1,201.88 LinearSrcOvermandrill_512.pngVSkOptsNonSimdCore > 2,382.63 LinearSrcOvermandrill_512.pngVSkOptsTrivial > 10,888.74 LinearSrcOvermandrill_512.pngVSkOptsBruteForce > > 209.14 LinearSrcOverplane.pngVSkOptsSSE41 > 562.24 LinearSrcOverplane.pngVSkOptsDefault > 272.64 LinearSrcOverplane.pngVSkOptsNonSimdCore > 436.46 LinearSrcOverplane.pngVSkOptsTrivial > 1,327.23 LinearSrcOverplane.pngVSkOptsBruteForce > > 318.01 LinearSrcOverbaby_tux.pngVSkOptsSSE41 > 529.05 LinearSrcOverbaby_tux.pngVSkOptsDefault > 441.33 LinearSrcOverbaby_tux.pngVSkOptsNonSimdCore > 720.50 LinearSrcOverbaby_tux.pngVSkOptsTrivial > 2,191.10 LinearSrcOverbaby_tux.pngVSkOptsBruteForce > > 479.68 LinearSrcOveryellow_rose.pngVSkOptsSSE41 > 1,095.03 LinearSrcOveryellow_rose.pngVSkOptsDefault > 668.60 LinearSrcOveryellow_rose.pngVSkOptsNonSimdCore > 1,257.19 LinearSrcOveryellow_rose.pngVSkOptsTrivial > 4,970.25 LinearSrcOveryellow_rose.pngVSkOptsBruteForce > > > > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1939513002 > CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > Committed: https://skia.googlesource.com/skia/+/554784cd85029c05d9ed04b1aeb71520d196153a > > Committed: https://skia.googlesource.com/skia/+/bc927548db17accec2195af6e15053f7918bb3f5 TBR=mtklein@google.com,fmalita@chromium.org,herb@google.com # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review-Url: https://codereview.chromium.org/1986763002
* Add tests and benches to support the sRGB blitter for SkOptsGravatar herb2016-05-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1,370.85 LinearSrcOvericonstrip.pngVSkOptsSSE41 2,359.69 LinearSrcOvericonstrip.pngVSkOptsDefault 1,828.72 LinearSrcOvericonstrip.pngVSkOptsNonSimdCore 3,277.40 LinearSrcOvericonstrip.pngVSkOptsTrivial 9,862.34 LinearSrcOvericonstrip.pngVSkOptsBruteForce 633.55 LinearSrcOvermandrill_512.pngVSkOptsSSE41 684.29 LinearSrcOvermandrill_512.pngVSkOptsDefault 1,201.88 LinearSrcOvermandrill_512.pngVSkOptsNonSimdCore 2,382.63 LinearSrcOvermandrill_512.pngVSkOptsTrivial 10,888.74 LinearSrcOvermandrill_512.pngVSkOptsBruteForce 209.14 LinearSrcOverplane.pngVSkOptsSSE41 562.24 LinearSrcOverplane.pngVSkOptsDefault 272.64 LinearSrcOverplane.pngVSkOptsNonSimdCore 436.46 LinearSrcOverplane.pngVSkOptsTrivial 1,327.23 LinearSrcOverplane.pngVSkOptsBruteForce 318.01 LinearSrcOverbaby_tux.pngVSkOptsSSE41 529.05 LinearSrcOverbaby_tux.pngVSkOptsDefault 441.33 LinearSrcOverbaby_tux.pngVSkOptsNonSimdCore 720.50 LinearSrcOverbaby_tux.pngVSkOptsTrivial 2,191.10 LinearSrcOverbaby_tux.pngVSkOptsBruteForce 479.68 LinearSrcOveryellow_rose.pngVSkOptsSSE41 1,095.03 LinearSrcOveryellow_rose.pngVSkOptsDefault 668.60 LinearSrcOveryellow_rose.pngVSkOptsNonSimdCore 1,257.19 LinearSrcOveryellow_rose.pngVSkOptsTrivial 4,970.25 LinearSrcOveryellow_rose.pngVSkOptsBruteForce BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1939513002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Committed: https://skia.googlesource.com/skia/+/554784cd85029c05d9ed04b1aeb71520d196153a Review-Url: https://codereview.chromium.org/1939513002