aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/core/SkXfermode4f.cpp
Commit message (Collapse)AuthorAge
* Fix SW sRGB dst + LCD coverage bug.Gravatar mtklein2016-07-22
| | | | | | | | | | | | | We're using the linear procs for sRGB destintations and the sRGB procs for linear destinations. Fix that. C.f. State32::getLCDProc(), which flags |= kDstIsSRGB_LCDFlag. kDistIsSRGB is (1<<2) == 4, so the sRGB procs must be 4-7, not 0-3. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2177493002 Review-Url: https://codereview.chromium.org/2177493002
* Correct sRGB <-> linear everywhere.Gravatar mtklein2016-07-20
| | | | | | | | | | | | | | | | | | | | | | This trims the SkPM4fPriv methods down to just foolproof methods. (Anything trying to build these itself is probably wrong.) Things like Sk4f srgb_to_linear(Sk4f) can't really exist anymore, at least not efficiently, so this refactor is somewhat more invasive than you might think. Generally this means things using to_4f() are also making a misstep... that's gone too. It also does not make sense to try to play games with linear floats with 255 bias any more. That hack can't work with real sRGB coding. Rather than update them, I've removed a couple of L32 xfermode fast paths. I'd even rather drop it entirely... BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2163683002 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2163683002
* Fix color order on LCD text when using sRGB software backend.Gravatar mtklein2016-07-19
| | | | | | | BUG=skia:5182 GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2166533002 Review-Url: https://codereview.chromium.org/2166533002
* linear -> sRGB: use fast approximate sqrt()Gravatar mtklein2016-06-07
| | | | | | | | | | | | | | | | Since we're already approximating the sRGB gamma curve with a sqrt(), we might as well approximate with it a faster approximate sqrt(). On Intel, this .rsqrt().invert() version is 2-3x faster than .sqrt() (~3x faster on older machines, ~2x faster on newer machines). This should provide ~11 bits of precision, suspiciously exactly enough. Running dm --config srgb, there are diffs, but none perceptible. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2046063002 Review-Url: https://codereview.chromium.org/2046063002
* Use special case for 0x00 and 0xFF alpha to go faster.Gravatar herb2016-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Base Exp Ratio Name 3916010 3077544 0.785888 top25desk_amazon_com.skp_1 3462512 2776580 0.801898 top25desk_google_com_calendar_.skp_1 3446330 3134187 0.909427 top25desk_ebay_com.skp_1 590474 546375 0.925316 top25desk_techcrunch_com.skp_1 3804991 3544162 0.931451 top25desk_google_com__hl_en_q_b.skp_1 996037 939960 0.9437 top25desk_blogger.skp_1 973264 922677 0.948023 top25desk_wikipedia__1_tab_.skp_1 4514050 4295660 0.95162 top25desk_docs___1_open_documen.skp_1 4255383 4110057 0.965849 top25desk_linkedin.skp_1 8408717 8191843 0.974208 top25desk_booking_com.skp_1 2878529 2806501 0.974977 top25desk_plus_google_com_11003.skp_1 11509894 11254486 0.97781 top25desk_pinterest.skp_1 9132514 9010635 0.986654 top25desk_weather_com.skp_1 6504720 6419592 0.986913 top25desk_sports_yahoo_com_.skp_1 9136774 9033870 0.988737 top25desk_answers_yahoo_com.skp_1 4836199 4799784 0.99247 top25desk_news_yahoo_com.skp_1 1393650 1384065 0.993122 top25desk_games_yahoo_com.skp_1 6779678 6735278 0.993451 top25desk_youtube_com.skp_1 10926943 10882308 0.995915 top25desk_espn.skp_1 4259514 4245489 0.996707 top25desk_facebook.skp_1 10955293 10947657 0.999303 top25desk_google_com_search_q_c.skp_1 9153575 9207386 1.00588 top25desk_twitter.skp_1 3865942 3906345 1.01045 top25desk_wordpress.skp_1 4180009 4305530 1.03003 top25desk_mail_google_com_mail_.skp_1 BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2020463003 Review-Url: https://codereview.chromium.org/2020463003
* Use Sk4x4f in srcover_srgb_dst_1.Gravatar mtklein2016-03-22
| | | | | | | | | | | | I've also pulled out the common parts shared with sRGB srcover_n, and rearranged to make the similarities a bit more clear. This speeds up about 25%. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1829513002 Review URL: https://codereview.chromium.org/1829513002
* Sk4x4fGravatar mtklein2016-03-22
| | | | | | | | | | | An API for loading and storing 4 Sk4f with transpose. This has SSSE3+ and portable versions. SSE2 and NEON versions to follow. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1825663002 Review URL: https://codereview.chromium.org/1825663002
* custom ssse3 srcover_n_srgb_bw, about 1.8x speedupGravatar mtklein2016-03-18
| | | | | | | | | | | | | | | | | | | | | | | | This is a little demo of the sorts of speedups we can get from working in planar format, or even just a mini-planar of 4 pixels at a time like I'm doing here. I chose this blit by running $ out/Release/nanobench --config srgb --match skp and looking for the hottest sRGB-related method. After this CL, src_1 and src_n become hotter than srcover_n. They can probably get a similar treatment. We transpose three times in this function: - dst after reading, as part of the zero-extension and conversion to float - src after reading, _MM_TRANSPOSE4_PS (which expands to 8 cheap instructions) - result before writing, the last _mm_shuffle_epi8 If we changed our buffer format to a mini-planar format like rrrr gggg bbbb aaaa, we could eliminate the src transpose and get another small speedup, to right around 2x. This code leans pretty heavily on SSSE3, so if we want it to speed up Windows+Linux Chrome, it'll eventually want to go behind a function pointer. This also appears to fix what looks like overflow in a few GMs, most noticeably in hairmodes. This is something we'd better look into... BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1813263002 Review URL: https://codereview.chromium.org/1813263002
* make pm4f be RGBA always, not pmcolor orderGravatar reed2016-03-08
| | | | | | | BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1774523002 Review URL: https://codereview.chromium.org/1774523002
* simplify/unify xferproc apiGravatar reed2016-02-24
| | | | | | | BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1721223002 Review URL: https://codereview.chromium.org/1721223002
* lots of sRGB and F16 blitsGravatar reed2016-02-22
| | | | | | | | | | | - generalize F16 xfermode procs - spriteblits for F16 and sRGB - saveLayer now respects colortype and profiletype BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1686013002 Review URL: https://codereview.chromium.org/1685203002
* SkNx: kth<...>() -> [...]Gravatar mtklein2016-02-21
| | | | | | | | | | Just some syntax cleanup. No real change: kth<...>() was calling [...] already. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1714363002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1714363002
* check pm swizzle when extracting lcd coverageGravatar reed2016-02-19
| | | | | | | BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1706283005 Review URL: https://codereview.chromium.org/1706283005
* lcd blits for sRGBGravatar reed2016-02-18
| | | | | | | | | Unimplemented for F16 for the moment, but not needed yet. Surprisingly not much slower than current impl (not srgb correct) GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1707883002 Review URL: https://codereview.chromium.org/1707883002
* optimize src mode (opaque src in srcover) singleton with aa (e.g. a8 text mask)Gravatar reed2016-02-03
| | | | | | | | | | | | | | | Before: 8/8 MB 1 9.09ms 10.8ms 10.3ms 11.5ms 9% █▆▆▁▁▁▂▅█▆ nonrendering xfer4f_srcover_aa_1_opaque_linear 8/8 MB 1 10.2ms 12.1ms 11.7ms 13.2ms 9% ▅▇▁▂▁▄█▆▅▆ nonrendering xfer4f_srcover_aa_1_opaque_srgb After: 8/8 MB 1 1.6ms 1.68ms 1.73ms 2.17ms 10% ▄▄█▁▃▁▁▂▁▁ nonrendering xfer4f_srcover_aa_1_opaque_linear 8/8 MB 1 3.13ms 3.62ms 3.97ms 5.81ms 21% █▃▁▂▆▂▂▂▃▂ nonrendering xfer4f_srcover_aa_1_opaque_srgb BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1664713003 Review URL: https://codereview.chromium.org/1664713003
* extend gm to test aa[] parameter on xfer4f procsGravatar reed2016-02-03
| | | | | | | BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1663643002 Review URL: https://codereview.chromium.org/1663643002
* unroll srcover_1 for blending a single colorGravatar reed2016-02-02
| | | | | | | | | | | | | | | | | Before: curr/maxrss loops min median mean max stddev samples config bench 8/8 MB 1 1.59ms 1.82ms 1.89ms 2.59ms 14% ▁█▃▃▃▃▃▃▃▃ nonrendering xfer4f_srcover_1_alpha_linear 8/8 MB 1 3.25ms 4.25ms 4.16ms 5.87ms 21% ▁▅▂▁▁▄█▄▅▂ nonrendering xfer4f_srcover_1_alpha_srgb After: curr/maxrss loops min median mean max stddev samples config bench 8/8 MB 1 915µs 915µs 946µs 1.02ms 4% █▄▇▁▁▁▆▁▁▁ nonrendering xfer4f_srcover_1_alpha_linear 8/8 MB 1 2.69ms 3.08ms 3.03ms 3.63ms 10% ▁▃▂▁▁█▄▄▄▆ nonrendering xfer4f_srcover_1_alpha_srgb BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1653943002 Review URL: https://codereview.chromium.org/1653943002
* float components in xfermodesGravatar reed2016-01-30
BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1623483002 TBR=mtklein Review URL: https://codereview.chromium.org/1634273002