aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/opts/SkBlitMask_opts_arm_neon.h
Commit message (Collapse)AuthorAge
* Sk4px blit mask.Gravatar mtklein2015-08-10
| | | | | | | | | | | | | | | | Local SKP nanobenching ranges SSE between 1.05x and 0.87x, much more heavily weighted toward <1.0x ratios (speedups). I profiled the top five regressions (1.05x-1.01x) and they look like noise. Will follow up after broad bot results. NEON looks similar but less extreme than SSE changes, ranging between 1.02x and 0.95x, again mostly speedups in 0.99x-0.97x range. The old code trifurcated into black, opaque-but-not-black, and general versions as a function of the constant src color. I did not see a significant difference between general and opaque-but-not-black, and I don't think a black version would be faster using SIMD. So we have here just one version of the code, the general version. Somewhat fantastically, I see no pixel diffs on GMs or SKPs. I will be following up with more CLs for the other procs called by SkBlitMask. BUG=skia: Review URL: https://codereview.chromium.org/1278253003
* add/fix copyrightsGravatar reed2015-06-26
| | | | | | | BUG=skia: TBR= Review URL: https://codereview.chromium.org/1212393002
* ARM Skia NEON patches - 16/17 - BlitmaskGravatar commit-bot@chromium.org2013-11-27
Blitmask: NEON optimised version of the D32_A8 functions Here are the microbenchmark results I got for the D32_A8 functions: Cortex-A9: ========== +-------+--------+--------+--------+ | count | Black | Opaque | Color | +-------+--------+--------+--------+ | 1 | -14% | -39,5% | -37,5% | +-------+--------+--------+--------+ | 2 | -3% | -29,9% | -25% | +-------+--------+--------+--------+ | 4 | -11,3% | -22% | -14,5% | +-------+--------+--------+--------+ | 8 | +128% | +66,6% | +105% | +-------+--------+--------+--------+ | 16 | +159% | +102% | +149% | +-------+--------+--------+--------+ | 64 | +189% | +136% | +189% | +-------+--------+--------+--------+ | 256 | +126% | +102% | +149% | +-------+--------+--------+--------+ | 1024 | +67,5% | +81,4% | +123% | +-------+--------+--------+--------+ Cortex-A15: =========== +-------+--------+--------+--------+ | count | Black | Opaque | Color | +-------+--------+--------+--------+ | 1 | -24% | -46,5% | -37,5% | +-------+--------+--------+--------+ | 2 | -18,5% | -35,5% | -28% | +-------+--------+--------+--------+ | 4 | -5,2% | -17,5% | -15,5% | +-------+--------+--------+--------+ | 8 | +72% | +65,8% | +84,7% | +-------+--------+--------+--------+ | 16 | +168% | +117% | +149% | +-------+--------+--------+--------+ | 64 | +165% | +110% | +145% | +-------+--------+--------+--------+ | 256 | +106% | +99,6% | +141% | +-------+--------+--------+--------+ | 1024 | +93,7% | +94,7% | +130% | +-------+--------+--------+--------+ Blitmask: add NEON optimised PlatformBlitRowProcs16 Here are the microbenchmark results (speedup vs. C code): +-------+-----------------+-----------------+ | | Cortex-A9 | Cortex-A15 | | count +--------+--------+--------+--------+ | | Blend | Opaque | Blend | Opaque | +-------+--------+--------+--------+--------+ | 1 | -19,2% | -36,7% | -33,6% | -44,7% | +-------+--------+--------+--------+--------+ | 2 | -12,6% | -27,8% | -39% | -48% | +-------+--------+--------+--------+--------+ | 4 | -11,5% | -21,6% | -37,7% | -44,3% | +-------+--------+--------+--------+--------+ | 8 | +141% | +59,7% | +123% | +48,7% | +-------+--------+--------+--------+--------+ | 16 | +213% | +119% | +214% | +121% | +-------+--------+--------+--------+--------+ | 64 | +212% | +105% | +242% | +167% | +-------+--------+--------+--------+--------+ | 256 | +289% | +167% | +249% | +207% | +-------+--------+--------+--------+--------+ | 1024 | +273% | +169% | +146% | +220% | +-------+--------+--------+--------+--------+ Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com, mtklein@google.com, reed@google.com Author: kevin.petit.arm@gmail.com Review URL: https://codereview.chromium.org/23719002 git-svn-id: http://skia.googlecode.com/svn/trunk@12420 2bbb7eff-a529-9590-31e7-b0007b416f81