aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/opts
Commit message (Collapse)AuthorAge
* Revert the revert of (ARM Skia NEON patches - 34 - Blur Filter - ↵Gravatar robertphillips@google.com2013-12-09
| | | | | | https://codereview.chromium.org/109403004/) git-svn-id: http://skia.googlecode.com/svn/trunk@12581 2bbb7eff-a529-9590-31e7-b0007b416f81
* Reverting r12568 (ARM Skia NEON patches - 34 - Blur Filter - ↵Gravatar robertphillips@google.com2013-12-09
| | | | | | https://codereview.chromium.org/109403004) due to image quality regressions on the N4. git-svn-id: http://skia.googlecode.com/svn/trunk@12578 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 34 - Blur FilterGravatar commit-bot@chromium.org2013-12-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve a little on Blur Grouping operations gives a 5-15% speed improvement on a Cortex-A15 based Chromebook. before: running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 30887.69 running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 30751.35 running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 30757.92 running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 30673.88 running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 19602.17 running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 20613.81 running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 17855.46 running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 17957.79 after: running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 27015.75 running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 27148.02 running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 27241.60 running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 27077.44 running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 18458.10 running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 19643.42 running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 16176.73 running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 16450.50 Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=senorblanco@chromium.org, mtklein@google.com, luisjoseromeroesclusa@hotmail.com Author: kevin.petit.arm@gmail.com Review URL: https://codereview.chromium.org/109403004 git-svn-id: http://skia.googlecode.com/svn/trunk@12568 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 32 - Xfermode: 1-pixel NEON modeprocsGravatar commit-bot@chromium.org2013-12-06
| | | | | | | | | | | | | | | | | In some cases, it's easy to provide a NEON version of the 1-pixel modeprocs. Combined with https://codereview.chromium.org/23724013/ (merged) it allows up to 35% speed improvement on Xfermodes when aa is non-NULL. Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com, reed@google.com, mtklein@google.com, luisjoseromeroesclusa@hotmail.com Author: kevin.petit.arm@gmail.com Review URL: https://codereview.chromium.org/104883004 git-svn-id: http://skia.googlecode.com/svn/trunk@12525 2bbb7eff-a529-9590-31e7-b0007b416f81
* Nit to self: NULL is not false.Gravatar senorblanco@chromium.org2013-12-04
| | | | | | | | | R=mtklein@google.com, mtklein BUG= Review URL: https://codereview.chromium.org/105423002 git-svn-id: http://skia.googlecode.com/svn/trunk@12493 2bbb7eff-a529-9590-31e7-b0007b416f81
* Do proper NEON checking for SkBoxBlur procs.Gravatar senorblanco@chromium.org2013-12-04
| | | | | | | | | TBR=mtklein BUG= Review URL: https://codereview.chromium.org/98373003 git-svn-id: http://skia.googlecode.com/svn/trunk@12490 2bbb7eff-a529-9590-31e7-b0007b416f81
* Implement a NEON version of the RGBA gaussian blur. This shows a 9-15% ↵Gravatar senorblanco@chromium.org2013-12-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | speedup on Nexus-10. R=mtklein@google.com, mtklein before: running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 33063.23 running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 32800.25 running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 33017.88 running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 32743.35 running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 21024.04 running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 22904.15 running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 18738.08 running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 18798.98 after: running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 30180.96 running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 29861.90 running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 30178.98 running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 29911.25 running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 19344.35 running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 19957.07 running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 17158.84 running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 17330.73 Review URL: https://codereview.chromium.org/99933004 git-svn-id: http://skia.googlecode.com/svn/trunk@12486 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 29 - Xfermode: SkFourByteInterpGravatar commit-bot@chromium.org2013-12-02
| | | | | | | | | | | | | | | | | | | | Xfermode: add a NEON version of SkFourByteInterp Brings a modest performance improvement on its own in ProcXfermodes when aa is neither zero nor FF. Combined with 1-pixel NEON modeprocs, it brings up to 35% speed improvement on the aa case. Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com, mtklein@google.com, reed@google.com Author: kevin.petit.arm@gmail.com Review URL: https://codereview.chromium.org/23724013 git-svn-id: http://skia.googlecode.com/svn/trunk@12448 2bbb7eff-a529-9590-31e7-b0007b416f81
* Reverting r12427Gravatar rmistry@google.com2013-12-02
| | | | git-svn-id: http://skia.googlecode.com/svn/trunk@12428 2bbb7eff-a529-9590-31e7-b0007b416f81
* Sanitizing source files in Housekeeper-NightlyGravatar skia.committer@gmail.com2013-12-02
| | | | git-svn-id: http://skia.googlecode.com/svn/trunk@12427 2bbb7eff-a529-9590-31e7-b0007b416f81
* Sanitizing source files in Housekeeper-NightlyGravatar skia.committer@gmail.com2013-11-28
| | | | git-svn-id: http://skia.googlecode.com/svn/trunk@12425 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 16/17 - BlitmaskGravatar commit-bot@chromium.org2013-11-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Blitmask: NEON optimised version of the D32_A8 functions Here are the microbenchmark results I got for the D32_A8 functions: Cortex-A9: ========== +-------+--------+--------+--------+ | count | Black | Opaque | Color | +-------+--------+--------+--------+ | 1 | -14% | -39,5% | -37,5% | +-------+--------+--------+--------+ | 2 | -3% | -29,9% | -25% | +-------+--------+--------+--------+ | 4 | -11,3% | -22% | -14,5% | +-------+--------+--------+--------+ | 8 | +128% | +66,6% | +105% | +-------+--------+--------+--------+ | 16 | +159% | +102% | +149% | +-------+--------+--------+--------+ | 64 | +189% | +136% | +189% | +-------+--------+--------+--------+ | 256 | +126% | +102% | +149% | +-------+--------+--------+--------+ | 1024 | +67,5% | +81,4% | +123% | +-------+--------+--------+--------+ Cortex-A15: =========== +-------+--------+--------+--------+ | count | Black | Opaque | Color | +-------+--------+--------+--------+ | 1 | -24% | -46,5% | -37,5% | +-------+--------+--------+--------+ | 2 | -18,5% | -35,5% | -28% | +-------+--------+--------+--------+ | 4 | -5,2% | -17,5% | -15,5% | +-------+--------+--------+--------+ | 8 | +72% | +65,8% | +84,7% | +-------+--------+--------+--------+ | 16 | +168% | +117% | +149% | +-------+--------+--------+--------+ | 64 | +165% | +110% | +145% | +-------+--------+--------+--------+ | 256 | +106% | +99,6% | +141% | +-------+--------+--------+--------+ | 1024 | +93,7% | +94,7% | +130% | +-------+--------+--------+--------+ Blitmask: add NEON optimised PlatformBlitRowProcs16 Here are the microbenchmark results (speedup vs. C code): +-------+-----------------+-----------------+ | | Cortex-A9 | Cortex-A15 | | count +--------+--------+--------+--------+ | | Blend | Opaque | Blend | Opaque | +-------+--------+--------+--------+--------+ | 1 | -19,2% | -36,7% | -33,6% | -44,7% | +-------+--------+--------+--------+--------+ | 2 | -12,6% | -27,8% | -39% | -48% | +-------+--------+--------+--------+--------+ | 4 | -11,5% | -21,6% | -37,7% | -44,3% | +-------+--------+--------+--------+--------+ | 8 | +141% | +59,7% | +123% | +48,7% | +-------+--------+--------+--------+--------+ | 16 | +213% | +119% | +214% | +121% | +-------+--------+--------+--------+--------+ | 64 | +212% | +105% | +242% | +167% | +-------+--------+--------+--------+--------+ | 256 | +289% | +167% | +249% | +207% | +-------+--------+--------+--------+--------+ | 1024 | +273% | +169% | +146% | +220% | +-------+--------+--------+--------+--------+ Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com, mtklein@google.com, reed@google.com Author: kevin.petit.arm@gmail.com Review URL: https://codereview.chromium.org/23719002 git-svn-id: http://skia.googlecode.com/svn/trunk@12420 2bbb7eff-a529-9590-31e7-b0007b416f81
* Implement a speedup for Y-only blurs by transposing.Gravatar senorblanco@chromium.org2013-11-11
| | | | | | | | | R=mtklein@google.com, mtklein, reed@google.com BUG= Review URL: https://codereview.chromium.org/66413007 git-svn-id: http://skia.googlecode.com/svn/trunk@12227 2bbb7eff-a529-9590-31e7-b0007b416f81
* Implement a NEON version of morphology. This is good for ~2.2X speedup on ↵Gravatar senorblanco@chromium.org2013-11-11
| | | | | | | | | | Tegra3. R=mtklein@google.com, mtklein, reed@google.com Review URL: https://codereview.chromium.org/68123003 git-svn-id: http://skia.googlecode.com/svn/trunk@12219 2bbb7eff-a529-9590-31e7-b0007b416f81
* SSE2 implementation of RGBA box blurs. This yields ~2X perf improvement on ↵Gravatar senorblanco@chromium.org2013-11-08
| | | | | | | | | | Xeon ES-2690. R=mtklein@google.com Review URL: https://codereview.chromium.org/61643011 git-svn-id: http://skia.googlecode.com/svn/trunk@12204 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 31 - Xfermode: xfer16Gravatar commit-bot@chromium.org2013-11-08
| | | | | | | | | | | | | | | | | | Xfermode: xfer16 This adds support for 16bit Xfermodes. It also tunes the gcc test macros in xfer32() to add compatibility for gcc > 4. Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com, mtklein@google.com, reed@google.com Author: kevin.petit.arm@gmail.com Review URL: https://codereview.chromium.org/33063002 git-svn-id: http://skia.googlecode.com/svn/trunk@12192 2bbb7eff-a529-9590-31e7-b0007b416f81
* Sanitizing source files in Housekeeper-NightlyGravatar skia.committer@gmail.com2013-11-08
| | | | git-svn-id: http://skia.googlecode.com/svn/trunk@12186 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 33 - Convolution filterGravatar commit-bot@chromium.org2013-11-07
| | | | | | | | | | | | | | | | | | | | NEON version of the convolutionProcs The bitmap_scale benchmark is now twice as fast on ARM. Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= Committed: http://code.google.com/p/skia/source/detail?r=12154 R=djsollen@google.com, mtklein@google.com, humper@google.com, epoger@google.com Author: kevin.petit.arm@gmail.com Review URL: https://codereview.chromium.org/27533004 git-svn-id: http://skia.googlecode.com/svn/trunk@12166 2bbb7eff-a529-9590-31e7-b0007b416f81
* Revert r12154Gravatar epoger@google.com2013-11-06
| | | | | | | | BUG=skia:1807 git-svn-id: http://skia.googlecode.com/svn/trunk@12156 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 33 - Convolution filterGravatar commit-bot@chromium.org2013-11-06
| | | | | | | | | | | | | | | | | NEON version of the convolutionProcs The bitmap_scale benchmark is now twice as fast on ARM. Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com, mtklein@google.com, humper@google.com Author: kevin.petit.arm@gmail.com Review URL: https://codereview.chromium.org/27533004 git-svn-id: http://skia.googlecode.com/svn/trunk@12154 2bbb7eff-a529-9590-31e7-b0007b416f81
* Mac build fix.Gravatar senorblanco@chromium.org2013-10-30
| | | | | | | | TBR=robertphillips Review URL: https://codereview.chromium.org/45963007 git-svn-id: http://skia.googlecode.com/svn/trunk@12039 2bbb7eff-a529-9590-31e7-b0007b416f81
* Implement SSE2-based implementations of the morphology filters (dilate & ↵Gravatar senorblanco@chromium.org2013-10-30
| | | | | | | | | | | | erode). This gives a 3-5X speedup over the naive implementation, and also mitigates a timing-based security attack in Chrome (https://code.google.com/p/chromium/issues/detail?id=251711). NOTE: this will require a corresponding GYP change on the Skia roll into Chrome: https://codereview.chromium.org/52453004/ R=mtklein@google.com, reed@google.com Review URL: https://codereview.chromium.org/52603004 git-svn-id: http://skia.googlecode.com/svn/trunk@12038 2bbb7eff-a529-9590-31e7-b0007b416f81
* Get rid of two unused variables from the .data section.Gravatar commit-bot@chromium.org2013-10-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before: $ objdump -x out/Release/libskia_opts.a | grep "\.data" | c++filt 1 .data 00000000 0000000000000000 0000000000000000 000004ec 2**2 0000000000000000 l d .data 0000000000000000 .data 1 .data 00000000 0000000000000000 0000000000000000 00000f58 2**2 0000000000000000 l d .data 0000000000000000 .data 2 .data 00000008 0000000000000000 0000000000000000 00001774 2**2 0000000000000000 l d .data 0000000000000000 .data 0000000000000000 g O .data 0000000000000004 debug_x 0000000000000004 g O .data 0000000000000004 debug_y 1 .data 00000000 0000000000000000 0000000000000000 00001d8c 2**2 0000000000000000 l d .data 0000000000000000 .data 1 .data 00000000 0000000000000000 0000000000000000 00000054 2**2 0000000000000000 l d .data 0000000000000000 .data 1 .data 00000000 0000000000000000 0000000000000000 000001f0 2**2 0000000000000000 l d .data 0000000000000000 .data 1 .data 00000000 0000000000000000 0000000000000000 00000044 2**2 0000000000000000 l d .data 0000000000000000 .data After: $ objdump -x out/Release/libskia_opts.a | grep "\.data" | c++filt 1 .data 00000000 0000000000000000 0000000000000000 000004ec 2**2 0000000000000000 l d .data 0000000000000000 .data 1 .data 00000000 0000000000000000 0000000000000000 00000f58 2**2 0000000000000000 l d .data 0000000000000000 .data 2 .data 00000000 0000000000000000 0000000000000000 00001774 2**2 0000000000000000 l d .data 0000000000000000 .data 1 .data 00000000 0000000000000000 0000000000000000 00001d8c 2**2 0000000000000000 l d .data 0000000000000000 .data 1 .data 00000000 0000000000000000 0000000000000000 00000054 2**2 0000000000000000 l d .data 0000000000000000 .data 1 .data 00000000 0000000000000000 0000000000000000 000001f0 2**2 0000000000000000 l d .data 0000000000000000 .data 1 .data 00000000 0000000000000000 0000000000000000 00000044 2**2 0000000000000000 l d .data 0000000000000000 .data Not sure why clang didn't catch them. R=mtklein@google.com BUG= Author: tfarina@chromium.org Review URL: https://codereview.chromium.org/50013002 git-svn-id: http://skia.googlecode.com/svn/trunk@11999 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 30 - Xfermode: NEON modeprocsGravatar commit-bot@chromium.org2013-10-17
| | | | | | | | | | | | | | | | | | | | | | | | Xfermode: NEON implementation of SIMD procs This patch contains a NEON implementation for a number of Xfermodes. It provides a big speedup on Xfermode benchmarks (currently up to 3x with gcc4.7 but up to 10x when gcc produces optimal code for it). Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= Committed: http://code.google.com/p/skia/source/detail?r=11777 Committed: http://code.google.com/p/skia/source/detail?r=11813 R=djsollen@google.com, mtklein@google.com, reed@google.com, robertphillips@google.com Author: kevin.petit.arm@gmail.com Review URL: https://codereview.chromium.org/26627004 git-svn-id: http://skia.googlecode.com/svn/trunk@11843 2bbb7eff-a529-9590-31e7-b0007b416f81
* Reverting r11813 (ARM Skia NEON patches - 30 - Xfermode: NEON modeprocs - ↵Gravatar robertphillips@google.com2013-10-17
| | | | | | https://codereview.chromium.org/26627004) due to Chromium compilation faliures. git-svn-id: http://skia.googlecode.com/svn/trunk@11833 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 30 - Xfermode: NEON modeprocsGravatar commit-bot@chromium.org2013-10-16
| | | | | | | | | | | | | | | | | | | | | | Xfermode: NEON implementation of SIMD procs This patch contains a NEON implementation for a number of Xfermodes. It provides a big speedup on Xfermode benchmarks (currently up to 3x with gcc4.7 but up to 10x when gcc produces optimal code for it). Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= Committed: http://code.google.com/p/skia/source/detail?r=11777 R=djsollen@google.com, mtklein@google.com, reed@google.com, robertphillips@google.com Author: kevin.petit.arm@gmail.com Review URL: https://codereview.chromium.org/26627004 git-svn-id: http://skia.googlecode.com/svn/trunk@11813 2bbb7eff-a529-9590-31e7-b0007b416f81
* Reverting r11777 (ARM Skia NEON patches - 30 - Xfermode: NEON modeprocs) due ↵Gravatar robertphillips@google.com2013-10-16
| | | | | | to Chromium compilation failure git-svn-id: http://skia.googlecode.com/svn/trunk@11799 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 30 - Xfermode: NEON modeprocsGravatar commit-bot@chromium.org2013-10-15
| | | | | | | | | | | | | | | | | | | Xfermode: NEON implementation of SIMD procs This patch contains a NEON implementation for a number of Xfermodes. It provides a big speedup on Xfermode benchmarks (currently up to 3x with gcc4.7 but up to 10x when gcc produces optimal code for it). Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com, mtklein@google.com, reed@google.com Author: kevin.petit.arm@gmail.com Review URL: https://codereview.chromium.org/26627004 git-svn-id: http://skia.googlecode.com/svn/trunk@11777 2bbb7eff-a529-9590-31e7-b0007b416f81
* Revert "Revert "change SkColorTable to be immutable""Gravatar reed@google.com2013-10-10
| | | | | | | | | | | | This reverts commit b8162cb840f4cb6002ef68d5ac775c6a122c52a9. Fixed was call-sites in benches that used the (now gone) setIsOpaque api. R=scroggo@google.com Review URL: https://codereview.chromium.org/26572006 git-svn-id: http://skia.googlecode.com/svn/trunk@11695 2bbb7eff-a529-9590-31e7-b0007b416f81
* Revert "change SkColorTable to be immutable"Gravatar reed@google.com2013-10-09
| | | | | | | | | | This reverts commit 1c0ff422868b3badf5ffe0790a5d051d1896e2f7. BUG= Review URL: https://codereview.chromium.org/26709002 git-svn-id: http://skia.googlecode.com/svn/trunk@11677 2bbb7eff-a529-9590-31e7-b0007b416f81
* change SkColorTable to be immutableGravatar reed@google.com2013-10-09
| | | | | | | | | BUG= R=scroggo@google.com Review URL: https://codereview.chromium.org/25353002 git-svn-id: http://skia.googlecode.com/svn/trunk@11676 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 28 - Xfermode: SIMD modeprocsGravatar commit-bot@chromium.org2013-10-09
| | | | | | | | | | | | | | | | | | | | | | Xfermode: allow for SIMD modeprocs This patch introduces the ability to have SIMD Xfermode modeprocs. In the NEON implementation, SIMD modeprocs will process 8 pixels at a time. Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= Committed: http://code.google.com/p/skia/source/detail?r=11654 R=djsollen@google.com, mtklein@google.com, reed@google.com Author: kevin.petit.arm@gmail.com Review URL: https://codereview.chromium.org/23644006 git-svn-id: http://skia.googlecode.com/svn/trunk@11669 2bbb7eff-a529-9590-31e7-b0007b416f81
* Revert "ARM Skia NEON patches - 28 - Xfermode: SIMD modeprocs"Gravatar djsollen@google.com2013-10-08
| | | | | | | | This reverts http://code.google.com/p/skia/source/detail?r=11654 Review URL: https://codereview.chromium.org/26340010 git-svn-id: http://skia.googlecode.com/svn/trunk@11655 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 28 - Xfermode: SIMD modeprocsGravatar commit-bot@chromium.org2013-10-08
| | | | | | | | | | | | | | | | | | | Xfermode: allow for SIMD modeprocs This patch introduces the ability to have SIMD Xfermode modeprocs. In the NEON implementation, SIMD modeprocs will process 8 pixels at a time. Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com, mtklein@google.com, reed@google.com Author: kevin.petit.arm@gmail.com Review URL: https://codereview.chromium.org/23644006 git-svn-id: http://skia.googlecode.com/svn/trunk@11654 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 24 - S32_D565_Blend_Dither slight speedup/bugfixGravatar commit-bot@chromium.org2013-09-26
| | | | | | | | | | | | | | | | | | | | BlitRow565: S32_D565_Blend_Dither, slight speedup + bugfix This patch adds a rewrite of S32_D565_Blend_Dither in intrinsics. The newer version is faster (10-20% depending on the value of count) and also supports ARGB as well as ABGR. It also adds the missing assert at the beginning of the function. Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com, mtklein@google.com Author: kevin.petit.arm@gmail.com Review URL: https://chromiumcodereview.appspot.com/22566002 git-svn-id: http://skia.googlecode.com/svn/trunk@11473 2bbb7eff-a529-9590-31e7-b0007b416f81
* Sanitizing source files in Housekeeper-NightlyGravatar skia.committer@gmail.com2013-09-21
| | | | git-svn-id: http://skia.googlecode.com/svn/trunk@11426 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 21 - new NEON S32_D565_OpaqueGravatar commit-bot@chromium.org2013-09-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | BlitRow565: NEON version of S32_D565_Opaque Here's a new implementation of S32_D565_Opaque in NEON. It improves dramatically the speed compared to S32A_D565_Opaque. Here are the benchmark results (speedup vs. existing NEON): +-------+-----------+------------+ | count | Cortex-A9 | Cortex-A15 | +-------+-----------+------------+ | 1 | +130% | +139% | +-------+-----------+------------+ | 2 | +65,2% | +51% | +-------+-----------+------------+ | 4 | -25,5% | +10,2% | +-------+-----------+------------+ | 8 | +63,8% | +32,1% | +-------+-----------+------------+ | 16 | +110% | +49,2% | +-------+-----------+------------+ | 64 | +153% | +123,5% | +-------+-----------+------------+ | 256 | +151% | +144,7% | +-------+-----------+------------+ | 1024 | +272% | +157,2% | +-------+-----------+------------+ Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com, mtklein@google.com Author: kevin.petit.arm@gmail.com Review URL: https://chromiumcodereview.appspot.com/22351006 git-svn-id: http://skia.googlecode.com/svn/trunk@11415 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 23 - S32_D565_Opaque_Dither cleanup/bugfix/speedGravatar commit-bot@chromium.org2013-09-18
| | | | | | | | | | | | | | | | | | | | | | | BlitRow565: S32_D565_Opaque_Dither: cleaning / bugfix This patch brings a little code cleaning (spaces/comments) and a little speed improvement (by using post-incrementation in the asm) but more importantly it fixes a bug on Linux. The new code now supports ARGB as well as ABGR. I removed the comment as I have confirmed with benchmarks that this code bring a *massive* (3x-7x) speedup compared to the C code. Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com, mtklein@google.com Author: kevin.petit.arm@gmail.com Review URL: https://chromiumcodereview.appspot.com/22269003 git-svn-id: http://skia.googlecode.com/svn/trunk@11339 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 19 - Intrinsics version of the Filter32 routinesGravatar commit-bot@chromium.org2013-09-13
| | | | | | | | | | | | | | | BitmapProcState: translate the filtering routines to intrinsics Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com, mtklein@google.com Author: kevin.petit.arm@gmail.com Review URL: https://chromiumcodereview.appspot.com/21915004 git-svn-id: http://skia.googlecode.com/svn/trunk@11246 2bbb7eff-a529-9590-31e7-b0007b416f81
* add SkConvolutionProcs* to the none platformConvolutionProcs() signatureGravatar reed@google.com2013-09-05
| | | | git-svn-id: http://skia.googlecode.com/svn/trunk@11120 2bbb7eff-a529-9590-31e7-b0007b416f81
* remove fConvolutionProcs from State, and just use it locallyGravatar reed@google.com2013-09-05
| | | | | | | | | BUG= R=humper@google.com Review URL: https://codereview.chromium.org/23796005 git-svn-id: http://skia.googlecode.com/svn/trunk@11118 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 18 - Preparation work for BitmapProcStateGravatar commit-bot@chromium.org2013-09-05
| | | | | | | | | | | | | | | | | | | | | | BitmapProcState: clean a little and get rid of some asm replacing the apparently stupid dx+dx+dx leads to more instructions being generated. Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BitmapProcState: move code common to C and NEON to a separate header Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com Author: kevin.petit.arm@gmail.com Review URL: https://chromiumcodereview.appspot.com/21931002 git-svn-id: http://skia.googlecode.com/svn/trunk@11109 2bbb7eff-a529-9590-31e7-b0007b416f81
* Sanitizing source files in Housekeeper-NightlyGravatar skia.committer@gmail.com2013-08-29
| | | | git-svn-id: http://skia.googlecode.com/svn/trunk@10992 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 15 - Preparation work for Blitmask optimsGravatar commit-bot@chromium.org2013-08-28
| | | | | | | | | | | | | | | Blitmask: copy empty factory functions to a new file Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG= R=djsollen@google.com Author: kevin.petit.arm@gmail.com Review URL: https://chromiumcodereview.appspot.com/21120007 git-svn-id: http://skia.googlecode.com/svn/trunk@10980 2bbb7eff-a529-9590-31e7-b0007b416f81
* Implement highQualityFilter16 so GM doesn't crash when you give it resources.Gravatar mtklein@google.com2013-08-26
| | | | | | | | | | | | | | Testing consisted of: 1) ninja -C out/Debug gm && gm -i resources --match mandrill_512 -w /tmp/gm 2) notice that gm didn't segfault 3) look in /tmp/gm and see a bunch of handsome monkeys BUG=skia:1517 R=humper@google.com Review URL: https://codereview.chromium.org/22801016 git-svn-id: http://skia.googlecode.com/svn/trunk@10917 2bbb7eff-a529-9590-31e7-b0007b416f81
* Cleanup the ARM blitrow optimizationsGravatar djsollen@google.com2013-08-09
| | | | | | | | R=mtklein@google.com Review URL: https://codereview.chromium.org/22229002 git-svn-id: http://skia.googlecode.com/svn/trunk@10652 2bbb7eff-a529-9590-31e7-b0007b416f81
* Minor sk_memset{16|32}_SSE2 optimization.Gravatar commit-bot@chromium.org2013-08-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using explicitly indexed references allows some compilers to generate more efficient loops. For gcc 4.6.3: 613c18: 83 ea 10 sub $0x10,%edx 613c1b: 66 0f 7f 07 movdqa %xmm0,(%rdi) 613c1f: 66 0f 7f 47 10 movdqa %xmm0,0x10(%rdi) 613c24: 66 0f 7f 47 20 movdqa %xmm0,0x20(%rdi) 613c29: 66 0f 7f 47 30 movdqa %xmm0,0x30(%rdi) 613c2e: 48 83 c7 40 add $0x40,%rdi 613c32: 83 fa 0f cmp $0xf,%edx 613c35: 7f e1 jg 613c18 <_Z16sk_memset32_SSE2Pjji+0x38> vs. previous: 613c18: 83 ea 10 sub $0x10,%edx 613c1b: 66 0f 7f 07 movdqa %xmm0,(%rdi) 613c1f: 66 0f 7f 47 10 movdqa %xmm0,0x10(%rdi) 613c24: 66 0f 7f 47 20 movdqa %xmm0,0x20(%rdi) 613c29: 48 83 c7 40 add $0x40,%rdi 613c2d: 83 fa 0f cmp $0xf,%edx 613c30: 66 0f 7f 47 f0 movdqa %xmm0,-0x10(%rdi) 613c35: 7f e1 jg 613c18 <_Z16sk_memset32_SSE2Pjji+0x38> This yields a 0.2% - 1% improvement with the memset micro benchmarks, presumably due to avoiding a stall on the next store after the %rdi increment. R=reed@google.com, senorblanco@chromium.org Author: fmalita@chromium.org Review URL: https://chromiumcodereview.appspot.com/21703003 git-svn-id: http://skia.googlecode.com/svn/trunk@10545 2bbb7eff-a529-9590-31e7-b0007b416f81
* ARM Skia NEON patches - 14 - S32A_BlendGravatar commit-bot@chromium.org2013-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Blitrow32: S32A_Blend new NEON version Adding a NEON version of S32A_Blend_BlitRow32. Here are the benchmark results: +-------+--------------------------+--------------------------+ | | Speedup vs. C | Speedup vs. ARM asm | | count +------------+-------------+------------+-------------+ | | Cortex A-9 | Cortex A-15 | Cortex A-9 | Cortex A-15 | +-------+------------+-------------+------------+-------------+ | 1 | +8,5% | +18,5% | +0.9% | +2,9% | +-------+------------+-------------+------------+-------------+ | 2 | +65,6% | +94% | +70,3% | +80% | +-------+------------+-------------+------------+-------------+ | 4 | +42,4% | +87,8% | +56,8% | +84,4% | +-------+------------+-------------+------------+-------------+ | 8 | +30% | +90% | +49,9% | +82,7% | +-------+------------+-------------+------------+-------------+ | 16 | +23,1% | +95,4% | +46,6% | +87,6% | +-------+------------+-------------+------------+-------------+ | 64 | +23,1% | +95,7% | +46,1% | +89,4% | +-------+------------+-------------+------------+-------------+ | 256 | +35,5% | +122% | +53,6% | +99,2% | +-------+------------+-------------+------------+-------------+ | 1024 | +61,8% | +101% | +64,2% | +91,2% | +-------+------------+-------------+------------+-------------+ BUG= R=djsollen@google.com Author: kevin.petit.arm@gmail.com Review URL: https://chromiumcodereview.appspot.com/18614010 git-svn-id: http://skia.googlecode.com/svn/trunk@10480 2bbb7eff-a529-9590-31e7-b0007b416f81
* Enable runtime checks for SSSE3 on x86 on Android.Gravatar mtklein@google.com2013-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | $ compare-android.sh bench --match bitmap_ --repeat 30 master -> ssse3 N=30 p=0.001000 (corrected to 0.000033) sig? speedup bench n -1.16% bitmap_scale_filter_256_64 y -0.72% bitmap_8888_A_scale_bicubic y -0.21% bitmap_index8_A n -0.00% bitmap_565 n -0.00% bitmap_scale_filter_90_80 n 0.03% bitmap_8888_A_source_transparent y 0.06% bitmap_index8 y 0.30% bitmap_8888_A_source_stripes_two n 0.34% bitmap_scale_filter_80_90 y 0.42% bitmap_8888_A y 0.44% bitmap_8888_A_source_opaque n 0.53% bitmap_scale_filter_90_10 y 0.71% bitmap_8888_A_source_stripes_three y 0.91% bitmap_8888_A_scale_rotate_bicubic y 1.04% bitmap_8888_update n 1.19% bitmap_scale_filter_10_90 n 1.39% bitmap_scale_filter_90_90 y 1.77% bitmap_8888_update_volatile y 1.89% bitmap_8888 y 2.37% bitmap_scale_filter_30_90 y 9.57% bitmap_scale_filter_64_256 n 17.86% bitmap_scale_filter_90_30 y 25.40% bitmap_8888_A_scale_rotate_bilerp y 27.19% bitmap_8888_scale_rotate_bilerp y 27.23% bitmap_8888_update_scale_rotate_bilerp y 27.29% bitmap_8888_update_volatile_scale_rotate_bilerp y 55.08% bitmap_8888_A_scale_bilerp y 58.75% bitmap_8888_update_volatile_scale_bilerp y 58.90% bitmap_8888_scale_bilerp y 58.92% bitmap_8888_update_scale_bilerp Overall speedup: 10.52% BUG=skia:1111 R=djsollen@google.com Review URL: https://codereview.chromium.org/21203005 git-svn-id: http://skia.googlecode.com/svn/trunk@10474 2bbb7eff-a529-9590-31e7-b0007b416f81
* Sanitizing source files in Housekeeper-NightlyGravatar skia.committer@gmail.com2013-07-31
| | | | git-svn-id: http://skia.googlecode.com/svn/trunk@10449 2bbb7eff-a529-9590-31e7-b0007b416f81