| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia:
R=djsollen@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/157863003
git-svn-id: http://skia.googlecode.com/svn/trunk@13381 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the SIGILL chrashes seen in chromium.
BUG=2067
R=reed@google.com, kevin.petit.arm@gmail.com, mtklein@google.com
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/132393007
git-svn-id: http://skia.googlecode.com/svn/trunk@13351 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Eliminates SkFlattenable{Read,Write}Buffer, promoting SkOrdered{Read,Write}Buffer
a step each in the hierarchy.
What used to be this:
SkFlattenableWriteBuffer -> SkOrderedWriteBuffer
SkFlattenableReadBuffer -> SkOrderedReadBuffer
SkFlattenableReadBuffer -> SkValidatingReadBuffer
is now
SkWriteBuffer
SkReadBuffer -> SkValidatingReadBuffer
Benefits:
- code is simpler, names are less wordy
- the generic SkFlattenableFooBuffer code in SkPaint was incorrect; removed
- write buffers are completely devirtualized, important for record speed
This refactoring was mostly mechanical. You aren't going to find anything
interesting in files with less than 10 lines changed.
BUG=skia:
R=reed@google.com, scroggo@google.com, djsollen@google.com, mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/134163010
git-svn-id: http://skia.googlecode.com/svn/trunk@13245 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@13228 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BitmapProcState: new factorised code
This one basically factorises the clamp and repeat transformations with
some performance improvements. It has the benefit of being faster, much
easier to maintain (nearly three times less code for more work
done :-)), and more complete (all persp transformations weren't optimised
in the previous version).
It also introduces the use of can_truncate_to_fixed_for_decal where
useful.
The effect on benchmarks ranges from a 5% penalty to a 25% gain on a
Cortex-A9 and from a 5% penalty to a 100% gain on a Cortex-A15.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, luisjoseromeroesclusa@hotmail.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/23835006
git-svn-id: http://skia.googlecode.com/svn/trunk@13218 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r13155. Next time I'll put the "do not disturb" sign on my commit.]
Refactor SkMorphologyImageFilter, CPU and GPU paths. This required making opts/ dependent on effects/, so that we could use the SkMorphologyProc type in SkMorphologyImageFilter.h.
Correctness and performance covered by existing tests; no change in functionality.
R=bsalomon@google.com, djsollen@google.com, reed@google.com
Committed: https://code.google.com/p/skia/source/detail?r=13154
BUG=skia:
Review URL: https://codereview.chromium.org/135013004
git-svn-id: http://skia.googlecode.com/svn/trunk@13168 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@13155 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
| |
opts/ dependent on effects/, so that we could use the SkMorphologyProc type in SkMorphologyImageFilter.h.
Correctness and performance covered by existing tests; no change in functionality.
R=bsalomon@google.com, djsollen@google.com, reed@google.com
Review URL: https://codereview.chromium.org/135013004
git-svn-id: http://skia.googlecode.com/svn/trunk@13154 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@13061 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Calculate 8 channels in parallel by using 16-bits to store each channel. Due to the limitation of VQRDMULH, (int16 * int16 * 2 + 0x8000) >> 16, the fast path can only support kernelSize < 128.
8 significant bits are kept at least in each stage, the final error should less-equal than 1.
Pre-fetching memory for X-direction read. In fact pre-fetching memory doesn't help much for Y direction read, since it is a waste to load a cache line for only read 8 bytes.(I left it there to keep the symmetry. pre-fetch is cheap :) )
bench data on Nexus 10
before:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 25081.48
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 25038.04
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 25209.04
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 24928.01
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 17160.98
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 17924.11
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 14609.19
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 14625.91
after:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 14848.42
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 16037.29
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 14819.55
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 14563.69
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 11905.34
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 11883.85
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 9576.51
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 9793.84
BUG=
R=senorblanco@chromium.org, mtklein@google.com, reed@google.com, kevin.petit@arm.com, kevin.petit.arm@gmail.com
Author: zheng.xu@arm.com
Review URL: https://codereview.chromium.org/105893003
git-svn-id: http://skia.googlecode.com/svn/trunk@13036 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
| |
https://codereview.chromium.org/109403004/)
git-svn-id: http://skia.googlecode.com/svn/trunk@12581 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
| |
https://codereview.chromium.org/109403004) due to image quality regressions on the N4.
git-svn-id: http://skia.googlecode.com/svn/trunk@12578 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improve a little on Blur
Grouping operations gives a 5-15% speed improvement on a Cortex-A15 based Chromebook.
before:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 30887.69
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 30751.35
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 30757.92
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 30673.88
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 19602.17
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 20613.81
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 17855.46
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 17957.79
after:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 27015.75
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 27148.02
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 27241.60
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 27077.44
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 18458.10
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 19643.42
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 16176.73
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 16450.50
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=senorblanco@chromium.org, mtklein@google.com, luisjoseromeroesclusa@hotmail.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/109403004
git-svn-id: http://skia.googlecode.com/svn/trunk@12568 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases, it's easy to provide a NEON version of the 1-pixel modeprocs.
Combined with https://codereview.chromium.org/23724013/ (merged) it allows
up to 35% speed improvement on Xfermodes when aa is non-NULL.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, reed@google.com, mtklein@google.com, luisjoseromeroesclusa@hotmail.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/104883004
git-svn-id: http://skia.googlecode.com/svn/trunk@12525 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
| |
R=mtklein@google.com, mtklein
BUG=
Review URL: https://codereview.chromium.org/105423002
git-svn-id: http://skia.googlecode.com/svn/trunk@12493 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
| |
TBR=mtklein
BUG=
Review URL: https://codereview.chromium.org/98373003
git-svn-id: http://skia.googlecode.com/svn/trunk@12490 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
speedup on Nexus-10.
R=mtklein@google.com, mtklein
before:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 33063.23
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 32800.25
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 33017.88
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 32743.35
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 21024.04
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 22904.15
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 18738.08
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 18798.98
after:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 30180.96
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 29861.90
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 30178.98
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 29911.25
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 19344.35
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 19957.07
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 17158.84
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 17330.73
Review URL: https://codereview.chromium.org/99933004
git-svn-id: http://skia.googlecode.com/svn/trunk@12486 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: add a NEON version of SkFourByteInterp
Brings a modest performance improvement on its own in
ProcXfermodes when aa is neither zero nor FF. Combined
with 1-pixel NEON modeprocs, it brings up to 35% speed
improvement on the aa case.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/23724013
git-svn-id: http://skia.googlecode.com/svn/trunk@12448 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@12428 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@12427 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@12425 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Blitmask: NEON optimised version of the D32_A8 functions
Here are the microbenchmark results I got for the D32_A8
functions:
Cortex-A9:
==========
+-------+--------+--------+--------+
| count | Black | Opaque | Color |
+-------+--------+--------+--------+
| 1 | -14% | -39,5% | -37,5% |
+-------+--------+--------+--------+
| 2 | -3% | -29,9% | -25% |
+-------+--------+--------+--------+
| 4 | -11,3% | -22% | -14,5% |
+-------+--------+--------+--------+
| 8 | +128% | +66,6% | +105% |
+-------+--------+--------+--------+
| 16 | +159% | +102% | +149% |
+-------+--------+--------+--------+
| 64 | +189% | +136% | +189% |
+-------+--------+--------+--------+
| 256 | +126% | +102% | +149% |
+-------+--------+--------+--------+
| 1024 | +67,5% | +81,4% | +123% |
+-------+--------+--------+--------+
Cortex-A15:
===========
+-------+--------+--------+--------+
| count | Black | Opaque | Color |
+-------+--------+--------+--------+
| 1 | -24% | -46,5% | -37,5% |
+-------+--------+--------+--------+
| 2 | -18,5% | -35,5% | -28% |
+-------+--------+--------+--------+
| 4 | -5,2% | -17,5% | -15,5% |
+-------+--------+--------+--------+
| 8 | +72% | +65,8% | +84,7% |
+-------+--------+--------+--------+
| 16 | +168% | +117% | +149% |
+-------+--------+--------+--------+
| 64 | +165% | +110% | +145% |
+-------+--------+--------+--------+
| 256 | +106% | +99,6% | +141% |
+-------+--------+--------+--------+
| 1024 | +93,7% | +94,7% | +130% |
+-------+--------+--------+--------+
Blitmask: add NEON optimised PlatformBlitRowProcs16
Here are the microbenchmark results (speedup vs. C code):
+-------+-----------------+-----------------+
| | Cortex-A9 | Cortex-A15 |
| count +--------+--------+--------+--------+
| | Blend | Opaque | Blend | Opaque |
+-------+--------+--------+--------+--------+
| 1 | -19,2% | -36,7% | -33,6% | -44,7% |
+-------+--------+--------+--------+--------+
| 2 | -12,6% | -27,8% | -39% | -48% |
+-------+--------+--------+--------+--------+
| 4 | -11,5% | -21,6% | -37,7% | -44,3% |
+-------+--------+--------+--------+--------+
| 8 | +141% | +59,7% | +123% | +48,7% |
+-------+--------+--------+--------+--------+
| 16 | +213% | +119% | +214% | +121% |
+-------+--------+--------+--------+--------+
| 64 | +212% | +105% | +242% | +167% |
+-------+--------+--------+--------+--------+
| 256 | +289% | +167% | +249% | +207% |
+-------+--------+--------+--------+--------+
| 1024 | +273% | +169% | +146% | +220% |
+-------+--------+--------+--------+--------+
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/23719002
git-svn-id: http://skia.googlecode.com/svn/trunk@12420 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
| |
R=mtklein@google.com, mtklein, reed@google.com
BUG=
Review URL: https://codereview.chromium.org/66413007
git-svn-id: http://skia.googlecode.com/svn/trunk@12227 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
| |
Tegra3.
R=mtklein@google.com, mtklein, reed@google.com
Review URL: https://codereview.chromium.org/68123003
git-svn-id: http://skia.googlecode.com/svn/trunk@12219 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
| |
Xeon ES-2690.
R=mtklein@google.com
Review URL: https://codereview.chromium.org/61643011
git-svn-id: http://skia.googlecode.com/svn/trunk@12204 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: xfer16
This adds support for 16bit Xfermodes. It also tunes the gcc test
macros in xfer32() to add compatibility for gcc > 4.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/33063002
git-svn-id: http://skia.googlecode.com/svn/trunk@12192 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@12186 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NEON version of the convolutionProcs
The bitmap_scale benchmark is now twice as fast on ARM.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
Committed: http://code.google.com/p/skia/source/detail?r=12154
R=djsollen@google.com, mtklein@google.com, humper@google.com, epoger@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/27533004
git-svn-id: http://skia.googlecode.com/svn/trunk@12166 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
| |
BUG=skia:1807
git-svn-id: http://skia.googlecode.com/svn/trunk@12156 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NEON version of the convolutionProcs
The bitmap_scale benchmark is now twice as fast on ARM.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, humper@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/27533004
git-svn-id: http://skia.googlecode.com/svn/trunk@12154 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
| |
TBR=robertphillips
Review URL: https://codereview.chromium.org/45963007
git-svn-id: http://skia.googlecode.com/svn/trunk@12039 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
| |
erode). This gives a 3-5X speedup over the naive implementation, and also mitigates a timing-based security attack in Chrome (https://code.google.com/p/chromium/issues/detail?id=251711).
NOTE: this will require a corresponding GYP change on the Skia roll into Chrome: https://codereview.chromium.org/52453004/
R=mtklein@google.com, reed@google.com
Review URL: https://codereview.chromium.org/52603004
git-svn-id: http://skia.googlecode.com/svn/trunk@12038 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before:
$ objdump -x out/Release/libskia_opts.a | grep "\.data" | c++filt
1 .data 00000000 0000000000000000 0000000000000000 000004ec 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00000f58 2**2
0000000000000000 l d .data 0000000000000000 .data
2 .data 00000008 0000000000000000 0000000000000000 00001774 2**2
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 g O .data 0000000000000004 debug_x
0000000000000004 g O .data 0000000000000004 debug_y
1 .data 00000000 0000000000000000 0000000000000000 00001d8c 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00000054 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 000001f0 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00000044 2**2
0000000000000000 l d .data 0000000000000000 .data
After:
$ objdump -x out/Release/libskia_opts.a | grep "\.data" | c++filt
1 .data 00000000 0000000000000000 0000000000000000 000004ec 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00000f58 2**2
0000000000000000 l d .data 0000000000000000 .data
2 .data 00000000 0000000000000000 0000000000000000 00001774 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00001d8c 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00000054 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 000001f0 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00000044 2**2
0000000000000000 l d .data 0000000000000000 .data
Not sure why clang didn't catch them.
R=mtklein@google.com
BUG=
Author: tfarina@chromium.org
Review URL: https://codereview.chromium.org/50013002
git-svn-id: http://skia.googlecode.com/svn/trunk@11999 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: NEON implementation of SIMD procs
This patch contains a NEON implementation for a number of Xfermodes.
It provides a big speedup on Xfermode benchmarks (currently up to 3x
with gcc4.7 but up to 10x when gcc produces optimal code for it).
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
Committed: http://code.google.com/p/skia/source/detail?r=11777
Committed: http://code.google.com/p/skia/source/detail?r=11813
R=djsollen@google.com, mtklein@google.com, reed@google.com, robertphillips@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/26627004
git-svn-id: http://skia.googlecode.com/svn/trunk@11843 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
| |
https://codereview.chromium.org/26627004) due to Chromium compilation faliures.
git-svn-id: http://skia.googlecode.com/svn/trunk@11833 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: NEON implementation of SIMD procs
This patch contains a NEON implementation for a number of Xfermodes.
It provides a big speedup on Xfermode benchmarks (currently up to 3x
with gcc4.7 but up to 10x when gcc produces optimal code for it).
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
Committed: http://code.google.com/p/skia/source/detail?r=11777
R=djsollen@google.com, mtklein@google.com, reed@google.com, robertphillips@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/26627004
git-svn-id: http://skia.googlecode.com/svn/trunk@11813 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
| |
to Chromium compilation failure
git-svn-id: http://skia.googlecode.com/svn/trunk@11799 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: NEON implementation of SIMD procs
This patch contains a NEON implementation for a number of Xfermodes.
It provides a big speedup on Xfermode benchmarks (currently up to 3x
with gcc4.7 but up to 10x when gcc produces optimal code for it).
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/26627004
git-svn-id: http://skia.googlecode.com/svn/trunk@11777 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit b8162cb840f4cb6002ef68d5ac775c6a122c52a9.
Fixed was call-sites in benches that used the (now gone) setIsOpaque api.
R=scroggo@google.com
Review URL: https://codereview.chromium.org/26572006
git-svn-id: http://skia.googlecode.com/svn/trunk@11695 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 1c0ff422868b3badf5ffe0790a5d051d1896e2f7.
BUG=
Review URL: https://codereview.chromium.org/26709002
git-svn-id: http://skia.googlecode.com/svn/trunk@11677 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
| |
BUG=
R=scroggo@google.com
Review URL: https://codereview.chromium.org/25353002
git-svn-id: http://skia.googlecode.com/svn/trunk@11676 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: allow for SIMD modeprocs
This patch introduces the ability to have SIMD Xfermode modeprocs.
In the NEON implementation, SIMD modeprocs will process 8 pixels
at a time.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
Committed: http://code.google.com/p/skia/source/detail?r=11654
R=djsollen@google.com, mtklein@google.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/23644006
git-svn-id: http://skia.googlecode.com/svn/trunk@11669 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
| |
This reverts http://code.google.com/p/skia/source/detail?r=11654
Review URL: https://codereview.chromium.org/26340010
git-svn-id: http://skia.googlecode.com/svn/trunk@11655 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: allow for SIMD modeprocs
This patch introduces the ability to have SIMD Xfermode modeprocs.
In the NEON implementation, SIMD modeprocs will process 8 pixels
at a time.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/23644006
git-svn-id: http://skia.googlecode.com/svn/trunk@11654 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BlitRow565: S32_D565_Blend_Dither, slight speedup + bugfix
This patch adds a rewrite of S32_D565_Blend_Dither in intrinsics.
The newer version is faster (10-20% depending on the value of count)
and also supports ARGB as well as ABGR. It also adds the missing
assert at the beginning of the function.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://chromiumcodereview.appspot.com/22566002
git-svn-id: http://skia.googlecode.com/svn/trunk@11473 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@11426 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BlitRow565: NEON version of S32_D565_Opaque
Here's a new implementation of S32_D565_Opaque in NEON. It
improves dramatically the speed compared to S32A_D565_Opaque.
Here are the benchmark results (speedup vs. existing NEON):
+-------+-----------+------------+
| count | Cortex-A9 | Cortex-A15 |
+-------+-----------+------------+
| 1 | +130% | +139% |
+-------+-----------+------------+
| 2 | +65,2% | +51% |
+-------+-----------+------------+
| 4 | -25,5% | +10,2% |
+-------+-----------+------------+
| 8 | +63,8% | +32,1% |
+-------+-----------+------------+
| 16 | +110% | +49,2% |
+-------+-----------+------------+
| 64 | +153% | +123,5% |
+-------+-----------+------------+
| 256 | +151% | +144,7% |
+-------+-----------+------------+
| 1024 | +272% | +157,2% |
+-------+-----------+------------+
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://chromiumcodereview.appspot.com/22351006
git-svn-id: http://skia.googlecode.com/svn/trunk@11415 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BlitRow565: S32_D565_Opaque_Dither: cleaning / bugfix
This patch brings a little code cleaning (spaces/comments) and a little
speed improvement (by using post-incrementation in the asm) but more
importantly it fixes a bug on Linux. The new code now supports ARGB
as well as ABGR.
I removed the comment as I have confirmed with benchmarks that this
code bring a *massive* (3x-7x) speedup compared to the C code.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://chromiumcodereview.appspot.com/22269003
git-svn-id: http://skia.googlecode.com/svn/trunk@11339 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BitmapProcState: translate the filtering routines to intrinsics
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://chromiumcodereview.appspot.com/21915004
git-svn-id: http://skia.googlecode.com/svn/trunk@11246 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@11120 2bbb7eff-a529-9590-31e7-b0007b416f81
|