| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
| |
Committed: http://code.google.com/p/skia/source/detail?r=13547
R=reed@google.com, mtklein@google.com
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/169753004
git-svn-id: http://skia.googlecode.com/svn/trunk@13583 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BlitRow565: S32A_D565_Opaque_Dither: some improvements
- Supports ARGB and ABGR
- Less magic numbers
- Reduced instruction count : 5-25% speedup
- Fixed indentation, removed some commented and useless code
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia:
R=djsollen@google.com, mtklein@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/177963003
git-svn-id: http://skia.googlecode.com/svn/trunk@13577 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Benchmarks hitting this path can benfit from this patch.
Here are the data:
before after
gradient_radial2_mirror 10885.52 10849.48 0.33%
gradient_radial2_clamp_hicolor 11819.69 11644.83 1.48%
gradient_radial2_clamp 11816.10 11649.91 1.41%
bitmaprect_FF_filter_trans 6.27 4.88 22.17%
bitmaprect_FF_nofilter_trans 6.27 4.88 22.17%
bitmaprect_FF_filter_identity 6.31 4.86 22.98%
bitmaprect_FF_nofilter_identity 6.25 4.86 22.24%
bitmap_4444_update 6.26 5.05 19.33%
bitmap_4444_update_volatile 6.21 5.06 18.52%
bitmap_4444 6.22 5.06 18.65%
BUG=
R=mtklein@google.com
Author: qiankun.miao@intel.com
Review URL: https://codereview.chromium.org/172083003
git-svn-id: http://skia.googlecode.com/svn/trunk@13556 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
present. (https://codereview.chromium.org/169753004/)
Reason for revert:
Breaking the build.
Non-android machines are calling sk_throw().
Original issue's description:
> Enable the SSSE3 files to be built for Android when SSSE3 is not present.
>
> Committed: http://code.google.com/p/skia/source/detail?r=13547
R=reed@google.com, mtklein@google.com, djsollen@google.com
TBR=djsollen@google.com, mtklein@google.com, reed@google.com
NOTREECHECKS=true
NOTRY=true
Author: scroggo@google.com
Review URL: https://codereview.chromium.org/175543004
git-svn-id: http://skia.googlecode.com/svn/trunk@13549 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
| |
R=reed@google.com, mtklein@google.com
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/169753004
git-svn-id: http://skia.googlecode.com/svn/trunk@13547 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Blitrow32: S32_Blend fix and little speed improvement
- the results are now exactly similar as the C code
- the speed has improved, especially for small values of count
+-------+-----------+------------+
| count | Cortex-A9 | Cortex-A15 |
+-------+-----------+------------+
| 1 | +30% | +18% |
+-------+-----------+------------+
| 2 | 0 | 0 |
+-------+-----------+------------+
| 4 | - <1% | +14% |
+-------+-----------+------------+
| > 4 | -0.5..+5% | -0.5..+4% |
+-------+-----------+------------+
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia:
Committed: http://code.google.com/p/skia/source/detail?r=13532
R=djsollen@google.com, mtklein@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/158973002
git-svn-id: http://skia.googlecode.com/svn/trunk@13543 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(https://codereview.chromium.org/158973002/)
Reason for revert:
Breaking the build.
See http://108.170.219.164:10117/builders/Build-Ubuntu12-GCC-Arm7-Debug-Nexus4/builds/2966 (and others).
We are getting warnings that vsrc and vdst may be uninitialized. Please fix and resubmit.
Original issue's description:
> ARM Skia NEON patches - 12 - S32_Blend
>
> Blitrow32: S32_Blend fix and little speed improvement
>
> - the results are now exactly similar as the C code
> - the speed has improved, especially for small values of count
>
> +-------+-----------+------------+
> | count | Cortex-A9 | Cortex-A15 |
> +-------+-----------+------------+
> | 1 | +30% | +18% |
> +-------+-----------+------------+
> | 2 | 0 | 0 |
> +-------+-----------+------------+
> | 4 | - <1% | +14% |
> +-------+-----------+------------+
> | > 4 | -0.5..+5% | -0.5..+4% |
> +-------+-----------+------------+
>
> Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
>
> BUG=skia:
>
> Committed: http://code.google.com/p/skia/source/detail?r=13532
R=djsollen@google.com, mtklein@google.com, kevin.petit@arm.com
TBR=djsollen@google.com, kevin.petit@arm.com, mtklein@google.com
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Author: scroggo@google.com
Review URL: https://codereview.chromium.org/175433002
git-svn-id: http://skia.googlecode.com/svn/trunk@13534 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Blitrow32: S32_Blend fix and little speed improvement
- the results are now exactly similar as the C code
- the speed has improved, especially for small values of count
+-------+-----------+------------+
| count | Cortex-A9 | Cortex-A15 |
+-------+-----------+------------+
| 1 | +30% | +18% |
+-------+-----------+------------+
| 2 | 0 | 0 |
+-------+-----------+------------+
| 4 | - <1% | +14% |
+-------+-----------+------------+
| > 4 | -0.5..+5% | -0.5..+4% |
+-------+-----------+------------+
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia:
R=djsollen@google.com, mtklein@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/158973002
git-svn-id: http://skia.googlecode.com/svn/trunk@13532 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use SkFractionalInt for some calculations to improve accuracy.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia:2175
NOTRY=true
R=djsollen@google.com, mtklein@google.com, hshi@chromium.org
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/167433002
git-svn-id: http://skia.googlecode.com/svn/trunk@13518 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
microbenchmark of S32A_D565_Opaque() shows a 3x speedup after SSE optimization with various count on i7-3770.
BUG=
R=mtklein@google.com, reed@google.com
Author: qiankun.miao@intel.com
Review URL: https://codereview.chromium.org/138163013
git-svn-id: http://skia.googlecode.com/svn/trunk@13495 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BlitRow565: new intrinsics version of S32A_D565_Blend
This new version is basically a rewrite of the existing code with
a few speed and accuracy improvements. There is a switch to enable
pixel perfect results at the cost of a (quite big) decrease of
performances (disabled in this patch).
Here are the benchmark results (speedup vs. existing code):
+-------+------------+------------+
| count | Cortex -A9 | Cortex-A15 |
+-------+------------+------------+
| 1 | +103.6% | +12% |
+-------+------------+------------+
| 2 | +3.6% | +21.6% |
+-------+------------+------------+
| 4 | +0.8% | -0.8% |
+-------+------------+------------+
| 8 | +3.9% | -1% |
+-------+------------+------------+
| 16 | +14.7% | +5.7% |
+-------+------------+------------+
| 64 | +18.1% | +13.2% |
+-------+------------+------------+
| 256 | +16.3% | +27.4% |
+-------+------------+------------+
| 1024 | +78.2% | +17.4% |
+-------+------------+------------+
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia:
R=djsollen@google.com, mtklein@google.com, halcanary@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/156113005
git-svn-id: http://skia.googlecode.com/svn/trunk@13438 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia:
R=djsollen@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/157863003
git-svn-id: http://skia.googlecode.com/svn/trunk@13381 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the SIGILL chrashes seen in chromium.
BUG=2067
R=reed@google.com, kevin.petit.arm@gmail.com, mtklein@google.com
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/132393007
git-svn-id: http://skia.googlecode.com/svn/trunk@13351 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Eliminates SkFlattenable{Read,Write}Buffer, promoting SkOrdered{Read,Write}Buffer
a step each in the hierarchy.
What used to be this:
SkFlattenableWriteBuffer -> SkOrderedWriteBuffer
SkFlattenableReadBuffer -> SkOrderedReadBuffer
SkFlattenableReadBuffer -> SkValidatingReadBuffer
is now
SkWriteBuffer
SkReadBuffer -> SkValidatingReadBuffer
Benefits:
- code is simpler, names are less wordy
- the generic SkFlattenableFooBuffer code in SkPaint was incorrect; removed
- write buffers are completely devirtualized, important for record speed
This refactoring was mostly mechanical. You aren't going to find anything
interesting in files with less than 10 lines changed.
BUG=skia:
R=reed@google.com, scroggo@google.com, djsollen@google.com, mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/134163010
git-svn-id: http://skia.googlecode.com/svn/trunk@13245 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@13228 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
BitmapProcState: new factorised code
This one basically factorises the clamp and repeat transformations with
some performance improvements. It has the benefit of being faster, much
easier to maintain (nearly three times less code for more work
done :-)), and more complete (all persp transformations weren't optimised
in the previous version).
It also introduces the use of can_truncate_to_fixed_for_decal where
useful.
The effect on benchmarks ranges from a 5% penalty to a 25% gain on a
Cortex-A9 and from a 5% penalty to a 100% gain on a Cortex-A15.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, luisjoseromeroesclusa@hotmail.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/23835006
git-svn-id: http://skia.googlecode.com/svn/trunk@13218 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
r13155. Next time I'll put the "do not disturb" sign on my commit.]
Refactor SkMorphologyImageFilter, CPU and GPU paths. This required making opts/ dependent on effects/, so that we could use the SkMorphologyProc type in SkMorphologyImageFilter.h.
Correctness and performance covered by existing tests; no change in functionality.
R=bsalomon@google.com, djsollen@google.com, reed@google.com
Committed: https://code.google.com/p/skia/source/detail?r=13154
BUG=skia:
Review URL: https://codereview.chromium.org/135013004
git-svn-id: http://skia.googlecode.com/svn/trunk@13168 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@13155 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
| |
opts/ dependent on effects/, so that we could use the SkMorphologyProc type in SkMorphologyImageFilter.h.
Correctness and performance covered by existing tests; no change in functionality.
R=bsalomon@google.com, djsollen@google.com, reed@google.com
Review URL: https://codereview.chromium.org/135013004
git-svn-id: http://skia.googlecode.com/svn/trunk@13154 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@13061 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Calculate 8 channels in parallel by using 16-bits to store each channel. Due to the limitation of VQRDMULH, (int16 * int16 * 2 + 0x8000) >> 16, the fast path can only support kernelSize < 128.
8 significant bits are kept at least in each stage, the final error should less-equal than 1.
Pre-fetching memory for X-direction read. In fact pre-fetching memory doesn't help much for Y direction read, since it is a waste to load a cache line for only read 8 bytes.(I left it there to keep the symmetry. pre-fetch is cheap :) )
bench data on Nexus 10
before:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 25081.48
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 25038.04
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 25209.04
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 24928.01
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 17160.98
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 17924.11
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 14609.19
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 14625.91
after:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 14848.42
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 16037.29
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 14819.55
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 14563.69
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 11905.34
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 11883.85
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 9576.51
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 9793.84
BUG=
R=senorblanco@chromium.org, mtklein@google.com, reed@google.com, kevin.petit@arm.com, kevin.petit.arm@gmail.com
Author: zheng.xu@arm.com
Review URL: https://codereview.chromium.org/105893003
git-svn-id: http://skia.googlecode.com/svn/trunk@13036 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
| |
https://codereview.chromium.org/109403004/)
git-svn-id: http://skia.googlecode.com/svn/trunk@12581 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
| |
https://codereview.chromium.org/109403004) due to image quality regressions on the N4.
git-svn-id: http://skia.googlecode.com/svn/trunk@12578 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improve a little on Blur
Grouping operations gives a 5-15% speed improvement on a Cortex-A15 based Chromebook.
before:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 30887.69
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 30751.35
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 30757.92
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 30673.88
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 19602.17
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 20613.81
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 17855.46
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 17957.79
after:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 27015.75
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 27148.02
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 27241.60
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 27077.44
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 18458.10
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 19643.42
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 16176.73
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 16450.50
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=senorblanco@chromium.org, mtklein@google.com, luisjoseromeroesclusa@hotmail.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/109403004
git-svn-id: http://skia.googlecode.com/svn/trunk@12568 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases, it's easy to provide a NEON version of the 1-pixel modeprocs.
Combined with https://codereview.chromium.org/23724013/ (merged) it allows
up to 35% speed improvement on Xfermodes when aa is non-NULL.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, reed@google.com, mtklein@google.com, luisjoseromeroesclusa@hotmail.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/104883004
git-svn-id: http://skia.googlecode.com/svn/trunk@12525 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
| |
R=mtklein@google.com, mtklein
BUG=
Review URL: https://codereview.chromium.org/105423002
git-svn-id: http://skia.googlecode.com/svn/trunk@12493 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
| |
TBR=mtklein
BUG=
Review URL: https://codereview.chromium.org/98373003
git-svn-id: http://skia.googlecode.com/svn/trunk@12490 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
speedup on Nexus-10.
R=mtklein@google.com, mtklein
before:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 33063.23
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 32800.25
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 33017.88
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 32743.35
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 21024.04
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 22904.15
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 18738.08
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 18798.98
after:
running bench [640 480] blur_image_filter_large_10.00_10.00 8888: cmsecs = 30180.96
running bench [640 480] blur_image_filter_small_10.00_10.00 8888: cmsecs = 29861.90
running bench [640 480] blur_image_filter_large_1.00_1.00 8888: cmsecs = 30178.98
running bench [640 480] blur_image_filter_small_1.00_1.00 8888: cmsecs = 29911.25
running bench [640 480] blur_image_filter_large_0.00_1.00 8888: cmsecs = 19344.35
running bench [640 480] blur_image_filter_large_0.00_10.00 8888: cmsecs = 19957.07
running bench [640 480] blur_image_filter_large_1.00_0.00 8888: cmsecs = 17158.84
running bench [640 480] blur_image_filter_large_10.00_0.00 8888: cmsecs = 17330.73
Review URL: https://codereview.chromium.org/99933004
git-svn-id: http://skia.googlecode.com/svn/trunk@12486 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: add a NEON version of SkFourByteInterp
Brings a modest performance improvement on its own in
ProcXfermodes when aa is neither zero nor FF. Combined
with 1-pixel NEON modeprocs, it brings up to 35% speed
improvement on the aa case.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/23724013
git-svn-id: http://skia.googlecode.com/svn/trunk@12448 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@12428 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@12427 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@12425 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Blitmask: NEON optimised version of the D32_A8 functions
Here are the microbenchmark results I got for the D32_A8
functions:
Cortex-A9:
==========
+-------+--------+--------+--------+
| count | Black | Opaque | Color |
+-------+--------+--------+--------+
| 1 | -14% | -39,5% | -37,5% |
+-------+--------+--------+--------+
| 2 | -3% | -29,9% | -25% |
+-------+--------+--------+--------+
| 4 | -11,3% | -22% | -14,5% |
+-------+--------+--------+--------+
| 8 | +128% | +66,6% | +105% |
+-------+--------+--------+--------+
| 16 | +159% | +102% | +149% |
+-------+--------+--------+--------+
| 64 | +189% | +136% | +189% |
+-------+--------+--------+--------+
| 256 | +126% | +102% | +149% |
+-------+--------+--------+--------+
| 1024 | +67,5% | +81,4% | +123% |
+-------+--------+--------+--------+
Cortex-A15:
===========
+-------+--------+--------+--------+
| count | Black | Opaque | Color |
+-------+--------+--------+--------+
| 1 | -24% | -46,5% | -37,5% |
+-------+--------+--------+--------+
| 2 | -18,5% | -35,5% | -28% |
+-------+--------+--------+--------+
| 4 | -5,2% | -17,5% | -15,5% |
+-------+--------+--------+--------+
| 8 | +72% | +65,8% | +84,7% |
+-------+--------+--------+--------+
| 16 | +168% | +117% | +149% |
+-------+--------+--------+--------+
| 64 | +165% | +110% | +145% |
+-------+--------+--------+--------+
| 256 | +106% | +99,6% | +141% |
+-------+--------+--------+--------+
| 1024 | +93,7% | +94,7% | +130% |
+-------+--------+--------+--------+
Blitmask: add NEON optimised PlatformBlitRowProcs16
Here are the microbenchmark results (speedup vs. C code):
+-------+-----------------+-----------------+
| | Cortex-A9 | Cortex-A15 |
| count +--------+--------+--------+--------+
| | Blend | Opaque | Blend | Opaque |
+-------+--------+--------+--------+--------+
| 1 | -19,2% | -36,7% | -33,6% | -44,7% |
+-------+--------+--------+--------+--------+
| 2 | -12,6% | -27,8% | -39% | -48% |
+-------+--------+--------+--------+--------+
| 4 | -11,5% | -21,6% | -37,7% | -44,3% |
+-------+--------+--------+--------+--------+
| 8 | +141% | +59,7% | +123% | +48,7% |
+-------+--------+--------+--------+--------+
| 16 | +213% | +119% | +214% | +121% |
+-------+--------+--------+--------+--------+
| 64 | +212% | +105% | +242% | +167% |
+-------+--------+--------+--------+--------+
| 256 | +289% | +167% | +249% | +207% |
+-------+--------+--------+--------+--------+
| 1024 | +273% | +169% | +146% | +220% |
+-------+--------+--------+--------+--------+
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/23719002
git-svn-id: http://skia.googlecode.com/svn/trunk@12420 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
| |
R=mtklein@google.com, mtklein, reed@google.com
BUG=
Review URL: https://codereview.chromium.org/66413007
git-svn-id: http://skia.googlecode.com/svn/trunk@12227 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
| |
Tegra3.
R=mtklein@google.com, mtklein, reed@google.com
Review URL: https://codereview.chromium.org/68123003
git-svn-id: http://skia.googlecode.com/svn/trunk@12219 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
| |
Xeon ES-2690.
R=mtklein@google.com
Review URL: https://codereview.chromium.org/61643011
git-svn-id: http://skia.googlecode.com/svn/trunk@12204 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: xfer16
This adds support for 16bit Xfermodes. It also tunes the gcc test
macros in xfer32() to add compatibility for gcc > 4.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/33063002
git-svn-id: http://skia.googlecode.com/svn/trunk@12192 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@12186 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NEON version of the convolutionProcs
The bitmap_scale benchmark is now twice as fast on ARM.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
Committed: http://code.google.com/p/skia/source/detail?r=12154
R=djsollen@google.com, mtklein@google.com, humper@google.com, epoger@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/27533004
git-svn-id: http://skia.googlecode.com/svn/trunk@12166 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
| |
BUG=skia:1807
git-svn-id: http://skia.googlecode.com/svn/trunk@12156 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NEON version of the convolutionProcs
The bitmap_scale benchmark is now twice as fast on ARM.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, humper@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/27533004
git-svn-id: http://skia.googlecode.com/svn/trunk@12154 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
| |
TBR=robertphillips
Review URL: https://codereview.chromium.org/45963007
git-svn-id: http://skia.googlecode.com/svn/trunk@12039 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
| |
erode). This gives a 3-5X speedup over the naive implementation, and also mitigates a timing-based security attack in Chrome (https://code.google.com/p/chromium/issues/detail?id=251711).
NOTE: this will require a corresponding GYP change on the Skia roll into Chrome: https://codereview.chromium.org/52453004/
R=mtklein@google.com, reed@google.com
Review URL: https://codereview.chromium.org/52603004
git-svn-id: http://skia.googlecode.com/svn/trunk@12038 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before:
$ objdump -x out/Release/libskia_opts.a | grep "\.data" | c++filt
1 .data 00000000 0000000000000000 0000000000000000 000004ec 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00000f58 2**2
0000000000000000 l d .data 0000000000000000 .data
2 .data 00000008 0000000000000000 0000000000000000 00001774 2**2
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 g O .data 0000000000000004 debug_x
0000000000000004 g O .data 0000000000000004 debug_y
1 .data 00000000 0000000000000000 0000000000000000 00001d8c 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00000054 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 000001f0 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00000044 2**2
0000000000000000 l d .data 0000000000000000 .data
After:
$ objdump -x out/Release/libskia_opts.a | grep "\.data" | c++filt
1 .data 00000000 0000000000000000 0000000000000000 000004ec 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00000f58 2**2
0000000000000000 l d .data 0000000000000000 .data
2 .data 00000000 0000000000000000 0000000000000000 00001774 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00001d8c 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00000054 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 000001f0 2**2
0000000000000000 l d .data 0000000000000000 .data
1 .data 00000000 0000000000000000 0000000000000000 00000044 2**2
0000000000000000 l d .data 0000000000000000 .data
Not sure why clang didn't catch them.
R=mtklein@google.com
BUG=
Author: tfarina@chromium.org
Review URL: https://codereview.chromium.org/50013002
git-svn-id: http://skia.googlecode.com/svn/trunk@11999 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: NEON implementation of SIMD procs
This patch contains a NEON implementation for a number of Xfermodes.
It provides a big speedup on Xfermode benchmarks (currently up to 3x
with gcc4.7 but up to 10x when gcc produces optimal code for it).
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
Committed: http://code.google.com/p/skia/source/detail?r=11777
Committed: http://code.google.com/p/skia/source/detail?r=11813
R=djsollen@google.com, mtklein@google.com, reed@google.com, robertphillips@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/26627004
git-svn-id: http://skia.googlecode.com/svn/trunk@11843 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
| |
https://codereview.chromium.org/26627004) due to Chromium compilation faliures.
git-svn-id: http://skia.googlecode.com/svn/trunk@11833 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: NEON implementation of SIMD procs
This patch contains a NEON implementation for a number of Xfermodes.
It provides a big speedup on Xfermode benchmarks (currently up to 3x
with gcc4.7 but up to 10x when gcc produces optimal code for it).
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
Committed: http://code.google.com/p/skia/source/detail?r=11777
R=djsollen@google.com, mtklein@google.com, reed@google.com, robertphillips@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/26627004
git-svn-id: http://skia.googlecode.com/svn/trunk@11813 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
| |
to Chromium compilation failure
git-svn-id: http://skia.googlecode.com/svn/trunk@11799 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: NEON implementation of SIMD procs
This patch contains a NEON implementation for a number of Xfermodes.
It provides a big speedup on Xfermode benchmarks (currently up to 3x
with gcc4.7 but up to 10x when gcc produces optimal code for it).
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/26627004
git-svn-id: http://skia.googlecode.com/svn/trunk@11777 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit b8162cb840f4cb6002ef68d5ac775c6a122c52a9.
Fixed was call-sites in benches that used the (now gone) setIsOpaque api.
R=scroggo@google.com
Review URL: https://codereview.chromium.org/26572006
git-svn-id: http://skia.googlecode.com/svn/trunk@11695 2bbb7eff-a529-9590-31e7-b0007b416f81
|