| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With SSE2, bitmap_BGRA_8888_A_scale_rotate_bicubic gains about 40%
performance improvement on desktop i7-3770.
BUG=skia:
Committed: https://skia.googlesource.com/skia/+/b381fa10d8079c58928058bb8a6db32b39f05e51
CQ_EXTRA_TRYBOTS=tryserver.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot
R=mtklein@google.com, humper@google.com
Author: qiankun.miao@intel.com
Review URL: https://codereview.chromium.org/525283002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://codereview.chromium.org/525283002/)
Reason for revert:
Color order looks wrong on Macs:
Before: http://chromium-skia-gm.commondatastorage.googleapis.com/gm/bitmap-64bitMD5/filterbitmap_image_mandrill_16.png/12823183142873462143.png
After: http://chromium-skia-gm.commondatastorage.googleapis.com/gm/bitmap-64bitMD5/filterbitmap_image_mandrill_16.png/13683040204546320578.png
Original issue's description:
> Enable highQualityFilter_SSE2
>
> With SSE2, bitmap_BGRA_8888_A_scale_rotate_bicubic gains about 40%
> performance improvement on desktop i7-3770.
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/b381fa10d8079c58928058bb8a6db32b39f05e51
R=humper@google.com, qiankun.miao@intel.com
TBR=humper@google.com, qiankun.miao@intel.com
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Author: mtklein@google.com
Review URL: https://codereview.chromium.org/539523002
|
|
|
|
|
|
|
|
|
|
|
|
| |
With SSE2, bitmap_BGRA_8888_A_scale_rotate_bicubic gains about 40%
performance improvement on desktop i7-3770.
BUG=skia:
R=mtklein@google.com, humper@google.com
Author: qiankun.miao@intel.com
Review URL: https://codereview.chromium.org/525283002
|
|
|
|
|
|
|
|
|
|
|
| |
This code sometimes generates a build warning that bothers Chrome.
BUG=399842,skia:2906
R=reed@google.com, mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/538463003
|
|
|
|
|
|
|
|
|
| |
BUG=skia:
R=mtklein@google.com, humper@google.com
Author: qiankun.miao@intel.com
Review URL: https://codereview.chromium.org/530673002
|
|
|
|
|
|
|
|
|
|
| |
R=reed@google.com, mtklein@google.com, senorblanco@google.com
TBR=reed@google.com
BUG=skia:2845
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/527973002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(patchset #1 id:1 of https://codereview.chromium.org/520963002/)
Reason for revert:
failing more GMs than expected.
Original issue's description:
> Disable NEON procs for box blur as it produces invalid results
>
> BUG=skia:2845
>
> Committed: https://skia.googlesource.com/skia/+/4a1764688c990fb926aaeab538497dad52768d99
R=senorblanco@google.com, mtklein@google.com
TBR=mtklein@google.com, senorblanco@google.com
NOTREECHECKS=true
NOTRY=true
BUG=skia:2845
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/531023002
|
|
|
|
|
|
|
|
|
| |
BUG=skia:2845
R=senorblanco@google.com, mtklein@google.com
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/520963002
|
|
|
|
|
|
|
|
|
|
|
|
| |
BUG=skia:2797
Committed: https://skia.googlesource.com/skia/+/84cab93186fbe3e87d931fea73cb31b70ff5017b
R=mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/497823002
|
|
|
|
|
|
|
|
|
| |
BUG=skia:2797
R=mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/497823002
|
|
|
|
|
|
|
|
|
| |
BUG=skia:2845
R=mtklein@google.com
Author: reed@google.com
Review URL: https://codereview.chromium.org/498733002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
factory/public-constructor for the class. We want to *not* rely on private constructors, and not rely on calling through the inheritance hierarchy for either flattening or unflattening(CreateProc).
Refactoring pattern:
1. guard the existing constructor(readbuffer) with the legacy build-flag
2. If you are a instancable subclass, implement CreateProc(readbuffer) to create a new instances from the buffer params (or return NULL).
If you're a shader subclass
1. You must read/write the local matrix if your class accepts that in its factory/constructor, else ignore it.
R=robertphillips@google.com, mtklein@google.com, senorblanco@google.com, senorblanco@chromium.org, sugoi@chromium.org
Author: reed@google.com
Review URL: https://codereview.chromium.org/395603002
|
|
|
|
|
|
|
|
|
| |
BUG=skia:2845
R=djsollen@google.com, senorblanco@google.com, senorblanco@chromium.org
Author: halcanary@google.com
Review URL: https://codereview.chromium.org/491973002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. vuzpq is a gcc instruction. Replace it with the equivalent vuzp
(see http://llvm.org/PR20423)
2. .func / .endfunc only have an effect with -gstabs, which we don't
use. As it's unused and clang doesn't support it, remove
.func / .endfunc (also see http://llvm.org/20424)
BUG=chromium:124610
R=mtklein@google.com
Author: thakis@chromium.org
Review URL: https://codereview.chromium.org/461693004
|
|
|
|
|
|
|
|
|
|
|
| |
See the notes in the Chromium bug, and http://llvm.org/20427
BUG=chromium:124610,skia:900
R=djsollen@google.com, mtklein@google.com
Author: thakis@chromium.org
Review URL: https://codereview.chromium.org/455903002
|
|
|
|
|
|
|
|
|
|
|
| |
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia:2813
R=halcanary@google.com, djsollen@google.com, mtklein@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/458453002
|
|
|
|
|
|
|
|
| |
R=halcanary@google.com, mtklein@google.com, kevin.petit@arm.com
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/451633006
|
|
|
|
|
|
|
|
| |
R=robertphillips@google.com
Author: krajcevski@google.com
Review URL: https://codereview.chromium.org/422023006
|
|
|
|
|
|
|
|
|
| |
BUG=skia:2746
R=bungeman@google.com, robertphillips@google.com, mtklein@google.com
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/414033002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(https://codereview.chromium.org/403583002/)
Reason for revert:
This is blocking the roll. Chromium Windows trybots (like win_chromium_x64_rel) are crashing in the SSSE3 code (for example SkCanvasVideoRenderTest.CroppedFrame).
Original issue's description:
> Enable the SSSE3 compile time check on all platforms (3rd attempt)
>
> BUG=skia:2746
>
> Committed: https://skia.googlesource.com/skia/+/933834851f9d48fbd85b728cc92e1f0134bfaa4e
R=halcanary@google.com, mtklein@google.com, djsollen@google.com
TBR=djsollen@google.com, halcanary@google.com, mtklein@google.com
NOTREECHECKS=true
NOTRY=true
BUG=skia:2746
Author: bungeman@google.com
Review URL: https://codereview.chromium.org/418523002
|
|
|
|
|
|
|
|
|
| |
BUG=skia:2746
R=halcanary@google.com, mtklein@google.com
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/403583002
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
now convert the time that we would have spent uploading the texture to
compressing it giving a net 50% memory savings for these things.
Committed: https://skia.googlesource.com/skia/+/bc9205be0a1094e312da098348601398c210dc5a
R=robertphillips@google.com, mtklein@google.com, kevin.petit@arm.com
Author: krajcevski@google.com
Review URL: https://codereview.chromium.org/390453002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(https://codereview.chromium.org/391693004/)
Reason for revert:
windows fail
Original issue's description:
> Enable the SSSE3 compile time check on all platforms.
>
> BUG=skia:2746
>
> Committed: https://skia.googlesource.com/skia/+/ee349531446ae2a8336b0903e05d0b2150d2131f
R=mtklein@google.com, djsollen@google.com
TBR=djsollen@google.com, mtklein@google.com
NOTREECHECKS=true
NOTRY=true
BUG=skia:2746
Author: halcanary@google.com
Review URL: https://codereview.chromium.org/390063002
|
|
|
|
|
|
|
|
|
| |
BUG=skia:2746
R=mtklein@google.com
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/391693004
|
|
|
|
|
|
|
|
|
|
| |
gaint is ~30%
R=djsollen@google.com
Author: djordje.pesut@imgtec.com
Review URL: https://codereview.chromium.org/357693002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can (https://codereview.chromium.org/390453002/)
Reason for revert:
Breaking chrome.
Original issue's description:
> Add support for NEON intrinsics to speed up texture compression. We can
> now convert the time that we would have spent uploading the texture to
> compressing it giving a net 50% memory savings for these things.
>
> Committed: https://skia.googlesource.com/skia/+/bc9205be0a1094e312da098348601398c210dc5a
R=robertphillips@google.com, mtklein@google.com, kevin.petit@arm.com
TBR=kevin.petit@arm.com, mtklein@google.com, robertphillips@google.com
NOTREECHECKS=true
NOTRY=true
Author: krajcevski@google.com
Review URL: https://codereview.chromium.org/384053003
|
|
|
|
|
|
|
|
|
|
|
| |
now convert the time that we would have spent uploading the texture to
compressing it giving a net 50% memory savings for these things.
R=robertphillips@google.com, mtklein@google.com, kevin.petit@arm.com
Author: krajcevski@google.com
Review URL: https://codereview.chromium.org/390453002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gain is ~30%
following functions are optimized:
SI8_D16_nofilter_DX
SI8_opaque_D32_nofilter_DX
R=djsollen@google.com, teodora.petrovic@gmail.com
Author: djordje.pesut@imgtec.com
Review URL: https://codereview.chromium.org/336533003
|
|
|
|
|
|
|
|
|
|
| |
This fixes Android build.
R=reed@google.com, mtklein@google.com
Author: scroggo@google.com
Review URL: https://codereview.chromium.org/378613002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds an SSE4.1 version of the existing BlurImage optimizations.
Performance of blur_image_filter_* benchmarks show a 10-50%
improvement on Linux/Ubuntu Core i7.
Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
Committed: https://skia.googlesource.com/skia/+/2830632ce93c97ed7647b13348365ea92e4ea665
R=mtklein@google.com, reed@chromium.org
Author: henrik.smiding@intel.com
Review URL: https://codereview.chromium.org/366593004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(https://codereview.chromium.org/366593004/)
Reason for revert:
breaks linker on chrome
[04:36:09.966000] [503/5965] LIB obj\chrome\installer_util.lib
[04:36:10.466000] FAILED: C:\Users\chrome-bot\buildbot\third_party\depot_tools\python276_bin\python.exe gyp-win-tool link-with-manifests environment.x86 True skia.dll "C:\Users\chrome-bot\buildbot\third_party\depot_tools\python276_bin\python.exe gyp-win-tool link-wrapper environment.x86 False link.exe /nologo /IMPLIB:skia.dll.lib /DLL /OUT:skia.dll @skia.dll.rsp" 2 mt.exe rc.exe "obj\skia\skia.skia.dll.intermediate.manifest" obj\skia\skia.skia.dll.generated.manifest
[04:36:10.466000] skia.opts_check_x86.obj : error LNK2019: unresolved external symbol "bool __cdecl SkBoxBlurGetPlatformProcs_SSE4(void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int))" (?SkBoxBlurGetPlatformProcs_SSE4@@YA_NPAP6AXPBIHPAIHHHHH@Z222@Z) referenced in function "bool __cdecl SkBoxBlurGetPlatformProcs(void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int))" (?SkBoxBlurGetPlatformProcs@@YA_NPAP6AXPBIHPAIHHHHH@Z222@Z)
[04:36:10.466000]
[04:36:10.466000] skia.dll : fatal error LNK1120: 1 unresolved externals
Original issue's description:
> Add SSE4 version of BlurImage optimizations.
>
> Adds an SSE4.1 version of the existing BlurImage optimizations.
> Performance of blur_image_filter_* benchmarks show a 10-50%
> improvement on Linux/Ubuntu Core i7.
>
> Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
>
> Committed: https://skia.googlesource.com/skia/+/2830632ce93c97ed7647b13348365ea92e4ea665
R=mtklein@google.com, henrik.smiding@intel.com
TBR=henrik.smiding@intel.com, mtklein@google.com
NOTREECHECKS=true
NOTRY=true
Author: reed@chromium.org
Review URL: https://codereview.chromium.org/375503003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds an SSE4.1 version of the existing BlurImage optimizations.
Performance of blur_image_filter_* benchmarks show a 10-50%
improvement on Linux/Ubuntu Core i7.
Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
R=mtklein@google.com
Author: henrik.smiding@intel.com
Review URL: https://codereview.chromium.org/366593004
|
|
|
|
|
|
|
|
|
| |
BUG=391016
R=tomhudson@chromium.org, mtklein@google.com, rnk@chromium.org, thakis@chromium.org
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/363983004
|
|
|
|
|
|
|
|
|
|
| |
MemorySanitizer is an unitialized memory use detector which is used in
Chromium, and does not presently support assembly code.
BUG=chromium:344505, chromium:373739
R=mtklein@google.com
Review URL: https://codereview.chromium.org/367973005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Marks the symbols in the S32A_Opaque_BlitRow32_SSE4 files as hidden,
so Chromium can build.
Also enables the optimizations.
Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
R=mtklein@google.com, joakim.landberg@intel.com
Author: henrik.smiding@intel.com
Review URL: https://codereview.chromium.org/368573002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reason for revert:
Mac Chrome builders still failing.
Original issue's description:
> Re-enable SSE4.
>
> I will roll this into Chrome with https://codereview.chromium.org/332393003.
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/a75b0fadbdec4214afec6dd727fd224d34ed164f
R=reed@google.com, mtklein@chromium.org
TBR=mtklein@chromium.org, reed@google.com
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Author: mtklein@google.com
Review URL: https://codereview.chromium.org/337093004
|
|
|
|
|
|
|
|
|
|
|
| |
I will roll this into Chrome with https://codereview.chromium.org/332393003.
BUG=skia:
R=reed@google.com, mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/357593003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the NEON code for Xfermodes performs well on arm64
targets except for dstout and dstin which are significantly
slower than the C code. This patch fixes this and gives
further improvements on other modes.
Here are some perf results:
+------------+------------+------------+
| mode | Cortex-A53 | Cortex-A57 |
+------------+------------+------------+
| multiply | +24.58% | +23.71% |
+------------+------------+------------+
| exclusion | +22.72% | +22.05% |
+------------+------------+------------+
| difference | +34.67% | +36.82% |
+------------+------------+------------+
| hardlight | +17.07% | +14.74% |
+------------+------------+------------+
| lighten | +38.21% | +32.87% |
+------------+------------+------------+
| darken | +37.59% | +32.99% |
+------------+------------+------------+
| overlay | +17.36% | +16.88% |
+------------+------------+------------+
| screen | +52.56% | +54.43% |
+------------+------------+------------+
| modulate | +62.85% | +61.32% |
+------------+------------+------------+
| plus | +91.52% | +117.41% |
+------------+------------+------------+
| xor | +42.86% | +43.38% |
+------------+------------+------------+
| dstatop | +48.46% | +48.99% |
+------------+------------+------------+
| srcatop | +50.50% | +48.51% |
+------------+------------+------------+
| dstout | +67.83% | +78.09% |
+------------+------------+------------+
| srcout | +69.02% | +78.26% |
+------------+------------+------------+
| dstin | +70.92% | +79.24% |
+------------+------------+------------+
| srcin | +68.90% | +78.23% |
+------------+------------+------------+
| dstover | +73.80% | +68.10% |
+------------+------------+------------+
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia
R=mtklein@google.com, djsollen@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/350343002
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Chrome canary failing to link chrome:
http://108.170.220.120:10115/builders/Canary-Chrome-Ubuntu13.10-Ninja-x86_64-ToT/builds/1009/steps/BuildChrome/logs/stdio
BUG=skia:
NOTRY=true
R=mtklein@google.com, rmistry@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/361493002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the set of platform-specific function pointers to do fast convolution (e.g., neon, SSE) were passed in a structure to the scaler.
I refactored this so that the scaler fills in these function pointers after it's called, so the caller doesn't have to worry about it.
R=mtklein@google.com
TBR=mtklein
NOTRY=True
Author: humper@google.com
Review URL: https://codereview.chromium.org/354193002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD
instruction set. Special case for when alpha is zero or opaque.
Performance increase of 10%-400% compared to the existing SSE2
optimization (measured on Silvermont architecture).
Noticeable in ~25 different skia bench subtests, especially in
bitmap_8888_*, repeatTile_*, and morph_*.
bitmap_8888_A - 100% faster
bitmap_8888_A_source_transparent - 250% faster
bitmap_8888_A_source_opaque - 25% faster
bitmap_8888_A_scale_bicubic - 75% faster
Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
Committed: https://skia.googlesource.com/skia/+/e2527b147679b0c43019fae7d59cc3777d2d097e
Committed: https://skia.googlesource.com/skia/+/b5c281e1e06af3be804309877de1dac6145686b9
R=reed@google.com, mtklein@google.com, tomhudson@google.com, djsollen@google.com, joakim.landberg@intel.com
Author: henrik.smiding@intel.com
Review URL: https://codereview.chromium.org/289473009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here are some perf results:
+-------+------------+------------+
| count | Cortex-A53 | Cortex-A57 |
+-------+------------+------------+
| 1 | -2.54% | -5.39% |
+-------+------------+------------+
| 2 | -0.66% | -2.08% |
+-------+------------+------------+
| 4 | -11.13% | 0.00% |
+-------+------------+------------+
| 8 | -5.79% | -1.30% |
+-------+------------+------------+
| 16 | 71.60% | 93.27% |
+-------+------------+------------+
| 64 | 30.99% | 57.35% |
+-------+------------+------------+
| 256 | 25.41% | 52.59% |
+-------+------------+------------+
| 1024 | 25.56% | 53.76% |
+-------+------------+------------+
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia:
R=mtklein@google.com, djsollen@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/346843003
|
|
|
|
|
|
|
|
|
|
|
| |
Original Mozilla bug: https://bugzilla.mozilla.org/show_bug.cgi?id=901208
R=reed@google.com, mtklein@google.com, reed1
BUG=skia:
Author: george@mozilla.com
Review URL: https://codereview.chromium.org/337853003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(https://codereview.chromium.org/289473009/)
NOTREECHECKS=true
NOTRY=true
Reason for revert:
Valgrind bot's seeing this code use uninitialized memory, and it's somehow blocking our roll into Chrome too:
> ld: warning: could not create compact unwind for
S32A_Opaque_BlitRow32_SSE4_asm:
> stack subq instruction is too different from dwarf stack size
> [10339/10982 | 3247.792] PACKAGE FRAMEWORK "Chromium Framework.framework",
> POSTBUILDS
> FAILED: ./gyp-mac-tool package-framework "Chromium Framework.framework" A &&
> (export
> BUILT_PRODUCTS_DIR=/Volumes/data/b/build/slave/mac_gpu/build/src/out/Release;
> export CONFIGURATION=Release; export CONTENTS_FOLDER_PATH="Chromium
> Framework.framework/Versions/A"; export
> DYLIB_INSTALL_NAME_BASE=@executable_path/../Versions/37.0.2056.0; export
> EXECUTABLE_NAME="Chromium Framework"; export EXECUTABLE_PATH="Chromium
> Framework.framework/Versions/A/Chromium Framework"; export
> FULL_PRODUCT_NAME="Chromium Framework.framework"; export
> INFOPLIST_PATH="Chromium Framework.framework/Versions/A/Resources/Info.plist";
> export
LD_DYLIB_INSTALL_NAME="@executable_path/../Versions/37.0.2056.0/Chromium
> Framework.framework/Chromium Framework"; export MACH_O_TYPE=mh_dylib; export
> PRODUCT_NAME="Chromium Framework"; export
> PRODUCT_TYPE=com.apple.product-type.framework; export
>
SDKROOT=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.6.sdk;
> export
>
SRCROOT=/Volumes/data/b/build/slave/mac_gpu/build/src/out/Release/../../chrome;
> export SOURCE_ROOT="${SRCROOT}"; export
> TARGET_BUILD_DIR=/Volumes/data/b/build/slave/mac_gpu/build/src/out/Release;
> export TEMP_DIR="${TMPDIR}"; export
UNLOCALIZED_RESOURCES_FOLDER_PATH="Chromium
> Framework.framework/Versions/A/Resources"; export WRAPPER_NAME="Chromium
> Framework.framework"; (cd ../../chrome && ../build/mac/tweak_info_plist.py
> "--breakpad=1" "--breakpad_uploads=0" "--keystone=0" "--scm=1"
> "--branding=Chromium" && ln -fns Versions/Current/Libraries
> "${BUILT_PRODUCTS_DIR}/${WRAPPER_NAME}/Libraries" &&
> tools/build/mac/verify_order _ChromeMain
> "${BUILT_PRODUCTS_DIR}/${EXECUTABLE_PATH}"); G=$?; ((exit $G) || rm -rf
> 'Chromium Framework.framework') && exit $G) && touch "Chromium
> Framework.framework"
> tools/build/mac/verify_order: unordered symbols in
> /Volumes/data/b/build/slave/mac_gpu/build/src/out/Release/Chromium
> Framework.framework/Versions/A/Chromium Framework:
> S32A_Opaque_BlitRow32_SSE4_asm
> _S32A_Opaque_BlitRow32_SSE4_asm
> ninja: build stopped: subcommand failed.
Original issue's description:
> Add SSE4 optimization of S32A_Opaque_Blitrow
>
> Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD
> instruction set. Special case for when alpha is zero or opaque.
>
> Performance increase of 10%-400% compared to the existing SSE2
> optimization (measured on Silvermont architecture).
> Noticeable in ~25 different skia bench subtests, especially in
> bitmap_8888_*, repeatTile_*, and morph_*.
>
> bitmap_8888_A - 100% faster
> bitmap_8888_A_source_transparent - 250% faster
> bitmap_8888_A_source_opaque - 25% faster
> bitmap_8888_A_scale_bicubic - 75% faster
>
> Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
>
> Committed: https://skia.googlesource.com/skia/+/e2527b147679b0c43019fae7d59cc3777d2d097e
>
> Committed: https://skia.googlesource.com/skia/+/b5c281e1e06af3be804309877de1dac6145686b9
R=reed@google.com, tomhudson@google.com, djsollen@google.com, joakim.landberg@intel.com, henrik.smiding@intel.com, mtklein@chromium.org
Author: mtklein@google.com
Review URL: https://codereview.chromium.org/336413007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD
instruction set. Special case for when alpha is zero or opaque.
Performance increase of 10%-400% compared to the existing SSE2
optimization (measured on Silvermont architecture).
Noticeable in ~25 different skia bench subtests, especially in
bitmap_8888_*, repeatTile_*, and morph_*.
bitmap_8888_A - 100% faster
bitmap_8888_A_source_transparent - 250% faster
bitmap_8888_A_source_opaque - 25% faster
bitmap_8888_A_scale_bicubic - 75% faster
Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
Committed: https://skia.googlesource.com/skia/+/e2527b147679b0c43019fae7d59cc3777d2d097e
R=reed@google.com, mtklein@google.com, tomhudson@google.com, djsollen@google.com, joakim.landberg@intel.com
Author: henrik.smiding@intel.com
Review URL: https://codereview.chromium.org/289473009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
benches and bots. (https://codereview.chromium.org/331193004/)
Reason for revert:
Experiment is over: disabling SSSE3 is a 25-50% perf regression for bitmap scaling on every machine we've got.
Original issue's description:
> Temporarily limit x86 SIMD to SSE2 only, to see effect on all benches and bots.
>
> BUG=372232
>
> Committed: https://skia.googlesource.com/skia/+/f1e5a04832e4d350f9ebf5d556c6d3897345f883
R=reed@google.com, mtklein@chromium.org
TBR=mtklein@chromium.org, reed@google.com
NOTREECHECKS=true
NOTRY=true
BUG=372232
Author: mtklein@google.com
Review URL: https://codereview.chromium.org/332213005
|
|
|
|
|
|
|
|
|
| |
BUG=372232
R=reed@google.com, mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/331193004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gain is ~40%
following function are optimized:
S32_D565_Blend
S32A_D565_Opaque_Dither
S32_D565_Opaque_Dither
S32_D565_Blend_Dither
S32A_D565_Opaque
S32A_D565_Blend
S32_Blend_BlitRow32
R=djsollen@google.com, teodora.petrovic@gmail.com
Author: djordje.pesut@imgtec.com
Review URL: https://codereview.chromium.org/326913004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables all 565 blitters except S32A_D565_Opaque.
Here are some performance results:
S32_D565_Opaque:
================
+-------+------------+------------+
| count | Cortex-A53 | Cortex-A57 |
+-------+------------+------------+
| 1 | -18.37% | -13.04% |
+-------+------------+------------+
| 2 | -9.90% | -13.78% |
+-------+------------+------------+
| 4 | -8.28% | -6.77% |
+-------+------------+------------+
| 8 | 157.63% | 78.15% |
+-------+------------+------------+
| 16 | 72.67% | 44.81% |
+-------+------------+------------+
| 64 | 76.78% | 40.89% |
+-------+------------+------------+
| 256 | 73.85% | 36.05% |
+-------+------------+------------+
| 1024 | 75.73% | 36.70% |
+-------+------------+------------+
S32_D565_Blend:
===============
+-------+------------+------------+
| count | Cortex-A53 | Cortex-A57 |
+-------+------------+------------+
| 1 | -9.99% | -13.79% |
+-------+------------+------------+
| 2 | -9.17% | -6.74% |
+-------+------------+------------+
| 4 | -6.73% | -4.42% |
+-------+------------+------------+
| 8 | 163.31% | 112.82% |
+-------+------------+------------+
| 16 | 55.21% | 44.68% |
+-------+------------+------------+
| 64 | 54.09% | 41.99% |
+-------+------------+------------+
| 256 | 52.63% | 40.64% |
+-------+------------+------------+
| 1024 | 52.46% | 40.45% |
+-------+------------+------------+
S32A_D565_Blend:
================
+-------+------------+------------+
| count | Cortex-A53 | Cortex-A57 |
+-------+------------+------------+
| 1 | -5.88% | -6.06% |
+-------+------------+------------+
| 2 | -4.74% | -0.01% |
+-------+------------+------------+
| 4 | -5.42% | -3.03% |
+-------+------------+------------+
| 8 | 78.78% | 77.96% |
+-------+------------+------------+
| 16 | 98.19% | 79.61% |
+-------+------------+------------+
| 64 | 111.56% | 72.60% |
+-------+------------+------------+
| 256 | 113.80% | 69.96% |
+-------+------------+------------+
| 1024 | 114.42% | 70.85% |
+-------+------------+------------+
S32_D565_Opaque_Dither:
=======================
+-------+------------+------------+
| count | Cortex-A53 | Cortex-A57 |
+-------+------------+------------+
| 1 | -4.18% | -0.93% |
+-------+------------+------------+
| 2 | -2.43% | -2.04% |
+-------+------------+------------+
| 4 | -1.09% | -1.23% |
+-------+------------+------------+
| 8 | 184.89% | 136.53% |
+-------+------------+------------+
| 16 | 128.64% | 89.11% |
+-------+------------+------------+
| 64 | 132.68% | 100.98% |
+-------+------------+------------+
| 256 | 157.02% | 100.86% |
+-------+------------+------------+
| 1024 | 163.85% | 103.62% |
+-------+------------+------------+
S32_D565_Blend_Dither:
======================
+-------+------------+------------+
| count | Cortex-A53 | Cortex-A57 |
+-------+------------+------------+
| 1 | -4.87% | 0.01% |
+-------+------------+------------+
| 2 | -2.71% | 2.97% |
+-------+------------+------------+
| 4 | -2.20% | 0.28% |
+-------+------------+------------+
| 8 | 149.76% | 146.80% |
+-------+------------+------------+
| 16 | 85.69% | 95.77% |
+-------+------------+------------+
| 64 | 88.81% | 101.39% |
+-------+------------+------------+
| 256 | 97.32% | 107.22% |
+-------+------------+------------+
| 1024 | 98.08% | 115.71% |
+-------+------------+------------+
S32A_D565_Opaque_Dither:
========================
+-------+------------+------------+
| count | Cortex-A53 | Cortex-A57 |
+-------+------------+------------+
| 1 | -1.86% | 0.02% |
+-------+------------+------------+
| 2 | -0.58% | -1.52% |
+-------+------------+------------+
| 4 | -0.75% | 1.16% |
+-------+------------+------------+
| 8 | 240.74% | 155.16% |
+-------+------------+------------+
| 16 | 181.97% | 132.15% |
+-------+------------+------------+
| 64 | 203.11% | 136.48% |
+-------+------------+------------+
| 256 | 223.45% | 133.05% |
+-------+------------+------------+
| 1024 | 225.96% | 134.05% |
+-------+------------+------------+
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia:
R=djsollen@google.com, mtklein@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/317193003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(https://codereview.chromium.org/289473009/)
Reason for revert:
Buildbot failures on Mac 10.6 and Mac 10.7.
R=reed@google.com, mtklein@google.com, tomhudson@google.com, djsollen@google.com, joakim.landberg@intel.com, henrik.smiding@intel.com
TBR=reed@google.com
NOTRY=True
Original issue's description:
> Add SSE4 optimization of S32A_Opaque_Blitrow
>
> Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD
> instruction set. Special case for when alpha is zero or opaque.
>
> Performance increase of 10%-400% compared to the existing SSE2
> optimization (measured on Silvermont architecture).
> Noticeable in ~25 different skia bench subtests, especially in
> bitmap_8888_*, repeatTile_*, and morph_*.
>
> bitmap_8888_A - 100% faster
> bitmap_8888_A_source_transparent - 250% faster
> bitmap_8888_A_source_opaque - 25% faster
> bitmap_8888_A_scale_bicubic - 75% faster
>
> Signed-off-by: Henrik Smiding <henrik.smiding@intel.com>
>
> Committed: https://skia.googlesource.com/skia/+/e2527b147679b0c43019fae7d59cc3777d2d097e
Author: jvanverth@google.com
Review URL: https://codereview.chromium.org/311053009
|