| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 74a11753604768bf461b80cabb66060e8564d82c.
TBR=joshualitt@google.com,bsalomon@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Committed: https://skia.googlesource.com/skia/+/3e9dfdb3784c0cbfecf7589a74aa9aff7ef40abd
Review URL: https://codereview.chromium.org/896163003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
#1 id:1 of https://codereview.chromium.org/896163003/)
Reason for revert:
failed on my manual revert
Original issue's description:
> Revert "Move DstCopy on gpu into the GrXferProcessor."
>
> This reverts commit 74a11753604768bf461b80cabb66060e8564d82c.
>
> TBR=joshualitt@google.com,bsalomon@google.com
> NOPRESUBMIT=true
> NOTREECHECKS=true
> NOTRY=true
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/3e9dfdb3784c0cbfecf7589a74aa9aff7ef40abd
TBR=
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/900913002
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 74a11753604768bf461b80cabb66060e8564d82c.
TBR=joshualitt@google.com,bsalomon@google.com
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/896163003
|
|
|
|
|
|
| |
BUG=skia:
Review URL: https://codereview.chromium.org/885923002
|
|
|
|
|
|
|
|
|
| |
Build trybots are not triggering.
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/894773002
|
|
|
|
|
|
| |
R=reed@google.com
Review URL: https://codereview.chromium.org/895103002
|
|
|
|
|
|
|
|
| |
This is working towards making a simple example part of the buildbot compile step and removing SkExamples from the experimental directory.
This works on Mac, Windows, and Linux but isn't complete for Android, ChromeOS and iOS.
Review URL: https://codereview.chromium.org/886413004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This merges and refactors SkAtomics.h and SkBarriers.h into SkAtomics.h and
some ports/ implementations. The major new feature is that we can express
memory orders explicitly rather than only through comments.
The porting layer is reduced to four template functions:
- sk_atomic_load
- sk_atomic_store
- sk_atomic_fetch_add
- sk_atomic_compare_exchange
From those four we can reconstruct all our previous sk_atomic_foo.
There are three ports:
- SkAtomics_std: uses C++11 <atomic>, used with MSVC
- SkAtomics_atomic: uses newer GCC/Clang intrinsics, used on not-MSVC where possible
- SkAtomics_sync: uses older GCC/Clang intrinsics, used where SkAtomics_atomic not supported
No public API changes.
TBR=reed@google.com
BUG=skia:
Review URL: https://codereview.chromium.org/896553002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(patchset #1 id:1 of https://codereview.chromium.org/873553003/)
Reason for revert:
Reverted the wrong CL.
Original issue's description:
> Revert of SSE4 opaque blend using intrinsics instead of assembly. (patchset #16 id:300001 of https://codereview.chromium.org/874863002/)
>
> Reason for revert:
> This causes a bug on the 'hittestpath' GM on MacMini 4,1
>
> See:
>
> https://gold.skia.org/#/triage/hittestpath?head=0
>
> for details.
>
> Original issue's description:
> > SSE4 opaque blend using intrinsics instead of assembly.
> >
> > Since we had such a hard time with the assembly versions of this blit (to the
> > point that we have them completely disabled everywhere), I thought I'd take
> > a shot at writing a version of the blit using intrinsics.
> >
> > The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
> > to skip the blend when the 16 src pixels we consider each loop are all opaque
> > or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
> > all those alphas.
> >
> > It's worth looking to see if we can backport this type of logic to SSE2 using
> > _mm_movemask_epi8, or up to 32 pixels at a time using AVX.
> >
> > My local performance testing doesn't show this to be an unambiguous win
> > (there are probably microbenchmarks and SKPs where we'd be better off just
> > powering through the blend rather than looking at alphas), but the potential
> > does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
> >
> > DM says it draws pixel perfect compare to the old code.
> >
> > Microbenchmarks:
> > bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
> > bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
> > bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
> > bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
> > bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
> > bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
> > bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
> > bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
> > bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
> > bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
> > bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
> > bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
> > bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
> > bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
> > bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
> > bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
> > bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
> >
> > Running over our ~70 SKP web page captures, this looks like we spend 0.7x
> > the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
> > be a decent predictor of real-world impact.
> >
> > BUG=chromium:399842
> >
> > Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785
> >
> > CQ_EXTRA_TRYBOTS=client.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot
> >
> > Committed: https://skia.googlesource.com/skia/+/6dbfb21a6c88af6d94e8c823c3ad559f1a41b493
>
> TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org
> NOPRESUBMIT=true
> NOTREECHECKS=true
> NOTRY=true
> BUG=chromium:399842
>
> Committed: https://skia.googlesource.com/skia/+/4988891a1173cd405bf1c1dd3a3668c451f45e4c
TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=chromium:399842
Review URL: https://codereview.chromium.org/894083002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
#16 id:300001 of https://codereview.chromium.org/874863002/)
Reason for revert:
This causes a bug on the 'hittestpath' GM on MacMini 4,1
See:
https://gold.skia.org/#/triage/hittestpath?head=0
for details.
Original issue's description:
> SSE4 opaque blend using intrinsics instead of assembly.
>
> Since we had such a hard time with the assembly versions of this blit (to the
> point that we have them completely disabled everywhere), I thought I'd take
> a shot at writing a version of the blit using intrinsics.
>
> The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
> to skip the blend when the 16 src pixels we consider each loop are all opaque
> or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
> all those alphas.
>
> It's worth looking to see if we can backport this type of logic to SSE2 using
> _mm_movemask_epi8, or up to 32 pixels at a time using AVX.
>
> My local performance testing doesn't show this to be an unambiguous win
> (there are probably microbenchmarks and SKPs where we'd be better off just
> powering through the blend rather than looking at alphas), but the potential
> does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
>
> DM says it draws pixel perfect compare to the old code.
>
> Microbenchmarks:
> bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
> bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
> bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
> bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
> bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
> bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
> bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
> bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
> bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
> bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
> bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
> bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
> bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
> bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
> bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
> bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
> bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
>
> Running over our ~70 SKP web page captures, this looks like we spend 0.7x
> the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
> be a decent predictor of real-world impact.
>
> BUG=chromium:399842
>
> Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785
>
> CQ_EXTRA_TRYBOTS=client.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/6dbfb21a6c88af6d94e8c823c3ad559f1a41b493
TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=chromium:399842
Review URL: https://codereview.chromium.org/873553003
|
|
|
|
|
|
|
|
|
|
| |
Not enabled by default, but this should get you SKPs, GMs etc for free to play with.
$ out/Debug/dm -w svgs --src gm skp --config svg
BUG=skia:
Review URL: https://codereview.chromium.org/892693002
|
|
|
|
|
|
|
|
| |
(patchset #2 id:20001 of https://codereview.chromium.org/860583002/)
SkAvoidXfermode has been moved into Android, so it is safe to remove.
Review URL: https://codereview.chromium.org/890893003
|
|
|
|
|
|
|
|
|
|
| |
to SkWindow.
Eventually, this will be moved to be a peer of SampleApp so it is compiled by the bots to avoid future bit rot.
Also ignore XCode auto-generated flag in CommandLineFlags, and remove the unused multiple-example part.
Review URL: https://codereview.chromium.org/890873003
|
|
|
|
|
|
|
|
|
| |
This adds SkSVGDevice and a small utility for converting SKP files to SVG (skp2svg).
R=reed@google.com,jcgregorio@google.com
BUG=skia:3368
Review URL: https://codereview.chromium.org/892533002
|
|
|
|
| |
Review URL: https://codereview.chromium.org/868243003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SkProxyCanvas is redundant with SkNWayCanvas, and means another class
we have to keep in sync with the SkCanvas interface.
Remove tests which use an SkProxyCanvas.
Requires a change to chromium.
BUG=skia:3279
BUG=skia:500
Review URL: https://codereview.chromium.org/886813002
|
|
|
|
|
|
|
| |
BUG=skia:
TBR=
Review URL: https://codereview.chromium.org/888663002
|
|
|
|
|
|
|
|
| |
directory. Anything of value has been copied into the mainline.
The obsolete gyp files are also included, along with a pixman test that never functioned but accidentally referenced some of these deleted files.
Review URL: https://codereview.chromium.org/867213004
|
|
|
|
|
|
|
|
|
|
| |
BUG=skia:
Committed: https://skia.googlesource.com/skia/+/d15e4e45374275c045572b304c229237c4a82be4
Committed: https://skia.googlesource.com/skia/+/d5a7db4a867c7e6ccf8451a053d987b470099198
Review URL: https://codereview.chromium.org/845103005
|
|
|
|
|
|
| |
BUG=skia:
Review URL: https://codereview.chromium.org/885733002
|
|
|
|
|
|
|
| |
TBR=
BUG=skia:
Review URL: https://codereview.chromium.org/883053002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://codereview.chromium.org/845103005/)
Reason for revert:
One last try to fix mac perf regression
Original issue's description:
> GrBatchPrototype
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/d15e4e45374275c045572b304c229237c4a82be4
>
> Committed: https://skia.googlesource.com/skia/+/d5a7db4a867c7e6ccf8451a053d987b470099198
TBR=bsalomon@google.com,kkinnunen@nvidia.com,joshualitt@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/877393002
|
|
|
|
|
|
|
|
| |
BUG=skia:
Committed: https://skia.googlesource.com/skia/+/d15e4e45374275c045572b304c229237c4a82be4
Review URL: https://codereview.chromium.org/845103005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we had such a hard time with the assembly versions of this blit (to the
point that we have them completely disabled everywhere), I thought I'd take
a shot at writing a version of the blit using intrinsics.
The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
to skip the blend when the 16 src pixels we consider each loop are all opaque
or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
all those alphas.
It's worth looking to see if we can backport this type of logic to SSE2 using
_mm_movemask_epi8, or up to 32 pixels at a time using AVX.
My local performance testing doesn't show this to be an unambiguous win
(there are probably microbenchmarks and SKPs where we'd be better off just
powering through the blend rather than looking at alphas), but the potential
does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
DM says it draws pixel perfect compare to the old code.
Microbenchmarks:
bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
Running over our ~70 SKP web page captures, this looks like we spend 0.7x
the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
be a decent predictor of real-world impact.
BUG=chromium:399842
Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785
CQ_EXTRA_TRYBOTS=client.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot
Review URL: https://codereview.chromium.org/874863002
|
|
|
|
| |
Review URL: https://codereview.chromium.org/864043005
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://codereview.chromium.org/845103005/)
Reason for revert:
creates large performance regression
Original issue's description:
> GrBatchPrototype
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/d15e4e45374275c045572b304c229237c4a82be4
TBR=bsalomon@google.com,joshualitt@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/862823004
|
|
|
|
|
|
|
|
| |
This slide can be used to find and diagnose discrepancies between BW clipping and drawing.
BUG=skia:423834
Review URL: https://codereview.chromium.org/872363003
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This should make it easier to keep our opts.gyp in sync with Chrome's GYP and GN.
BUG=skia:
Landing this without review as a mega-tryjob.
TBR=reed@google.com
Committed: https://skia.googlesource.com/skia/+/c98fe3aa4f8c97c462c0eb6d9106fc37e48d7f82
Review URL: https://codereview.chromium.org/870353003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://codereview.chromium.org/870353003/)
Reason for revert:
Android Makefiles broken
Original issue's description:
> Split src/opts source lists out of opts.gyp.
>
> This should make it easier to keep our opts.gyp in sync with Chrome's GYP and GN.
>
> BUG=skia:
>
> Landing this without review as a mega-tryjob.
> TBR=reed@google.com
>
> Committed: https://skia.googlesource.com/skia/+/c98fe3aa4f8c97c462c0eb6d9106fc37e48d7f82
TBR=mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/880783002
|
|
|
|
|
|
|
|
|
|
|
| |
This should make it easier to keep our opts.gyp in sync with Chrome's GYP and GN.
BUG=skia:
Landing this without review as a mega-tryjob.
TBR=reed@google.com
Review URL: https://codereview.chromium.org/870353003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
#14 id:260001 of https://codereview.chromium.org/874863002/)
Reason for revert:
This kills Mac 10.6 bots.
FAILED: c++ -MMD -MF obj/src/opts/opts_sse4.SkBlitRow_opts_SSE4.o.d -DSK_INTERNAL -DSK_GAMMA_SRGB -DSK_GAMMA_APPLY_TO_A8 -DSK_SCALAR_TO_FLOAT_EXCLUDED -DSK_ALLOW_STATIC_GLOBAL_INITIALIZERS=1 -DSK_SUPPORT_GPU=1 -DSK_SUPPORT_OPENCL=0 -DSK_FORCE_DISTANCE_FIELD_TEXT=0 -DSK_BUILD_FOR_MAC -DSK_CRASH_HANDLER -DSK_DEVELOPER=1 -I../../src/core -I../../src/utils -I../../include/c -I../../include/config -I../../include/core -I../../include/pathops -I../../include/pipe -I../../include/utils/mac -I../../include/effects -O0 -gdwarf-2 -mmacosx-version-min=10.6 -arch x86_64 -mssse3 -Wall -Wextra -Winit-self -Wpointer-arith -Wsign-compare -Wno-unused-parameter -Wno-invalid-offsetof -msse4.1 -c ../../src/opts/SkBlitRow_opts_SSE4.cpp -o obj/src/opts/opts_sse4.SkBlitRow_opts_SSE4.o
../../src/opts/SkBlitRow_opts_SSE4.cpp:15:27: warning: x86intrin.h: No such file or directory
../../src/opts/SkBlitRow_opts_SSE4.cpp: In function 'void S32A_Opaque_BlitRow32_SSE4(SkPMColor*, const SkPMColor*, int, U8CPU)':
../../src/opts/SkBlitRow_opts_SSE4.cpp:40: error: '_mm_testz_si128' was not declared in this scope
../../src/opts/SkBlitRow_opts_SSE4.cpp:45: error: '_mm_testc_si128' was not declared in this scope
Original issue's description:
> SSE4 opaque blend using intrinsics instead of assembly.
>
> Since we had such a hard time with the assembly versions of this blit (to the
> point that we have them completely disabled everywhere), I thought I'd take
> a shot at writing a version of the blit using intrinsics.
>
> The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
> to skip the blend when the 16 src pixels we consider each loop are all opaque
> or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
> all those alphas.
>
> It's worth looking to see if we can backport this type of logic to SSE2 using
> _mm_movemask_epi8, or up to 32 pixels at a time using AVX.
>
> My local performance testing doesn't show this to be an unambiguous win
> (there are probably microbenchmarks and SKPs where we'd be better off just
> powering through the blend rather than looking at alphas), but the potential
> does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
>
> DM says it draws pixel perfect compare to the old code.
>
> Microbenchmarks:
> bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
> bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
> bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
> bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
> bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
> bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
> bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
> bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
> bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
> bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
> bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
> bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
> bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
> bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
> bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
> bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
> bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
>
> Running over our ~70 SKP web page captures, this looks like we spend 0.7x
> the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
> be a decent predictor of real-world impact.
>
> BUG=chromium:399842
>
> Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785
TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=chromium:399842
Review URL: https://codereview.chromium.org/874033004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we had such a hard time with the assembly versions of this blit (to the
point that we have them completely disabled everywhere), I thought I'd take
a shot at writing a version of the blit using intrinsics.
The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*)
to skip the blend when the 16 src pixels we consider each loop are all opaque
or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract
all those alphas.
It's worth looking to see if we can backport this type of logic to SSE2 using
_mm_movemask_epi8, or up to 32 pixels at a time using AVX.
My local performance testing doesn't show this to be an unambiguous win
(there are probably microbenchmarks and SKPs where we'd be better off just
powering through the blend rather than looking at alphas), but the potential
does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.)
DM says it draws pixel perfect compare to the old code.
Microbenchmarks:
bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x
bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x
bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x
bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x
bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x
bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x
bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x
bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x
bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x
bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x
bitmap_RGBA_8888 5.73us -> 5.58us 0.97x
bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x
bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x
bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x
bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x
bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x
bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x
Running over our ~70 SKP web page captures, this looks like we spend 0.7x
the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should
be a decent predictor of real-world impact.
BUG=chromium:399842
Review URL: https://codereview.chromium.org/874863002
|
|
|
|
|
|
| |
BUG=skia:
Review URL: https://codereview.chromium.org/845103005
|
|
|
|
|
|
| |
BUG=skia:
Review URL: https://codereview.chromium.org/873333004
|
|
|
|
|
|
|
| |
BUG=skia:
TBR=
Review URL: https://codereview.chromium.org/873293003
|
|
|
|
| |
Review URL: https://codereview.chromium.org/834303005
|
|
|
|
|
|
|
|
|
|
|
| |
BUG=skia:1366
For the added bench, the collapsing makes the bench take:
- 70% of the time for CPU rendering of 3 consecutive matrix filters
- almost no change in the GPU rendering of the matrix filters
- 50% of the time for CPU and GPU rendering of 3 consecutive table filters
Review URL: https://codereview.chromium.org/776673002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fold alpha to the inner savelayer in savelayer-savelayer-restore
patterns such as this:
SaveLayer (non-opaque)
Save
ClipRect
SaveLayer
Restore
Restore
Restore
Current blink generates these for example for SVG content such as this:
<path style="opacity:0.5 filter:url(#blur_filter)"/>
The outer save layer is due to the opacity and the inner one is due to
blur filter being implemented with picture image filter.
Reduces layers in desk_carsvg.skp testcase from 115 to 78.
BUG=skia:3119
Review URL: https://codereview.chromium.org/835973005
|
|
|
|
|
|
| |
BUG=skia:
Review URL: https://codereview.chromium.org/855473002
|
|
|
|
|
|
| |
TBR=mtklein@google.com
Review URL: https://codereview.chromium.org/868333002
|
|
|
|
| |
Review URL: https://codereview.chromium.org/872543002
|
|
|
|
|
|
|
|
|
|
|
|
| |
- add -Wsign-compare, which has been catching useful issues for Kimmo;
- add -Winit-self and -Wpointer-arith to Mac builds so everyone's using
the same flags;
- try try removing -Wno-uninitialized. This was only for the old 10.6
compiler that we have warnings set as non-errors now.
BUG=skia:
Review URL: https://codereview.chromium.org/872793002
|
|
|
|
| |
Review URL: https://codereview.chromium.org/861323002
|
|
|
|
| |
Review URL: https://codereview.chromium.org/858123002
|
|
|
|
|
|
|
| |
BUG=skia:
NOTREECHECKS=True
Review URL: https://codereview.chromium.org/868783002
|
|
|
|
|
|
| |
BUG=skia:
Review URL: https://codereview.chromium.org/858343002
|
|
|
|
|
|
| |
BUG=skia:
Review URL: https://codereview.chromium.org/869463002
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To compile SkCondVar, we already require either pthreads or Windows. This
simplifies that code to not need SK_USE_POSIX_THREADS to be explicitly defined.
We'll just look to see if we're targeting Windows, and if not, assume pthreads.
Both before and after this CL, that code will fail to compile if we're not on
Windows and don't have pthreads.
BUG=skia:
Review URL: https://codereview.chromium.org/869443003
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This basically takes out the Windows-only hacks and promotes them to
cross-platform behavior driven by --gpu_threading.
- When --gpu_threading is false (the default), this puts GPU tasks and tests
together in the same GPU enclave. They all run serially.
- When --gpu_threading is true, both the tests and the tasks run totally
independently, just like the thread-safe CPU-bound work.
BUG=skia:3255
Review URL: https://codereview.chromium.org/847273005
|
|
|
|
|
|
| |
BUG=skia:3255
Review URL: https://codereview.chromium.org/859303003
|