aboutsummaryrefslogtreecommitdiffhomepage
path: root/gyp
Commit message (Collapse)AuthorAge
* Revert of Build in C++11 mode on Unix-like bots. (patchset #4 id:60001 of ↵Gravatar mtklein2015-02-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://codereview.chromium.org/868233008/) Reason for revert: Perf-Ubuntu12, and Test-Ubuntu12, Build-Nacl all too old. Android and Chrome OS builders look ok. Android testers look ok. Chrome OS testers haven't run yet. Original issue's description: > Build in C++11 mode on Unix-like bots. > > Mac and Windows bots are already building in C++11 mode. > This turns on the rest, mostly to see what work remains. > > This will probably break a few bots. It'd be nice if we could let those > all come in as red before reverting this so I can see the full list to fix. > > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/779e49602a9c8f4d2799504822e01bcafbcaa534 TBR=stephana@google.com,mtklein@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/879803003
* Build in C++11 mode on Unix-like bots.Gravatar mtklein2015-02-04
| | | | | | | | | | | | Mac and Windows bots are already building in C++11 mode. This turns on the rest, mostly to see what work remains. This will probably break a few bots. It'd be nice if we could let those all come in as red before reverting this so I can see the full list to fix. BUG=skia: Review URL: https://codereview.chromium.org/868233008
* Revert "Move DstCopy on gpu into the GrXferProcessor."Gravatar egdaniel2015-02-04
| | | | | | | | | | | | | | This reverts commit 74a11753604768bf461b80cabb66060e8564d82c. TBR=joshualitt@google.com,bsalomon@google.com NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Committed: https://skia.googlesource.com/skia/+/3e9dfdb3784c0cbfecf7589a74aa9aff7ef40abd Review URL: https://codereview.chromium.org/896163003
* Revert of Revert "Move DstCopy on gpu into the GrXferProcessor." (patchset ↵Gravatar egdaniel2015-02-04
| | | | | | | | | | | | | | | | | | | | | | | | | | | | #1 id:1 of https://codereview.chromium.org/896163003/) Reason for revert: failed on my manual revert Original issue's description: > Revert "Move DstCopy on gpu into the GrXferProcessor." > > This reverts commit 74a11753604768bf461b80cabb66060e8564d82c. > > TBR=joshualitt@google.com,bsalomon@google.com > NOPRESUBMIT=true > NOTREECHECKS=true > NOTRY=true > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/3e9dfdb3784c0cbfecf7589a74aa9aff7ef40abd TBR= NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/900913002
* Revert "Move DstCopy on gpu into the GrXferProcessor."Gravatar egdaniel2015-02-04
| | | | | | | | | | | | This reverts commit 74a11753604768bf461b80cabb66060e8564d82c. TBR=joshualitt@google.com,bsalomon@google.com NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/896163003
* Move DstCopy on gpu into the GrXferProcessor.Gravatar egdaniel2015-02-03
| | | | | | BUG=skia: Review URL: https://codereview.chromium.org/885923002
* Build in C++11 mode on Macs.Gravatar mtklein2015-02-03
| | | | | | | | | Build trybots are not triggering. NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/894773002
* remove remaining parts of SkExampleGravatar caryclark2015-02-03
| | | | | | R=reed@google.com Review URL: https://codereview.chromium.org/895103002
* move HelloWorld to be a peer of SampleAppGravatar caryclark2015-02-02
| | | | | | | | This is working towards making a simple example part of the buildbot compile step and removing SkExamples from the experimental directory. This works on Mac, Windows, and Linux but isn't complete for Android, ChromeOS and iOS. Review URL: https://codereview.chromium.org/886413004
* Atomics overhaul.Gravatar mtklein2015-02-02
| | | | | | | | | | | | | | | | | | | | | | | | | This merges and refactors SkAtomics.h and SkBarriers.h into SkAtomics.h and some ports/ implementations. The major new feature is that we can express memory orders explicitly rather than only through comments. The porting layer is reduced to four template functions: - sk_atomic_load - sk_atomic_store - sk_atomic_fetch_add - sk_atomic_compare_exchange From those four we can reconstruct all our previous sk_atomic_foo. There are three ports: - SkAtomics_std: uses C++11 <atomic>, used with MSVC - SkAtomics_atomic: uses newer GCC/Clang intrinsics, used on not-MSVC where possible - SkAtomics_sync: uses older GCC/Clang intrinsics, used where SkAtomics_atomic not supported No public API changes. TBR=reed@google.com BUG=skia: Review URL: https://codereview.chromium.org/896553002
* Revert of Revert of SSE4 opaque blend using intrinsics instead of assembly. ↵Gravatar stephana2015-02-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (patchset #1 id:1 of https://codereview.chromium.org/873553003/) Reason for revert: Reverted the wrong CL. Original issue's description: > Revert of SSE4 opaque blend using intrinsics instead of assembly. (patchset #16 id:300001 of https://codereview.chromium.org/874863002/) > > Reason for revert: > This causes a bug on the 'hittestpath' GM on MacMini 4,1 > > See: > > https://gold.skia.org/#/triage/hittestpath?head=0 > > for details. > > Original issue's description: > > SSE4 opaque blend using intrinsics instead of assembly. > > > > Since we had such a hard time with the assembly versions of this blit (to the > > point that we have them completely disabled everywhere), I thought I'd take > > a shot at writing a version of the blit using intrinsics. > > > > The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*) > > to skip the blend when the 16 src pixels we consider each loop are all opaque > > or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract > > all those alphas. > > > > It's worth looking to see if we can backport this type of logic to SSE2 using > > _mm_movemask_epi8, or up to 32 pixels at a time using AVX. > > > > My local performance testing doesn't show this to be an unambiguous win > > (there are probably microbenchmarks and SKPs where we'd be better off just > > powering through the blend rather than looking at alphas), but the potential > > does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.) > > > > DM says it draws pixel perfect compare to the old code. > > > > Microbenchmarks: > > bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x > > bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x > > bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x > > bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x > > bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x > > bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x > > bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x > > bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x > > bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x > > bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x > > bitmap_RGBA_8888 5.73us -> 5.58us 0.97x > > bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x > > bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x > > bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x > > bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x > > bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x > > bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x > > > > Running over our ~70 SKP web page captures, this looks like we spend 0.7x > > the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should > > be a decent predictor of real-world impact. > > > > BUG=chromium:399842 > > > > Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785 > > > > CQ_EXTRA_TRYBOTS=client.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot > > > > Committed: https://skia.googlesource.com/skia/+/6dbfb21a6c88af6d94e8c823c3ad559f1a41b493 > > TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org > NOPRESUBMIT=true > NOTREECHECKS=true > NOTRY=true > BUG=chromium:399842 > > Committed: https://skia.googlesource.com/skia/+/4988891a1173cd405bf1c1dd3a3668c451f45e4c TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=chromium:399842 Review URL: https://codereview.chromium.org/894083002
* Revert of SSE4 opaque blend using intrinsics instead of assembly. (patchset ↵Gravatar stephana2015-02-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | #16 id:300001 of https://codereview.chromium.org/874863002/) Reason for revert: This causes a bug on the 'hittestpath' GM on MacMini 4,1 See: https://gold.skia.org/#/triage/hittestpath?head=0 for details. Original issue's description: > SSE4 opaque blend using intrinsics instead of assembly. > > Since we had such a hard time with the assembly versions of this blit (to the > point that we have them completely disabled everywhere), I thought I'd take > a shot at writing a version of the blit using intrinsics. > > The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*) > to skip the blend when the 16 src pixels we consider each loop are all opaque > or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract > all those alphas. > > It's worth looking to see if we can backport this type of logic to SSE2 using > _mm_movemask_epi8, or up to 32 pixels at a time using AVX. > > My local performance testing doesn't show this to be an unambiguous win > (there are probably microbenchmarks and SKPs where we'd be better off just > powering through the blend rather than looking at alphas), but the potential > does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.) > > DM says it draws pixel perfect compare to the old code. > > Microbenchmarks: > bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x > bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x > bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x > bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x > bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x > bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x > bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x > bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x > bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x > bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x > bitmap_RGBA_8888 5.73us -> 5.58us 0.97x > bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x > bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x > bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x > bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x > bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x > bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x > > Running over our ~70 SKP web page captures, this looks like we spend 0.7x > the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should > be a decent predictor of real-world impact. > > BUG=chromium:399842 > > Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785 > > CQ_EXTRA_TRYBOTS=client.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot > > Committed: https://skia.googlesource.com/skia/+/6dbfb21a6c88af6d94e8c823c3ad559f1a41b493 TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=chromium:399842 Review URL: https://codereview.chromium.org/873553003
* SVG backend in DMGravatar mtklein2015-01-31
| | | | | | | | | | Not enabled by default, but this should get you SKPs, GMs etc for free to play with. $ out/Debug/dm -w svgs --src gm skp --config svg BUG=skia: Review URL: https://codereview.chromium.org/892693002
* Reland "remove unused SkAvoidXfermode"Gravatar scroggo2015-01-30
| | | | | | | | (patchset #2 id:20001 of https://codereview.chromium.org/860583002/) SkAvoidXfermode has been moved into Android, so it is safe to remove. Review URL: https://codereview.chromium.org/890893003
* First cut at cleaning up Sergio's example code and moving some common code ↵Gravatar caryclark2015-01-30
| | | | | | | | | | to SkWindow. Eventually, this will be moved to be a peer of SampleApp so it is compiled by the bots to avoid future bit rot. Also ignore XCode auto-generated flag in CommandLineFlags, and remove the unused multiple-example part. Review URL: https://codereview.chromium.org/890873003
* Initial SVG backend stubbingGravatar fmalita2015-01-30
| | | | | | | | | This adds SkSVGDevice and a small utility for converting SKP files to SVG (skp2svg). R=reed@google.com,jcgregorio@google.com BUG=skia:3368 Review URL: https://codereview.chromium.org/892533002
* Define SK_OVERRIDE when building for Android framework.Gravatar scroggo2015-01-30
| | | | Review URL: https://codereview.chromium.org/868243003
* Remove SkProxyCanvas.Gravatar scroggo2015-01-29
| | | | | | | | | | | | | | SkProxyCanvas is redundant with SkNWayCanvas, and means another class we have to keep in sync with the SkCanvas interface. Remove tests which use an SkProxyCanvas. Requires a change to chromium. BUG=skia:3279 BUG=skia:500 Review URL: https://codereview.chromium.org/886813002
* add new gm for SkPath::addArc()Gravatar reed2015-01-29
| | | | | | | BUG=skia: TBR= Review URL: https://codereview.chromium.org/888663002
* The original instantiation of pathops was in the experimental/Intersection ↵Gravatar caryclark2015-01-29
| | | | | | | | directory. Anything of value has been copied into the mainline. The obsolete gyp files are also included, along with a pixman test that never functioned but accidentally referenced some of these deleted files. Review URL: https://codereview.chromium.org/867213004
* GrBatchPrototypeGravatar joshualitt2015-01-28
| | | | | | | | | | BUG=skia: Committed: https://skia.googlesource.com/skia/+/d15e4e45374275c045572b304c229237c4a82be4 Committed: https://skia.googlesource.com/skia/+/d5a7db4a867c7e6ccf8451a053d987b470099198 Review URL: https://codereview.chromium.org/845103005
* Fold gmtoskp into DM, as --src gm --config skp.Gravatar mtklein2015-01-28
| | | | | | BUG=skia: Review URL: https://codereview.chromium.org/885733002
* dstread gmGravatar joshualitt2015-01-28
| | | | | | | TBR= BUG=skia: Review URL: https://codereview.chromium.org/883053002
* Revert of GrBatchPrototype (patchset #32 id:630001 of ↵Gravatar joshualitt2015-01-28
| | | | | | | | | | | | | | | | | | | | | | | | https://codereview.chromium.org/845103005/) Reason for revert: One last try to fix mac perf regression Original issue's description: > GrBatchPrototype > > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/d15e4e45374275c045572b304c229237c4a82be4 > > Committed: https://skia.googlesource.com/skia/+/d5a7db4a867c7e6ccf8451a053d987b470099198 TBR=bsalomon@google.com,kkinnunen@nvidia.com,joshualitt@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/877393002
* GrBatchPrototypeGravatar joshualitt2015-01-27
| | | | | | | | BUG=skia: Committed: https://skia.googlesource.com/skia/+/d15e4e45374275c045572b304c229237c4a82be4 Review URL: https://codereview.chromium.org/845103005
* SSE4 opaque blend using intrinsics instead of assembly.Gravatar mtklein2015-01-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we had such a hard time with the assembly versions of this blit (to the point that we have them completely disabled everywhere), I thought I'd take a shot at writing a version of the blit using intrinsics. The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*) to skip the blend when the 16 src pixels we consider each loop are all opaque or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract all those alphas. It's worth looking to see if we can backport this type of logic to SSE2 using _mm_movemask_epi8, or up to 32 pixels at a time using AVX. My local performance testing doesn't show this to be an unambiguous win (there are probably microbenchmarks and SKPs where we'd be better off just powering through the blend rather than looking at alphas), but the potential does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.) DM says it draws pixel perfect compare to the old code. Microbenchmarks: bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x bitmap_RGBA_8888 5.73us -> 5.58us 0.97x bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x Running over our ~70 SKP web page captures, this looks like we spend 0.7x the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should be a decent predictor of real-world impact. BUG=chromium:399842 Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785 CQ_EXTRA_TRYBOTS=client.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot Review URL: https://codereview.chromium.org/874863002
* Setup Android framework builds to use the appropriate shared lib defines.Gravatar djsollen2015-01-27
| | | | Review URL: https://codereview.chromium.org/864043005
* Revert of GrBatchPrototype (patchset #30 id:570001 of ↵Gravatar joshualitt2015-01-27
| | | | | | | | | | | | | | | | | | | | | | https://codereview.chromium.org/845103005/) Reason for revert: creates large performance regression Original issue's description: > GrBatchPrototype > > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/d15e4e45374275c045572b304c229237c4a82be4 TBR=bsalomon@google.com,joshualitt@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/862823004
* Add ClipDrawMatch SampleApp slideGravatar robertphillips2015-01-27
| | | | | | | | This slide can be used to find and diagnose discrepancies between BW clipping and drawing. BUG=skia:423834 Review URL: https://codereview.chromium.org/872363003
* Split src/opts source lists out of opts.gyp.Gravatar mtklein2015-01-26
| | | | | | | | | | | | | This should make it easier to keep our opts.gyp in sync with Chrome's GYP and GN. BUG=skia: Landing this without review as a mega-tryjob. TBR=reed@google.com Committed: https://skia.googlesource.com/skia/+/c98fe3aa4f8c97c462c0eb6d9106fc37e48d7f82 Review URL: https://codereview.chromium.org/870353003
* Revert of Split src/opts source lists out of opts.gyp. (patchset #1 id:1 of ↵Gravatar mtklein2015-01-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | https://codereview.chromium.org/870353003/) Reason for revert: Android Makefiles broken Original issue's description: > Split src/opts source lists out of opts.gyp. > > This should make it easier to keep our opts.gyp in sync with Chrome's GYP and GN. > > BUG=skia: > > Landing this without review as a mega-tryjob. > TBR=reed@google.com > > Committed: https://skia.googlesource.com/skia/+/c98fe3aa4f8c97c462c0eb6d9106fc37e48d7f82 TBR=mtklein@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/880783002
* Split src/opts source lists out of opts.gyp.Gravatar mtklein2015-01-26
| | | | | | | | | | | This should make it easier to keep our opts.gyp in sync with Chrome's GYP and GN. BUG=skia: Landing this without review as a mega-tryjob. TBR=reed@google.com Review URL: https://codereview.chromium.org/870353003
* Revert of SSE4 opaque blend using intrinsics instead of assembly. (patchset ↵Gravatar bungeman2015-01-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | #14 id:260001 of https://codereview.chromium.org/874863002/) Reason for revert: This kills Mac 10.6 bots. FAILED: c++ -MMD -MF obj/src/opts/opts_sse4.SkBlitRow_opts_SSE4.o.d -DSK_INTERNAL -DSK_GAMMA_SRGB -DSK_GAMMA_APPLY_TO_A8 -DSK_SCALAR_TO_FLOAT_EXCLUDED -DSK_ALLOW_STATIC_GLOBAL_INITIALIZERS=1 -DSK_SUPPORT_GPU=1 -DSK_SUPPORT_OPENCL=0 -DSK_FORCE_DISTANCE_FIELD_TEXT=0 -DSK_BUILD_FOR_MAC -DSK_CRASH_HANDLER -DSK_DEVELOPER=1 -I../../src/core -I../../src/utils -I../../include/c -I../../include/config -I../../include/core -I../../include/pathops -I../../include/pipe -I../../include/utils/mac -I../../include/effects -O0 -gdwarf-2 -mmacosx-version-min=10.6 -arch x86_64 -mssse3 -Wall -Wextra -Winit-self -Wpointer-arith -Wsign-compare -Wno-unused-parameter -Wno-invalid-offsetof -msse4.1 -c ../../src/opts/SkBlitRow_opts_SSE4.cpp -o obj/src/opts/opts_sse4.SkBlitRow_opts_SSE4.o ../../src/opts/SkBlitRow_opts_SSE4.cpp:15:27: warning: x86intrin.h: No such file or directory ../../src/opts/SkBlitRow_opts_SSE4.cpp: In function 'void S32A_Opaque_BlitRow32_SSE4(SkPMColor*, const SkPMColor*, int, U8CPU)': ../../src/opts/SkBlitRow_opts_SSE4.cpp:40: error: '_mm_testz_si128' was not declared in this scope ../../src/opts/SkBlitRow_opts_SSE4.cpp:45: error: '_mm_testc_si128' was not declared in this scope Original issue's description: > SSE4 opaque blend using intrinsics instead of assembly. > > Since we had such a hard time with the assembly versions of this blit (to the > point that we have them completely disabled everywhere), I thought I'd take > a shot at writing a version of the blit using intrinsics. > > The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*) > to skip the blend when the 16 src pixels we consider each loop are all opaque > or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract > all those alphas. > > It's worth looking to see if we can backport this type of logic to SSE2 using > _mm_movemask_epi8, or up to 32 pixels at a time using AVX. > > My local performance testing doesn't show this to be an unambiguous win > (there are probably microbenchmarks and SKPs where we'd be better off just > powering through the blend rather than looking at alphas), but the potential > does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.) > > DM says it draws pixel perfect compare to the old code. > > Microbenchmarks: > bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x > bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x > bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x > bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x > bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x > bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x > bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x > bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x > bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x > bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x > bitmap_RGBA_8888 5.73us -> 5.58us 0.97x > bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x > bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x > bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x > bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x > bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x > bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x > > Running over our ~70 SKP web page captures, this looks like we spend 0.7x > the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should > be a decent predictor of real-world impact. > > BUG=chromium:399842 > > Committed: https://skia.googlesource.com/skia/+/04bc91b972417038fecfa87c484771eac2b9b785 TBR=henrik.smiding@intel.com,mtklein@google.com,herb@google.com,reed@google.com,thakis@chromium.org,mtklein@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=chromium:399842 Review URL: https://codereview.chromium.org/874033004
* SSE4 opaque blend using intrinsics instead of assembly.Gravatar mtklein2015-01-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we had such a hard time with the assembly versions of this blit (to the point that we have them completely disabled everywhere), I thought I'd take a shot at writing a version of the blit using intrinsics. The key feature of SSE4 we're exploiting is that we can use ptest (_mm_test*) to skip the blend when the 16 src pixels we consider each loop are all opaque or all transparent. _mm_shuffle_epi8 from SSSE3 also lends a hand to extract all those alphas. It's worth looking to see if we can backport this type of logic to SSE2 using _mm_movemask_epi8, or up to 32 pixels at a time using AVX. My local performance testing doesn't show this to be an unambiguous win (there are probably microbenchmarks and SKPs where we'd be better off just powering through the blend rather than looking at alphas), but the potential does seem tantalizing enough to let skiaperf vet it on the bots. (< 1.0x is a win.) DM says it draws pixel perfect compare to the old code. Microbenchmarks: bitmap_RGBA_8888_A_source_stripes_two 14us -> 14.4us 1.03x bitmap_RGBA_8888_A_source_stripes_three 14.3us -> 14.5us 1.01x bitmap_RGBA_8888_scale_bilerp 61.9us -> 62.2us 1.01x bitmap_RGBA_8888_update_volatile_scale_rotate_bilerp 102us -> 101us 0.99x bitmap_RGBA_8888_scale_rotate_bilerp 103us -> 101us 0.99x bitmap_RGBA_8888_scale 18.4us -> 18.2us 0.99x bitmap_RGBA_8888_A_scale_rotate_bicubic 71us -> 70us 0.99x bitmap_RGBA_8888_update_scale_rotate_bilerp 103us -> 101us 0.99x bitmap_RGBA_8888_A_scale_rotate_bilerp 112us -> 109us 0.98x bitmap_RGBA_8888_update_volatile 5.72us -> 5.58us 0.98x bitmap_RGBA_8888 5.73us -> 5.58us 0.97x bitmap_RGBA_8888_update 5.78us -> 5.6us 0.97x bitmap_RGBA_8888_A_scale_bilerp 70.7us -> 68us 0.96x bitmap_RGBA_8888_A_scale_bicubic 23.7us -> 21.8us 0.92x bitmap_RGBA_8888_A 13.9us -> 10.9us 0.78x bitmap_RGBA_8888_A_source_opaque 14us -> 6.29us 0.45x bitmap_RGBA_8888_A_source_transparent 14us -> 3.65us 0.26x Running over our ~70 SKP web page captures, this looks like we spend 0.7x the time in S32A_Opaque_BlitRow compared to the SSE2 version, which should be a decent predictor of real-world impact. BUG=chromium:399842 Review URL: https://codereview.chromium.org/874863002
* GrBatchPrototypeGravatar joshualitt2015-01-26
| | | | | | BUG=skia: Review URL: https://codereview.chromium.org/845103005
* s/sk_tools::DrawCheckerboard/sk_tool_utils::draw_checkerboard/Gravatar halcanary2015-01-26
| | | | | | BUG=skia: Review URL: https://codereview.chromium.org/873333004
* add bench for building mipmapsGravatar reed2015-01-26
| | | | | | | BUG=skia: TBR= Review URL: https://codereview.chromium.org/873293003
* Factor out checkerboard function in gm and sampleapp into tools.Gravatar halcanary2015-01-26
| | | | Review URL: https://codereview.chromium.org/834303005
* Collapse consecutive SkTableColorFiltersGravatar cwallez2015-01-26
| | | | | | | | | | | BUG=skia:1366 For the added bench, the collapsing makes the bench take: - 70% of the time for CPU rendering of 3 consecutive matrix filters - almost no change in the GPU rendering of the matrix filters - 50% of the time for CPU and GPU rendering of 3 consecutive table filters Review URL: https://codereview.chromium.org/776673002
* Fold alpha to the inner savelayer in savelayer-savelayer-restore patternsGravatar kkinnunen2015-01-26
| | | | | | | | | | | | | | | | | | | | | | | | | | Fold alpha to the inner savelayer in savelayer-savelayer-restore patterns such as this: SaveLayer (non-opaque) Save ClipRect SaveLayer Restore Restore Restore Current blink generates these for example for SVG content such as this: <path style="opacity:0.5 filter:url(#blur_filter)"/> The outer save layer is due to the opacity and the inner one is due to blur filter being implemented with picture image filter. Reduces layers in desk_carsvg.skp testcase from 115 to 78. BUG=skia:3119 Review URL: https://codereview.chromium.org/835973005
* initial preroll apiGravatar reed2015-01-25
| | | | | | BUG=skia: Review URL: https://codereview.chromium.org/855473002
* experimental/skp_to_pdf_md5 optionally also outputs pdf filesGravatar halcanary2015-01-24
| | | | | | TBR=mtklein@google.com Review URL: https://codereview.chromium.org/868333002
* Take budgeted param when snapping new image.Gravatar bsalomon2015-01-23
| | | | Review URL: https://codereview.chromium.org/872543002
* Update compiler warning flagsGravatar mtklein2015-01-23
| | | | | | | | | | | | - add -Wsign-compare, which has been catching useful issues for Kimmo; - add -Winit-self and -Wpointer-arith to Mac builds so everyone's using the same flags; - try try removing -Wno-uninitialized. This was only for the old 10.6 compiler that we have warnings set as non-errors now. BUG=skia: Review URL: https://codereview.chromium.org/872793002
* Remove GrBinHashKeyGravatar bsalomon2015-01-23
| | | | Review URL: https://codereview.chromium.org/861323002
* Add specialized content key class for resources.Gravatar bsalomon2015-01-23
| | | | Review URL: https://codereview.chromium.org/858123002
* remove unnecessary guard flags for android (for conics)Gravatar reed2015-01-22
| | | | | | | BUG=skia: NOTREECHECKS=True Review URL: https://codereview.chromium.org/868783002
* Rename GrOptDrawState to GrPipeline and GrDrawState to GrPipelineBuilderGravatar egdaniel2015-01-22
| | | | | | BUG=skia: Review URL: https://codereview.chromium.org/858343002
* remove (unused) GatherPixelRefsGravatar reed2015-01-22
| | | | | | BUG=skia: Review URL: https://codereview.chromium.org/869463002
* Don't require -DSK_USE_POSIX_THREADS.Gravatar mtklein2015-01-21
| | | | | | | | | | | | | To compile SkCondVar, we already require either pthreads or Windows. This simplifies that code to not need SK_USE_POSIX_THREADS to be explicitly defined. We'll just look to see if we're targeting Windows, and if not, assume pthreads. Both before and after this CL, that code will fail to compile if we're not on Windows and don't have pthreads. BUG=skia: Review URL: https://codereview.chromium.org/869443003