aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/opts
Commit message (Collapse)AuthorAge
* specialize arm64 allTrue()/anyTrue()Gravatar Mike Klein2018-03-26
| | | | | | | | | | | | | | | | | aarch64 added vector-wise add/mul/min/max instructions. We can use min and max to implement allTrue() and anyTrue(), respectively. (This CL is mostly so I don't forget these intrinsics exist.) In assembly, these actually compile to two instructions, the folding operation into a vector register, then a move from the vector register to a general purpose register. Change-Id: Ia6a999ac250740de765e871094e911979a8711c7 Reviewed-on: https://skia-review.googlesource.com/116482 Reviewed-by: Chris Dalton <csmartdalton@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Revert "Implement Sk2f::Store2"Gravatar Chris Dalton2018-03-23
| | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 8a8a8e9dd5c47f3fc930064bd030790f98af27af. Reason for revert: Needs non-SIMD impl Original change's description: > Implement Sk2f::Store2 > > Bug: skia: > Change-Id: Ieedd05ced376a7604936e9d2729fc20a8669496e > Reviewed-on: https://skia-review.googlesource.com/115531 > Commit-Queue: Chris Dalton <csmartdalton@google.com> > Reviewed-by: Mike Klein <mtklein@google.com> TBR=mtklein@google.com,csmartdalton@google.com Change-Id: I8dfbd87c5871b041a4fc6ef3816f121c72083a20 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: skia: Reviewed-on: https://skia-review.googlesource.com/116240 Reviewed-by: Chris Dalton <csmartdalton@google.com> Commit-Queue: Chris Dalton <csmartdalton@google.com>
* Implement Sk2f::Store2Gravatar Chris Dalton2018-03-23
| | | | | | | | Bug: skia: Change-Id: Ieedd05ced376a7604936e9d2729fc20a8669496e Reviewed-on: https://skia-review.googlesource.com/115531 Commit-Queue: Chris Dalton <csmartdalton@google.com> Reviewed-by: Mike Klein <mtklein@google.com>
* small ABI + narrow/wide code updatesGravatar Mike Klein2018-03-21
| | | | | | | | | | | | | | | | | | | | The only tangible effect this CL should have is to use __vectorcall on all Windows builds, including scalar ones. The code generation is a little better there with __vectorcall than not, so might as well. This is a baby step towards vector stages with MSVC, but a very baby step indeed. Mostly this refactors and regroups a bunch of logic to reflect my current thoughts. The BUILD.gn changes are essentially no-ops, but they simplify things and make our flags more similar to how those targets are built in Chromium. (And I cleaned up other /arch: uses so this works.) Change-Id: I73dd39d15cdc7b3d268231a707952bbbfd91496e Reviewed-on: https://skia-review.googlesource.com/115644 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* detect nonfinite cubic pointsGravatar Mike Reed2018-03-20
| | | | | | | | | Bug: oss-fuzz:7031 Cq-Include-Trybots: skia.primary:Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Release-All-SKNX_NO_SIMD Change-Id: I657c0652dc863256f445a84c084ccc37d287f534 Reviewed-on: https://skia-review.googlesource.com/115222 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Reed <reed@google.com>
* exact divide by 255 with NEONGravatar Mike Klein2018-03-14
| | | | | | | Change-Id: Ib121eb0d5af1f22f48f517fe909112a77d92032e Reviewed-on: https://skia-review.googlesource.com/113666 Commit-Queue: Mike Klein <mtklein@chromium.org> Reviewed-by: Herb Derby <herb@google.com>
* Small cleanups suggested by ClangTidyGravatar Kevin Lubick2018-03-12
| | | | | | | | | Bug: skia: Cq-Include-Trybots: skia.primary:Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Release-All-SKNX_NO_SIMD Change-Id: Idd95e359838fdaecbdccc3a2c5a1b36971f20b8b Reviewed-on: https://skia-review.googlesource.com/113703 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Kevin Lubick <kjlubick@google.com>
* with no more offline codegen, seed_shader can be normalGravatar Mike Klein2018-03-11
| | | | | | | | | | | | | | We passed this iota array as an argument before because it was generating awkward code for our object file parser to handle (relocations, other weird things, can't quite remember). Now that we're compiling pipeline code normally, we can make seed_shader a normal stage again, with no special iota ctx pointer needed. Change-Id: I3929d61bfb6f914248f360c2c2326ce3d1f23163 Reviewed-on: https://skia-review.googlesource.com/113667 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* follow JUMPER_NARROW_STAGES in lowp stages tooGravatar Mike Klein2018-03-10
| | | | | | | | | | | | | | Should give dramatically better codegen for all 32-bit builds and 64-bit Windows builds, bringing it in line with how we make highp float stages. May help this bug, which is mostly Windows perf regressions. Bug: chromium:820469 Change-Id: I223f7568a09dea28ec614b18555766ea7d8365fa Reviewed-on: https://skia-review.googlesource.com/113665 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Reland "Reland "make SkJumper stages normal Skia code""Gravatar Mike Klein2018-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a reland of 78cb579f33943421afc8423a39867fcfd69fed44 This time, lowp stages are controlled by !defined(JUMPER_IS_SCALAR), not by defined(__clang__). The two are usually the same, except when we opt Clang builds into JUMPER_IS_SCALAR artificially. Some Google3 builds use compilers old enough that they barf when compiling our NEON code. It's conceivably also possible to define JUMPER_IS_SCALAR yourself, but I don't think anyone does that. Original change's description: > Reland "make SkJumper stages normal Skia code" > > This is a reland of 22e536e3a1a09405d1c0e6f071717a726d86e8d4 > > Now with fixed #include paths in SkRasterPipeline_opts.h, > and -ffp-contract=fast for the :hsw target to minimize > diffs on non-Windows Clang AVX2/AVX-512 bots. > > Original change's description: > > make SkJumper stages normal Skia code > > > > Enough clients are using Clang now that we can say, use Clang to build > > if you want these software pipeline stages to go fast. > > > > This lets us drop the offline build aspect of SkJumper stages, instead > > building as part of Skia using the SkOpts framework. > > > > I think everything should work, except I've (temporarily) removed > > AVX-512 support. I will put this back in a follow up. > > > > I have had to drop Windows down to __vectorcall and our narrower > > stage calling convention that keeps the d-registers on the stack. > > I tried forcing sysv_abi, but that crashed Clang. :/ > > > > Added a TODO to up the same narrower stage calling convention > > for lowp stages... we just *don't* today, for no good reason. > > > > Change-Id: Iaaa792ffe4deab3508d2dc5d0008c163c24b3383 > > Reviewed-on: https://skia-review.googlesource.com/110641 > > Commit-Queue: Mike Klein <mtklein@chromium.org> > > Reviewed-by: Herb Derby <herb@google.com> > > Reviewed-by: Florin Malita <fmalita@chromium.org> > > Change-Id: I44f2c03d33958e3807747e40904b6351957dd448 > Reviewed-on: https://skia-review.googlesource.com/112742 > Reviewed-by: Mike Klein <mtklein@chromium.org> Change-Id: I3d71197d4bbb19ca4a94961a97fa2e54d5cbfb0d Reviewed-on: https://skia-review.googlesource.com/112744 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Mike Klein <mtklein@google.com>
* Revert "Reland "make SkJumper stages normal Skia code""Gravatar Mike Klein2018-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 78cb579f33943421afc8423a39867fcfd69fed44. Reason for revert: lowp should be controlled by defined(JUMPER_IS_SCALAR), not defined(__clang__). So close. Original change's description: > Reland "make SkJumper stages normal Skia code" > > This is a reland of 22e536e3a1a09405d1c0e6f071717a726d86e8d4 > > Now with fixed #include paths in SkRasterPipeline_opts.h, > and -ffp-contract=fast for the :hsw target to minimize > diffs on non-Windows Clang AVX2/AVX-512 bots. > > Original change's description: > > make SkJumper stages normal Skia code > > > > Enough clients are using Clang now that we can say, use Clang to build > > if you want these software pipeline stages to go fast. > > > > This lets us drop the offline build aspect of SkJumper stages, instead > > building as part of Skia using the SkOpts framework. > > > > I think everything should work, except I've (temporarily) removed > > AVX-512 support. I will put this back in a follow up. > > > > I have had to drop Windows down to __vectorcall and our narrower > > stage calling convention that keeps the d-registers on the stack. > > I tried forcing sysv_abi, but that crashed Clang. :/ > > > > Added a TODO to up the same narrower stage calling convention > > for lowp stages... we just *don't* today, for no good reason. > > > > Change-Id: Iaaa792ffe4deab3508d2dc5d0008c163c24b3383 > > Reviewed-on: https://skia-review.googlesource.com/110641 > > Commit-Queue: Mike Klein <mtklein@chromium.org> > > Reviewed-by: Herb Derby <herb@google.com> > > Reviewed-by: Florin Malita <fmalita@chromium.org> > > Change-Id: I44f2c03d33958e3807747e40904b6351957dd448 > Reviewed-on: https://skia-review.googlesource.com/112742 > Reviewed-by: Mike Klein <mtklein@chromium.org> TBR=mtklein@chromium.org,herb@google.com,fmalita@chromium.org Change-Id: Ie64da98f5187d44e03c0ce05d7cb189d4a6e6663 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://skia-review.googlesource.com/112743 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Mike Klein <mtklein@google.com>
* Reland "make SkJumper stages normal Skia code"Gravatar Mike Klein2018-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a reland of 22e536e3a1a09405d1c0e6f071717a726d86e8d4 Now with fixed #include paths in SkRasterPipeline_opts.h, and -ffp-contract=fast for the :hsw target to minimize diffs on non-Windows Clang AVX2/AVX-512 bots. Original change's description: > make SkJumper stages normal Skia code > > Enough clients are using Clang now that we can say, use Clang to build > if you want these software pipeline stages to go fast. > > This lets us drop the offline build aspect of SkJumper stages, instead > building as part of Skia using the SkOpts framework. > > I think everything should work, except I've (temporarily) removed > AVX-512 support. I will put this back in a follow up. > > I have had to drop Windows down to __vectorcall and our narrower > stage calling convention that keeps the d-registers on the stack. > I tried forcing sysv_abi, but that crashed Clang. :/ > > Added a TODO to up the same narrower stage calling convention > for lowp stages... we just *don't* today, for no good reason. > > Change-Id: Iaaa792ffe4deab3508d2dc5d0008c163c24b3383 > Reviewed-on: https://skia-review.googlesource.com/110641 > Commit-Queue: Mike Klein <mtklein@chromium.org> > Reviewed-by: Herb Derby <herb@google.com> > Reviewed-by: Florin Malita <fmalita@chromium.org> Change-Id: I44f2c03d33958e3807747e40904b6351957dd448 Reviewed-on: https://skia-review.googlesource.com/112742 Reviewed-by: Mike Klein <mtklein@chromium.org>
* Revert "make SkJumper stages normal Skia code"Gravatar Mike Klein2018-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 22e536e3a1a09405d1c0e6f071717a726d86e8d4. Reason for revert: wrong include path :/ Original change's description: > make SkJumper stages normal Skia code > > Enough clients are using Clang now that we can say, use Clang to build > if you want these software pipeline stages to go fast. > > This lets us drop the offline build aspect of SkJumper stages, instead > building as part of Skia using the SkOpts framework. > > I think everything should work, except I've (temporarily) removed > AVX-512 support. I will put this back in a follow up. > > I have had to drop Windows down to __vectorcall and our narrower > stage calling convention that keeps the d-registers on the stack. > I tried forcing sysv_abi, but that crashed Clang. :/ > > Added a TODO to up the same narrower stage calling convention > for lowp stages... we just *don't* today, for no good reason. > > Change-Id: Iaaa792ffe4deab3508d2dc5d0008c163c24b3383 > Reviewed-on: https://skia-review.googlesource.com/110641 > Commit-Queue: Mike Klein <mtklein@chromium.org> > Reviewed-by: Herb Derby <herb@google.com> > Reviewed-by: Florin Malita <fmalita@chromium.org> TBR=mtklein@chromium.org,herb@google.com,fmalita@chromium.org Change-Id: I2bdc709c80cdfa6b13ff24e024b3721bef887f46 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://skia-review.googlesource.com/112741 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* make SkJumper stages normal Skia codeGravatar Mike Klein2018-03-07
| | | | | | | | | | | | | | | | | | | | | | | | Enough clients are using Clang now that we can say, use Clang to build if you want these software pipeline stages to go fast. This lets us drop the offline build aspect of SkJumper stages, instead building as part of Skia using the SkOpts framework. I think everything should work, except I've (temporarily) removed AVX-512 support. I will put this back in a follow up. I have had to drop Windows down to __vectorcall and our narrower stage calling convention that keeps the d-registers on the stack. I tried forcing sysv_abi, but that crashed Clang. :/ Added a TODO to up the same narrower stage calling convention for lowp stages... we just *don't* today, for no good reason. Change-Id: Iaaa792ffe4deab3508d2dc5d0008c163c24b3383 Reviewed-on: https://skia-review.googlesource.com/110641 Commit-Queue: Mike Klein <mtklein@chromium.org> Reviewed-by: Herb Derby <herb@google.com> Reviewed-by: Florin Malita <fmalita@chromium.org>
* Implement Sk2f::Store4Gravatar Chris Dalton2018-02-07
| | | | | | | | Bug: skia: Change-Id: I2adb983d68625d327e7c00e53b6ae4703b46252f Reviewed-on: https://skia-review.googlesource.com/104761 Commit-Queue: Chris Dalton <csmartdalton@google.com> Reviewed-by: Mike Klein <mtklein@chromium.org>
* Enable conditional-uninitialized flagGravatar Kevin Lubick2018-01-05
| | | | | | | | | Bug: skia:7462 Cq-Include-Trybots: skia.primary:Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Release-All-SKNX_NO_SIMD Change-Id: I1c0a09984bf28a5c620a89af56040f018bae6310 Reviewed-on: https://skia-review.googlesource.com/90941 Commit-Queue: Kevin Lubick <kjlubick@google.com> Reviewed-by: Mike Klein <mtklein@chromium.org>
* Remove previous blur image implementation. Try 2Gravatar Herbert Derby2017-12-06
| | | | | | | | | | | The last version had a problem with the no simd compilation. TBR=mtklein@google.com Change-Id: I139388cf3bf1b55cb4a49133a98be129fd9711c6 Reviewed-on: https://skia-review.googlesource.com/80983 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Herb Derby <herb@google.com>
* Revert "Remove previous blur image implementation"Gravatar Herb Derby2017-12-06
| | | | | | | | | | | | | | | | | | | | | | | | This reverts commit a11346dd24fa7122246dc96ba15604713e460036. Reason for revert: Compile problems on linux Original change's description: > Remove previous blur image implementation > > Change-Id: Ie3bada767f0ba945cb17f174f179510768eb178d > Reviewed-on: https://skia-review.googlesource.com/77583 > Commit-Queue: Herb Derby <herb@google.com> > Reviewed-by: Mike Klein <mtklein@google.com> TBR=mtklein@google.com,herb@google.com Change-Id: I5f157fbc2fd4e2a37618dc9c0e72621ff9892ef6 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://skia-review.googlesource.com/81100 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Herb Derby <herb@google.com>
* Remove previous blur image implementationGravatar Herbert Derby2017-12-06
| | | | | | | Change-Id: Ie3bada767f0ba945cb17f174f179510768eb178d Reviewed-on: https://skia-review.googlesource.com/77583 Commit-Queue: Herb Derby <herb@google.com> Reviewed-by: Mike Klein <mtklein@google.com>
* Add Store3 to Sk2fGravatar Chris Dalton2017-12-01
| | | | | | | | Bug: skia: Change-Id: I0377e6a1dd8259e944f7902a5c68af524fa588c7 Reviewed-on: https://skia-review.googlesource.com/79382 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Chris Dalton <csmartdalton@google.com>
* add Load2() to Sk4fGravatar Mike Klein2017-12-01
| | | | | | | | | and test it. Change-Id: Ib0c2cf93c63d8d3c36a7d4d60bbec4ecede29bc7 Reviewed-on: https://skia-review.googlesource.com/78480 Reviewed-by: Chris Dalton <csmartdalton@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Fix Sk8b reading too many bytesGravatar Herb Derby2017-11-16
| | | | | | | Change-Id: I0e94ef1620b54405a23470507e2b2c4bb54731c9 Reviewed-on: https://skia-review.googlesource.com/72860 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Herb Derby <herb@google.com>
* Support for direct gaussian blur evaluationGravatar Herb Derby2017-11-02
| | | | | | | Change-Id: I1b00ba2720648b75fce47d3f4d0f56fb8f2cd171 Reviewed-on: https://skia-review.googlesource.com/67041 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Herb Derby <herb@google.com>
* Add mulHi to SkNxGravatar Herb Derby2017-10-11
| | | | | | | | | | | | | Add mulHi to base SkNx, and specialize implementations for Sk4u for neon and sse. Add casts for converting from uint8_t by 4 to uint32_t by 4. Cq-Include-Trybots: skia.primary:Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Change-Id: I29a32e2ad9812a47fff841ceca334e562362836f Reviewed-on: https://skia-review.googlesource.com/57960 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Herb Derby <herb@google.com>
* make most of SkColorPriv.h privateGravatar Cary Clark2017-09-15
| | | | | | | | | | | | | created new file src/core/SkColorData.h for internal consumption. Note that many of the functions there are unused as well. Bug: skia: 6898 R: reed@google.com Change-Id: I25bfd5a9c21f53558c4ca65a77eb5d322d897c6d Reviewed-on: https://skia-review.googlesource.com/46848 Commit-Queue: Cary Clark <caryclark@google.com> Reviewed-by: Mike Reed <reed@google.com>
* Sk4i version of blur.Gravatar Herb Derby2017-09-14
| | | | | | | | | | | | | | | | | For the blur_1.50_normal_low_quality benchmark, this code goes from about 120us to 85us. The original implementation executes at about 95us. This changed in controlled by the flag: SK_SUPPORT_LEGACY_SLOW_SMALL_BLUR BUG=chromium:759070 CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Change-Id: If722cb8ffd8c47a94b7a6b4e6dd26fd1474b6209 Reviewed-on: https://skia-review.googlesource.com/45300 Commit-Queue: Herb Derby <herb@google.com> Reviewed-by: Mike Klein <mtklein@chromium.org>
* Add missing methods to neon/sse SkNx implementationsGravatar Chris Dalton2017-08-28
| | | | | | | | | | Adds negate, abs, sqrt to Sk2f and/or Sk4f. Bug: skia: Change-Id: I0688dae45b32ff94abcc0525ef1f09d666f9c6e9 Reviewed-on: https://skia-review.googlesource.com/39642 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Chris Dalton <csmartdalton@google.com>
* make SkOpts functions inline, not staticGravatar Mike Klein2017-08-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | When Skia's built with an interestingly advanced instruction set baseline like SSSE3 or SSE4.1, we end up with two distinct copies of some SkOpts functions, one default in SkOpts.o and one specialization from SkOpts_{ssse3,sse41}.o. These functions are static, and so are technically unrelated, even though they're the same code compiled with the same instructions available. They're going to be identical. What we want here is to remove static but mark them as inline instead. In this case inline means "if the linker sees multiple copies of this, that's cool, just pick any one arbitrarily". That's just what we want. Now, when I disassemble a binary before and after this change, I do see the redundant routines removed. However, the file size change is minimal... I suspect that this must mean the linker has noticed that we had identical code and physically folded the two logically independent routines. I don't know how prevalent this optimization is, though, so it doesn't hurt to give it more of a "one copy please" hint with inline. There may also be a difference here between the binary size (~unchanged) and the in-memory layout of that binary? Change-Id: Id9c8f0ffc84aa1c9a066c22b623d34adab281857 Reviewed-on: https://skia-review.googlesource.com/37501 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Ben Wagner <bungeman@google.com>
* remove code associated with legacy affine imageshadersGravatar Mike Reed2017-08-23
| | | | | | | | | | | requires https://skia-review.googlesource.com/c/33180 CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Bug: skia: Change-Id: I226e120cc5aebe393bda8bc069e7927fdc981a0e Reviewed-on: https://skia-review.googlesource.com/36800 Commit-Queue: Mike Reed <reed@google.com> Reviewed-by: Florin Malita <fmalita@chromium.org>
* explicitly vectorize sk_memset{16,32,64}Gravatar Mike Klein2017-08-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This ought to help clients who don't enable autovectorization. With autovectorization enabled, this new version is like, hyper-vectorized compared to the old autovectorization. Instead of handling 128 bytes max per loop, it now handles up to 512 bytes per loop. Pretty exciting. Locally perf effects are a mix, but we'd expect this to help Chrome unambiguously if they've turned off autovectorization. $ out/ok bench:samples=100 sw filter:match=memset32_\\d\* serial Before: [memset32_100000] 16ms @0 20.1ms @99 20.2ms @100 [memset32_10000] 1.07ms @0 1.26ms @99 1.31ms @100 [memset32_1000] 73.9µs @0 89.4µs @99 90.1µs @100 [memset32_100] 8.59µs @0 9.74µs @99 9.96µs @100 [memset32_10] 7.45µs @0 8.96µs @99 8.99µs @100 [memset32_1] 2.29µs @0 2.81µs @99 2.92µs @100 After: [memset32_100000] 16.2ms @0 17.3ms @99 17.3ms @100 [memset32_10000] 1.06ms @0 1.18ms @99 1.23ms @100 [memset32_1000] 72µs @0 75.6µs @99 84.7µs @100 [memset32_100] 9.14µs @0 10.6µs @99 10.7µs @100 [memset32_10] 5.43µs @0 5.88µs @99 5.99µs @100 [memset32_1] 3.43µs @0 3.65µs @99 3.83µs @100 BUG=chromium:755391 Change-Id: If9059a30ca7a345f1f7c37bd51473c29e8bb8922 Reviewed-on: https://skia-review.googlesource.com/34746 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Retry cleaning up SkLinearBitmapPipeline.Gravatar Mike Klein2017-07-20
| | | | | | | | | | | | | | | | | | This is mostly dead code. In order to make it truly dead, we need to opt drawing unpremul images into SkRasterPipelineBlitter. They had been handled by SkLinearBitmapPipeline, but can't be draw by SkBitmapProcLegacyShader. Drawing unpremul images is tested by the GM all_variants_8888, which gave us trouble last time around (serialize-8888 drew right, 8888 wrong) but now draws fine. I think this was probably also the root of the revert, drawing some unpremul image in Chrome's tests somewhere. Change-Id: I453f9df44ade807316935921cbae82961e2f08aa Reviewed-on: https://skia-review.googlesource.com/24862 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Assume HQ is handled by pipeline, delete legacy code-pathGravatar Mike Reed2017-07-20
| | | | | | | | | CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Bug: skia: Change-Id: If6f0d0a57463bf99a66d674e65a62ce3931d0116 Reviewed-on: https://skia-review.googlesource.com/24644 Commit-Queue: Mike Reed <reed@google.com> Reviewed-by: Mike Klein <mtklein@chromium.org>
* Simplify Sk4i's Min/MaxGravatar Yuqian Li2017-07-14
| | | | | | | | | CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Bug: skia: Change-Id: Id82a58f2f4d947894bb710cb3190c873b20b98eb Reviewed-on: https://skia-review.googlesource.com/23404 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Yuqian Li <liyuqian@google.com>
* Implement Sk4i's abs, min, maxGravatar Yuqian Li2017-07-12
| | | | | | | | | CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Bug: skia: Change-Id: Ia9ec3f72095e1c744f88df7bb990d99e0f87d578 Reviewed-on: https://skia-review.googlesource.com/22720 Commit-Queue: Yuqian Li <liyuqian@google.com> Reviewed-by: Herb Derby <herb@google.com>
* remove unreachable perspective code for imageshaderGravatar Mike Reed2017-07-12
| | | | | | | | | CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Bug: skia: Change-Id: If9a7df3e1c387098b00bf1cc1a37c36c6d256ef1 Reviewed-on: https://skia-review.googlesource.com/22348 Commit-Queue: Florin Malita <fmalita@chromium.org> Reviewed-by: Florin Malita <fmalita@chromium.org>
* header cleanupGravatar Hal Canary2017-07-05
| | | | | | | Change-Id: I3f7667a1357194ae2bdd341ad9d46eb93920f404 Reviewed-on: https://skia-review.googlesource.com/21374 Reviewed-by: Brian Salomon <bsalomon@google.com> Commit-Queue: Hal Canary <halcanary@google.com>
* remove unreachable samples for non-N32 imageshadersGravatar Mike Reed2017-06-29
| | | | | | | | | | CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Bug: skia: Change-Id: I2d9a22b22d72c81a742b8fd497797bff8174915b Reviewed-on: https://skia-review.googlesource.com/21264 Commit-Queue: Eric Boren <borenet@google.com> Commit-Queue: Mike Reed <reed@google.com> Reviewed-by: Florin Malita <fmalita@chromium.org>
* Revert "remove a bit more dead code"Gravatar Mike Reed2017-06-26
| | | | | | | | | | | | | | | | | | | | | | | This reverts commit d9b1fe02a677ec44bc1c99809f3ee7eb08708137. Reason for revert: try to fix chrome roll Original change's description: > remove a bit more dead code > > Change-Id: I61484672e88d6bb4f75833ee89e7178c4f34d610 > Reviewed-on: https://skia-review.googlesource.com/20780 > Commit-Queue: Mike Klein <mtklein@google.com> > Reviewed-by: Herb Derby <herb@google.com> TBR=mtklein@chromium.org,mtklein@google.com,herb@google.com # Not skipping CQ checks because original CL landed > 1 day ago. Change-Id: I03dcd344dfb138261d9421b0692d12e4ed431100 Reviewed-on: https://skia-review.googlesource.com/20822 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Mike Reed <reed@google.com>
* remove a bit more dead codeGravatar Mike Klein2017-06-24
| | | | | | | Change-Id: I61484672e88d6bb4f75833ee89e7178c4f34d610 Reviewed-on: https://skia-review.googlesource.com/20780 Commit-Queue: Mike Klein <mtklein@google.com> Reviewed-by: Herb Derby <herb@google.com>
* remove unused SkFilterProcsGravatar Mike Reed2017-06-21
| | | | | | | | | CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Bug: skia: Change-Id: I7f9abb3ecb1c7b4fd18a703198c54c6d8f5e1ef7 Reviewed-on: https://skia-review.googlesource.com/20429 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Mike Reed <reed@google.com>
* Refactor SkBlurImageFilter_opts.h for readability.Gravatar Mike Klein2017-06-13
| | | | | | | | | | | This is mostly reindentation to make each platform's code clear, and rewriting comments to match the usual way we orient vector comments these days (in memory order). Change-Id: I7ceb98c5af88980e74b6a124507e0ef1900fc731 Reviewed-on: https://skia-review.googlesource.com/19663 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* remove unneeded proc fieldsGravatar Mike Reed2017-06-12
| | | | | | | | | CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Bug: skia: Change-Id: Ibf997c8d19a045d41d3e92b8db63c36f8fa10b3e Reviewed-on: https://skia-review.googlesource.com/19441 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Mike Reed <reed@google.com>
* replace 4f procs with pipeline (only called in 2 places by ganesh)Gravatar Mike Reed2017-06-09
| | | | | | | | | | | enables lots of code to delete CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Bug: skia: Change-Id: I13631ead68a9232bd8c13c5ef54727f44def26ca Reviewed-on: https://skia-review.googlesource.com/19278 Reviewed-by: Mike Klein <mtklein@google.com> Commit-Queue: Mike Reed <reed@google.com>
* remove unused xfermode methodsGravatar Mike Reed2017-06-06
| | | | | | | | | CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Bug: skia: Change-Id: Ibc7d581bcc40134ee7cf57bb65fee2d70e119bc7 Reviewed-on: https://skia-review.googlesource.com/18842 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Mike Reed <reed@google.com>
* simplify sse41::srcover_srgb_srgbGravatar Mike Klein2017-05-30
| | | | | | | | Most importantly, remove the undefined behavior implied by "delta". Change-Id: I8f9740804ec74dd40b049eafd4f0d51b36ce3237 Reviewed-on: https://skia-review.googlesource.com/18140 Reviewed-by: Herb Derby <herb@google.com>
* remove sse2::srcover_srgb_srgbGravatar Mike Klein2017-05-30
| | | | | | | Change-Id: Icc570d8a8f1df1dea202e1d234433491122b9b67 Reviewed-on: https://skia-review.googlesource.com/18135 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Faster and more accurate blit_row_s32a_opaque for ARMGravatar Matteo Franchin2017-05-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change ARM implementation of alpha blending to work on 8 pixels at a time (using NEON). Also improve the accuracy of alpha blending by using a formula based on SkMulDiv255Round rather than SkPMSrcOver. Note that a number of variations of this code were considered. Here are some notes: - A 16 pixels at a time version was considered. This performs well for the case of extreme alpha (all-opaque or all-transparent pixels), but performs worst than the 8 pixels version when there are frequent transitions of alpha. Also gcc 6.2.1 seems to have troubles with register pressure when using this version. - If the branch to detect the fully-opaque or fully-transparent cases is removed, then the performance increases significantly for images which are all partially transparent (especially on ARM Cortex A72), but can significantly decrease for images that are almost fully opaque or fully transparent. This implementation is a compromise to the effects described above. This patch produces a ~10% improvement on the nanobench's sub-scores repeatTile_BGRA_8888_A, constXTile_MM_filter_trans, constXTile_CC_trans, constXTile_RR_filter_trans when running on ARM Cortex A72. Improvements of greater magnitude (20% to 30%) are observed when running on ARM Cortex A53. CQ_INCLUDE_TRYBOTS=skia.primary:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Change-Id: I1f0c9f549057613bbffd26e6651f3beeb0019af9 Bug: skia: Reviewed-on: https://skia-review.googlesource.com/16520 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* move sk_memset?? to SkOptsGravatar Mike Klein2017-05-23
| | | | | | | | | This lets the compiler generate AVX versions with wider writes. Change-Id: Ia63825e70c72bdb4d14bef97d8b4ea4be54c9d84 Reviewed-on: https://skia-review.googlesource.com/17715 Reviewed-by: Florin Malita <fmalita@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* remove old 565 destination optsGravatar Mike Klein2017-05-06
| | | | | | | | | | This is not an important format, and the code is dead or close to it. The code is an occasional maintenance burden so I'd like it gone. Change-Id: I4ad921533abf3211e6a81e6e475b848795eea060 Reviewed-on: https://skia-review.googlesource.com/15600 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* CRC32 no longer restricted to ARM64Gravatar Amaury Le Leyzour2017-05-04
| | | | | | | | | | | | On a simple benchmark the CRC32 version is about 3x faster on ARM Cortex A57 (Aarch32) than the Murmur3 scalar version. BUG=skia: Change-Id: I71515e8463a33924998b837ff9f32202690dd2fe Reviewed-on: https://skia-review.googlesource.com/15480 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>