aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/opts/SkBlurImageFilter_opts.h
Commit message (Collapse)AuthorAge
* Update SkOpts namespaces.Gravatar mtklein2016-07-13
| | | | | | | | | | | | | | | | | | | | | | | | | If we make sure all SkOpts functions are static, we can give the namespaces any name we like. This lets us drop the sk_ prefix and give a real indication of the default SIMD instruction set rather than just saying sk_default. Both of these changes help debugger, profiler, and crash report readability. Perhaps more importantly, keeping these functions static helps prevent accidentally linking in unused versions of functions, as you see here with sk_avx::srcover_srgb_srgb(). This requires we update SkBlend_opts tests and benches to call SkOpts functions through SkOpts rather than declaring the methods externally. In practice this drops testing of the SSE2 version on machines with SSE4. If we still really need to test/bench the compile time best SIMD level version of this method against the runtime detected best, we can include SkBlend_opts.h into the tests or benches directly, similar to what we do for the trivial, brute-force, or best non-SIMD versions. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145833002 CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2145833002
* Move immintrin/arm_neon includes to where they are used.Gravatar mtklein2016-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | On my Mac (so, immintrin), this improves compile time, both wall and cpu, by about 16%. To test I ran this on an SSD with files hot in their caches: $ env CC=/usr/bin/clang CXX=/usr/bin/clang++ ./gyp_skia && \ ninja -C out/Release -t clean && \ time ninja -C out/Release Before: 159 wall / 3367 cpu 159 wall / 3368 cpu After: 137 wall / 2860 cpu 136 wall / 2863 cpu I also tried further refining immintrin down to emmintrin / tmmintrin / smmintrin etc. That made no signficant difference, so I've kept immintrin for its simplicity. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2045633002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot TBR=reed@google.com No public API changes. Committed: https://skia.googlesource.com/skia/+/12dfaaa53c23f3d03050bde8f64136ac1f44164a Review-Url: https://codereview.chromium.org/2045633002
* Revert of Move immintrin/arm_neon includes to where they are used. (patchset ↵Gravatar mtklein2016-06-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | #2 id:20001 of https://codereview.chromium.org/2045633002/ ) Reason for revert: Appears to have broken the ARMv7 aspect of the Google3 roll in bizarre seemingly-unrelated ways. Original issue's description: > Move immintrin/arm_neon includes to where they are used. > > On my Mac (so, immintrin), this improves compile time, both wall and cpu, > by about 16%. To test I ran this on an SSD with files hot in their caches: > > $ env CC=/usr/bin/clang CXX=/usr/bin/clang++ ./gyp_skia && \ > ninja -C out/Release -t clean && \ > time ninja -C out/Release > > Before: 159 wall / 3367 cpu > 159 wall / 3368 cpu > > After: 137 wall / 2860 cpu > 136 wall / 2863 cpu > > I also tried further refining immintrin down to emmintrin / tmmintrin / smmintrin etc. > That made no signficant difference, so I've kept immintrin for its simplicity. > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2045633002 > CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > TBR=reed@google.com > No public API changes. > > Committed: https://skia.googlesource.com/skia/+/12dfaaa53c23f3d03050bde8f64136ac1f44164a TBR=herb@google.com,mtklein@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review-Url: https://codereview.chromium.org/2046213002
* Move immintrin/arm_neon includes to where they are used.Gravatar mtklein2016-06-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | On my Mac (so, immintrin), this improves compile time, both wall and cpu, by about 16%. To test I ran this on an SSD with files hot in their caches: $ env CC=/usr/bin/clang CXX=/usr/bin/clang++ ./gyp_skia && \ ninja -C out/Release -t clean && \ time ninja -C out/Release Before: 159 wall / 3367 cpu 159 wall / 3368 cpu After: 137 wall / 2860 cpu 136 wall / 2863 cpu I also tried further refining immintrin down to emmintrin / tmmintrin / smmintrin etc. That made no signficant difference, so I've kept immintrin for its simplicity. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2045633002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot TBR=reed@google.com No public API changes. Review-Url: https://codereview.chromium.org/2045633002
* Make SkBlurImageFilter capable of cropping during blur (raster path)Gravatar senorblanco2015-11-02
| | | | | | | | | | | | | | | | | | SkBlurImageFilter can currently only process a source image which is larger than or equal to the destination rect. If the source image (or crop rect) is smaller, it is padded out to dest size with transparent black via the 6-param version of applyCropRect(). Fixing this requires modifying all the flavours of RGBA box_blur() to accept a src crop rect. BUG=skia:4502, skia:4526 CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot;client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Committed: https://skia.googlesource.com/skia/+/1b82ceb737c73327412f2e8a91748481e1aec9e4 Review URL: https://codereview.chromium.org/1415653003
* Revert of Make SkBlurImageFilter capable of cropping during blur (patchset ↵Gravatar senorblanco2015-11-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | #16 id:400001 of https://codereview.chromium.org/1415653003/ ) Reason for revert: ASAN failures (see https://codereview.chromium.org/1415653003/) Original issue's description: > Make SkBlurImageFilter capable of cropping during blur (raster path) > > SkBlurImageFilter can currently only process a source image > which is larger than or equal to the destination rect. If > the source image (or crop rect) is smaller, it is padded > out to dest size with transparent black via the 6-param > version of applyCropRect(). > > Fixing this requires modifying all the flavours of RGBA > box_blur() to accept a src crop rect. > > BUG=skia:4502, skia:4526 > CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot;client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > Committed: https://skia.googlesource.com/skia/+/1b82ceb737c73327412f2e8a91748481e1aec9e4 TBR=mtklein@google.com,reed@google.com NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia:4502, skia:4526 Review URL: https://codereview.chromium.org/1428053002
* Make SkBlurImageFilter capable of cropping during blur (raster path)Gravatar senorblanco2015-11-02
| | | | | | | | | | | | | | | | SkBlurImageFilter can currently only process a source image which is larger than or equal to the destination rect. If the source image (or crop rect) is smaller, it is padded out to dest size with transparent black via the 6-param version of applyCropRect(). Fixing this requires modifying all the flavours of RGBA box_blur() to accept a src crop rect. BUG=skia:4502, skia:4526 CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot;client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1415653003
* SkBlurImageFilter_opts: optimize NEON box_blur_double in separate loops.Gravatar senorblanco2015-10-28
| | | | | | | | | | | | | | Stop leaning so hard on the branch predictor, and pull the conditionals out of the loops for box_blur_double() (NEON). This is conceptually the same change as https://codereview.chromium.org/1426583004/ for the NEON double-pixel loop. R=mtklein@google.com BUG=skia:4526 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1412793009
* SkBlurImageFilter_opt.h: break conditions into separate loops.Gravatar senorblanco2015-10-28
| | | | | | | | | | | This gives ~15% improvement on blur_image on Linux Z620, and should allow me to implement cropping without incurring a perf hit. BUG=skia: CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1426583004
* move reinterpret_cast into SK_PREFETCHGravatar mtklein2015-10-28
| | | | | | | | | no public API changes TBR=reed@google.com BUG=skia: Review URL: https://codereview.chromium.org/1419573011
* Refactor SkBlurImageFilter_Opts.h.Gravatar senorblanco2015-10-27
| | | | | | | | | | | | Refactor box_blur() into a single driver function which SSE*, NEON and generic code paths can use. I've used macros to do this in order to keep debug performance reasonable, but it's fairly ugly. I'm open to other suggestions. BUG=skia: CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1408003007
* Port morphology to SkOpts.Gravatar mtklein2015-08-04
| | | | | | | | | | | | Nothing too fancy. Direction enums become enum classes so they don't get all confused. An alternative is to create one single Direction enum that both blur and morphology opts use. BUG=skia:4117 Review URL: https://codereview.chromium.org/1267343004
* Port SkBlurImage opts to SkOpts.Gravatar mtklein2015-08-04
+268 -535 lines I also rearranged the code a little bit to encapsulate itself better, mostly replacing static helper functions with lambdas. This also let me merge the SSE2 and SSE4.1 code paths. BUG=skia:4117 Review URL: https://codereview.chromium.org/1264103004