aboutsummaryrefslogtreecommitdiffhomepage
path: root/tests/SkNxTest.cpp
Commit message (Collapse)AuthorAge
* Implement Sk2f::Store4Gravatar Chris Dalton2018-02-07
| | | | | | | | Bug: skia: Change-Id: I2adb983d68625d327e7c00e53b6ae4703b46252f Reviewed-on: https://skia-review.googlesource.com/104761 Commit-Queue: Chris Dalton <csmartdalton@google.com> Reviewed-by: Mike Klein <mtklein@chromium.org>
* Add Store3 to Sk2fGravatar Chris Dalton2017-12-01
| | | | | | | | Bug: skia: Change-Id: I0377e6a1dd8259e944f7902a5c68af524fa588c7 Reviewed-on: https://skia-review.googlesource.com/79382 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Chris Dalton <csmartdalton@google.com>
* add Load2() to Sk4fGravatar Mike Klein2017-12-01
| | | | | | | | | and test it. Change-Id: Ib0c2cf93c63d8d3c36a7d4d60bbec4ecede29bc7 Reviewed-on: https://skia-review.googlesource.com/78480 Reviewed-by: Chris Dalton <csmartdalton@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Add mulHi to SkNxGravatar Herb Derby2017-10-11
| | | | | | | | | | | | | Add mulHi to base SkNx, and specialize implementations for Sk4u for neon and sse. Add casts for converting from uint8_t by 4 to uint32_t by 4. Cq-Include-Trybots: skia.primary:Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Change-Id: I29a32e2ad9812a47fff841ceca334e562362836f Reviewed-on: https://skia-review.googlesource.com/57960 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Herb Derby <herb@google.com>
* Add missing methods to neon/sse SkNx implementationsGravatar Chris Dalton2017-08-28
| | | | | | | | | | Adds negate, abs, sqrt to Sk2f and/or Sk4f. Bug: skia: Change-Id: I0688dae45b32ff94abcc0525ef1f09d666f9c6e9 Reviewed-on: https://skia-review.googlesource.com/39642 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Chris Dalton <csmartdalton@google.com>
* Implement Sk4i's abs, min, maxGravatar Yuqian Li2017-07-12
| | | | | | | | | CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD Bug: skia: Change-Id: Ia9ec3f72095e1c744f88df7bb990d99e0f87d578 Reviewed-on: https://skia-review.googlesource.com/22720 Commit-Queue: Yuqian Li <liyuqian@google.com> Reviewed-by: Herb Derby <herb@google.com>
* Make load4 and store4 part of SkNx properly.Gravatar Mike Klein2016-10-06
| | | | | | | | | | | | | | | Every type now nominally has Load4() and Store4() methods. The ones that we use are implemented. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=3046 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Change-Id: I7984f0c2063ef8acbc322bd2e968f8f7eaa0d8fd Reviewed-on: https://skia-review.googlesource.com/3046 Reviewed-by: Matt Sarett <msarett@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Support Float32 output from SkColorSpaceXformGravatar msarett2016-09-16
| | | | | | | | | | | | | | | * Adds Float32 support to SkColorSpaceXform * Changes API to allows clients to ask for F32, updates clients to new API * Adds Sk4f_load4 and Sk4f_store4 to SkNx * Make use of new xform in SkGr.cpp BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2339233003 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Committed: https://skia.googlesource.com/skia/+/43d6651111374b5d1e4ddd9030dcf079b448ec47 Review-Url: https://codereview.chromium.org/2339233003
* Revert of Support Float32 output from SkColorSpaceXform (patchset #7 ↵Gravatar msarett2016-09-16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | id:140001 of https://codereview.chromium.org/2339233003/ ) Reason for revert: Hitting an assert Original issue's description: > Support Float32 output from SkColorSpaceXform > > * Adds Float32 support to SkColorSpaceXform > * Changes API to allows clients to ask for F32, updates clients to > new API > * Adds Sk4f_load4 and Sk4f_store4 to SkNx > * Make use of new xform in SkGr.cpp > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2339233003 > CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > Committed: https://skia.googlesource.com/skia/+/43d6651111374b5d1e4ddd9030dcf079b448ec47 TBR=brianosman@google.com,mtklein@google.com,scroggo@google.com,mtklein@chromium.org,bsalomon@google.com # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review-Url: https://codereview.chromium.org/2347473007
* Support Float32 output from SkColorSpaceXformGravatar msarett2016-09-16
| | | | | | | | | | | | | | * Adds Float32 support to SkColorSpaceXform * Changes API to allows clients to ask for F32, updates clients to new API * Adds Sk4f_load4 and Sk4f_store4 to SkNx * Make use of new xform in SkGr.cpp BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2339233003 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2339233003
* Expand _01 half<->float limitation to _finite. Simplify.Gravatar mtklein2016-07-15
| | | | | | | | | | | | | | | | | | | | | | | It's become clear we need to sometimes deal with values <0 or >1. I'm not yet convinced we care about NaN or +-inf. We had some fairly clever tricks and optimizations here for NEON and SSE. I've thrown them out in favor of a single implementation. If we find the specializations mattered, we can certainly figure out how to extend them to this new range/domain. This happens to add a vectorized float -> half for ARMv7, which was missing from the _01 version. (The SSE strategy was not portable to platforms that flush denorm floats to zero.) I've tested the full float range for FloatToHalf on my desktop and a 5x. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145663003 CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot Committed: https://skia.googlesource.com/skia/+/3296bee70d074bb8094b3229dbe12fa016657e90 Review-Url: https://codereview.chromium.org/2145663003
* Revert of Expand _01 half<->float limitation to _finite. Simplify. ↵Gravatar mtklein2016-07-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (patchset #7 id:120001 of https://codereview.chromium.org/2145663003/ ) Reason for revert: Unit tests fail on Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast Original issue's description: > Expand _01 half<->float limitation to _finite. Simplify. > > It's become clear we need to sometimes deal with values <0 or >1. > I'm not yet convinced we care about NaN or +-inf. > > We had some fairly clever tricks and optimizations here for NEON > and SSE. I've thrown them out in favor of a single implementation. > If we find the specializations mattered, we can certainly figure out > how to extend them to this new range/domain. > > This happens to add a vectorized float -> half for ARMv7, which was > missing from the _01 version. (The SSE strategy was not portable to > platforms that flush denorm floats to zero.) > > I've tested the full float range for FloatToHalf on my desktop and a 5x. > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145663003 > CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-Fast-Trybot > > Committed: https://skia.googlesource.com/skia/+/3296bee70d074bb8094b3229dbe12fa016657e90 TBR=msarett@google.com,mtklein@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review-Url: https://codereview.chromium.org/2151023003
* Expand _01 half<->float limitation to _finite. Simplify.Gravatar mtklein2016-07-14
| | | | | | | | | | | | | | | | | | | | | | It's become clear we need to sometimes deal with values <0 or >1. I'm not yet convinced we care about NaN or +-inf. We had some fairly clever tricks and optimizations here for NEON and SSE. I've thrown them out in favor of a single implementation. If we find the specializations mattered, we can certainly figure out how to extend them to this new range/domain. This happens to add a vectorized float -> half for ARMv7, which was missing from the _01 version. (The SSE strategy was not portable to platforms that flush denorm floats to zero.) I've tested the full float range for FloatToHalf on my desktop and a 5x. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2145663003 CQ_INCLUDE_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review-Url: https://codereview.chromium.org/2145663003
* SkNx refreshGravatar mtklein2016-03-21
| | | | | | | | | | | | | | | | | | | | - rearrange a bit - fewer macros - hooks for all operators - add left and right scalar operator overrides - add +=, &=, <<=, etc. - add SkNx_split() and SkNx_join() - simplify the many rsqrt() and invert() options to just what we actually use This refactoring pointed out that our float <-> int NEON conversions are not specialized, so I've implemented them. It seems nice that this is an error rather than silently falling back to serial code. It's unclear to me if split/join want to be external, static methods, or non-static methods (SkNx_join(), Sk4f::Join(), x.join()). Time will tell? BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1812233003 CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot;client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1812233003
* SkNx: kth<...>() -> [...]Gravatar mtklein2016-02-21
| | | | | | | | | | Just some syntax cleanup. No real change: kth<...>() was calling [...] already. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1714363002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1714363002
* fast sk4f <-> sk4i SSE methodsGravatar mtklein2016-02-17
| | | | | | | | BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1707673002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1707673002
* Sk4f: add floor()Gravatar mtklein2016-02-09
| | | | | | | | | | | | BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1685773002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Committed: https://skia.googlesource.com/skia/+/86c6c4935171a1d2d6a9ffbff37ec6dac1326614 CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus7-GPU-Tegra3-Arm7-Release-Trybot,Test-Android-GCC-Nexus9-GPU-TegraK1-Arm64-Release-Trybot Review URL: https://codereview.chromium.org/1685773002
* Revert of Sk4f: add floor() (patchset #3 id:40001 of ↵Gravatar mtklein2016-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | https://codereview.chromium.org/1685773002/ ) Reason for revert: build break must be this, right? Original issue's description: > Sk4f: add floor() > > BUG=skia: > GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1685773002 > CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot > > Committed: https://skia.googlesource.com/skia/+/86c6c4935171a1d2d6a9ffbff37ec6dac1326614 TBR=herb@google.com,mtklein@chromium.org # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/1679343004
* Sk4f: add floor()Gravatar mtklein2016-02-09
| | | | | | | | BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1685773002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1685773002
* sknx refactoringGravatar mtklein2016-02-09
| | | | | | | | | | | | | | | | | | | | | | | | - trim unused specializations (Sk4i, Sk2d) and apis (SkNx_dup) - expand apis a little * v[0] == v.kth<0>() * SkNx_shuffle can now convert to different-sized vectors, e.g. Sk2f <-> Sk4f - remove anonymous namespace I believe it's safe to remove the anonymous namespace right now. We're worried about violating the One Definition Rule; the anonymous namespace protected us from that. In Release builds, this is mostly moot, as everything tends to inline completely. In Debug builds, violating the ODR is at worst an inconvenience, time spent trying to figure out why the bot is broken. Now that we're building with SSE2/NEON everywhere, very few bots have even a chance about getting confused by two definitions of the same type or function. Where we do compile variants depending on, e.g., SSSE3, we do so in static inline functions. These are not subject to the ODR. I plan to follow up with a tedious .kth<...>() -> [...] auto-replace. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1683543002 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1683543002
* fix float <---> uint16_t conversion, with Mike's tests.Gravatar mtklein2016-02-08
| | | | | | | | | | BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1679713002 patch from issue 1679713002 at patchset 1 (http://crrev.com/1679713002#ps1) CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1676893002
* add SkNx::abs(), for now only implemented for Sk4fGravatar mtklein2016-01-15
| | | | | | | | | | | There's no reason we couldn't implement this for all ints and floats; just don't want to land unused code. BUG=skia: GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1590843003 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1590843003
* Unify some SkNx codeGravatar mtklein2015-12-14
| | | | | | | | | | | | | | | - one base case and one N=1 case instead of two each (or three with doubles) - use SkNx_cast instead of FromBytes/toBytes - 4-at-a-time Sk4f::ToBytes becomes a special standalone Sk4f_ToBytes If I did everything right, this'll be perf- and pixel- neutral. https://gold.skia.org/search2?issue=1526523003&unt=true&query=source_type%3Dgm&master=false BUG=skia: CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1526523003
* Add SkNx_cast().Gravatar mtklein2015-11-20
| | | | | | | | | | | | | | | | | | | | SkNx_cast() can cast between any of our vector types, provided they have the same number of elements. Any types should work with the default implementation, and we can drop in specializations as needed, like the SSE and NEON Sk4f -> Sk4i I included here as an example. To make this work, I made some internal name changes: SkNi<N,T> -> SkNx<N, T> SkNf<N> -> SkNx<N, float> User aliases (Sk4f, Sk16b, etc.) stay the same. We can land this first (it's PS1) if that makes things easier. BUG=skia: CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1464623002
* prune unused SkNx featuresGravatar mtklein2015-11-09
| | | | | | | | | | | | | | | - remove float -> int conversion, keeping float -> byte - remove support for doubles I was thinking of specializing Sk8f for AVX. This will help keep the complexity down. This may cause minor diffs in radial gradients: toBytes() rounds where castTrunc() truncated. But I don't see any diffs in Gold. https://gold.skia.org/search2?issue=1411563008&unt=true&query=source_type%3Dgm&master=false BUG=skia:4117 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Review URL: https://codereview.chromium.org/1411563008
* Require Sk4f::toBytes() clampsGravatar mtklein2015-09-01
| | | | | | | | BUG=skia:4117 CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot;client.skia.android:Test-Android-GCC-Nexus9-CPU-Denver-Arm64-Release-Trybot Review URL: https://codereview.chromium.org/1312053004
* 3-15% speedup to HardLight / Overlay xfermodes.Gravatar mtklein2015-07-14
| | | | | | | | | | | | | | | While investigating my bug (skia:4052) I saw this TODO and figured it'd make me feel better about an otherwise unsuccessful investigation. This speeds up HardLight and Overlay (same code) by about 15% with SSE, mostly by rewriting the logic from 1 cheap comparison and 2 expensive div255() calls to 2 cheap comparisons and 1 expensive div255(). NEON speeds up by a more modest ~3%. BUG=skia: Review URL: https://codereview.chromium.org/1230663005
* Add a GM that reproduces layout test failures with my new xfermode code.Gravatar mtklein2015-07-13
| | | | | | | | | | | | | | Inspired by https://storage.googleapis.com/chromium-layout-test-archives/linux_blink_rel/69169/layout-test-results/results.html I think the root cause is overflow. Also, adds tests for Sk16b::operator<(). It wasn't wrong, but it was suspect (used in all three of these xfermode implementations) and so it's best to have tests. BUG=skia: Review URL: https://codereview.chromium.org/1228393006
* Update some Sk4px APIs.Gravatar mtklein2015-06-22
| | | | | | | | | | | | | | | Mostly this is about ergonomics, making it easier to do good operations and hard / impossible to do bad ones. - SkAlpha / SkPMColor constructors become static factories. - Remove div255TruncNarrow(), rename div255RoundNarrow() to div255(). In practice we always want to round, and the narrowing to 8-bit is contextually obvious. - Rename fastMulDiv255Round() approxMulDiv255() to stress it's approximate-ness over its speed. Drop Round for the same reason as above... we should always round. - Add operator overloads so we don't have to keep throwing in seemingly-random Sk4px() or Sk4px::Wide() casts. - use operator*() for 8-bit x 8-bit -> 16-bit math. It's always what we want, and there's generally no 8x8->8 alternative. - MapFoo can take a const Func&. Don't think it makes a big difference, but nice to do. BUG=skia: Review URL: https://codereview.chromium.org/1202013002
* Thorough tests for saturatedAdd and mulDiv255Round.Gravatar mtklein2015-06-15
| | | | | | | | | | BUG=skia:3951 Committed: https://skia.googlesource.com/skia/+/ce9d11189a5924b47c3629063b72bae9d466c2c7 CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot Review URL: https://codereview.chromium.org/1184113003
* Revert of Thorough tests for saturatedAdd and mulDiv255Round. (patchset #1 ↵Gravatar mtklein2015-06-15
| | | | | | | | | | | | | | | | | | | | | | id:1 of https://codereview.chromium.org/1184113003/) Reason for revert: https://code.google.com/p/skia/issues/detail?id=3951 Original issue's description: > Thorough tests for saturatedAdd and mulDiv255Round. > > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/ce9d11189a5924b47c3629063b72bae9d466c2c7 TBR=fmalita@chromium.org,mtklein@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/1177123004
* Thorough tests for saturatedAdd and mulDiv255Round.Gravatar mtklein2015-06-15
| | | | | | BUG=skia: Review URL: https://codereview.chromium.org/1184113003
* Remove overly-promiscuous SkNx syntax sugar.Gravatar mtklein2015-06-10
| | | | | | | | | | | | I haven't figured out a pithy way to have these apply to only classes originating from SkNx, so let's just remove them. There aren't too many use cases, and it's not really any less readable without them. Semantically, this is a no-op. BUG=skia: Review URL: https://codereview.chromium.org/1167153002
* add Min to SkNi, specialized for u8 and u16 on SSE and NEONGravatar mtklein2015-05-14
| | | | | | | | | | | 0x8001 / 0x7fff don't seem to work, but we were close: 0x8000 does. I plan to use this to implement the Difference xfermode, and it seems generally handy. BUG=skia: Review URL: https://codereview.chromium.org/1133933004
* Split rsqrt into rsqrt{0,1,2}, with increasing cost and precision on ARMGravatar mtklein2015-04-27
| | | | | | | | | | | | This is a logical no-op. Everything was using the equivalent of rsqrt1() before, and is now after. BUG=skia: Committed: https://skia.googlesource.com/skia/+/9de16283fdc8cc0d31a84f503578d0ecea4e8297 CQ_EXTRA_TRYBOTS=client.skia.compile:Build-Ubuntu-GCC-Arm64-Debug-Android-Trybot Review URL: https://codereview.chromium.org/1109913002
* Revert of Split rsqrt into rsqrt{0,1,2}, with increasing cost and precision ↵Gravatar mtklein2015-04-27
| | | | | | | | | | | | | | | | | | | | | | | | on ARM (patchset #2 id:20001 of https://codereview.chromium.org/1109913002/) Reason for revert: arm64 typos Original issue's description: > Split rsqrt into rsqrt{0,1,2}, with increasing cost and precision on ARM > > This is a logical no-op. Everything was using the equivalent of rsqrt1() before, and is now after. > > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/9de16283fdc8cc0d31a84f503578d0ecea4e8297 TBR=reed@google.com,mtklein@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/1105233003
* Split rsqrt into rsqrt{0,1,2}, with increasing cost and precision on ARMGravatar mtklein2015-04-27
| | | | | | | | This is a logical no-op. Everything was using the equivalent of rsqrt1() before, and is now after. BUG=skia: Review URL: https://codereview.chromium.org/1109913002
* Mike's radial gradient CL with better float -> int.Gravatar mtklein2015-04-27
| | | | | | | | | | | | | | | | | | patch from issue 1072303005 at patchset 40001 (http://crrev.com/1072303005#ps40001) This looks quite launchable. radial_gradient3, min of 100 samples: N5: 985µs -> 946µs MBP: 395µs -> 279µs On my MBP, most of the meat looks like it's now in reading the cache and writing to dst one color at a time. Is that something we could do in float math rather than with a lookup table? BUG=skia: CQ_EXTRA_TRYBOTS=client.skia.compile:Build-Mac10.8-Clang-Arm7-Debug-Android-Trybot,Build-Ubuntu-GCC-Arm7-Release-Android_NoNeon-Trybot Committed: https://skia.googlesource.com/skia/+/abf6c5cf95e921fae59efb487480e5b5081cf0ec Review URL: https://codereview.chromium.org/1109643002
* Revert of Mike's radial gradient CL with better float -> int. (patchset #7 ↵Gravatar mtklein2015-04-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | id:120001 of https://codereview.chromium.org/1109643002/) Reason for revert: compile failures. Original issue's description: > Mike's radial gradient CL with better float -> int. > > patch from issue 1072303005 at patchset 40001 (http://crrev.com/1072303005#ps40001) > > This looks quite launchable. radial_gradient3, min of 100 samples: > N5: 985µs -> 946µs > MBP: 395µs -> 279µs > > On my MBP, most of the meat looks like it's now in reading the cache and writing to dst one color at a time. Is that something we could do in float math rather than with a lookup table? > > BUG=skia: > > CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Debug-Trybot,Test-Android-GCC-Nexus9-CPU-Denver-Arm64-Debug-Trybot > > Committed: https://skia.googlesource.com/skia/+/abf6c5cf95e921fae59efb487480e5b5081cf0ec TBR=reed@google.com,robertphillips@google.com,mtklein@chromium.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/1109883003
* Mike's radial gradient CL with better float -> int.Gravatar mtklein2015-04-27
| | | | | | | | | | | | | | | | patch from issue 1072303005 at patchset 40001 (http://crrev.com/1072303005#ps40001) This looks quite launchable. radial_gradient3, min of 100 samples: N5: 985µs -> 946µs MBP: 395µs -> 279µs On my MBP, most of the meat looks like it's now in reading the cache and writing to dst one color at a time. Is that something we could do in float math rather than with a lookup table? BUG=skia: CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Debug-Trybot,Test-Android-GCC-Nexus9-CPU-Denver-Arm64-Debug-Trybot Review URL: https://codereview.chromium.org/1109643002
* Sk4h and Sk8h for SSEGravatar mtklein2015-04-14
| | | | | | | | | | These will underly the SkPMFloat-like class for uint16_t components. Sk4h will back a single-pixel version, and Sk8h any larger number than that. BUG=skia: Review URL: https://codereview.chromium.org/1088883005
* Use switch operator[](int) to kth<int>() so we can use vget_lane.Gravatar mtklein2015-04-03
| | | | | | | | | #floats BUG=skia: BUG=skia:3592 Review URL: https://codereview.chromium.org/1059743002
* Refactor Sk2x<T> + Sk4x<T> into SkNf<N,T> and SkNi<N,T>Gravatar mtklein2015-03-30
The primary feature this delivers is SkNf and SkNd for arbitrary power-of-two N. Non-specialized types or types larger than 128 bits should now Just Work (and we can drop in a specialization to make them faster). Sk4s is now just a typedef for SkNf<4, SkScalar>; Sk4d is SkNf<4, double>, Sk2f SkNf<2, float>, etc. This also makes implementing new specializations easier and more encapsulated. We're now using template specialization, which means the specialized versions don't have to leak out so much from SkNx_sse.h and SkNx_neon.h. This design leaves us room to grow up, e.g to SkNf<8, SkScalar> == Sk8s, and to grown down too, to things like SkNi<8, uint16_t> == Sk8h. To simplify things, I've stripped away most APIs (swizzles, casts, reinterpret_casts) that no one's using yet. I will happily add them back if they seem useful. You shouldn't feel bad about using any of the typedef Sk4s, Sk4f, Sk4d, Sk2s, Sk2f, Sk2d, Sk4i, etc. Here's how you should feel: - Sk4f, Sk4s, Sk2d: feel awesome - Sk2f, Sk2s, Sk4d: feel pretty good No public API changes. TBR=reed@google.com BUG=skia:3592 Review URL: https://codereview.chromium.org/1048593002