aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/jumper/SkJumper_misc.h
Commit message (Collapse)AuthorAge
* fold SkJumper_vectors.h into SkJumper_stages.cppGravatar Mike Klein2018-01-01
| | | | | | | | | This brings a little more symmetry to _stages.cpp and _stages_lowp.cpp. Change-Id: Icfcbd3f264ab97d8445ad8e14c25b4a07c780aea Reviewed-on: https://skia-review.googlesource.com/90030 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* a little SkJumper tidy upGravatar Mike Klein2017-12-12
| | | | | | | | | | | | | | | | I noticed these little bits while working on that old-Clang fix. - We can force-inline anytime we've got Clang, not just when JUMPER_IS_OFFLINE. - The _aarch64 and _vfp4 WRAP functions are dead code, as they're never compiled offline now. Change-Id: I5850daded2ffcfe50ceeadc43f89fa8597df3387 Reviewed-on: https://skia-review.googlesource.com/84060 Commit-Queue: Mike Klein <mtklein@chromium.org> Commit-Queue: Florin Malita <fmalita@chromium.org> Reviewed-by: Florin Malita <fmalita@chromium.org>
* remove vfpv4 requirement for SkJumper on ARMv7Gravatar Mike Klein2017-12-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VFPv4 gives us two interesting features: - FMA - f16<->f32 conversions Even without FMAs, NEON still has non-fused MLA instructions. We don't really care about the fusedness of those mul-adds, so losing FMA here is kind of no big deal. We already maintain portable code to do f16<->f32 conversions, so it's not much of a maintanence hit to use that instead of the native instructions. To my knowledge software F16 rendering is not a performance critical mode of operation for any of our users. This drops our minimum requirement to basically just having NEON. Devices like the Nexus 7 2012 will now take SkJumper fast paths instead of portable code. (Though actually, we've only ever required NEON for _lowp... only the float code also needed vfpv4). The main file to look at here is actually SkJumper_vectors.h, where you will see all the substantive changes. The rest just kind of tears down most of the old complexity, add adds ABI to put just a little of it back. :) Change-Id: Ia9237117698729c91e5fa51126baf80748093bf4 Bug: skia: Reviewed-on: https://skia-review.googlesource.com/83521 Commit-Queue: Mike Klein <mtklein@chromium.org> Reviewed-by: Florin Malita <fmalita@chromium.org>
* Move context types into STAGE() macros.Gravatar Mike Klein2017-09-28
| | | | | | | | | | | | This is something I came up with while writing _lowp.cpp. This should all be a logical no-op, but there are some code generation changes. I'm not exactly sure why. Change-Id: Iaad36b5298b37fe26ebd375a147a48852f98e1e4 Reviewed-on: https://skia-review.googlesource.com/52003 Commit-Queue: Mike Klein <mtklein@chromium.org> Reviewed-by: Herb Derby <herb@google.com>
* centralize SI, Ctx, and load_and_inc()Gravatar Mike Klein2017-09-15
| | | | | | | | | | | | | | | | | We've got independent definitions of SI, LazyCtx/Ctx, and load_and_inc() in _stages.cpp and _lowp.cpp. It's a good time to centralize them, taking _stages.cpp's SI and load_and_inc(), and _lowp's Ctx. SI and load_and_inc() are uninterestingly different. But using _lowp's Ctx will let us get its prettier typed stage definitions into _stages.cpp, but that is not not done here. This is a pure refactor with no generated code changes. Change-Id: I53260b0fdc71a77bf9e3ed6f3df3a2a4cbd2392b Reviewed-on: https://skia-review.googlesource.com/47181 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* split up JUMPER defineGravatar Mike Klein2017-08-28
| | | | | | | | | | | | | | | | | | | | | | | | | Whether JUMPER is defined is starting to get a little overloaded: - are we compiling offline (defined) or as part of Skia (!defined)? - are we using Clang vector extensions (defined) or scalars (!defined)? This splits JUMPER into these two separate concerns: - JUMPER_IS_OFFLINE - JUMPER_IS_SCALAR, JUMPER_IS_NEON, JUMPER_IS_AVX2, etc. The upshot is that we'll now use Clang vector extensions when available for our "portable" baseline. On x86-64 and ARMv8 compiled by Clang, we're guaranteed to pick up SSE2 and NEON respectively. Our -Fast bot should even get all the way to AVX2. Another CL will do some refactoring in SkJumper to remove the redundant copies of guaranteed vector code on x86-64 and ARMv8. I didn't want to do that here yet to demonstrate that there is zero effect on the .S files from this CL. Change-Id: Ib5e8f00b35e8721b2cc7180e294840ffaf9dddce Reviewed-on: https://skia-review.googlesource.com/39500 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* move load_and_inc() and LazyCtx into SkJumper_misc.hGravatar Mike Klein2017-06-02
| | | | | | | | | | | | This is a no-op refator that'll help keep the interesting diff more focused in the lowp CL. The lowp stages will use these unaltered, so SkJumper_misc.h is a good place for them. Change-Id: I7fb6327ade29ac884194517d94ac4303ed1079e0 Reviewed-on: https://skia-review.googlesource.com/18484 Reviewed-by: Mike Klein <mtklein@chromium.org> Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* add unaligned_store()Gravatar Mike Klein2017-05-15
| | | | | | | | | | | | | Don't know why I never wrote unaligned_store() to mirror unaligned_load(), but now I have. This replaces all remaining memcpy() in SkJumper_stages.cpp, which is nice. The generated stage code didn't change. Change-Id: I714c1072a975d7fa268a4b06c13f06557bf0c12c Reviewed-on: https://skia-review.googlesource.com/16870 Reviewed-by: Mike Reed <reed@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Add evenly spaced stops and unify gradient contextsGravatar Herb Derby2017-05-15
| | | | | | | Change-Id: I17ac13b9d1ea6765e2c1a2b53aa6975eab408856 Reviewed-on: https://skia-review.googlesource.com/16713 Commit-Queue: Herb Derby <herb@google.com> Reviewed-by: Mike Klein <mtklein@chromium.org>
* finish up constantsGravatar Mike Klein2017-05-01
| | | | | | | | | | | | | | | | | | For whatever reason, if I swap the condition in the if_then_else tests from < to >= and swap the then/else values, I can use constants in hsl_to_rgb. Still don't understand why, but I'll take it. I suspect it has something to do with SSE, IEEE, and NaN, but I don't care enough to speculate any more concretely. This does that, removes C() and _f, updates some comments, and adds a guard in build_stages.py to yell if it sees trouble like LCPI40_4... This reminds me to try -ffast-math soon. I think that was mostly held back by constants. Change-Id: I3f8a37a4d4642f77422ce3261b750061e9e604a3 Reviewed-on: https://skia-review.googlesource.com/14942 Reviewed-by: Herb Derby <herb@google.com>
* jumper, remove C(int)Gravatar Mike Klein2017-04-27
| | | | | | | | | | | This finishes off integer constants... they should all be normal now. CQ_INCLUDE_TRYBOTS=skia.primary:Test-Win7-MSVC-Golo-CPU-AVX-x86_64-Release,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release-SK_CPU_LIMIT_SSE41,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release-SK_CPU_LIMIT_SSE2 Change-Id: I66ecc6533807fc59bb5ac9d3c5f7ab9e6e1f0d7f Reviewed-on: https://skia-review.googlesource.com/14528 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* jumper, replace _i with normal constantsGravatar Mike Klein2017-04-27
| | | | | | | | | | | | | | | So far I only seem to be encountering constant pools with float constants, so integer constants should be easy to make normal. This just removes _i. There might be a couple integer constants generated with C() too... they'll be the next CL. CQ_INCLUDE_TRYBOTS=skia.primary:Test-Win7-MSVC-Golo-CPU-AVX-x86_64-Release,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release-SK_CPU_LIMIT_SSE41,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release-SK_CPU_LIMIT_SSE2 Change-Id: Icc82cbc660d1e33bcdb5282072fb86cb5190d901 Reviewed-on: https://skia-review.googlesource.com/14527 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* jumper, split store_f16 into to_half, store4Gravatar Mike Klein2017-04-04
| | | | | | | | | | | | | | | | | | | Pretty much the same deal as the last CL going the other direction: split store_f16 into to_half() and store4(). Platforms that had fused strategies here get a little less optimal, but the code's easier to follow, maintain, and reuse. Also adds widen_cast() to encapsulate the fairly common pattern of expanding one of our logical vector types (e.g. 8-byte U16) up to the width of the physical vector type (e.g. 16-byte __m128i). This operation is deeply understood by Clang, and often is a no-op. I could make bit_cast() do this, but it seems clearer to have two names. Change-Id: I7ba5bb4746acfcaa6d486379f67e07baee3820b2 Reviewed-on: https://skia-review.googlesource.com/11204 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Refactor and recomment SkJumper_stages.cpp.Gravatar Mike Klein2017-04-03
SkJumper_stages.cpp is starting to get unweildy. This spins some logical parts out into their own headers. I will follow up by moving more of the very specific f16/f32 load/store logic into SkJumper_vectors.h too. Change-Id: I2a3a055e9d1b65f56983d05649270772a4c69f31 Reviewed-on: https://skia-review.googlesource.com/11133 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>