| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
| |
This brings a little more symmetry to _stages.cpp and _stages_lowp.cpp.
Change-Id: Icfcbd3f264ab97d8445ad8e14c25b4a07c780aea
Reviewed-on: https://skia-review.googlesource.com/90030
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed these little bits while working on that old-Clang fix.
- We can force-inline anytime we've got Clang,
not just when JUMPER_IS_OFFLINE.
- The _aarch64 and _vfp4 WRAP functions are dead code,
as they're never compiled offline now.
Change-Id: I5850daded2ffcfe50ceeadc43f89fa8597df3387
Reviewed-on: https://skia-review.googlesource.com/84060
Commit-Queue: Mike Klein <mtklein@chromium.org>
Commit-Queue: Florin Malita <fmalita@chromium.org>
Reviewed-by: Florin Malita <fmalita@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VFPv4 gives us two interesting features:
- FMA
- f16<->f32 conversions
Even without FMAs, NEON still has non-fused MLA instructions. We don't
really care about the fusedness of those mul-adds, so losing FMA here is
kind of no big deal.
We already maintain portable code to do f16<->f32 conversions, so it's
not much of a maintanence hit to use that instead of the native
instructions. To my knowledge software F16 rendering is not a
performance critical mode of operation for any of our users.
This drops our minimum requirement to basically just having NEON.
Devices like the Nexus 7 2012 will now take SkJumper fast paths
instead of portable code. (Though actually, we've only ever
required NEON for _lowp... only the float code also needed vfpv4).
The main file to look at here is actually SkJumper_vectors.h,
where you will see all the substantive changes. The rest just
kind of tears down most of the old complexity, add adds ABI
to put just a little of it back. :)
Change-Id: Ia9237117698729c91e5fa51126baf80748093bf4
Bug: skia:
Reviewed-on: https://skia-review.googlesource.com/83521
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Florin Malita <fmalita@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is something I came up with while writing _lowp.cpp.
This should all be a logical no-op, but there are some code generation
changes. I'm not exactly sure why.
Change-Id: Iaad36b5298b37fe26ebd375a147a48852f98e1e4
Reviewed-on: https://skia-review.googlesource.com/52003
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've got independent definitions of SI, LazyCtx/Ctx, and load_and_inc()
in _stages.cpp and _lowp.cpp. It's a good time to centralize them,
taking _stages.cpp's SI and load_and_inc(), and _lowp's Ctx.
SI and load_and_inc() are uninterestingly different. But using _lowp's
Ctx will let us get its prettier typed stage definitions into
_stages.cpp, but that is not not done here.
This is a pure refactor with no generated code changes.
Change-Id: I53260b0fdc71a77bf9e3ed6f3df3a2a4cbd2392b
Reviewed-on: https://skia-review.googlesource.com/47181
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Whether JUMPER is defined is starting to get a little overloaded:
- are we compiling offline (defined) or as part of Skia (!defined)?
- are we using Clang vector extensions (defined) or scalars (!defined)?
This splits JUMPER into these two separate concerns:
- JUMPER_IS_OFFLINE
- JUMPER_IS_SCALAR, JUMPER_IS_NEON, JUMPER_IS_AVX2, etc.
The upshot is that we'll now use Clang vector extensions when available
for our "portable" baseline. On x86-64 and ARMv8 compiled by Clang,
we're guaranteed to pick up SSE2 and NEON respectively. Our -Fast
bot should even get all the way to AVX2.
Another CL will do some refactoring in SkJumper to remove the redundant
copies of guaranteed vector code on x86-64 and ARMv8. I didn't want to
do that here yet to demonstrate that there is zero effect on the .S
files from this CL.
Change-Id: Ib5e8f00b35e8721b2cc7180e294840ffaf9dddce
Reviewed-on: https://skia-review.googlesource.com/39500
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a no-op refator that'll help keep the interesting diff more
focused in the lowp CL. The lowp stages will use these unaltered, so
SkJumper_misc.h is a good place for them.
Change-Id: I7fb6327ade29ac884194517d94ac4303ed1079e0
Reviewed-on: https://skia-review.googlesource.com/18484
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't know why I never wrote unaligned_store() to mirror
unaligned_load(), but now I have. This replaces all
remaining memcpy() in SkJumper_stages.cpp, which is nice.
The generated stage code didn't change.
Change-Id: I714c1072a975d7fa268a4b06c13f06557bf0c12c
Reviewed-on: https://skia-review.googlesource.com/16870
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
| |
Change-Id: I17ac13b9d1ea6765e2c1a2b53aa6975eab408856
Reviewed-on: https://skia-review.googlesource.com/16713
Commit-Queue: Herb Derby <herb@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For whatever reason, if I swap the condition in the if_then_else tests
from < to >= and swap the then/else values, I can use constants in
hsl_to_rgb. Still don't understand why, but I'll take it. I suspect it
has something to do with SSE, IEEE, and NaN, but I don't care enough to
speculate any more concretely.
This does that, removes C() and _f, updates some comments, and adds a
guard in build_stages.py to yell if it sees trouble like LCPI40_4...
This reminds me to try -ffast-math soon. I think that was mostly held
back by constants.
Change-Id: I3f8a37a4d4642f77422ce3261b750061e9e604a3
Reviewed-on: https://skia-review.googlesource.com/14942
Reviewed-by: Herb Derby <herb@google.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This finishes off integer constants... they should all be normal now.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Win7-MSVC-Golo-CPU-AVX-x86_64-Release,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release-SK_CPU_LIMIT_SSE41,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release-SK_CPU_LIMIT_SSE2
Change-Id: I66ecc6533807fc59bb5ac9d3c5f7ab9e6e1f0d7f
Reviewed-on: https://skia-review.googlesource.com/14528
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So far I only seem to be encountering constant pools with float
constants, so integer constants should be easy to make normal.
This just removes _i. There might be a couple integer constants
generated with C() too... they'll be the next CL.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Win7-MSVC-Golo-CPU-AVX-x86_64-Release,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release-SK_CPU_LIMIT_SSE41,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Release-SK_CPU_LIMIT_SSE2
Change-Id: Icc82cbc660d1e33bcdb5282072fb86cb5190d901
Reviewed-on: https://skia-review.googlesource.com/14527
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pretty much the same deal as the last CL going the other direction:
split store_f16 into to_half() and store4(). Platforms that had fused
strategies here get a little less optimal, but the code's easier to
follow, maintain, and reuse.
Also adds widen_cast() to encapsulate the fairly common pattern of
expanding one of our logical vector types (e.g. 8-byte U16) up to the
width of the physical vector type (e.g. 16-byte __m128i). This
operation is deeply understood by Clang, and often is a no-op.
I could make bit_cast() do this, but it seems clearer to have two names.
Change-Id: I7ba5bb4746acfcaa6d486379f67e07baee3820b2
Reviewed-on: https://skia-review.googlesource.com/11204
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
SkJumper_stages.cpp is starting to get unweildy.
This spins some logical parts out into their own headers.
I will follow up by moving more of the very specific
f16/f32 load/store logic into SkJumper_vectors.h too.
Change-Id: I2a3a055e9d1b65f56983d05649270772a4c69f31
Reviewed-on: https://skia-review.googlesource.com/11133
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|