| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a reland of 339133f82c30cd3080672db28e6f72c894cba05a
Original change's description:
> start cleaning up non-skcms SkColorSpaceXforms
>
> I think this gets rid of
> - SkColorSpaceXform_Base
> - SkColorSpaceXform_XYZ
> - SkColorSpaceXform_A2B
> and lots of support code. Might be more left to clean up?
>
> Change-Id: I560d974d1e879dfd6a63ee2244a3dd88bd495c8a
> Reviewed-on: https://skia-review.googlesource.com/129512
> Commit-Queue: Brian Osman <brianosman@google.com>
> Auto-Submit: Mike Klein <mtklein@chromium.org>
> Reviewed-by: Brian Osman <brianosman@google.com>
Change-Id: I33ee0d8bcfd72c401823a2e7d5168c9ecc9a5181
Reviewed-on: https://skia-review.googlesource.com/129624
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 339133f82c30cd3080672db28e6f72c894cba05a.
Reason for revert: broke NinePatchDrawableTest.testGetPadding? stranger things have happened.
Original change's description:
> start cleaning up non-skcms SkColorSpaceXforms
>
> I think this gets rid of
> - SkColorSpaceXform_Base
> - SkColorSpaceXform_XYZ
> - SkColorSpaceXform_A2B
> and lots of support code. Might be more left to clean up?
>
> Change-Id: I560d974d1e879dfd6a63ee2244a3dd88bd495c8a
> Reviewed-on: https://skia-review.googlesource.com/129512
> Commit-Queue: Brian Osman <brianosman@google.com>
> Auto-Submit: Mike Klein <mtklein@chromium.org>
> Reviewed-by: Brian Osman <brianosman@google.com>
TBR=mtklein@chromium.org,brianosman@google.com
Change-Id: I9e76195481b8658b34936aeece278d81c286c0fa
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/129680
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I think this gets rid of
- SkColorSpaceXform_Base
- SkColorSpaceXform_XYZ
- SkColorSpaceXform_A2B
and lots of support code. Might be more left to clean up?
Change-Id: I560d974d1e879dfd6a63ee2244a3dd88bd495c8a
Reviewed-on: https://skia-review.googlesource.com/129512
Commit-Queue: Brian Osman <brianosman@google.com>
Auto-Submit: Mike Klein <mtklein@chromium.org>
Reviewed-by: Brian Osman <brianosman@google.com>
|
|
|
|
|
|
|
|
|
| |
Docs-Preview: https://skia.org/?cl=112204
Bug: skia:
Change-Id: I10042a0200db00bd8ff8078467c409b1cf191f50
Reviewed-on: https://skia-review.googlesource.com/112204
Commit-Queue: Ethan Nicholas <ethannicholas@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a reland of 78cb579f33943421afc8423a39867fcfd69fed44
This time, lowp stages are controlled by !defined(JUMPER_IS_SCALAR), not
by defined(__clang__). The two are usually the same, except when we opt
Clang builds into JUMPER_IS_SCALAR artificially.
Some Google3 builds use compilers old enough that they barf when
compiling our NEON code. It's conceivably also possible to define
JUMPER_IS_SCALAR yourself, but I don't think anyone does that.
Original change's description:
> Reland "make SkJumper stages normal Skia code"
>
> This is a reland of 22e536e3a1a09405d1c0e6f071717a726d86e8d4
>
> Now with fixed #include paths in SkRasterPipeline_opts.h,
> and -ffp-contract=fast for the :hsw target to minimize
> diffs on non-Windows Clang AVX2/AVX-512 bots.
>
> Original change's description:
> > make SkJumper stages normal Skia code
> >
> > Enough clients are using Clang now that we can say, use Clang to build
> > if you want these software pipeline stages to go fast.
> >
> > This lets us drop the offline build aspect of SkJumper stages, instead
> > building as part of Skia using the SkOpts framework.
> >
> > I think everything should work, except I've (temporarily) removed
> > AVX-512 support. I will put this back in a follow up.
> >
> > I have had to drop Windows down to __vectorcall and our narrower
> > stage calling convention that keeps the d-registers on the stack.
> > I tried forcing sysv_abi, but that crashed Clang. :/
> >
> > Added a TODO to up the same narrower stage calling convention
> > for lowp stages... we just *don't* today, for no good reason.
> >
> > Change-Id: Iaaa792ffe4deab3508d2dc5d0008c163c24b3383
> > Reviewed-on: https://skia-review.googlesource.com/110641
> > Commit-Queue: Mike Klein <mtklein@chromium.org>
> > Reviewed-by: Herb Derby <herb@google.com>
> > Reviewed-by: Florin Malita <fmalita@chromium.org>
>
> Change-Id: I44f2c03d33958e3807747e40904b6351957dd448
> Reviewed-on: https://skia-review.googlesource.com/112742
> Reviewed-by: Mike Klein <mtklein@chromium.org>
Change-Id: I3d71197d4bbb19ca4a94961a97fa2e54d5cbfb0d
Reviewed-on: https://skia-review.googlesource.com/112744
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 78cb579f33943421afc8423a39867fcfd69fed44.
Reason for revert: lowp should be controlled by defined(JUMPER_IS_SCALAR), not defined(__clang__). So close.
Original change's description:
> Reland "make SkJumper stages normal Skia code"
>
> This is a reland of 22e536e3a1a09405d1c0e6f071717a726d86e8d4
>
> Now with fixed #include paths in SkRasterPipeline_opts.h,
> and -ffp-contract=fast for the :hsw target to minimize
> diffs on non-Windows Clang AVX2/AVX-512 bots.
>
> Original change's description:
> > make SkJumper stages normal Skia code
> >
> > Enough clients are using Clang now that we can say, use Clang to build
> > if you want these software pipeline stages to go fast.
> >
> > This lets us drop the offline build aspect of SkJumper stages, instead
> > building as part of Skia using the SkOpts framework.
> >
> > I think everything should work, except I've (temporarily) removed
> > AVX-512 support. I will put this back in a follow up.
> >
> > I have had to drop Windows down to __vectorcall and our narrower
> > stage calling convention that keeps the d-registers on the stack.
> > I tried forcing sysv_abi, but that crashed Clang. :/
> >
> > Added a TODO to up the same narrower stage calling convention
> > for lowp stages... we just *don't* today, for no good reason.
> >
> > Change-Id: Iaaa792ffe4deab3508d2dc5d0008c163c24b3383
> > Reviewed-on: https://skia-review.googlesource.com/110641
> > Commit-Queue: Mike Klein <mtklein@chromium.org>
> > Reviewed-by: Herb Derby <herb@google.com>
> > Reviewed-by: Florin Malita <fmalita@chromium.org>
>
> Change-Id: I44f2c03d33958e3807747e40904b6351957dd448
> Reviewed-on: https://skia-review.googlesource.com/112742
> Reviewed-by: Mike Klein <mtklein@chromium.org>
TBR=mtklein@chromium.org,herb@google.com,fmalita@chromium.org
Change-Id: Ie64da98f5187d44e03c0ce05d7cb189d4a6e6663
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/112743
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a reland of 22e536e3a1a09405d1c0e6f071717a726d86e8d4
Now with fixed #include paths in SkRasterPipeline_opts.h,
and -ffp-contract=fast for the :hsw target to minimize
diffs on non-Windows Clang AVX2/AVX-512 bots.
Original change's description:
> make SkJumper stages normal Skia code
>
> Enough clients are using Clang now that we can say, use Clang to build
> if you want these software pipeline stages to go fast.
>
> This lets us drop the offline build aspect of SkJumper stages, instead
> building as part of Skia using the SkOpts framework.
>
> I think everything should work, except I've (temporarily) removed
> AVX-512 support. I will put this back in a follow up.
>
> I have had to drop Windows down to __vectorcall and our narrower
> stage calling convention that keeps the d-registers on the stack.
> I tried forcing sysv_abi, but that crashed Clang. :/
>
> Added a TODO to up the same narrower stage calling convention
> for lowp stages... we just *don't* today, for no good reason.
>
> Change-Id: Iaaa792ffe4deab3508d2dc5d0008c163c24b3383
> Reviewed-on: https://skia-review.googlesource.com/110641
> Commit-Queue: Mike Klein <mtklein@chromium.org>
> Reviewed-by: Herb Derby <herb@google.com>
> Reviewed-by: Florin Malita <fmalita@chromium.org>
Change-Id: I44f2c03d33958e3807747e40904b6351957dd448
Reviewed-on: https://skia-review.googlesource.com/112742
Reviewed-by: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 22e536e3a1a09405d1c0e6f071717a726d86e8d4.
Reason for revert: wrong include path :/
Original change's description:
> make SkJumper stages normal Skia code
>
> Enough clients are using Clang now that we can say, use Clang to build
> if you want these software pipeline stages to go fast.
>
> This lets us drop the offline build aspect of SkJumper stages, instead
> building as part of Skia using the SkOpts framework.
>
> I think everything should work, except I've (temporarily) removed
> AVX-512 support. I will put this back in a follow up.
>
> I have had to drop Windows down to __vectorcall and our narrower
> stage calling convention that keeps the d-registers on the stack.
> I tried forcing sysv_abi, but that crashed Clang. :/
>
> Added a TODO to up the same narrower stage calling convention
> for lowp stages... we just *don't* today, for no good reason.
>
> Change-Id: Iaaa792ffe4deab3508d2dc5d0008c163c24b3383
> Reviewed-on: https://skia-review.googlesource.com/110641
> Commit-Queue: Mike Klein <mtklein@chromium.org>
> Reviewed-by: Herb Derby <herb@google.com>
> Reviewed-by: Florin Malita <fmalita@chromium.org>
TBR=mtklein@chromium.org,herb@google.com,fmalita@chromium.org
Change-Id: I2bdc709c80cdfa6b13ff24e024b3721bef887f46
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/112741
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enough clients are using Clang now that we can say, use Clang to build
if you want these software pipeline stages to go fast.
This lets us drop the offline build aspect of SkJumper stages, instead
building as part of Skia using the SkOpts framework.
I think everything should work, except I've (temporarily) removed
AVX-512 support. I will put this back in a follow up.
I have had to drop Windows down to __vectorcall and our narrower
stage calling convention that keeps the d-registers on the stack.
I tried forcing sysv_abi, but that crashed Clang. :/
Added a TODO to up the same narrower stage calling convention
for lowp stages... we just *don't* today, for no good reason.
Change-Id: Iaaa792ffe4deab3508d2dc5d0008c163c24b3383
Reviewed-on: https://skia-review.googlesource.com/110641
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove SkColorSpaceXform_base from the inheritance chain. Move some
enums only used within XYZ to be class scoped. Eliminate redundant
context structs, and just use the SkJumper types directly.
SkColorSpaceXform_Base still exists, but just to hold a couple static
functions. Trying to untangle the dst gamma table mess next.
Bug: skia:
Change-Id: I6d2b7807c33e61a0d7df74e334356567d8a2c0e0
Reviewed-on: https://skia-review.googlesource.com/112601
Commit-Queue: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
| |
Bug: skia:
Change-Id: If6481d202bf22a95f1dea0c5bf7d84698b63869a
Reviewed-on: https://skia-review.googlesource.com/109241
Commit-Queue: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Plenty more to follow-up:
- gradients
- gpu impl
Bug: skia:7638
Change-Id: I8e54fd0e24921f040f178c793b36c7fb855b136e
Reviewed-on: https://skia-review.googlesource.com/107420
Commit-Queue: Mike Reed <reed@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Same sort of deal as before, now with all three new formats.
While I was at it, I made sure RGBA 8888 and BGRA 8888 both work too.
We don't want the 101010's in lowp, but 888x should be fine.
After looking at the DM images on monitors at work, I decided to
re-enable dither even on 10-bit images.
Looking at the GMs in 888x or 101010x is interesting... I think we must
not be clearing the memory allocated for layers? Seems like we want to
allocate layers as 8888?
Change-Id: I3a85b4f00877792a6425a7e7eb31eacb04ae9218
Reviewed-on: https://skia-review.googlesource.com/101640
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
| |
dx and dy are already size_t, so no need to demote them to int,
and demoting to int gets dicey in terms of wrap-around.
Change-Id: I98eb31ef7aa35fa2c2aa5be27cdc0b4dc7dfd008
Reviewed-on: https://skia-review.googlesource.com/99500
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
| |
Bug: skia:7459
Change-Id: Iccc2588f80e22b13ed5d23656b8c75d7b7058a36
Reviewed-on: https://skia-review.googlesource.com/92700
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Yuqian Li <liyuqian@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The updated algorithm matches our new GPU algorithm
(https://skia.org/dev/design/conical) and it brings
about 7%-26% speedup. In the next CL, I'll simplify
the GPU code by reusing the CPU code in this CL.
7.20% faster in gradient_conical_clamp_hicolor
8.94% faster in gradient_conicalZero_clamp_hicolor
10.00% faster in gradient_conicalOut_clamp_hicolor
11.72% faster in gradient_conicalOutZero_clamp_hicolor
13.62% faster in gradient_conical_clamp_3color
16.52% faster in gradient_conicalZero_clamp_3color
17.48% faster in gradient_conical_clamp
17.70% faster in gradient_conical_clamp_shallow
20.60% faster in gradient_conicalOut_clamp_3color
20.98% faster in gradient_conicalOutZero_clamp_3color
21.79% faster in gradient_conicalZero_clamp
22.48% faster in gradient_conicalOut_clamp
26.13% faster in gradient_conicalOutZero_clamp
Bug: skia:
Change-Id: Ia159495e1c77658cb28e48c9edf84938464e501c
Reviewed-on: https://skia-review.googlesource.com/90262
Commit-Queue: Yuqian Li <liyuqian@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
| |
This brings a little more symmetry to _stages.cpp and _stages_lowp.cpp.
Change-Id: Icfcbd3f264ab97d8445ad8e14c25b4a07c780aea
Reviewed-on: https://skia-review.googlesource.com/90030
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It looks like we can specialize hot image shaders into their
own single stages for a good speedup on both x86 and ARM.
I've started here with bilerp_clamp_8888, and will
follow up with bgra and 565, and lowp versions of those,
and probably also the same for nearest neighbors.
All pixels are identical in GMs.
This time, rewrite the loop over sample points to be a little
friendlier to 32-bit x86 code generation. The previous version
created an object file indirection feature build_stages.py can't handle.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Android-Clang-NexusPlayer-CPU-Moorefield-x86-Release-All-Android,Test-Android-Clang-NexusPlayer-GPU-PowerVR-x86-Release-All-Android
Change-Id: I150b6af4a5b89e009dc04ca69e1857892e173deb
Reviewed-on: https://skia-review.googlesource.com/89180
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is more consistent with our other SK_BUILD_FOR_... macros,
and less likely to collide with other preprocessor logic.
(Luckily, this was defined in public.bzl, so we can do this
all in one CL in the Skia repo.)
Change-Id: I5f232888288c9c53fad445545d983d0fb0b4add8
Reviewed-on: https://skia-review.googlesource.com/86940
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 8a64e52a98d178be13fd137b3b3a3c6aff457d85.
Reason for revert:
Test-Android-Clang-NexusPlayer-CPU-Moorefield-x86-Release-All-Android
Test-Android-Clang-NexusPlayer-GPU-PowerVR-x86-Release-All-Android
Original change's description:
> attempt 2: add experimental bilerp_clamp_8888 stage
>
> It looks like we can specialize hot image shaders into their
> own single stages for a good speedup on both x86 and ARM.
>
> I've started here with bilerp_clamp_8888, and will
> follow up with bgra and 565, and lowp versions of those,
> and probably also the same for nearest neighbors.
>
> All pixels are identical in GMs.
>
> Change-Id: Ib5ed6e528efd9e3eed96ba67d02fbec2e8133a81
> Reviewed-on: https://skia-review.googlesource.com/86860
> Reviewed-by: Mike Klein <mtklein@chromium.org>
> Commit-Queue: Mike Klein <mtklein@chromium.org>
TBR=mtklein@chromium.org,liyuqian@google.com
Change-Id: I34409a7b4aee4fd54baee44f7fc53bd0982500fe
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/86601
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It looks like we can specialize hot image shaders into their
own single stages for a good speedup on both x86 and ARM.
I've started here with bilerp_clamp_8888, and will
follow up with bgra and 565, and lowp versions of those,
and probably also the same for nearest neighbors.
All pixels are identical in GMs.
Change-Id: Ib5ed6e528efd9e3eed96ba67d02fbec2e8133a81
Reviewed-on: https://skia-review.googlesource.com/86860
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of trying to carefully manage the in-gamut / out-of-gamut state
of the pipeline, let's do what a GPU would do, clamping to representable
range in any float -> integer conversion.
Most effects doing table lookups now clamp themselves internally, and
the store_foo() methods clamp when the destination is fixed point. In
turn the from_srgb() conversions and all future transfer function stages
can care less about this stuff.
If I'm thinking right, the _lowp side of things need not change at all,
and that will soften the performance impact of this change. Anything
that was fast to begin with was probably running a _lowp pipeline.
Bug: skia:7419
Change-Id: Id2e080ac240a97b900a1ac131c85d9e15f70af32
Reviewed-on: https://skia-review.googlesource.com/85740
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Brian Osman <brianosman@google.com>
|
|
|
|
|
|
|
|
|
|
|
| |
The path involving approx_log2() and approx_pow2() does not produce 0.
And it's probably not a good idea to think about what approx_log2(0) is
anyway.
Change-Id: If5f48298c5bd5565ae808ebdfbd02649f4dd3046
Reviewed-on: https://skia-review.googlesource.com/85840
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some interesting things are starting to fall out already,
like the fact that I needed to add a gamma_dst stage to
be able to draw into gamma-transfer-fn destinations.
I've also had to pass an SkAlphaType through to the linearize
functions so that they can maintain premul invariants. I'm not
sure this is actually a good idea... if you can, please double-
check my logic at SkRasterPipeline.cpp:128?
If it's correct logic, I'm going to need to do it all over the place.
But I imagine you don't do this and somehow get away with it.
Change-Id: I42cd9b161b54287d674225103ad9e19f8b388959
Reviewed-on: https://skia-review.googlesource.com/84680
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Brian Osman <brianosman@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're still working out why these intrinsics aren't available.
Long term options:
- get the intrinsics to work
- convert to inline asm
Change-Id: I07edf1944daf01842f01b26ad874f62314d0f68f
Reviewed-on: https://skia-review.googlesource.com/84222
Reviewed-by: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to be a bit more pedantic here to support builds that may be
using AVX2 as part of their baseline but perhaps not enabling all the
related features SkJumper would like to use.
E.g. we've seen Tensorflow build with AVX2 and FMA, but not F16C.
So check all three {AVX2,FMA,F16C}, and only then build stages in HSW
mode. I've updated the define as a reminder.
This only affects builds using these features for their _baseline_
stages... the offline-compiled stages in SkJumper_generated.S are
not affected.
Change-Id: I9bfb3bae3589d35043b748782cefa8c213726d6a
Reviewed-on: https://skia-review.googlesource.com/84221
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed these little bits while working on that old-Clang fix.
- We can force-inline anytime we've got Clang,
not just when JUMPER_IS_OFFLINE.
- The _aarch64 and _vfp4 WRAP functions are dead code,
as they're never compiled offline now.
Change-Id: I5850daded2ffcfe50ceeadc43f89fa8597df3387
Reviewed-on: https://skia-review.googlesource.com/84060
Commit-Queue: Mike Klein <mtklein@chromium.org>
Commit-Queue: Florin Malita <fmalita@chromium.org>
Reviewed-by: Florin Malita <fmalita@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Older versions of Clang, at least vanilla 3.9 and "Apple LLVM version
8.1.0 (clang-802.0.42)" seem to crash when compiling SkJumper_stages.cpp
with NEON for ARMv7 without at least -O1.
So detect that case, and fall back to scalar code.
Change-Id: I3c1595da491bef38c18f47f96690700c67fdc70e
Reviewed-on: https://skia-review.googlesource.com/83980
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VFPv4 gives us two interesting features:
- FMA
- f16<->f32 conversions
Even without FMAs, NEON still has non-fused MLA instructions. We don't
really care about the fusedness of those mul-adds, so losing FMA here is
kind of no big deal.
We already maintain portable code to do f16<->f32 conversions, so it's
not much of a maintanence hit to use that instead of the native
instructions. To my knowledge software F16 rendering is not a
performance critical mode of operation for any of our users.
This drops our minimum requirement to basically just having NEON.
Devices like the Nexus 7 2012 will now take SkJumper fast paths
instead of portable code. (Though actually, we've only ever
required NEON for _lowp... only the float code also needed vfpv4).
The main file to look at here is actually SkJumper_vectors.h,
where you will see all the substantive changes. The rest just
kind of tears down most of the old complexity, add adds ABI
to put just a little of it back. :)
Change-Id: Ia9237117698729c91e5fa51126baf80748093bf4
Bug: skia:
Reviewed-on: https://skia-review.googlesource.com/83521
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Florin Malita <fmalita@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here's the tiny performance gain:
$python tools/calmbench/calmbench.py firstsecond --extraarg "-m conic"
firstsecond (compared to master) is likely
4.23% faster in gradient_conicalOut_clamp_3color
4.23% faster in gradient_conicalOutZero_clamp_3color
4.79% faster in gradient_conical_clamp_shallow_dither
6.04% faster in gradient_conical_clamp_3color
6.04% faster in gradient_conicalZero_clamp_3color
6.42% faster in gradient_conicalOut_clamp
6.43% faster in gradient_conicalOutZero_clamp
6.74% faster in gradient_conical_clamp
6.98% faster in gradient_conical_clamp_shallow
6.98% faster in gradient_conicalZero_clamp
Bug: skia:
Change-Id: Id74866908b99753ed8b16a657d3f67c9255d0043
Reviewed-on: https://skia-review.googlesource.com/76561
Commit-Queue: Yuqian Li <liyuqian@google.com>
Reviewed-by: Mike Klein <mtklein@chromium.org>
Reviewed-by: Florin Malita <fmalita@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit a7fa3377d24643d86117159f8a58d2ee66880a4d.
Reason for revert: lots of crashing GPU bots.
Original change's description:
> add experimental bilerp_clamp_8888 stage
>
> It looks like we can specialize hot image shaders into their
> own single stages for a good speedup on both x86 and ARM.
>
> I've started here with bilerp_clamp_8888, and will
> follow up with bgra and 565, and lowp versions of those,
> and probably also the same for nearest neighbors.
>
> All pixels are identical in GMs.
>
> Change-Id: I2f6995767cd38053d670b8d0bfdb71b687803d70
> Reviewed-on: https://skia-review.googlesource.com/82100
> Reviewed-by: Yuqian Li <liyuqian@google.com>
> Commit-Queue: Mike Klein <mtklein@chromium.org>
TBR=mtklein@chromium.org,mtklein@google.com,liyuqian@google.com
Change-Id: If70abb91b69bcd781e395dd3ac05ff1eebb1169f
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/83340
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It looks like we can specialize hot image shaders into their
own single stages for a good speedup on both x86 and ARM.
I've started here with bilerp_clamp_8888, and will
follow up with bgra and 565, and lowp versions of those,
and probably also the same for nearest neighbors.
All pixels are identical in GMs.
Change-Id: I2f6995767cd38053d670b8d0bfdb71b687803d70
Reviewed-on: https://skia-review.googlesource.com/82100
Reviewed-by: Yuqian Li <liyuqian@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Turns out I was defining JUMPER_HAS_NEON_LOWP,
but checking JUMPER_NEON_HAS_LOWP.
🤦
Change-Id: Ib328190ce35a367bf3d08d8e66f0ab8791ccb8b2
Reviewed-on: https://skia-review.googlesource.com/82320
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
| |
705.32 -> 457.76 gradient_sweep_clamp_3color
609.38 -> 345.34 gradient_radial1_clamp_3color
Change-Id: I0165ac8f004ee095ada4f12b33db0a94ae39fca3
Reviewed-on: https://skia-review.googlesource.com/69902
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Florin Malita <fmalita@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit a3dd5ec3a769fb833ce77878cd4e551c15e5074d.
Reason for revert: breaking build on Build-Debian9-Clang-x86_64_Release-Fast
Original change's description:
> more powerful map()
>
> Change-Id: Icbae002999a295e3a9d1d2e6046e686784d5f608
> Reviewed-on: https://skia-review.googlesource.com/69901
> Reviewed-by: Florin Malita <fmalita@chromium.org>
> Commit-Queue: Mike Klein <mtklein@chromium.org>
TBR=mtklein@chromium.org,fmalita@chromium.org
Change-Id: Ice989dd6a6b2786f318791dd91f2c06f689cb979
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://skia-review.googlesource.com/70105
Reviewed-by: Greg Daniel <egdaniel@google.com>
Commit-Queue: Greg Daniel <egdaniel@google.com>
|
|
|
|
|
|
|
| |
Change-Id: Icbae002999a295e3a9d1d2e6046e686784d5f608
Reviewed-on: https://skia-review.googlesource.com/69901
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
| |
Change-Id: I15f83a72645fed0ed8dca9c9aad66c5db5eb247a
Reviewed-on: https://skia-review.googlesource.com/69920
Commit-Queue: Florin Malita <fmalita@chromium.org>
Reviewed-by: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was originally going to add these to help test a lowp dither, but
after looking at diffs I don't think lowp dither is a good idea.
Non-dithered lowp gradients look fine to me so far.
I'd have done conics, but they scare me.
Change-Id: I8f5e75aec726983186214845ca38cfa0d54496b3
Reviewed-on: https://skia-review.googlesource.com/66460
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Florin Malita <fmalita@chromium.org>
|
|
|
|
|
|
|
|
|
|
| |
The 4444 image in all_bitmap_configs now draws slightly different before
and after serialization. (It's serialized as 8888.) Still looks fine.
Change-Id: I1396cf1550b6769a1734ed25d59bd5b1866dfacd
Reviewed-on: https://skia-review.googlesource.com/65960
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) Move a couple stages around in the enum to places
that make more sense, and guass_a_to_rbga in the code too.
2) mirror the SkRasterPipeline stage enum with either:
LOWP(st): the stage is implemented in low precision
TODO(st): the stage should be lowp, but isn't
NOPE(st): the stage shouldn't be done in lowp.
3) statically enforce that all stages are covered by one of
LOWP, TODO, or NOPE.
Change-Id: I06c7a7e470663ef73bf652c1b65c0d3c89f0d767
Reviewed-on: https://skia-review.googlesource.com/63800
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
| |
Change-Id: I5629e74c4c13ddb9217fd3c2df3388030fa03f0c
Reviewed-on: https://skia-review.googlesource.com/63780
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Chrome generally uses BGRA buffers, so srcover_rgba_8888 isn't really
doing them any good. Probably a good idea to cover both kN32 options
any time we specialize like this?
There's one small diff, so I've lazily guarded this by
SK_LEGACY_LOWP_STAGES, which I want to rebaseline today anyway.
Change-Id: Ice672aa01a3fc83be0798580d6730a54df075478
Reviewed-on: https://skia-review.googlesource.com/63301
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Mike Reed <reed@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fills out a couple more matrix and gather stages.
Deletes a not particularly important unit test that was using a
scale matrix in a weird, non-lowp compatible way.
This will require guards for Blink layout tests.
Change-Id: I54cb228ff541f771e8f4758f07d26c5161d48af3
Reviewed-on: https://skia-review.googlesource.com/62520
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This method is a little simpler macro-wise,
and makes it easier to guard new lowp stages:
LOWP(foo)
LOWP(bar)
#ifndef SK_LEGACY_LOWP_BAZ
LOWP(baz)
#endif
Change-Id: I06392f5cf7a04651e7bf47e79f10f7da8520f5ab
Reviewed-on: https://skia-review.googlesource.com/63141
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a no-op refactor.
It's just always surprised me that the matrix_scale_translate
stage expects [tx ty sx sy], when scales precede the translates
in the names and in both normal row-major and column-major matrix
layouts.
This switches to [sx sy tx ty], scale then translate.
Change-Id: I2d88701121ae8013facd5a28bb0ff520211db5a6
Reviewed-on: https://skia-review.googlesource.com/62541
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're going to want to assign types to the stages depending on their
inputs and outputs:
GG: x,y -> x,y
GP: x,y -> r,g,b,a
PP: r,g,b,a -> r,g,b,a
(There are a couple other degenerate cases here, where a stage ignores
its inputs or creates no outputs, but we can always just pretend their
null input or output is one type or the other arbitrarily.)
The GG stages will be pretty much entirely float code, and the GP stages
a mix of float math and byte stuff.
Since we've chosen U16 to match our register size in _lowp land,
we'll unpack each F register across two of those for transport between
stages. This is a notional, free operation in both directions.
Change-Id: I605311d0dc327a1a3a9d688173d9498c1658e715
Reviewed-on: https://skia-review.googlesource.com/60800
Reviewed-by: Herb Derby <herb@google.com>
Reviewed-by: Florin Malita <fmalita@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
| |
As this array grows longer it causes troublesome code generation
when we're compiling offline, but it's easy as an argument.
Change-Id: I53526443f534f29d3bff17c3aec24a9e916c9b86
Reviewed-on: https://skia-review.googlesource.com/60564
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Today (x,y) are the integer coordinates of the first destination pixel
we're working on. By renaming them (dx,dy), we free up the names (x,y)
for working (i.e. _source_) x and y.
Until now we've generally just been continuing to call those (r,g), but
in the _lowp code that won't be possible (r+g hold x together, b+a y)
but we'll have the ability to just give them proper names x and y.
Change-Id: Id5faa09c4406116df5df7494efc6cb23659e9a2f
Reviewed-on: https://skia-review.googlesource.com/60820
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's properly 16 today because of HSW/lowp stages handling 16 pixels at
a time, but it hasn't yet had an effect on lowp so we didn't notice.
As we add lowp shader stages this will start to matter,
so might as well bump it up to 16 now.
(One day _skx lowp stages could bump this up to 32.)
Change-Id: Idd8185c08e12dc657389a35bf659662c9670f98a
Reviewed-on: https://skia-review.googlesource.com/60565
Reviewed-by: Mike Klein <mtklein@chromium.org>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are non-zero values of a that make infinite 1.0f/a.
Let's just check for the real thing we care about, that
scale is finite.
Bug: skia:7123
Change-Id: If97574c9f3f2f0b73c749d0bea9aa19e6114f4d1
Reviewed-on: https://skia-review.googlesource.com/58460
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|