| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- rearrange a bit
- fewer macros
- hooks for all operators
- add left and right scalar operator overrides
- add +=, &=, <<=, etc.
- add SkNx_split() and SkNx_join()
- simplify the many rsqrt() and invert() options to just what we actually use
This refactoring pointed out that our float <-> int NEON conversions are not specialized, so I've implemented them. It seems nice that this is an error rather than silently falling back to serial code.
It's unclear to me if split/join want to be external, static methods, or non-static methods (SkNx_join(), Sk4f::Join(), x.join()). Time will tell?
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1812233003
CQ_EXTRA_TRYBOTS=client.skia.android:Test-Android-GCC-Nexus5-CPU-NEON-Arm7-Release-Trybot;client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1812233003
|
|
|
|
|
|
|
|
|
|
| |
Just some syntax cleanup. No real change: kth<...>() was calling [...] already.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1714363002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1714363002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- trim unused specializations (Sk4i, Sk2d) and apis (SkNx_dup)
- expand apis a little
* v[0] == v.kth<0>()
* SkNx_shuffle can now convert to different-sized vectors, e.g. Sk2f <-> Sk4f
- remove anonymous namespace
I believe it's safe to remove the anonymous namespace right now.
We're worried about violating the One Definition Rule; the anonymous namespace protected us from that.
In Release builds, this is mostly moot, as everything tends to inline completely.
In Debug builds, violating the ODR is at worst an inconvenience, time spent trying to figure out why the bot is broken.
Now that we're building with SSE2/NEON everywhere, very few bots have even a chance about getting confused by two definitions of the same type or function. Where we do compile variants depending on, e.g., SSSE3, we do so in static inline functions. These are not subject to the ODR.
I plan to follow up with a tedious .kth<...>() -> [...] auto-replace.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1683543002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1683543002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This disables a few tests in DM:
- one BlurLargeImage GM maybe is really broken
- FontMgrAndroidParser uses libexpat, which I've not (yet?) built from source,
so MSAN can't see into it.
This extends some of the MSAN stifling we added around SkImageDecoder_libjpeg to SkCodec, and skips .wbmps, .pngs, and .bmps. We're only seeing issues in colortables for .png and .bmp.
I think I can probably back out disabling Codec and the RAW image decodes...
they should all be covered by the libjpeg stifles.
BUG=skia:4550,skia:4900
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1673663002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Debug-MSAN-Trybot
TBR=msarett@google.com
Review URL: https://codereview.chromium.org/1673663002
|
|
|
|
|
|
|
|
|
|
| |
This means we can remove a lot of explicit casts in code that uses SkNx.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1650653002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1650653002
|
|
|
|
|
|
|
|
|
|
|
| |
This gets rid of those unsightly lambdas,
and makes the file more consistent both with itself and with Sk4px.
BUG=skia:4765
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1569373002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1569373002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems that MSVC + __vectorcall don't play well together,
so back ourselves out into a situation where we don't need it.
- Inline transfermode functions. This removes the need for SK_VECTORCALL.
- Remove 565 destination specializations.
Blending into 565 is not speed-critical enough to merit the code bloat.
- Removing 565 specializations means a bunch of Sk4px code is now dead.
8888 xfermodes generally speed up a bit from inlining, smoothly ranging from no change down to 0.65x for the fastest functions like Plus or Modulate.
565 xfermodes generally slow down because we're doing 565 -> 8888 and 8888->565 conversion serially[1] and using the stack, smoothly ranging from no change up to 2x slower for the fastest functions like Plus and Modulate.
[1] the 565->8888 conversion is actually being autovectorized
BUG=skia:4765,skia:4776
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1565223002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
No public API changes.
TBR=reed@google.com
Review URL: https://codereview.chromium.org/1565223002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- one base case and one N=1 case instead of two each (or three with doubles)
- use SkNx_cast instead of FromBytes/toBytes
- 4-at-a-time Sk4f::ToBytes becomes a special standalone Sk4f_ToBytes
If I did everything right, this'll be perf- and pixel- neutral.
https://gold.skia.org/search2?issue=1526523003&unt=true&query=source_type%3Dgm&master=false
BUG=skia:
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1526523003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generally this was a performance win, even on devices without AVX due
to unrolling, but on ARM+NEON it looks like that unrolling hurt a bit.
while (...) { blend a pixel }
~~~>
while (...) { blend two pixels }
if (n % 2) { blend last pixel }
BUG=chromium:555278
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1465483002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode_ColorDodge_aa 10.3ms -> 7.85ms 0.76x
Xfermode_SoftLight_aa 13.8ms -> 10.2ms 0.74x
Xfermode_ColorBurn_aa 10.7ms -> 7.82ms 0.73x
Xfermode_SoftLight 33.6ms -> 23.2ms 0.69x
Xfermode_ColorDodge 25ms -> 16.5ms 0.66x
Xfermode_ColorBurn 26.1ms -> 16.6ms 0.63x
Ought to be no pixel diffs:
https://gold.skia.org/search2?issue=1432903002&unt=true&query=source_type%3Dgm&master=false
Incidental stuff:
I made the SkNx(T) constructors implicit to make writing math expressions simpler.
This allows us to write expressions like
Sk4f v;
...
v = v*4;
rather than
Sk4f v;
...
v = v * Sk4f(4);
As written it only works when the constant is on the right-hand side,
so expressions like `(Sk4f(1) - da)` have to stay for now. I plan on
following up with a CL that lets those become `(1 - da)` too.
BUG=skia:4117
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1432903002
|
|
|
|
|
|
|
|
|
|
|
| |
This allows us to express shuffles more directly in code while also giving us a
convenient point to platform-specify particular shuffles for particular types.
No specializations yet. Everyone just uses the (pretty good) default option.
BUG=skia:
Review URL: https://codereview.chromium.org/1301413006
|
|
|
|
|
|
|
|
|
|
|
|
| |
This switches over SkXfermodes_opts.h and SkColorMatrixFilter to use Sk4f,
and converts the SkPMFloat benches to Sk4f benches.
No pixels should change here, and no code beyond the Sk4f_ benches should change speed.
The benches are faster than the old versions.
BUG=skia:4117
Review URL: https://codereview.chromium.org/1324743002
|
|
|
|
|
|
| |
DOCS_PREVIEW= https://skia.org/?cl=1316233002
Review URL: https://codereview.chromium.org/1316233002
|
|
|
|
|
|
| |
DOCS_PREVIEW= https://skia.org/?cl=1316123003
Review URL: https://codereview.chromium.org/1316123003
|
|
|
|
|
|
|
|
| |
Remember failed attempt https://codereview.chromium.org/1286093004/ ? I think this one is simpler and safer and even technically legal C++.
BUG=skia:4117
Review URL: https://codereview.chromium.org/1296183004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At head they're s,d[,aa] in SkXfermode_opts.h but Sk4px::Map* expect d,s[,aa]
so we ended up having to write weird little lambda shims to match impedance.
There's no reason for these to disagree, and d,s[,aa] is the One True Order
(because no matter what you're doing in graphics, there's always a dst).
Should be no perf or image diff, though I'm suspicious it might help MSVC code generation.
BUG=skia:4117
Committed: https://skia.googlesource.com/skia/+/6028a8476504022fe40b6870b1460b5e4a80969f
CQ_EXTRA_TRYBOTS=client.skia:Test-Win8-MSVC-ShuttleB-CPU-AVX2-x86-Release-Trybot
Review URL: https://codereview.chromium.org/1289903002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
#1 id:1 of https://codereview.chromium.org/1289903002/ )
Reason for revert:
?
Original issue's description:
> Normalize SkXfermode_opts.h argument order as d,s[,aa].
>
> At head they're s,d[,aa] in SkXfermode_opts.h but Sk4px::Map* expect d,s[,aa]
> so we ended up having to write weird little lambda shims to match impedance.
>
> There's no reason for these to disagree, and d,s[,aa] is the One True Order
> (because no matter what you're doing in graphics, there's always a dst).
>
> Should be no perf or image diff, though I'm suspicious it might help MSVC code generation.
>
> BUG=skia:4117
>
> Committed: https://skia.googlesource.com/skia/+/6028a8476504022fe40b6870b1460b5e4a80969f
TBR=djsollen@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:4117
Review URL: https://codereview.chromium.org/1284363002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
id:1 of https://codereview.chromium.org/1286093004/ )
Reason for revert:
Maybe causing test / gold problems?
Original issue's description:
> Refactor to put SkXfermode_opts inside SK_OPTS_NS.
>
> Without this refactor I was getting warnings previously about having code
> inside namespace SK_OPTS_NS (e.g. namespace sse2, namespace neon) referring to
> code inside an anonymous namespace (Sk4px, SkPMFloat, Sk4f, etc) [1].
>
> That low-level code was in an anonymous namespace to allow multiple independent
> copies of its methods to be instantiated without the linker getting confused /
> offended about violating the One Definition Rule. This was only happening in
> Debug mode where the methods were not being inlined.
>
> To fix this all, I've force-inlined the methods of the low-level code and
> removed the anonymous namespace.
>
> BUG=skia:4117
>
>
> [1] Here is what those errors looked like:
>
> In file included from ../../../../src/core/SkOpts.cpp:18:0:
> ../../../../src/opts/SkXfermode_opts.h:193:7: error: 'portable::Sk4pxXfermode' has a field 'portable::Sk4pxXfermode::fProc4' whose type uses the anonymous namespace [-Werror]
> class Sk4pxXfermode : public SkProcCoeffXfermode {
> ^
> ../../../../src/opts/SkXfermode_opts.h:193:7: error: 'portable::Sk4pxXfermode' has a field 'portable::Sk4pxXfermode::fAAProc4' whose type uses the anonymous namespace [-Werror]
> ../../../../src/opts/SkXfermode_opts.h:235:7: error: 'portable::SkPMFloatXfermode' has a field 'portable::SkPMFloatXfermode::fProcF' whose type uses the anonymous namespace [-Werror]
> class SkPMFloatXfermode : public SkProcCoeffXfermode {
> ^
> cc1plus: all warnings being treated as errors
>
> Committed: https://skia.googlesource.com/skia/+/b07bee3121680b53b98b780ac08d14d374dd4c6f
TBR=djsollen@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:4117
Review URL: https://codereview.chromium.org/1284333002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At head they're s,d[,aa] in SkXfermode_opts.h but Sk4px::Map* expect d,s[,aa]
so we ended up having to write weird little lambda shims to match impedance.
There's no reason for these to disagree, and d,s[,aa] is the One True Order
(because no matter what you're doing in graphics, there's always a dst).
Should be no perf or image diff, though I'm suspicious it might help MSVC code generation.
BUG=skia:4117
Review URL: https://codereview.chromium.org/1289903002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without this refactor I was getting warnings previously about having code
inside namespace SK_OPTS_NS (e.g. namespace sse2, namespace neon) referring to
code inside an anonymous namespace (Sk4px, SkPMFloat, Sk4f, etc) [1].
That low-level code was in an anonymous namespace to allow multiple independent
copies of its methods to be instantiated without the linker getting confused /
offended about violating the One Definition Rule. This was only happening in
Debug mode where the methods were not being inlined.
To fix this all, I've force-inlined the methods of the low-level code and
removed the anonymous namespace.
BUG=skia:4117
[1] Here is what those errors looked like:
In file included from ../../../../src/core/SkOpts.cpp:18:0:
../../../../src/opts/SkXfermode_opts.h:193:7: error: 'portable::Sk4pxXfermode' has a field 'portable::Sk4pxXfermode::fProc4' whose type uses the anonymous namespace [-Werror]
class Sk4pxXfermode : public SkProcCoeffXfermode {
^
../../../../src/opts/SkXfermode_opts.h:193:7: error: 'portable::Sk4pxXfermode' has a field 'portable::Sk4pxXfermode::fAAProc4' whose type uses the anonymous namespace [-Werror]
../../../../src/opts/SkXfermode_opts.h:235:7: error: 'portable::SkPMFloatXfermode' has a field 'portable::SkPMFloatXfermode::fProcF' whose type uses the anonymous namespace [-Werror]
class SkPMFloatXfermode : public SkProcCoeffXfermode {
^
cc1plus: all warnings being treated as errors
Review URL: https://codereview.chromium.org/1286093004
|
|
|
|
|
|
|
|
|
|
|
|
| |
+268 -535 lines
I also rearranged the code a little bit to encapsulate itself better,
mostly replacing static helper functions with lambdas. This also
let me merge the SSE2 and SSE4.1 code paths.
BUG=skia:4117
Review URL: https://codereview.chromium.org/1264103004
|
|
Renames Sk4pxXfermode.h to SkXfermode_opts.h,
and refactors it a tiny bit internally.
This moves xfermode optimization from being "compile-time everywhere but NEON"
to simply "runtime everywhere". I don't anticipate any effect on perf or
correctness.
BUG=skia:4117
Review URL: https://codereview.chromium.org/1264543006
|