| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
| |
BUG=skia:
TBR=
Review URL: https://codereview.chromium.org/1212393002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
HardLight, Overlay, Darken, and Lighten are all
~2x faster with SSE, ~25% faster with NEON.
This covers all previously-implemented NEON xfermodes.
3 previous SSE xfermodes remain. Those need division
and sqrt, so I'm planning on using SkPMFloat for them.
It'll help the readability and NEON speed if I move that
into [0,1] space first.
The main new concept here is c.thenElse(t,e), which behaves like
(c ? t : e) except, of course, both t and e are evaluated. This allows
us to emulate conditionals with vectors.
This also removes the concept of SkNb. Instead of a standalone bool
vector, each SkNi or SkNf will just return their own types for
comparisons. Turns out to be a lot more manageable this way.
BUG=skia:
Committed: https://skia.googlesource.com/skia/+/b9d4163bebab0f5639f9c5928bb5fc15f472dddc
CQ_EXTRA_TRYBOTS=client.skia.compile:Build-Ubuntu-GCC-Arm64-Debug-Android-Trybot
Review URL: https://codereview.chromium.org/1196713004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of https://codereview.chromium.org/1196713004/)
Reason for revert:
64-bit ARM build failures.
Original issue's description:
> Implement four more xfermodes with Sk4px.
>
> HardLight, Overlay, Darken, and Lighten are all
> ~2x faster with SSE, ~25% faster with NEON.
>
> This covers all previously-implemented NEON xfermodes.
> 3 previous SSE xfermodes remain. Those need division
> and sqrt, so I'm planning on using SkPMFloat for them.
> It'll help the readability and NEON speed if I move that
> into [0,1] space first.
>
> The main new concept here is c.thenElse(t,e), which behaves like
> (c ? t : e) except, of course, both t and e are evaluated. This allows
> us to emulate conditionals with vectors.
>
> This also removes the concept of SkNb. Instead of a standalone bool
> vector, each SkNi or SkNf will just return their own types for
> comparisons. Turns out to be a lot more manageable this way.
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/b9d4163bebab0f5639f9c5928bb5fc15f472dddc
TBR=reed@google.com,mtklein@chromium.org
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/1205703008
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
HardLight, Overlay, Darken, and Lighten are all
~2x faster with SSE, ~25% faster with NEON.
This covers all previously-implemented NEON xfermodes.
3 previous SSE xfermodes remain. Those need division
and sqrt, so I'm planning on using SkPMFloat for them.
It'll help the readability and NEON speed if I move that
into [0,1] space first.
The main new concept here is c.thenElse(t,e), which behaves like
(c ? t : e) except, of course, both t and e are evaluated. This allows
us to emulate conditionals with vectors.
This also removes the concept of SkNb. Instead of a standalone bool
vector, each SkNi or SkNf will just return their own types for
comparisons. Turns out to be a lot more manageable this way.
BUG=skia:
Review URL: https://codereview.chromium.org/1196713004
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Once in SkXfermode as usual to pick up compile-time SSE and NEON
- Once in SkXfermode_arm_neon to pick up run-time NEON
This allows us to start cleaning up SkXfermode_arm_neon as we've done
for SkXfermode_SSE2. I'm saving this catharsis for a day when I need it.
The Sk4px xfermodes are generally faster than the existing NEON procs,
so this should also have the side effect of a perf win there.
This means our new Plus-AA code works for runtime NEON too.
BUG=skia:3852
Review URL: https://codereview.chromium.org/1150313003
|
|
|
|
|
|
|
|
|
|
| |
The compiler may choose to use x30 for a local loop counter;
ensure it's saved. Patch from kevin.petit@arm.com,
verified by benm@google.com.
R=djsollen@google.com
Review URL: https://codereview.chromium.org/786273003
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was needed for pictures before v33, and we're now requiring v35+.
Will follow up with the same for skia/ext/pixel_ref_utils_unittest.cc
BUG=skia:
Committed: https://skia.googlesource.com/skia/+/52c293547b973f7fb5de3c83f5062b07d759ab88
Review URL: https://codereview.chromium.org/769953002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
https://codereview.chromium.org/769953002/)
Reason for revert:
Breaks canary builds. Will reland after the Chromium change lands.
Original issue's description:
> Remove SK_SUPPORT_LEGACY_DEEPFLATTENING.
>
> This was needed for pictures before v33, and we're now requiring v35+.
>
> Will follow up with the same for skia/ext/pixel_ref_utils_unittest.cc
>
> BUG=skia:
>
> Committed: https://skia.googlesource.com/skia/+/52c293547b973f7fb5de3c83f5062b07d759ab88
TBR=reed@google.com,mtklein@chromium.org
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/768183002
|
|
|
|
|
|
|
|
|
|
| |
This was needed for pictures before v33, and we're now requiring v35+.
Will follow up with the same for skia/ext/pixel_ref_utils_unittest.cc
BUG=skia:
Review URL: https://codereview.chromium.org/769953002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
factory/public-constructor for the class. We want to *not* rely on private constructors, and not rely on calling through the inheritance hierarchy for either flattening or unflattening(CreateProc).
Refactoring pattern:
1. guard the existing constructor(readbuffer) with the legacy build-flag
2. If you are a instancable subclass, implement CreateProc(readbuffer) to create a new instances from the buffer params (or return NULL).
If you're a shader subclass
1. You must read/write the local matrix if your class accepts that in its factory/constructor, else ignore it.
R=robertphillips@google.com, mtklein@google.com, senorblanco@google.com, senorblanco@chromium.org, sugoi@chromium.org
Author: reed@google.com
Review URL: https://codereview.chromium.org/395603002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the NEON code for Xfermodes performs well on arm64
targets except for dstout and dstin which are significantly
slower than the C code. This patch fixes this and gives
further improvements on other modes.
Here are some perf results:
+------------+------------+------------+
| mode | Cortex-A53 | Cortex-A57 |
+------------+------------+------------+
| multiply | +24.58% | +23.71% |
+------------+------------+------------+
| exclusion | +22.72% | +22.05% |
+------------+------------+------------+
| difference | +34.67% | +36.82% |
+------------+------------+------------+
| hardlight | +17.07% | +14.74% |
+------------+------------+------------+
| lighten | +38.21% | +32.87% |
+------------+------------+------------+
| darken | +37.59% | +32.99% |
+------------+------------+------------+
| overlay | +17.36% | +16.88% |
+------------+------------+------------+
| screen | +52.56% | +54.43% |
+------------+------------+------------+
| modulate | +62.85% | +61.32% |
+------------+------------+------------+
| plus | +91.52% | +117.41% |
+------------+------------+------------+
| xor | +42.86% | +43.38% |
+------------+------------+------------+
| dstatop | +48.46% | +48.99% |
+------------+------------+------------+
| srcatop | +50.50% | +48.51% |
+------------+------------+------------+
| dstout | +67.83% | +78.09% |
+------------+------------+------------+
| srcout | +69.02% | +78.26% |
+------------+------------+------------+
| dstin | +70.92% | +79.24% |
+------------+------------+------------+
| srcin | +68.90% | +78.23% |
+------------+------------+------------+
| dstover | +73.80% | +68.10% |
+------------+------------+------------+
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=skia
R=mtklein@google.com, djsollen@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/350343002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Aarch64 support
This change contains the necessary modifications to have Skia build and
run properly on an ARMv8 processor in aarch64 execution state.
Here's a list of the changes:
- add an arm64 target to the build system + SK_CPU_ARM64 flag
- MatrixTest was failing when built in Release mode. Fused MAC
instructions were generated which made some intermediate results
more accurate. As the test relies on result comparison, the more
precise results when compared to others led to a gap bigger than
what was tolerated. As I don't know if some actual skia code relies
on results being comparable, I've disabled fused MAC instruction
with -ffp-contract=off for arm64.
- Modify include/core/SkOnce.h to have barriers work.
- SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS.
- use existing Xfermode optimisations with modifications that can be
removed in the future when toolchains are ready. Also save a few
instructions is two Xfermodes (will apply to ARM too).
- use existing SkBoxBlur and SkMorphology optimisations.
- use existing SkBlitMask optimisations
- use existing BitmapProcState and Convolution optimisations.
Future changes will include:
- Blitters (only partialy merged upstream)
- SkUtils (there's little value in sending asm optimisations without
having them benchmarked on real hardware).
Signed-off-by: Kevin PETIT <kevin.petit@arm.com>
BUG=skia:
Committed: http://code.google.com/p/skia/source/detail?r=13980
R=djsollen@google.com, reed@google.com, mtklein@google.com, halcanary@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/143423004
git-svn-id: http://skia.googlecode.com/svn/trunk@14025 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(https://codereview.chromium.org/143423004/)
Reason for revert:
GYP's failing on most (all?) bots.
Original issue's description:
> ARM Skia NEON patches - 35 - First AArch64 support
>
> Aarch64 support
>
> This change contains the necessary modifications to have Skia build and
> run properly on an ARMv8 processor in aarch64 execution state.
>
> Here's a list of the changes:
>
> - add an arm64 target to the build system + SK_CPU_ARM64 flag
>
> - MatrixTest was failing when built in Release mode. Fused MAC
> instructions were generated which made some intermediate results
> more accurate. As the test relies on result comparison, the more
> precise results when compared to others led to a gap bigger than
> what was tolerated. As I don't know if some actual skia code relies
> on results being comparable, I've disabled fused MAC instruction
> with -ffp-contract=off for arm64.
>
> - Modify include/core/SkOnce.h to have barriers work.
>
> - SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS.
>
> - use existing Xfermode optimisations with modifications that can be
> removed in the future when toolchains are ready. Also save a few
> instructions is two Xfermodes (will apply to ARM too).
>
> - use existing SkBoxBlur and SkMorphology optimisations.
>
> - use existing SkBlitMask optimisations
>
> - use existing BitmapProcState and Convolution optimisations.
>
> Future changes will include:
>
> - Blitters (only partialy merged upstream)
>
> - SkUtils (there's little value in sending asm optimisations without
> having them benchmarked on real hardware).
>
> Signed-off-by: Kevin PETIT <kevin.petit@arm.com>
>
> BUG=skia:
>
> Committed: http://code.google.com/p/skia/source/detail?r=13980
R=djsollen@google.com, reed@google.com, halcanary@google.com, kevin.petit@arm.com
TBR=djsollen@google.com, halcanary@google.com, kevin.petit@arm.com, reed@google.com
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Author: mtklein@google.com
Review URL: https://codereview.chromium.org/216113005
git-svn-id: http://skia.googlecode.com/svn/trunk@13983 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Aarch64 support
This change contains the necessary modifications to have Skia build and
run properly on an ARMv8 processor in aarch64 execution state.
Here's a list of the changes:
- add an arm64 target to the build system + SK_CPU_ARM64 flag
- MatrixTest was failing when built in Release mode. Fused MAC
instructions were generated which made some intermediate results
more accurate. As the test relies on result comparison, the more
precise results when compared to others led to a gap bigger than
what was tolerated. As I don't know if some actual skia code relies
on results being comparable, I've disabled fused MAC instruction
with -ffp-contract=off for arm64.
- Modify include/core/SkOnce.h to have barriers work.
- SK_CPU_ARM64 implies SK_ARM_NEON_MODE_ALWAYS.
- use existing Xfermode optimisations with modifications that can be
removed in the future when toolchains are ready. Also save a few
instructions is two Xfermodes (will apply to ARM too).
- use existing SkBoxBlur and SkMorphology optimisations.
- use existing SkBlitMask optimisations
- use existing BitmapProcState and Convolution optimisations.
Future changes will include:
- Blitters (only partialy merged upstream)
- SkUtils (there's little value in sending asm optimisations without
having them benchmarked on real hardware).
Signed-off-by: Kevin PETIT <kevin.petit@arm.com>
BUG=skia:
R=djsollen@google.com, reed@google.com, mtklein@google.com, halcanary@google.com
Author: kevin.petit@arm.com
Review URL: https://codereview.chromium.org/143423004
git-svn-id: http://skia.googlecode.com/svn/trunk@13980 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change is motivated by the desire to see the text information in the debugger when not in developer mode. It is structured so user's can disable it if the capability is not wanted.
R=bsalomon@google.com
Author: robertphillips@google.com
Review URL: https://codereview.chromium.org/197763008
git-svn-id: http://skia.googlecode.com/svn/trunk@13795 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Eliminates SkFlattenable{Read,Write}Buffer, promoting SkOrdered{Read,Write}Buffer
a step each in the hierarchy.
What used to be this:
SkFlattenableWriteBuffer -> SkOrderedWriteBuffer
SkFlattenableReadBuffer -> SkOrderedReadBuffer
SkFlattenableReadBuffer -> SkValidatingReadBuffer
is now
SkWriteBuffer
SkReadBuffer -> SkValidatingReadBuffer
Benefits:
- code is simpler, names are less wordy
- the generic SkFlattenableFooBuffer code in SkPaint was incorrect; removed
- write buffers are completely devirtualized, important for record speed
This refactoring was mostly mechanical. You aren't going to find anything
interesting in files with less than 10 lines changed.
BUG=skia:
R=reed@google.com, scroggo@google.com, djsollen@google.com, mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/134163010
git-svn-id: http://skia.googlecode.com/svn/trunk@13245 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases, it's easy to provide a NEON version of the 1-pixel modeprocs.
Combined with https://codereview.chromium.org/23724013/ (merged) it allows
up to 35% speed improvement on Xfermodes when aa is non-NULL.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, reed@google.com, mtklein@google.com, luisjoseromeroesclusa@hotmail.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/104883004
git-svn-id: http://skia.googlecode.com/svn/trunk@12525 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: add a NEON version of SkFourByteInterp
Brings a modest performance improvement on its own in
ProcXfermodes when aa is neither zero nor FF. Combined
with 1-pixel NEON modeprocs, it brings up to 35% speed
improvement on the aa case.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/23724013
git-svn-id: http://skia.googlecode.com/svn/trunk@12448 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: xfer16
This adds support for 16bit Xfermodes. It also tunes the gcc test
macros in xfer32() to add compatibility for gcc > 4.
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/33063002
git-svn-id: http://skia.googlecode.com/svn/trunk@12192 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: NEON implementation of SIMD procs
This patch contains a NEON implementation for a number of Xfermodes.
It provides a big speedup on Xfermode benchmarks (currently up to 3x
with gcc4.7 but up to 10x when gcc produces optimal code for it).
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
Committed: http://code.google.com/p/skia/source/detail?r=11777
Committed: http://code.google.com/p/skia/source/detail?r=11813
R=djsollen@google.com, mtklein@google.com, reed@google.com, robertphillips@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/26627004
git-svn-id: http://skia.googlecode.com/svn/trunk@11843 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
| |
https://codereview.chromium.org/26627004) due to Chromium compilation faliures.
git-svn-id: http://skia.googlecode.com/svn/trunk@11833 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Xfermode: NEON implementation of SIMD procs
This patch contains a NEON implementation for a number of Xfermodes.
It provides a big speedup on Xfermode benchmarks (currently up to 3x
with gcc4.7 but up to 10x when gcc produces optimal code for it).
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
Committed: http://code.google.com/p/skia/source/detail?r=11777
R=djsollen@google.com, mtklein@google.com, reed@google.com, robertphillips@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/26627004
git-svn-id: http://skia.googlecode.com/svn/trunk@11813 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
| |
to Chromium compilation failure
git-svn-id: http://skia.googlecode.com/svn/trunk@11799 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
Xfermode: NEON implementation of SIMD procs
This patch contains a NEON implementation for a number of Xfermodes.
It provides a big speedup on Xfermode benchmarks (currently up to 3x
with gcc4.7 but up to 10x when gcc produces optimal code for it).
Signed-off-by: Kévin PETIT <kevin.petit@arm.com>
BUG=
R=djsollen@google.com, mtklein@google.com, reed@google.com
Author: kevin.petit.arm@gmail.com
Review URL: https://codereview.chromium.org/26627004
git-svn-id: http://skia.googlecode.com/svn/trunk@11777 2bbb7eff-a529-9590-31e7-b0007b416f81
|