| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Moves CPU feature detection to its own file.
- Cleans up some redundant feature detection scattered around core/ and opts/.
- Can now detect a few new CPU features:
* F16C -> Intel f16<->f32 instructions, added between AVX and AVX2
* FMA -> Intel FMA instructions, added at the same time as AVX2
* VFP_FP16 -> ARM f16<->f32 instructions, quite common
* NEON_FMA -> ARM FMA instructions, also quite common
* SSE and SSE3... why not?
This new internal API makes it very cheap to do fine-grained runtime CPU
feature detection. Redundant calls to SkCpu::Supports() should be eliminated
and it's hoistable out of loops. It compiles away entirely when we have the
appropriate instructions available at compile time.
This means we can call it to guard even a little snippet of 1 or 2 instructions
right where needed and let inlining hoist the check (if any at all) up to
somewhere that doesn't hurt performance. I've explained how I made this work
in the private section of the new header.
Once this lands and bakes a bit, I'll start following up with CLs to use it more
and to add a bunch of those little 1-2 instruction snippets we've been wanting,
e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps
(for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Committed: https://skia.googlesource.com/skia/+/872ea29357439f05b1f6995dd300fc054733e607
Review URL: https://codereview.chromium.org/1890483002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
of https://codereview.chromium.org/1890483002/ )
Reason for revert:
many unexpected GM diffs across GPU+CPU configs on Windows (hopefully just text masks on GPU?). seems like we pick a different srcover variant in some places.
Original issue's description:
> Move CPU feature detection to its own file.
>
> - Moves CPU feature detection to its own file.
> - Cleans up some redundant feature detection scattered around core/ and opts/.
> - Can now detect a few new CPU features:
> * F16C -> Intel f16<->f32 instructions, added between AVX and AVX2
> * FMA -> Intel FMA instructions, added at the same time as AVX2
> * VFP_FP16 -> ARM f16<->f32 instructions, quite common
> * NEON_FMA -> ARM FMA instructions, also quite common
> * SSE and SSE3... why not?
>
> This new internal API makes it very cheap to do fine-grained runtime CPU
> feature detection. Redundant calls to SkCpu::Supports() should be eliminated
> and it's hoistable out of loops. It compiles away entirely when we have the
> appropriate instructions available at compile time.
>
> This means we can call it to guard even a little snippet of 1 or 2 instructions
> right where needed and let inlining hoist the check (if any at all) up to
> somewhere that doesn't hurt performance. I've explained how I made this work
> in the private section of the new header.
>
> Once this lands and bakes a bit, I'll start following up with CLs to use it more
> and to add a bunch of those little 1-2 instruction snippets we've been wanting,
> e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps
> (for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM.
>
> BUG=skia:
> GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002
> CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
>
> Committed: https://skia.googlesource.com/skia/+/872ea29357439f05b1f6995dd300fc054733e607
TBR=fmalita@chromium.org,herb@google.com,reed@google.com,mtklein@chromium.org
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG=skia:
Review URL: https://codereview.chromium.org/1892643003
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Moves CPU feature detection to its own file.
- Cleans up some redundant feature detection scattered around core/ and opts/.
- Can now detect a few new CPU features:
* F16C -> Intel f16<->f32 instructions, added between AVX and AVX2
* FMA -> Intel FMA instructions, added at the same time as AVX2
* VFP_FP16 -> ARM f16<->f32 instructions, quite common
* NEON_FMA -> ARM FMA instructions, also quite common
* SSE and SSE3... why not?
This new internal API makes it very cheap to do fine-grained runtime CPU
feature detection. Redundant calls to SkCpu::Supports() should be eliminated
and it's hoistable out of loops. It compiles away entirely when we have the
appropriate instructions available at compile time.
This means we can call it to guard even a little snippet of 1 or 2 instructions
right where needed and let inlining hoist the check (if any at all) up to
somewhere that doesn't hurt performance. I've explained how I made this work
in the private section of the new header.
Once this lands and bakes a bit, I'll start following up with CLs to use it more
and to add a bunch of those little 1-2 instruction snippets we've been wanting,
e.g. cvtps2ph, cvtph2ps, ptest, pmulld, pmovzxbd, blendvps, pshufb, roundps
(for floor) on x86, and vcvt.f32.f16, vcvt.f16.f32 on ARM.
BUG=skia:
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1890483002
CQ_EXTRA_TRYBOTS=client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot
Review URL: https://codereview.chromium.org/1890483002
|
|
|
|
|
|
| |
GOLD_TRYBOT_URL= https://gold.skia.org/search2?unt=true&query=source_type%3Dgm&master=false&issue=1842753002
Review URL: https://codereview.chromium.org/1842753002
|
|
|
|
|
|
| |
DOCS_PREVIEW= https://skia.org/?cl=1316233002
Review URL: https://codereview.chromium.org/1316233002
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we just 'define' them, but not attribute anything to them, like
'1' for example, cpp expands it to nothing and that breaks the "#if"
clauses.
To fix that, uses "#if defined(...)" which will correctly check if your
macro name was defined or not.
BUG=skia:2850
TEST=make most
R=robertphillips@google.com
Review URL: https://codereview.chromium.org/628763005
|
|
|
|
|
|
|
|
|
|
|
|
| |
This CL also removes the debug capability of runtime switching
in/out of NEON mode as it uses deprecated APIs.
BUG=skia:1061
R=tomhudson@google.com
Author: djsollen@google.com
Review URL: https://codereview.chromium.org/506033003
|
|
|
|
|
|
|
|
|
|
|
| |
R=djsollen@google.com, reed@google.com
BUG=skia:
Author: george@mozilla.com
Review URL: https://codereview.chromium.org/189263015
git-svn-id: http://skia.googlecode.com/svn/trunk@13750 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
| |
detection on Android.
R=scroggo@google.com
Review URL: https://codereview.chromium.org/22193002
git-svn-id: http://skia.googlecode.com/svn/trunk@10530 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch ensures that when Skia is built for Chromium, it will
always use the Android NDK's cpu-features helper library to detect
NEON at runtime.
This is needed because sandboxed Chromium renderer processes cannot
access /proc, and the probing performed in SkUtilsArm.cpp will never
work. As such, the NEON code paths will never be used even when the
device supports them.
Chromium has special code that ensures that the browser process
passes the CPU features flags to every renderer process, but
Skia needs to use android_getCpuFeatures() to get them.
See http://crbug.com/164154 for full details.
Review URL: https://codereview.appspot.com/7102045
git-svn-id: http://skia.googlecode.com/svn/trunk@7149 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
| |
https://codereview.appspot.com/6465078/)
This CL is part I of IV (I broke down the 1280 files into 4 CLs).
Review URL: https://codereview.appspot.com/6485054
git-svn-id: http://skia.googlecode.com/svn/trunk@5262 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements dynamic ARM NEON support for the functions
implemented by src/core/SkBitmapProcState_matrixProcs.cpp.
- Because the SkBitmapProcState_matrix_{clamp,repeat}.h headers
are NEON-specific, they are renamed with a _neon.h suffix, and
moved to src/opts/ (from src/core/)
- Add a new file src/opts/SkBitmapProcState_matrixProcs_neon.cpp
which implements the NEON code paths for all builds, and add
it to the 'opts_neon' static library.
- Modify SkBitmapProcState_matrixProcs.cpp to select the right
code-path depending on our build configuration. Note that in
the case where 'arm_neon == 1', we do not embed regular ARM
code paths in the final binary. Only 'arm_neon_optional == 1'
builds will contain both regular and NEON code paths at the
same time.
Note that there doesn't seem to be a simple way to put the
NEON-specific selection from that currently is in
SkBitmapProcState_matrixProcs.cpp into src/opts/. Doing so
would require much more drastic restructuring. This is also
true of the other SkBitmapProcState source files that will
be touched in a future patch.
Review URL: https://codereview.appspot.com/6453065
git-svn-id: http://skia.googlecode.com/svn/trunk@4888 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
This patch adds minimal support for dynamic ARM NEON support,
i.e. the ability to probe the CPU at runtime for NEON and
provide alternate code paths when it is available.
- Add include/core/SkUtilsArm.h, which declares a few helper
macros (e.g. SK_NEON_ARM_IS_DYNAMIC), plus the handy
function 'sk_cpu_arm_has_neon()' which returns true if
the target CPU supports the ARM NEON instruction set.
Note that the header is in include/core/ because it will
have to be included from NEON-specific code under src/code/
It would probably be more logical to put it under include/opts/
instead, but this would require moving all the NEON-specific
stuff under src/code/ into src/opts/, which is not trivial
due to the way the code is currently architected.
- Add src/core/SkUtilsArm.cpp which implements
'sk_cpu_arm_has_neon' for ARM-based Linux systems, only
when SK_NEON_ARM_IS_DYNAMIC is true.
(For other cases, 'sk_cpu_arm_has_neon' is an inline function
that returns a constant 'true' or 'false' value).
There is no user-level accessible CPUID instruction on ARM,
so do all CPU feature probing by parsing /proc/cpuinfo.
This is Linux-specific.
For Debug build types, the CPU probing result is printed
to the Android log (or Linux command-line) for easier
debugging.
- Create a new 'opts_neon' target (static library) which shall
contain all the NEON-specific code paths for the library.
This is necessary because -mfpu=neon impacts also non-scalar
code. Just like with -mssse3 on x86, we can't build the rest
of the library with this flag.
Note that for now, we only include memset16_neon and
memset32_neon in this library.
- Modify opts_check_arm.cpp to implement SK_ARM_NEON_IS_DYNAMIC
properly.
Compared to a 'xoom' build, the only difference is the use of
NEON-optimized memset16/32 functions. Later patches will move
more NEON-specific code paths to 'opts_neon'.
Review URL: https://codereview.appspot.com/6247058
git-svn-id: http://skia.googlecode.com/svn/trunk@4069 2bbb7eff-a529-9590-31e7-b0007b416f81
|