diff options
author | 2017-10-02 11:43:20 -0700 | |
---|---|---|
committer | 2017-10-02 22:05:49 +0000 | |
commit | b437351d880fd17ea2bb8fd0997da7754a32903c (patch) | |
tree | ee6f99644debae9e8e643ad49977084ee2ac8502 /src/jumper/build_stages.py | |
parent | 099fa0fb9801a138f12cf7cdf46b6581d81acce8 (diff) |
add _skx stages
This just makes sure all the plumbing is in place to use the Skylake
Xeon subset of AVX-512 instructions. So far,
- no Windows
- no lowp
- nothing explicitly making use of AVX-512 registers or instructions
This initial pass should run essentially identically to the _hsw AVX2
code we've been using previously. Clang _does_ use AVX-512-only
instructions to implement some of the higher-level concepts we've coded,
but it's really a pretty subtle difference.
Next steps will bump N from 8 to 16 and start threading through an
AVX-512-friendly mask instead of tail. I'll also want to take a harder
look at how we do blending like if_then_else()... the default codegen
here doesn't really take advantage of AVX-512 the way I'd like here.
CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-Clang-GCE-CPU-AVX512-x86_64-Debug
Change-Id: I6c9442488a449ea4770617bb22b2669859cc92e2
Reviewed-on: https://skia-review.googlesource.com/54062
Commit-Queue: Mike Klein <mtklein@chromium.org>
Reviewed-by: Herb Derby <herb@google.com>
Diffstat (limited to 'src/jumper/build_stages.py')
-rwxr-xr-x | src/jumper/build_stages.py | 7 |
1 files changed, 6 insertions, 1 deletions
diff --git a/src/jumper/build_stages.py b/src/jumper/build_stages.py index 52a8c8ba40..a5b6280b80 100755 --- a/src/jumper/build_stages.py +++ b/src/jumper/build_stages.py @@ -99,10 +99,15 @@ subprocess.check_call(clang + cflags + hsw + win + ['-c', stages_lowp] + ['-o', 'win_lowp_hsw.o']) +skx = ['-march=skylake-avx512'] +subprocess.check_call(clang + cflags + skx + + ['-c', stages] + + ['-o', 'skx.o']) + # Merge x86-64 object files to deduplicate constants. # (No other platform has more than one specialization.) subprocess.check_call(['ld', '-r', '-o', 'merged.o', - 'hsw.o', 'avx.o', 'sse41.o', 'sse2.o', + 'skx.o', 'hsw.o', 'avx.o', 'sse41.o', 'sse2.o', 'lowp_hsw.o', 'lowp_sse41.o', 'lowp_sse2.o']) subprocess.check_call(['ld', '-r', '-o', 'win_merged.o', 'win_hsw.o', 'win_avx.o', 'win_sse41.o', 'win_sse2.o', |