aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/jumper/build_stages.py
diff options
context:
space:
mode:
authorGravatar Mike Klein <mtklein@chromium.org>2017-10-02 11:43:20 -0700
committerGravatar Skia Commit-Bot <skia-commit-bot@chromium.org>2017-10-02 22:05:49 +0000
commitb437351d880fd17ea2bb8fd0997da7754a32903c (patch)
treeee6f99644debae9e8e643ad49977084ee2ac8502 /src/jumper/build_stages.py
parent099fa0fb9801a138f12cf7cdf46b6581d81acce8 (diff)
add _skx stages
This just makes sure all the plumbing is in place to use the Skylake Xeon subset of AVX-512 instructions. So far, - no Windows - no lowp - nothing explicitly making use of AVX-512 registers or instructions This initial pass should run essentially identically to the _hsw AVX2 code we've been using previously. Clang _does_ use AVX-512-only instructions to implement some of the higher-level concepts we've coded, but it's really a pretty subtle difference. Next steps will bump N from 8 to 16 and start threading through an AVX-512-friendly mask instead of tail. I'll also want to take a harder look at how we do blending like if_then_else()... the default codegen here doesn't really take advantage of AVX-512 the way I'd like here. CQ_INCLUDE_TRYBOTS=skia.primary:Test-Debian9-Clang-GCE-CPU-AVX512-x86_64-Debug Change-Id: I6c9442488a449ea4770617bb22b2669859cc92e2 Reviewed-on: https://skia-review.googlesource.com/54062 Commit-Queue: Mike Klein <mtklein@chromium.org> Reviewed-by: Herb Derby <herb@google.com>
Diffstat (limited to 'src/jumper/build_stages.py')
-rwxr-xr-xsrc/jumper/build_stages.py7
1 files changed, 6 insertions, 1 deletions
diff --git a/src/jumper/build_stages.py b/src/jumper/build_stages.py
index 52a8c8ba40..a5b6280b80 100755
--- a/src/jumper/build_stages.py
+++ b/src/jumper/build_stages.py
@@ -99,10 +99,15 @@ subprocess.check_call(clang + cflags + hsw + win +
['-c', stages_lowp] +
['-o', 'win_lowp_hsw.o'])
+skx = ['-march=skylake-avx512']
+subprocess.check_call(clang + cflags + skx +
+ ['-c', stages] +
+ ['-o', 'skx.o'])
+
# Merge x86-64 object files to deduplicate constants.
# (No other platform has more than one specialization.)
subprocess.check_call(['ld', '-r', '-o', 'merged.o',
- 'hsw.o', 'avx.o', 'sse41.o', 'sse2.o',
+ 'skx.o', 'hsw.o', 'avx.o', 'sse41.o', 'sse2.o',
'lowp_hsw.o', 'lowp_sse41.o', 'lowp_sse2.o'])
subprocess.check_call(['ld', '-r', '-o', 'win_merged.o',
'win_hsw.o', 'win_avx.o', 'win_sse41.o', 'win_sse2.o',