aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/jumper/build_stages.py
Commit message (Collapse)AuthorAge
* SkJumper: update to Clang 4.0Gravatar Mike Klein2017-03-15
| | | | | | | | | | | | | This Clang makes some new decisions about what (not) to inline. Luckily, liberal use of the 'inline' keyword steers it back in the right direction. This new code draws the same, and generally looks improved. Change-Id: I0ab6e1c884e6b339d01ae46a08a848e36dcc535a Reviewed-on: https://skia-review.googlesource.com/9702 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* SkJumper: more constants, _f and _i literals.Gravatar Mike Klein2017-03-14
| | | | | | | | | | | Generalize section types to avoid, adding another type (.rodata). I've kept K for iota only. Maybe one day... Change-Id: Ie5678a2ea00fefe550bc0e6dcab32f98c31d3fae Reviewed-on: https://skia-review.googlesource.com/9403 Commit-Queue: Mike Klein <mtklein@chromium.org> Reviewed-by: Herb Derby <herb@google.com>
* Back to code as data arrays, this time in .text.Gravatar Mike Klein2017-03-07
| | | | | | | | | | | | | | | | | This technique lets us generate a single source file, use the C++ preprocessor, and avoid the pain of working with assemblers. By using the section attribute or declspec allocate, we can put these data arrays into the .text section, making them ordinary code. This is like the previous solution, except it should actually run. CQ_INCLUDE_TRYBOTS=skia.primary:Test-Win2k8-MSVC-GCE-CPU-AVX2-x86_64-Debug,Test-Mac-Clang-MacMini6.2-CPU-AVX-x86_64-Debug,Test-Ubuntu-Clang-GCE-CPU-AVX2-x86_64-Debug,Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Debug Change-Id: Ide7675f6cf32eb4831ff02906acbdc3faaeaa684 Reviewed-on: https://skia-review.googlesource.com/9336 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* SkJumper: be more precise by rejecting data sections.Gravatar Mike Klein2017-03-02
| | | | | | | | | | | | | | This allows %rip addressing as long as it's not going into a data section. This lets us use switch tables, avoiding loops and stack. On HSW, SkRasterPipeline_f16: 90 -> 63 SkRasterPipeline_srgb: 170 -> 97 Change-Id: I3ca2e4ff819b70beea78be75579f9d80c06979e8 Reviewed-on: https://skia-review.googlesource.com/9146 Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* SkJumper: allow the compiler to generate FMAsGravatar Mike Klein2017-03-02
| | | | | | | | | | | | | | | Today we use mad() to get FMAs where possible. -ffp-contract=fast lets the compiler generate them if it spots an opportunity. It looks like it's found a mix of FMAs and FMSs. I will follow up by seeing if we can relax the use of mad(). Quick experiments say no, but less quick experiments may say otherwise. Change-Id: I5228811cfbf11cccc0d715672a464fd1e1cea3b0 Reviewed-on: https://skia-review.googlesource.com/9136 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Some small SkJumper refactoring.Gravatar Mike Klein2017-03-02
| | | | | | | | | No generated code changes. Change-Id: I2d480b5391f8246a01118766a9522d528a87f75a Reviewed-on: https://skia-review.googlesource.com/9129 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* SkJumper: upgrade to Clang 3.9Gravatar Mike Klein2017-03-01
| | | | | | | | | | | Mostly I think this will help me handle the AVX tails better. But there are some wins here already, particularly in AVX and ARM code. Change-Id: Ie79b4c2c4ab455277c313f15d360cbf8e4bb7836 Reviewed-on: https://skia-review.googlesource.com/9126 Reviewed-by: Mike Klein <mtklein@chromium.org> Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org>
* SkJumper: reformat .S filesGravatar Mike Klein2017-02-23
| | | | | | | | | | | | Decimal byte encoding makes more horizontal space for comments, which are the only thing you really want to read. No code change here. Change-Id: I674d78c898976063b0d89b747af41c62dc294303 Reviewed-on: https://skia-review.googlesource.com/8899 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* Add AVX to the SkJumper mix.Gravatar Mike Klein2017-02-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | AVX is a nice little halfway point between SSE4.1 and HSW, in terms of instructions available, performance, and availability. Intel chips have had AVX since ~2011, compared to ~2013 for HSW and ~2007 for SSE4.1. Like HSW it's got 8-wide 256-bit float vectors, but integer (and double) operations are essentially still only 128-bit. It also doesn't have F16 conversion or FMA instructions. It doesn't look like this is going to be a burden to maintain, and only adds a few KB of code size. In exchange, we now run 8x wide on 45% to 70% of x86 machines, depending on the OS. In my brief testing, speed eerily resembles exact geometric progression: SSE4.1: 1x speed (baseline) AVX: ~sqrt(2)x speed HSW: ~2x speed This adds all the basic plumbing for AVX but leaves it disabled. I'll flip it on once I've implemented the f16 TODOs. Change-Id: I1c378dabb8a06386646371bf78ade9e9432b006f Reviewed-on: https://skia-review.googlesource.com/8898 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* SkJumper: WindowsGravatar Mike Klein2017-02-21
| | | | | | | | | | | | | - Compile stages with -DWIN to pick up MS-specific start_pipeline(). - Add SkJumper_generated_win.S with MS-specific assembly. - Add a minimal asm tool to our GN Windows toolchain. The SkRasterPipeline_f16 benchmark run ~4x faster on my desktop. Change-Id: Ia45afb4ecb6a055e2c0e43f0f54f59e081c23b7f Reviewed-on: https://skia-review.googlesource.com/8778 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* SkJumper: aarch64 and armv7Gravatar Mike Klein2017-02-18
| | | | | | | Change-Id: Ie356b062372af3516a437d27bafa20d98e28edd6 Reviewed-on: https://skia-review.googlesource.com/8678 Commit-Queue: Mike Klein <mtklein@chromium.org> Reviewed-by: Mike Klein <mtklein@chromium.org>
* SkJumper: start on asmGravatar Mike Klein2017-02-17
| | | | | | | | | | | | | Will follow up with Linux, then Android aarch64 and armv7, then iOS, then Windows. I took some opportunities to refactor. CQ_INCLUDE_trybots=skia.primary:Test-Mac-Clang-MacMini6.2-CPU-AVX-x86_64-Debug,Perf-Mac-Clang-MacMini6.2-CPU-AVX-x86_64-Debug Change-Id: Ifcf1edabdfe5df0a91bd089f09523aba95cdf5ef Reviewed-on: https://skia-review.googlesource.com/8611 Commit-Queue: Mike Klein <mtklein@chromium.org> Reviewed-by: Herb Derby <herb@google.com>
* SkJumper: make some room for wider instructions.Gravatar Mike Klein2017-02-16
| | | | | | | | | No real change here. Change-Id: I56449c292585038901d78902e6aeb68203e36351 Reviewed-on: https://skia-review.googlesource.com/8476 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org>
* SkJumperGravatar Mike Klein2017-02-16
Change-Id: If9f73e712e429564fef58ccb838c212ec8d2e68c Reviewed-on: https://skia-review.googlesource.com/8525 Commit-Queue: Mike Klein <mtklein@chromium.org> Reviewed-by: Herb Derby <herb@google.com>