| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
| |
I also updated the dump feature to work with aarch64, and included comments on how to disassemble an aarch64 dump.
Looking at an aarch64 dump made it immediately obvious that the jump offset was off by 1.
Change-Id: I17fa6ee44779e8be69ab4582e338c88212aba36c
Reviewed-on: https://skia-review.googlesource.com/6841
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
| |
The new test would fail without the the change in SkSplicer.cpp to call fSpliced(x,x+body) instead of fSpliced(x,body). The rest of the changes are cosmetic, mostly renaming n to limit.
Change-Id: Iae28802d0adb91e962ed3ee60fa5a4334bd140f9
Reviewed-on: https://skia-review.googlesource.com/6837
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Seems to be working. The jump to loop_start might be a little off, but not by much. Correctness is really still a big TODO.
$ adb shell 'cd /data/local/tmp; ./monobench SkRasterPipeline 200'
SkRasterPipeline_…
200 …f16_compile 1x …f16_run 1.42x …srgb_compile 2.21x …srgb_run 2.59x⏎
Change-Id: I0e1acc6404cf3ce8084d9ef8011cbe0b5f1fd6e3
Reviewed-on: https://skia-review.googlesource.com/6811
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|
|
I think I may have cracked the compile-ahead-of-time-splice-at-runtime nut.
This compiles stages ahead of time using clang, then splices them together at runtime. This means the stages can be written in simple C++, with some mild restrictions.
This performs identically to our Xbyak experiment, and already supports more stages. As written this stands alone from SkRasterPipeline_opts.h, but I'm fairly confident that the bulk (the STAGE implementations) can ultimately be shared.
As of PS 25 or so, this also supports all the stages used by bench/SkRasterPipelineBench.cpp:
SkRasterPipeline_…
400 …f16_compile 1x …f16_run 1.38x …srgb_compile 1.89x …srgb_run 2.21x
That is, ~30% faster than baseline for f16, ~15% faster for sRGB.
Change-Id: I1ec7dcb769613713ce56978c58038f606f87d63d
Reviewed-on: https://skia-review.googlesource.com/6733
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@chromium.org>
|