| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This way, we could have more stack memory on Google3: if each of the
two branche has 12K stack memory, Google3 would believe that it needs 24K
stack memory; but using SkSTArenaAlloc, we could use 12K stack memory to
handle those two branches.
Bug: skia:
Change-Id: Ie9234226cd4ba93b5be2ebeb95ab771031354f97
Reviewed-on: https://skia-review.googlesource.com/42101
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
|
|
|
|
|
|
|
|
| |
Bug: skia:
Change-Id: I5c98220498c71ced4565f492335cef2a372d0765
Reviewed-on: https://skia-review.googlesource.com/41743
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Always inline (Clang previously ignored inline and got 25% slower)
2. SIMD everywhere other than x86 gcc:
non-SIMD is only faster in my desktop with gcc;
with Clang on my desktop, SIMD is 50% faster than non-SIMD.
3. Allocate 4x memory instead of 2x when running out of space:
on old Android devices with Linux kernel 3.10 (e.g., Nexus 6P, 5X),
the alloc/memcpy will triger a major bottleneck in kernel (30% of
the running time). Such bottleneck goes away (the kernel is no
longer doing stupid things during alloc/memcpy) in Linux kernel
3.18 (e.g., Pixel), and that's why DAA is much faster on Pixel than
on Nexus 6P.
I think maybe I should adopt SkRasterPipeline for device-specific
optimizations.
Bug: skia:
Change-Id: I0408aa7671a5f1b39aad3bec25f8fc994ff5a1bb
Reviewed-on: https://skia-review.googlesource.com/30820
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
|
|
|
|
|
|
|
|
| |
Bug: skia:
Change-Id: I9f86f71247379713ffaf14e5c704c2ac4c6f9cbd
Reviewed-on: https://skia-review.googlesource.com/26861
Reviewed-by: Mike Reed <reed@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
|
|
|
|
|
|
|
|
|
|
|
| |
It seems that google3 is using -fstack-usage to determine whether
we exceed 16k.
Bug: skia:
Change-Id: I259ff7fc0e6614dde83eb340f0a17efbc52ebf57
Reviewed-on: https://skia-review.googlesource.com/26940
Reviewed-by: Yuqian Li <liyuqian@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
|
|
|
|
|
|
|
|
|
|
|
| |
It seems that the compiler added the stack size of two "if" branches,
rather than get the max of them...
Bug: skia:
Change-Id: Idf6b47cafd84c9a53a7b8dafb38f815e08094100
Reviewed-on: https://skia-review.googlesource.com/26780
Reviewed-by: Yuqian Li <liyuqian@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Replace deprecated function call and reduce stack usage in g3
Bug: skia:
Change-Id: Ib49ccecef4711c92ea2e62e772d98c0f5097e30d
TBR: reed@google.com, caryclark@google.com
Reviewed-on: https://skia-review.googlesource.com/26565
Reviewed-by: Yuqian Li <liyuqian@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
|
|
DAA is:
1. Much simpler than AAA.
SkScan_AAAPath.cpp is about 1700 lines.
SkScan_DAAPath.cpp is about 300 lines.
The whole DAA CL is only about 800 lines.
2. Much faster than AAA for complicated paths.
The speedup applies to GL backend (including ccpr)!
Here's the frame time of 'SampleApp --slide Chart' on macbook pro:
AAA-raster: 33ms
DAA-raster: 21ms
AAA-gl: 30ms
DAA-gl: 20ms
AAA-ccpr: 18ms
DAA-ccpr: 12ms
My linux desktop doesn't have SSE3 so the speedup is smaller
(~25% for Chart). I believe that DAA is so fast that I can enable
it for any paths (AAA is not enabled by default for complicated
paths because it is slow; hence our older supersampling scan
converter is used for stroking on Chart for AAA-xxx config.)
3. The SkCoverageDelta is suitable for threaded backend with
out-of-order concurrent scan conversion as commented in the source
code. Maybe we can also just send deltas to GPU.
4. Similar to most analytic path renderers, the quality is on the best
ground-truth level, unless there are intersections within a pixel.
The intersections look good to my eyes although theoretically that
could be arbitrary far from the ground truth (see my AAA slides).
5. For simple paths, such as circle, triangle, rrect, etc., DAA is
slower than AAA. But DAA is faster than our older supersampling
scan converter in most cases. As those simple paths usually don't
constitute the bottleneck of a picture (skp or svg), I strongly
recommend use DAA.
6. DAA also heavily favors blitMask so it may work quite well with
SkRasterPipeline and SkRasterPipelineBlitter.
Finally, please check https://skia-review.googlesource.com/c/22420/
which accelerate DAA by specializing blitCoverageDeltas for
SkARGB32_Blitter and SkARGB32_Black_Blitter. It brings a little(<5%)
speedup. But I couldn't figure out how to reduce the duplicate code
so I don't intend to land it.
Bug: skia:
Change-Id: I3b7ed6a727447922e645b1acb737a506e7c09a4c
Reviewed-on: https://skia-review.googlesource.com/19666
Reviewed-by: Mike Reed <reed@google.com>
Reviewed-by: Cary Clark <caryclark@google.com>
Commit-Queue: Yuqian Li <liyuqian@google.com>
|