aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/opts
Commit message (Collapse)AuthorAge
* Optimize highQualityFilterGravatar qiankun.miao2014-11-25
| | | | | | | | | | | | | | | | | | portable version: before: 10M 1 806µs 807µs 810µs 821µs 1% █▂▁▁▃▁▁▁█▁ 8888 bitmap_BGRA_8888_A_scale_rotate_bicubic after: 10M 1 566µs 568µs 569µs 579µs 1% ▄▂▂█▂▁▁▁▃▁ 8888 bitmap_BGRA_8888_A_scale_rotate_bicubic SSE version: before: 10M 1 485µs 486µs 487µs 494µs 1% ▇▂▁▁▁▁█▂▁▁ 8888 bitmap_BGRA_8888_A_scale_rotate_bicubic after: 10M 1 419µs 420µs 421µs 430µs 1% ▅▃▂▁▁█▂▁▁▁ 8888 bitmap_BGRA_8888_A_scale_rotate_bicubic BUG=skia: Review URL: https://codereview.chromium.org/759603002
* Add SkBlendARGB32_SSE2() to clean up codeGravatar qiankun.miao2014-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Related nanobench results: before: maxrss loops min median mean max stddev samples config bench 10M 2 31.9µs 32.4µs 33.3µs 38.7µs 6% █▄▂▂▂▁▂▁▁▁ 8888 bitmap_BGRA_8888_A_scale_bicubic 10M 13 43.8µs 51.8µs 49.6µs 57.9µs 11% ▁▁▁▁▂▆▇▆▅█ 8888 bitmap_BGRA_8888_A_scale_bilerp 10M 13 23.7µs 24.3µs 26µs 32.7µs 13% ▅█▆▁▁▁▁▂▁▁ 8888 bitmap_Index_8_A 10M 4 1.68µs 1.7µs 4.09µs 25.4µs 183% █▁▁▁▁▁▁▁▁▁ 8888 text_16_AA_88 10M 144 1.76µs 1.77µs 1.78µs 1.81µs 1% █▂▇▂▅▁▁▁▁▁ 8888 text_16_AA_FF 10M 10 4.7µs 5.34µs 5.61µs 8.63µs 21% █▂▂▃▂▁▁▁▁▄ 8888 rotated_rects_aa_alternating_transparent_and_opaque_src 10M 50 4.44µs 4.47µs 4.5µs 4.71µs 2% █▅▃▂▂▂▁▁▁▁ 8888 rotated_rects_aa_changing_opaque_src 10M 51 4.39µs 4.78µs 5.21µs 6.62µs 17% ▁▆▆▇▁▁█▁▂▂ 8888 rotated_rects_aa_same_opaque_src 10M 50 4.47µs 5.79µs 5.43µs 6.14µs 11% ▄▂▁▃▇▇▆▇▇█ 8888 rotated_rects_aa_alternating_transparent_and_opaque_srcover 10M 30 4.35µs 6.06µs 5.84µs 7.63µs 16% ▅▅▅▄▅▅▄█▁▁ 8888 rotated_rects_aa_changing_transparent_srcover 10M 44 4.31µs 4.51µs 4.76µs 6.25µs 13% ▄▂▂▁█▃▁▃▁▁ 8888 rotated_rects_aa_changing_opaque_srcover 10M 46 4.36µs 4.42µs 4.75µs 6.19µs 14% ▆█▃▁▁▁▁▁▁▁ 8888 rotated_rects_aa_same_transparent_srcover 10M 47 4.29µs 4.35µs 4.44µs 5.15µs 6% ▃▂▂▁▁█▁▁▁▁ 8888 rotated_rects_aa_same_opaque_srcover 10M 3 39.1µs 39.2µs 50.7µs 153µs 71% █▁▁▁▁▁▁▁▁▁ 8888 rectori 10M 1 2.3ms 2.31ms 2.35ms 2.74ms 6% ▁▁▁▁▁▁▁▁█▂ 8888 maskcolor 10M 1 2.33ms 2.34ms 2.53ms 3.14ms 11% ▁▁▁▁▁▁▅█▄▄ 8888 maskopaque 10M 11 15µs 15.3µs 15.7µs 18.3µs 7% ▅▃▂▂▁▁▁▁█▁ 8888 rrects_3_stroke_4 10M 46 3.99µs 4.07µs 4.14µs 4.54µs 4% █▅▅▃▂▂▁▁▁▁ 8888 rrects_3 10M 16 15.6µs 15.9µs 16.1µs 17.5µs 4% █▄▃▂▂▂▁▂▁▁ 8888 ovals_3_stroke_4 10M 40 5.09µs 5.18µs 5.23µs 5.67µs 3% █▅▃▂▂▁▃▁▁▁ 8888 ovals_3 10M 231 1.92µs 1.93µs 1.94µs 2µs 1% █▃▂▁▃▁▁▁▁▁ 8888 zeroradroundrect 10M 924 3.88µs 3.93µs 4.11µs 4.95µs 9% ▁█▆▃▁▁▁▁▁▁ 8888 arbroundrect 10M 8 8.11µs 8.47µs 8.48µs 8.85µs 3% █▅▇▄▄▂▁▄▄▆ 8888 merge_large 10M 14 6.71µs 6.92µs 6.96µs 7.46µs 3% ▃▆▁█▃▃▃▂▂▁ 8888 merge_small 11M 2 225µs 227µs 229µs 233µs 1% ███▃▇▂▃▁▃▂ 8888 displacement_full_large 16M 1 381µs 401µs 401µs 421µs 3% ▅▅▅█▆▄▄▃▃▁ 8888 displacement_alpha_large 19M 1 507µs 508µs 509µs 512µs 0% █▃▂▆▂▂▃▂▃▁ 8888 displacement_zero_large 19M 19 9µs 9.11µs 9.15µs 9.67µs 2% ▄▂▂▂█▂▁▁▁▂ 8888 displacement_full_small 19M 5 54.2µs 54.5µs 54.9µs 58µs 2% █▃▂▂▁▁▃▁▁▁ 8888 blurroundrect_WH[100x100]_cr[90] 20M 1 229µs 230µs 231µs 240µs 2% █▄▃▂▂▁▁▁▁▂ 8888 GM_varied_text_clipped_no_lcd 20M 1 267µs 269µs 270µs 279µs 1% █▄▃▂▂▂▂▂▁▁ 8888 GM_varied_text_ignorable_clip_no_lcd 22M 1 1.95ms 1.97ms 2.03ms 2.46ms 8% ▁▁▁▁▁▁▁▂█▃ 8888 GM_convex_poly_clip after: maxrss loops min median mean max stddev samples config bench 10M 2 31.5µs 32.3µs 32.8µs 37.2µs 5% █▄▃▂▂▂▁▁▁▁ 8888 bitmap_BGRA_8888_A_scale_bicubic 10M 13 43.9µs 44µs 44.1µs 44.9µs 1% █▂▁▁▁▆▁▁▁▂ 8888 bitmap_BGRA_8888_A_scale_bilerp 10M 19 22.7µs 23.3µs 25.6µs 32.4µs 14% ▁▁▁▁▁▅▆▁▅█ 8888 bitmap_Index_8_A 10M 5 1.79µs 1.97µs 3.85µs 21.1µs 158% █▁▁▁▁▁▁▁▁▁ 8888 text_16_AA_88 10M 141 1.83µs 1.83µs 1.85µs 1.93µs 2% ▅▁▁█▁▁▁▁▁▁ 8888 text_16_AA_FF 10M 10 4.65µs 4.92µs 5.06µs 6.56µs 11% █▃▃▂▂▂▁▁▁▁ 8888 rotated_rects_aa_alternating_transparent_and_opaque_src 10M 51 4.35µs 4.48µs 4.83µs 6.68µs 17% ▂▁▁▁▁▁▁▂▆█ 8888 rotated_rects_aa_changing_opaque_src 10M 51 4.38µs 4.79µs 4.85µs 5.84µs 11% ▁█▁▃▃▁▄▁▄▇ 8888 rotated_rects_aa_same_opaque_src 10M 32 5.58µs 6.24µs 6.1µs 6.39µs 5% █▂█▆▁▇▄▅▇▇ 8888 rotated_rects_aa_alternating_transparent_and_opaque_srcover 10M 42 4.28µs 5.59µs 5.11µs 6.01µs 15% ▂▂█▇█▂▁▆▁▇ 8888 rotated_rects_aa_changing_transparent_srcover 10M 48 4.24µs 4.33µs 4.58µs 6.46µs 15% ▁▁▁▁▁█▃▂▁▁ 8888 rotated_rects_aa_changing_opaque_srcover 10M 48 4.28µs 4.3µs 4.4µs 5.12µs 6% ▂▂▁▁▁▁▁▁▁█ 8888 rotated_rects_aa_same_transparent_srcover 10M 46 4.24µs 4.29µs 4.66µs 7.11µs 20% ▁▁▁▁▁▁▁▁▃█ 8888 rotated_rects_aa_same_opaque_srcover 10M 3 39.3µs 39.4µs 51.4µs 154µs 70% █▁▁▁▁▁▁▁▁▁ 8888 rectori 10M 1 2.32ms 2.43ms 2.53ms 3.14ms 11% ▁▁▁▁▂▄█▃▅▁ 8888 maskcolor 10M 1 2.33ms 2.37ms 2.54ms 3.21ms 12% ▁▁▁▁▁▂█▅▆▁ 8888 maskopaque 10M 10 15.3µs 15.6µs 15.8µs 17.2µs 4% █▅▃▂▂▂▁▁▁▁ 8888 rrects_3_stroke_4 10M 46 4.03µs 4.09µs 4.15µs 4.47µs 4% █▄▆▂▂▂▁▁▁▁ 8888 rrects_3 10M 15 15.9µs 16.2µs 16.3µs 17.8µs 4% █▄▃▂▂▂▁▁▁▁ 8888 ovals_3_stroke_4 10M 40 5.14µs 5.26µs 5.29µs 5.72µs 3% █▅▃▂▂▁▂▂▁▁ 8888 ovals_3 10M 222 1.91µs 1.99µs 2.21µs 2.91µs 19% ▂▁▁▁▁▁▂▇▇█ 8888 zeroradroundrect 10M 462 3.9µs 3.96µs 4.23µs 5.22µs 12% ▆▄█▁▂▁▁▁▁▁ 8888 arbroundrect 10M 8 8.2µs 8.59µs 8.62µs 8.97µs 3% ▆▄█▄▅▃▁▆▄█ 8888 merge_large 10M 14 6.73µs 6.88µs 6.86µs 7.08µs 2% ▄█▁▂▄▂▅▄▂▅ 8888 merge_small 11M 2 221µs 234µs 237µs 263µs 5% ▄▃▃▃▄▃▂▁▇█ 8888 displacement_full_large 16M 1 387µs 416µs 427µs 471µs 7% ▇█▁▃▃▁▃▃▇▆ 8888 displacement_alpha_large 19M 1 512µs 521µs 528µs 594µs 5% █▂▂▂▁▁▂▃▁▁ 8888 displacement_zero_large 19M 18 9.06µs 9.12µs 9.13µs 9.23µs 1% █▃▃▃▄▃▆▁▅▅ 8888 displacement_full_small 19M 5 55.6µs 55.9µs 56.5µs 59.5µs 2% █▃▂▁▁▁▁▁▅▁ 8888 blurroundrect_WH[100x100]_cr[90] 20M 1 229µs 233µs 235µs 254µs 3% █▄▃▂▂▁▁▂▁▁ 8888 GM_varied_text_clipped_no_lcd 20M 1 270µs 271µs 272µs 278µs 1% █▄▃▂▂▂▁▂▁▇ 8888 GM_varied_text_ignorable_clip_no_lcd 22M 1 1.96ms 2ms 2.06ms 2.45ms 7% ▂▂▁▁▁▁▁▃█▄ 8888 GM_convex_poly_clip BUG=skia: Review URL: https://codereview.chromium.org/754733002
* Cleanup with SkAlphaMulQ_SSE2()Gravatar qiankun.miao2014-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | Related nanobench results: before: 10M 18 7.03µs 7.31µs 7.38µs 8.46µs 6% ▂▁▂▂▂▃▄▁█▁ 8888 bitmaprect_80_filter_identity 10M 43 6.96µs 6.97µs 6.99µs 7.19µs 1% ▁▂▁▁▁▁▁█▁▁ 8888 bitmaprect_80_nofilter_identity 10M 14 35.7µs 35.8µs 35.9µs 36.3µs 1% ▃▂▁▂▁█▂▁▁▁ 8888 bitmap_BGRA_8888_update_scale_bilerp 10M 16 35.5µs 35.6µs 35.7µs 36.3µs 1% █▅▂▁▁▁▃▂▁▁ 8888 bitmap_BGRA_8888_update_volatile_scale_bilerp 10M 16 35.4µs 35.4µs 35.5µs 36.8µs 1% ▂▁█▁▁▁▁▂▁▁ 8888 bitmap_BGRA_8888_scale_bilerp 10M 25 16.4µs 16.6µs 16.7µs 17.4µs 2% ▂▁▁▂▁▁▁▅▅█ 8888 bitmap_Index_8 10M 15 37.9µs 38µs 38µs 38.4µs 0% ▄▆▂▁▁▁█▂▁▁ 8888 bitmap_RGB_565 10M 33 11.1µs 11.1µs 11.1µs 11.2µs 0% ▆▂█▂▂▂▁▁▂▁ 8888 bitmap_BGRA_8888_scale after: 10M 9 7.04µs 7.06µs 7.1µs 7.32µs 1% █▅▂▁▁▂▁▁▁▁ 8888 bitmaprect_80_filter_identity 10M 18 7.01µs 7.02µs 7.05µs 7.25µs 1% █▂▁▁▁▁▁▁▁▁ 8888 bitmaprect_80_nofilter_identity 10M 5 33.9µs 34µs 34.1µs 34.5µs 1% █▃▂▂▁▁▁▅▃▂ 8888 bitmap_BGRA_8888_update_scale_bilerp 10M 7 35.5µs 35.5µs 35.6µs 36.3µs 1% ▃▂▂▁▂▁▂▁█▂ 8888 bitmap_BGRA_8888_update_volatile_scale_bilerp 10M 7 35.5µs 35.5µs 35.7µs 36.8µs 1% ▂▁▁▁▁▁▁▁▁█ 8888 bitmap_BGRA_8888_scale_bilerp 10M 11 16.4µs 16.4µs 16.4µs 16.6µs 0% █▂▁▁▂▁▁▁▂▁ 8888 bitmap_Index_8 10M 7 37.3µs 37.4µs 38.4µs 47.8µs 9% ▁▁▁▁▁▁▁▁▁█ 8888 bitmap_RGB_565 10M 33 11µs 11µs 11.1µs 11.2µs 1% ▄█▅▃▂▁▁▁▁▁ 8888 bitmap_BGRA_8888_scale BUG=skia: Review URL: https://codereview.chromium.org/755573002
* Cleanup of S32_D565_Opaque_SSE2()Gravatar qiankun.miao2014-11-24
| | | | | | BUG=skia: Review URL: https://codereview.chromium.org/725693003
* Optimize SkAlphaMulQ_SSE2Gravatar qiankun.miao2014-11-14
| | | | | | | | | These two mask clear are useless, because _mm_srli_epi16 fills high byte of each word with 0. BUG=skia: Review URL: https://codereview.chromium.org/724333003
* Fix race in supports_simd().Gravatar mtklein2014-10-13
| | | | | | | | | | Local statics are not thread safe in Chrome. Use an SkLazyPtr instead. See https://code.google.com/p/chromium/issues/detail?id=418041 BUG=418041 Review URL: https://codereview.chromium.org/655573002
* Improve SkARGB32_A8_BlitMask_SSE2Gravatar jmuizelaar2014-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With clang this: - movzbl -3(%rbx), %edx - pxor %xmm5, %xmm5 - pinsrw $0, %edx, %xmm5 - pinsrw $1, %edx, %xmm5 - movzbl -2(%rbx), %edx - pinsrw $2, %edx, %xmm5 - pinsrw $3, %edx, %xmm5 - movzbl -1(%rbx), %edx - pinsrw $4, %edx, %xmm5 - pinsrw $5, %edx, %xmm5 - movzbl (%rbx), %edx - pinsrw $6, %edx, %xmm5 - pinsrw $7, %edx, %xmm5 becomes: + movd (%rbx), %xmm4 + punpcklbw %xmm9, %xmm4 + punpcklwd %xmm4, %xmm4 And clang already does better codegen than msvc 2013 on this. BUG=skia: Review URL: https://codereview.chromium.org/609823003
* Enable highQualityFilter_SSE2Gravatar qiankun.miao2014-09-04
| | | | | | | | | | | | | | | | With SSE2, bitmap_BGRA_8888_A_scale_rotate_bicubic gains about 40% performance improvement on desktop i7-3770. BUG=skia: Committed: https://skia.googlesource.com/skia/+/b381fa10d8079c58928058bb8a6db32b39f05e51 CQ_EXTRA_TRYBOTS=tryserver.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot R=mtklein@google.com, humper@google.com Author: qiankun.miao@intel.com Review URL: https://codereview.chromium.org/525283002
* Revert of Enable highQualityFilter_SSE2 (patchset #1 id:1 of ↵Gravatar mtklein2014-09-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://codereview.chromium.org/525283002/) Reason for revert: Color order looks wrong on Macs: Before: http://chromium-skia-gm.commondatastorage.googleapis.com/gm/bitmap-64bitMD5/filterbitmap_image_mandrill_16.png/12823183142873462143.png After: http://chromium-skia-gm.commondatastorage.googleapis.com/gm/bitmap-64bitMD5/filterbitmap_image_mandrill_16.png/13683040204546320578.png Original issue's description: > Enable highQualityFilter_SSE2 > > With SSE2, bitmap_BGRA_8888_A_scale_rotate_bicubic gains about 40% > performance improvement on desktop i7-3770. > > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/b381fa10d8079c58928058bb8a6db32b39f05e51 R=humper@google.com, qiankun.miao@intel.com TBR=humper@google.com, qiankun.miao@intel.com NOTREECHECKS=true NOTRY=true BUG=skia: Author: mtklein@google.com Review URL: https://codereview.chromium.org/539523002
* Enable highQualityFilter_SSE2Gravatar qiankun.miao2014-09-03
| | | | | | | | | | | | With SSE2, bitmap_BGRA_8888_A_scale_rotate_bicubic gains about 40% performance improvement on desktop i7-3770. BUG=skia: R=mtklein@google.com, humper@google.com Author: qiankun.miao@intel.com Review URL: https://codereview.chromium.org/525283002
* Disable SSE4 S32A_Opaque blit.Gravatar mtklein2014-09-03
| | | | | | | | | | | This code sometimes generates a build warning that bothers Chrome. BUG=399842,skia:2906 R=reed@google.com, mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/538463003
* Remove dead code in SkBitmapFilter_opts_SSE2.h/cppGravatar qiankun.miao2014-09-03
| | | | | | | | | BUG=skia: R=mtklein@google.com, humper@google.com Author: qiankun.miao@intel.com Review URL: https://codereview.chromium.org/530673002
* Disable NEON procs for box blur as it produces invalid resultsGravatar djsollen2014-09-02
| | | | | | | | | | R=reed@google.com, mtklein@google.com, senorblanco@google.com TBR=reed@google.com BUG=skia:2845 Author: djsollen@google.com Review URL: https://codereview.chromium.org/527973002
* Revert of Disable NEON procs for box blur as it produces invalid results ↵Gravatar djsollen2014-09-02
| | | | | | | | | | | | | | | | | | | | | | | | (patchset #1 id:1 of https://codereview.chromium.org/520963002/) Reason for revert: failing more GMs than expected. Original issue's description: > Disable NEON procs for box blur as it produces invalid results > > BUG=skia:2845 > > Committed: https://skia.googlesource.com/skia/+/4a1764688c990fb926aaeab538497dad52768d99 R=senorblanco@google.com, mtklein@google.com TBR=mtklein@google.com, senorblanco@google.com NOTREECHECKS=true NOTRY=true BUG=skia:2845 Author: djsollen@google.com Review URL: https://codereview.chromium.org/531023002
* Disable NEON procs for box blur as it produces invalid resultsGravatar djsollen2014-09-02
| | | | | | | | | BUG=skia:2845 R=senorblanco@google.com, mtklein@google.com Author: djsollen@google.com Review URL: https://codereview.chromium.org/520963002
* Disable Neon optimization of bad S32A/D565 blend.Gravatar mtklein2014-08-22
| | | | | | | | | | | | BUG=skia:2797 Committed: https://skia.googlesource.com/skia/+/84cab93186fbe3e87d931fea73cb31b70ff5017b R=mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/497823002
* Disable Neon optimization of bad S32A/D565 blend.Gravatar mtklein2014-08-22
| | | | | | | | | BUG=skia:2797 R=mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/497823002
* disable neon proc that is triggering assertsGravatar reed2014-08-22
| | | | | | | | | BUG=skia:2845 R=mtklein@google.com Author: reed@google.com Review URL: https://codereview.chromium.org/498733002
* Simplify flattening to just write enough to call the ↵Gravatar reed2014-08-21
| | | | | | | | | | | | | | | | | | factory/public-constructor for the class. We want to *not* rely on private constructors, and not rely on calling through the inheritance hierarchy for either flattening or unflattening(CreateProc). Refactoring pattern: 1. guard the existing constructor(readbuffer) with the legacy build-flag 2. If you are a instancable subclass, implement CreateProc(readbuffer) to create a new instances from the buffer params (or return NULL). If you're a shader subclass 1. You must read/write the local matrix if your class accepts that in its factory/constructor, else ignore it. R=robertphillips@google.com, mtklein@google.com, senorblanco@google.com, senorblanco@chromium.org, sugoi@chromium.org Author: reed@google.com Review URL: https://codereview.chromium.org/395603002
* Turn off NEON SkBoxBlurGetPlatformProcs for ARM64 (for now)Gravatar halcanary2014-08-20
| | | | | | | | | BUG=skia:2845 R=djsollen@google.com, senorblanco@google.com, senorblanco@chromium.org Author: halcanary@google.com Review URL: https://codereview.chromium.org/491973002
* Let skia build with clang's integrated assembler.Gravatar thakis2014-08-11
| | | | | | | | | | | | | | | | 1. vuzpq is a gcc instruction. Replace it with the equivalent vuzp (see http://llvm.org/PR20423) 2. .func / .endfunc only have an effect with -gstabs, which we don't use. As it's unused and clang doesn't support it, remove .func / .endfunc (also see http://llvm.org/20424) BUG=chromium:124610 R=mtklein@google.com Author: thakis@chromium.org Review URL: https://codereview.chromium.org/461693004
* Replace a pre-UAL instruction with its modern form.Gravatar thakis2014-08-11
| | | | | | | | | | | See the notes in the Chromium bug, and http://llvm.org/20427 BUG=chromium:124610,skia:900 R=djsollen@google.com, mtklein@google.com Author: thakis@chromium.org Review URL: https://codereview.chromium.org/455903002
* Fix S32A_D565_Opaque for RGBA on arm64Gravatar kevin.petit2014-08-09
| | | | | | | | | | | Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG=skia:2813 R=halcanary@google.com, djsollen@google.com, mtklein@google.com Author: kevin.petit@arm.com Review URL: https://codereview.chromium.org/458453002
* Disable suspect NEON function for 64-bit AndroidGravatar djsollen2014-08-07
| | | | | | | | R=halcanary@google.com, mtklein@google.com, kevin.petit@arm.com Author: djsollen@google.com Review URL: https://codereview.chromium.org/451633006
* Add query for block dimensions of a given formatGravatar krajcevski2014-07-29
| | | | | | | | R=robertphillips@google.com Author: krajcevski@google.com Review URL: https://codereview.chromium.org/422023006
* Enable the SSSE3 compile time check on all platforms (4th attempt)Gravatar djsollen2014-07-24
| | | | | | | | | BUG=skia:2746 R=bungeman@google.com, robertphillips@google.com, mtklein@google.com Author: djsollen@google.com Review URL: https://codereview.chromium.org/414033002
* Revert of Enable the SSSE3 compile time check on all platforms. ↵Gravatar bungeman2014-07-23
| | | | | | | | | | | | | | | | | | | | | | | | (https://codereview.chromium.org/403583002/) Reason for revert: This is blocking the roll. Chromium Windows trybots (like win_chromium_x64_rel) are crashing in the SSSE3 code (for example SkCanvasVideoRenderTest.CroppedFrame). Original issue's description: > Enable the SSSE3 compile time check on all platforms (3rd attempt) > > BUG=skia:2746 > > Committed: https://skia.googlesource.com/skia/+/933834851f9d48fbd85b728cc92e1f0134bfaa4e R=halcanary@google.com, mtklein@google.com, djsollen@google.com TBR=djsollen@google.com, halcanary@google.com, mtklein@google.com NOTREECHECKS=true NOTRY=true BUG=skia:2746 Author: bungeman@google.com Review URL: https://codereview.chromium.org/418523002
* Enable the SSSE3 compile time check on all platforms (3rd attempt)Gravatar djsollen2014-07-22
| | | | | | | | | BUG=skia:2746 R=halcanary@google.com, mtklein@google.com Author: djsollen@google.com Review URL: https://codereview.chromium.org/403583002
* Add support for NEON intrinsics to speed up texture compression. We canGravatar krajcevski2014-07-14
| | | | | | | | | | | | | now convert the time that we would have spent uploading the texture to compressing it giving a net 50% memory savings for these things. Committed: https://skia.googlesource.com/skia/+/bc9205be0a1094e312da098348601398c210dc5a R=robertphillips@google.com, mtklein@google.com, kevin.petit@arm.com Author: krajcevski@google.com Review URL: https://codereview.chromium.org/390453002
* Revert of Enable the SSSE3 compile time check on all platforms. ↵Gravatar halcanary2014-07-14
| | | | | | | | | | | | | | | | | | | | | | | | (https://codereview.chromium.org/391693004/) Reason for revert: windows fail Original issue's description: > Enable the SSSE3 compile time check on all platforms. > > BUG=skia:2746 > > Committed: https://skia.googlesource.com/skia/+/ee349531446ae2a8336b0903e05d0b2150d2131f R=mtklein@google.com, djsollen@google.com TBR=djsollen@google.com, mtklein@google.com NOTREECHECKS=true NOTRY=true BUG=skia:2746 Author: halcanary@google.com Review URL: https://codereview.chromium.org/390063002
* Enable the SSSE3 compile time check on all platforms.Gravatar djsollen2014-07-14
| | | | | | | | | BUG=skia:2746 R=mtklein@google.com Author: djsollen@google.com Review URL: https://codereview.chromium.org/391693004
* MIPS: added optimization for SkRGB16_Opaque_Blitter::blitMaskGravatar djordje.pesut2014-07-14
| | | | | | | | | | gaint is ~30% R=djsollen@google.com Author: djordje.pesut@imgtec.com Review URL: https://codereview.chromium.org/357693002
* Revert of Add support for NEON intrinsics to speed up texture compression. ↵Gravatar krajcevski2014-07-11
| | | | | | | | | | | | | | | | | | | | | | | We can (https://codereview.chromium.org/390453002/) Reason for revert: Breaking chrome. Original issue's description: > Add support for NEON intrinsics to speed up texture compression. We can > now convert the time that we would have spent uploading the texture to > compressing it giving a net 50% memory savings for these things. > > Committed: https://skia.googlesource.com/skia/+/bc9205be0a1094e312da098348601398c210dc5a R=robertphillips@google.com, mtklein@google.com, kevin.petit@arm.com TBR=kevin.petit@arm.com, mtklein@google.com, robertphillips@google.com NOTREECHECKS=true NOTRY=true Author: krajcevski@google.com Review URL: https://codereview.chromium.org/384053003
* Add support for NEON intrinsics to speed up texture compression. We canGravatar krajcevski2014-07-11
| | | | | | | | | | | now convert the time that we would have spent uploading the texture to compressing it giving a net 50% memory savings for these things. R=robertphillips@google.com, mtklein@google.com, kevin.petit@arm.com Author: krajcevski@google.com Review URL: https://codereview.chromium.org/390453002
* MIPS: added optimizations for functions from SkBitmapProcStateGravatar djordje.pesut2014-07-08
| | | | | | | | | | | | | | gain is ~30% following functions are optimized: SI8_D16_nofilter_DX SI8_opaque_D32_nofilter_DX R=djsollen@google.com, teodora.petrovic@gmail.com Author: djordje.pesut@imgtec.com Review URL: https://codereview.chromium.org/336533003
* Add return to SkBoxBlurGetPlatformProcs_SSE4.Gravatar scroggo2014-07-07
| | | | | | | | | | This fixes Android build. R=reed@google.com, mtklein@google.com Author: scroggo@google.com Review URL: https://codereview.chromium.org/378613002
* Add SSE4 version of BlurImage optimizations.Gravatar henrik.smiding2014-07-07
| | | | | | | | | | | | | | | | Adds an SSE4.1 version of the existing BlurImage optimizations. Performance of blur_image_filter_* benchmarks show a 10-50% improvement on Linux/Ubuntu Core i7. Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> Committed: https://skia.googlesource.com/skia/+/2830632ce93c97ed7647b13348365ea92e4ea665 R=mtklein@google.com, reed@chromium.org Author: henrik.smiding@intel.com Review URL: https://codereview.chromium.org/366593004
* Revert of Add SSE4 version of BlurImage optimizations. ↵Gravatar reed2014-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (https://codereview.chromium.org/366593004/) Reason for revert: breaks linker on chrome [04:36:09.966000] [503/5965] LIB obj\chrome\installer_util.lib [04:36:10.466000] FAILED: C:\Users\chrome-bot\buildbot\third_party\depot_tools\python276_bin\python.exe gyp-win-tool link-with-manifests environment.x86 True skia.dll "C:\Users\chrome-bot\buildbot\third_party\depot_tools\python276_bin\python.exe gyp-win-tool link-wrapper environment.x86 False link.exe /nologo /IMPLIB:skia.dll.lib /DLL /OUT:skia.dll @skia.dll.rsp" 2 mt.exe rc.exe "obj\skia\skia.skia.dll.intermediate.manifest" obj\skia\skia.skia.dll.generated.manifest [04:36:10.466000] skia.opts_check_x86.obj : error LNK2019: unresolved external symbol "bool __cdecl SkBoxBlurGetPlatformProcs_SSE4(void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int))" (?SkBoxBlurGetPlatformProcs_SSE4@@YA_NPAP6AXPBIHPAIHHHHH@Z222@Z) referenced in function "bool __cdecl SkBoxBlurGetPlatformProcs(void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int))" (?SkBoxBlurGetPlatformProcs@@YA_NPAP6AXPBIHPAIHHHHH@Z222@Z) [04:36:10.466000] [04:36:10.466000] skia.dll : fatal error LNK1120: 1 unresolved externals Original issue's description: > Add SSE4 version of BlurImage optimizations. > > Adds an SSE4.1 version of the existing BlurImage optimizations. > Performance of blur_image_filter_* benchmarks show a 10-50% > improvement on Linux/Ubuntu Core i7. > > Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> > > Committed: https://skia.googlesource.com/skia/+/2830632ce93c97ed7647b13348365ea92e4ea665 R=mtklein@google.com, henrik.smiding@intel.com TBR=henrik.smiding@intel.com, mtklein@google.com NOTREECHECKS=true NOTRY=true Author: reed@chromium.org Review URL: https://codereview.chromium.org/375503003
* Add SSE4 version of BlurImage optimizations.Gravatar henrik.smiding2014-07-04
| | | | | | | | | | | | | | Adds an SSE4.1 version of the existing BlurImage optimizations. Performance of blur_image_filter_* benchmarks show a 10-50% improvement on Linux/Ubuntu Core i7. Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> R=mtklein@google.com Author: henrik.smiding@intel.com Review URL: https://codereview.chromium.org/366593004
* Exclude Clang on Windows too. Comment this up a bit.Gravatar mtklein2014-07-02
| | | | | | | | | BUG=391016 R=tomhudson@chromium.org, mtklein@google.com, rnk@chromium.org, thakis@chromium.org Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/363983004
* Disable assembly code in MemorySanitizer builds.Gravatar Mike Klein2014-07-02
| | | | | | | | | | MemorySanitizer is an unitialized memory use detector which is used in Chromium, and does not presently support assembly code. BUG=chromium:344505, chromium:373739 R=mtklein@google.com Review URL: https://codereview.chromium.org/367973005
* Hide symbols in S32A_Opaque_BlitRow32_SSE4Gravatar henrik.smiding2014-07-01
| | | | | | | | | | | | | | Marks the symbols in the S32A_Opaque_BlitRow32_SSE4 files as hidden, so Chromium can build. Also enables the optimizations. Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> R=mtklein@google.com, joakim.landberg@intel.com Author: henrik.smiding@intel.com Review URL: https://codereview.chromium.org/368573002
* Revert of Re-enable SSE4. (https://codereview.chromium.org/357593003/)Gravatar mtklein2014-06-30
| | | | | | | | | | | | | | | | | | | | | | | | Reason for revert: Mac Chrome builders still failing. Original issue's description: > Re-enable SSE4. > > I will roll this into Chrome with https://codereview.chromium.org/332393003. > > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/a75b0fadbdec4214afec6dd727fd224d34ed164f R=reed@google.com, mtklein@chromium.org TBR=mtklein@chromium.org, reed@google.com NOTREECHECKS=true NOTRY=true BUG=skia: Author: mtklein@google.com Review URL: https://codereview.chromium.org/337093004
* Re-enable SSE4.Gravatar mtklein2014-06-30
| | | | | | | | | | | I will roll this into Chrome with https://codereview.chromium.org/332393003. BUG=skia: R=reed@google.com, mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/357593003
* ARM Skia NEON patches - 41 - arm64: SkXfermode::xfer32Gravatar kevin.petit2014-06-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the NEON code for Xfermodes performs well on arm64 targets except for dstout and dstin which are significantly slower than the C code. This patch fixes this and gives further improvements on other modes. Here are some perf results: +------------+------------+------------+ | mode | Cortex-A53 | Cortex-A57 | +------------+------------+------------+ | multiply | +24.58% | +23.71% | +------------+------------+------------+ | exclusion | +22.72% | +22.05% | +------------+------------+------------+ | difference | +34.67% | +36.82% | +------------+------------+------------+ | hardlight | +17.07% | +14.74% | +------------+------------+------------+ | lighten | +38.21% | +32.87% | +------------+------------+------------+ | darken | +37.59% | +32.99% | +------------+------------+------------+ | overlay | +17.36% | +16.88% | +------------+------------+------------+ | screen | +52.56% | +54.43% | +------------+------------+------------+ | modulate | +62.85% | +61.32% | +------------+------------+------------+ | plus | +91.52% | +117.41% | +------------+------------+------------+ | xor | +42.86% | +43.38% | +------------+------------+------------+ | dstatop | +48.46% | +48.99% | +------------+------------+------------+ | srcatop | +50.50% | +48.51% | +------------+------------+------------+ | dstout | +67.83% | +78.09% | +------------+------------+------------+ | srcout | +69.02% | +78.26% | +------------+------------+------------+ | dstin | +70.92% | +79.24% | +------------+------------+------------+ | srcin | +68.90% | +78.23% | +------------+------------+------------+ | dstover | +73.80% | +68.10% | +------------+------------+------------+ Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG=skia R=mtklein@google.com, djsollen@google.com Author: kevin.petit@arm.com Review URL: https://codereview.chromium.org/350343002
* Disable SSE4 code.Gravatar mtklein2014-06-27
| | | | | | | | | | | | | Chrome canary failing to link chrome: http://108.170.220.120:10115/builders/Canary-Chrome-Ubuntu13.10-Ninja-x86_64-ToT/builds/1009/steps/BuildChrome/logs/stdio BUG=skia: NOTRY=true R=mtklein@google.com, rmistry@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/361493002
* Refactor bitmap scaler to make it easier to migrate rest of chrome to use itGravatar humper2014-06-27
| | | | | | | | | | | | | | Previously, the set of platform-specific function pointers to do fast convolution (e.g., neon, SSE) were passed in a structure to the scaler. I refactored this so that the scaler fills in these function pointers after it's called, so the caller doesn't have to worry about it. R=mtklein@google.com TBR=mtklein NOTRY=True Author: humper@google.com Review URL: https://codereview.chromium.org/354193002
* Add SSE4 optimization of S32A_Opaque_BlitrowGravatar henrik.smiding2014-06-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD instruction set. Special case for when alpha is zero or opaque. Performance increase of 10%-400% compared to the existing SSE2 optimization (measured on Silvermont architecture). Noticeable in ~25 different skia bench subtests, especially in bitmap_8888_*, repeatTile_*, and morph_*. bitmap_8888_A - 100% faster bitmap_8888_A_source_transparent - 250% faster bitmap_8888_A_source_opaque - 25% faster bitmap_8888_A_scale_bicubic - 75% faster Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> Committed: https://skia.googlesource.com/skia/+/e2527b147679b0c43019fae7d59cc3777d2d097e Committed: https://skia.googlesource.com/skia/+/b5c281e1e06af3be804309877de1dac6145686b9 R=reed@google.com, mtklein@google.com, tomhudson@google.com, djsollen@google.com, joakim.landberg@intel.com Author: henrik.smiding@intel.com Review URL: https://codereview.chromium.org/289473009
* ARM Skia NEON patches - 40 - arm64: S32A_D565_OpaqueGravatar kevin.petit2014-06-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here are some perf results: +-------+------------+------------+ | count | Cortex-A53 | Cortex-A57 | +-------+------------+------------+ | 1 | -2.54% | -5.39% | +-------+------------+------------+ | 2 | -0.66% | -2.08% | +-------+------------+------------+ | 4 | -11.13% | 0.00% | +-------+------------+------------+ | 8 | -5.79% | -1.30% | +-------+------------+------------+ | 16 | 71.60% | 93.27% | +-------+------------+------------+ | 64 | 30.99% | 57.35% | +-------+------------+------------+ | 256 | 25.41% | 52.59% | +-------+------------+------------+ | 1024 | 25.56% | 53.76% | +-------+------------+------------+ Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG=skia: R=mtklein@google.com, djsollen@google.com Author: kevin.petit@arm.com Review URL: https://codereview.chromium.org/346843003
* Fix SkBlitRow_opts_arm so that it works on ARM v4t.Gravatar george2014-06-20
| | | | | | | | | | | Original Mozilla bug: https://bugzilla.mozilla.org/show_bug.cgi?id=901208 R=reed@google.com, mtklein@google.com, reed1 BUG=skia: Author: george@mozilla.com Review URL: https://codereview.chromium.org/337853003