aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/opts
Commit message (Collapse)AuthorAge
* Even more win64 warning fixesGravatar bsalomon2014-12-12
| | | | Review URL: https://codereview.chromium.org/800993002
* Add SSSE3 acceleration for S32_D16_filter_DXGravatar qiankun.miao2014-12-10
| | | | | | | | | | | With this CL, related nanobench can be improved for 565 config. bitmap_BGRA_8888_update_scale_bilerp 76.1us -> 46.7us 0.61x bitmap_BGRA_8888_scale_bilerp 78.7us -> 47us 0.6x bitmap_BGRA_8888_update_volatile_scale_bilerp 82.7us -> 46.9us 0.57x BUG=skia: Review URL: https://codereview.chromium.org/788853002
* Avoid crash on some 64b ARM NEON platforms.Gravatar tomhudson2014-12-09
| | | | | | | | | | The compiler may choose to use x30 for a local loop counter; ensure it's saved. Patch from kevin.petit@arm.com, verified by benm@google.com. R=djsollen@google.com Review URL: https://codereview.chromium.org/786273003
* Add SSSE3 acceleration for S32_D16_filter_DXDYGravatar qiankun.miao2014-12-04
| | | | | | | | | | | | | With this CL, related nanobench can be improved for 565 config. bitmap_BGRA_8888_scale_rotate_bilerp 115us -> 70.5us 0.61x bitmap_BGRA_8888_update_volatile_scale_rotate_bilerp 115us -> 70.5us 0.61x bitmap_BGRA_8888_update_scale_rotate_bilerp 112us -> 68us 0.6x BUG=skia: Committed: https://skia.googlesource.com/skia/+/45a05780867a06b9f8a8d5240cf6c5d5a2c15a35 Review URL: https://codereview.chromium.org/773753002
* Revert of Add SSSE3 acceleration for S32_D16_filter_DXDY (patchset #3 ↵Gravatar jam2014-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | id:40001 of https://codereview.chromium.org/773753002/) Reason for revert: breaks build when not using SSE3, since the two method definitions differ in parameter types (typo) Original issue's description: > Add SSSE3 acceleration for S32_D16_filter_DXDY > > With this CL, related nanobench can be improved for 565 config. > bitmap_BGRA_8888_scale_rotate_bilerp 115us -> 70.5us 0.61x > bitmap_BGRA_8888_update_volatile_scale_rotate_bilerp 115us -> 70.5us 0.61x > bitmap_BGRA_8888_update_scale_rotate_bilerp 112us -> 68us 0.6x > > > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/45a05780867a06b9f8a8d5240cf6c5d5a2c15a35 TBR=mtklein@google.com,qkmiao@gmail.com,qiankun.miao@intel.com NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/761103003
* Add SSSE3 acceleration for S32_D16_filter_DXDYGravatar qiankun.miao2014-12-02
| | | | | | | | | | | With this CL, related nanobench can be improved for 565 config. bitmap_BGRA_8888_scale_rotate_bilerp 115us -> 70.5us 0.61x bitmap_BGRA_8888_update_volatile_scale_rotate_bilerp 115us -> 70.5us 0.61x bitmap_BGRA_8888_update_scale_rotate_bilerp 112us -> 68us 0.6x BUG=skia: Review URL: https://codereview.chromium.org/773753002
* SkColorTable locking serves no purpose anymore.Gravatar mtklein2014-12-02
| | | | | | | | | The only thing the unlock methods were doing was assert their balance. This removes the unlock methods and renames the lock methods "read". BUG=skia: Review URL: https://codereview.chromium.org/719213008
* Remove SK_SUPPORT_LEGACY_DEEPFLATTENING.Gravatar mtklein2014-12-01
| | | | | | | | | | | | This was needed for pictures before v33, and we're now requiring v35+. Will follow up with the same for skia/ext/pixel_ref_utils_unittest.cc BUG=skia: Committed: https://skia.googlesource.com/skia/+/52c293547b973f7fb5de3c83f5062b07d759ab88 Review URL: https://codereview.chromium.org/769953002
* Revert of Remove SK_SUPPORT_LEGACY_DEEPFLATTENING. (patchset #1 id:1 of ↵Gravatar mtklein2014-12-01
| | | | | | | | | | | | | | | | | | | | | | | | | https://codereview.chromium.org/769953002/) Reason for revert: Breaks canary builds. Will reland after the Chromium change lands. Original issue's description: > Remove SK_SUPPORT_LEGACY_DEEPFLATTENING. > > This was needed for pictures before v33, and we're now requiring v35+. > > Will follow up with the same for skia/ext/pixel_ref_utils_unittest.cc > > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/52c293547b973f7fb5de3c83f5062b07d759ab88 TBR=reed@google.com,mtklein@chromium.org NOTREECHECKS=true NOTRY=true BUG=skia: Review URL: https://codereview.chromium.org/768183002
* Remove SK_SUPPORT_LEGACY_DEEPFLATTENING.Gravatar mtklein2014-12-01
| | | | | | | | | | This was needed for pictures before v33, and we're now requiring v35+. Will follow up with the same for skia/ext/pixel_ref_utils_unittest.cc BUG=skia: Review URL: https://codereview.chromium.org/769953002
* Eliminate static initializers in SkColor_SSE2.h.Gravatar mtklein2014-11-25
| | | | | | | | | | | | | | | | | | | | Chrome hates static initializers. Two global masks can become a single local mask instead. Perf looks like a no-op: $ c --match bitmaprect_80 bitmap_RGBA --config 8888 bitmap_RGBA_8888_scale 13.7us -> 14.1us 1.03x bitmap_RGBA_8888_update_volatile 4.53us -> 4.6us 1.02x bitmap_RGBA_8888 4.55us -> 4.61us 1.01x bitmap_RGBA_8888_update 4.64us -> 4.67us 1.01x bitmap_RGBA_8888_A_source_stripes_three 9.66us -> 9.71us 1.01x bitmaprect_80_filter_identity 10.6us -> 10.5us 0.99x bitmaprect_80_nofilter_identity 10.5us -> 10.4us 0.99x TBR=reed@google.com BUG=skia: Review URL: https://codereview.chromium.org/762453002
* Optimize highQualityFilterGravatar qiankun.miao2014-11-25
| | | | | | | | | | | | | | | | | | portable version: before: 10M 1 806µs 807µs 810µs 821µs 1% █▂▁▁▃▁▁▁█▁ 8888 bitmap_BGRA_8888_A_scale_rotate_bicubic after: 10M 1 566µs 568µs 569µs 579µs 1% ▄▂▂█▂▁▁▁▃▁ 8888 bitmap_BGRA_8888_A_scale_rotate_bicubic SSE version: before: 10M 1 485µs 486µs 487µs 494µs 1% ▇▂▁▁▁▁█▂▁▁ 8888 bitmap_BGRA_8888_A_scale_rotate_bicubic after: 10M 1 419µs 420µs 421µs 430µs 1% ▅▃▂▁▁█▂▁▁▁ 8888 bitmap_BGRA_8888_A_scale_rotate_bicubic BUG=skia: Review URL: https://codereview.chromium.org/759603002
* Add SkBlendARGB32_SSE2() to clean up codeGravatar qiankun.miao2014-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Related nanobench results: before: maxrss loops min median mean max stddev samples config bench 10M 2 31.9µs 32.4µs 33.3µs 38.7µs 6% █▄▂▂▂▁▂▁▁▁ 8888 bitmap_BGRA_8888_A_scale_bicubic 10M 13 43.8µs 51.8µs 49.6µs 57.9µs 11% ▁▁▁▁▂▆▇▆▅█ 8888 bitmap_BGRA_8888_A_scale_bilerp 10M 13 23.7µs 24.3µs 26µs 32.7µs 13% ▅█▆▁▁▁▁▂▁▁ 8888 bitmap_Index_8_A 10M 4 1.68µs 1.7µs 4.09µs 25.4µs 183% █▁▁▁▁▁▁▁▁▁ 8888 text_16_AA_88 10M 144 1.76µs 1.77µs 1.78µs 1.81µs 1% █▂▇▂▅▁▁▁▁▁ 8888 text_16_AA_FF 10M 10 4.7µs 5.34µs 5.61µs 8.63µs 21% █▂▂▃▂▁▁▁▁▄ 8888 rotated_rects_aa_alternating_transparent_and_opaque_src 10M 50 4.44µs 4.47µs 4.5µs 4.71µs 2% █▅▃▂▂▂▁▁▁▁ 8888 rotated_rects_aa_changing_opaque_src 10M 51 4.39µs 4.78µs 5.21µs 6.62µs 17% ▁▆▆▇▁▁█▁▂▂ 8888 rotated_rects_aa_same_opaque_src 10M 50 4.47µs 5.79µs 5.43µs 6.14µs 11% ▄▂▁▃▇▇▆▇▇█ 8888 rotated_rects_aa_alternating_transparent_and_opaque_srcover 10M 30 4.35µs 6.06µs 5.84µs 7.63µs 16% ▅▅▅▄▅▅▄█▁▁ 8888 rotated_rects_aa_changing_transparent_srcover 10M 44 4.31µs 4.51µs 4.76µs 6.25µs 13% ▄▂▂▁█▃▁▃▁▁ 8888 rotated_rects_aa_changing_opaque_srcover 10M 46 4.36µs 4.42µs 4.75µs 6.19µs 14% ▆█▃▁▁▁▁▁▁▁ 8888 rotated_rects_aa_same_transparent_srcover 10M 47 4.29µs 4.35µs 4.44µs 5.15µs 6% ▃▂▂▁▁█▁▁▁▁ 8888 rotated_rects_aa_same_opaque_srcover 10M 3 39.1µs 39.2µs 50.7µs 153µs 71% █▁▁▁▁▁▁▁▁▁ 8888 rectori 10M 1 2.3ms 2.31ms 2.35ms 2.74ms 6% ▁▁▁▁▁▁▁▁█▂ 8888 maskcolor 10M 1 2.33ms 2.34ms 2.53ms 3.14ms 11% ▁▁▁▁▁▁▅█▄▄ 8888 maskopaque 10M 11 15µs 15.3µs 15.7µs 18.3µs 7% ▅▃▂▂▁▁▁▁█▁ 8888 rrects_3_stroke_4 10M 46 3.99µs 4.07µs 4.14µs 4.54µs 4% █▅▅▃▂▂▁▁▁▁ 8888 rrects_3 10M 16 15.6µs 15.9µs 16.1µs 17.5µs 4% █▄▃▂▂▂▁▂▁▁ 8888 ovals_3_stroke_4 10M 40 5.09µs 5.18µs 5.23µs 5.67µs 3% █▅▃▂▂▁▃▁▁▁ 8888 ovals_3 10M 231 1.92µs 1.93µs 1.94µs 2µs 1% █▃▂▁▃▁▁▁▁▁ 8888 zeroradroundrect 10M 924 3.88µs 3.93µs 4.11µs 4.95µs 9% ▁█▆▃▁▁▁▁▁▁ 8888 arbroundrect 10M 8 8.11µs 8.47µs 8.48µs 8.85µs 3% █▅▇▄▄▂▁▄▄▆ 8888 merge_large 10M 14 6.71µs 6.92µs 6.96µs 7.46µs 3% ▃▆▁█▃▃▃▂▂▁ 8888 merge_small 11M 2 225µs 227µs 229µs 233µs 1% ███▃▇▂▃▁▃▂ 8888 displacement_full_large 16M 1 381µs 401µs 401µs 421µs 3% ▅▅▅█▆▄▄▃▃▁ 8888 displacement_alpha_large 19M 1 507µs 508µs 509µs 512µs 0% █▃▂▆▂▂▃▂▃▁ 8888 displacement_zero_large 19M 19 9µs 9.11µs 9.15µs 9.67µs 2% ▄▂▂▂█▂▁▁▁▂ 8888 displacement_full_small 19M 5 54.2µs 54.5µs 54.9µs 58µs 2% █▃▂▂▁▁▃▁▁▁ 8888 blurroundrect_WH[100x100]_cr[90] 20M 1 229µs 230µs 231µs 240µs 2% █▄▃▂▂▁▁▁▁▂ 8888 GM_varied_text_clipped_no_lcd 20M 1 267µs 269µs 270µs 279µs 1% █▄▃▂▂▂▂▂▁▁ 8888 GM_varied_text_ignorable_clip_no_lcd 22M 1 1.95ms 1.97ms 2.03ms 2.46ms 8% ▁▁▁▁▁▁▁▂█▃ 8888 GM_convex_poly_clip after: maxrss loops min median mean max stddev samples config bench 10M 2 31.5µs 32.3µs 32.8µs 37.2µs 5% █▄▃▂▂▂▁▁▁▁ 8888 bitmap_BGRA_8888_A_scale_bicubic 10M 13 43.9µs 44µs 44.1µs 44.9µs 1% █▂▁▁▁▆▁▁▁▂ 8888 bitmap_BGRA_8888_A_scale_bilerp 10M 19 22.7µs 23.3µs 25.6µs 32.4µs 14% ▁▁▁▁▁▅▆▁▅█ 8888 bitmap_Index_8_A 10M 5 1.79µs 1.97µs 3.85µs 21.1µs 158% █▁▁▁▁▁▁▁▁▁ 8888 text_16_AA_88 10M 141 1.83µs 1.83µs 1.85µs 1.93µs 2% ▅▁▁█▁▁▁▁▁▁ 8888 text_16_AA_FF 10M 10 4.65µs 4.92µs 5.06µs 6.56µs 11% █▃▃▂▂▂▁▁▁▁ 8888 rotated_rects_aa_alternating_transparent_and_opaque_src 10M 51 4.35µs 4.48µs 4.83µs 6.68µs 17% ▂▁▁▁▁▁▁▂▆█ 8888 rotated_rects_aa_changing_opaque_src 10M 51 4.38µs 4.79µs 4.85µs 5.84µs 11% ▁█▁▃▃▁▄▁▄▇ 8888 rotated_rects_aa_same_opaque_src 10M 32 5.58µs 6.24µs 6.1µs 6.39µs 5% █▂█▆▁▇▄▅▇▇ 8888 rotated_rects_aa_alternating_transparent_and_opaque_srcover 10M 42 4.28µs 5.59µs 5.11µs 6.01µs 15% ▂▂█▇█▂▁▆▁▇ 8888 rotated_rects_aa_changing_transparent_srcover 10M 48 4.24µs 4.33µs 4.58µs 6.46µs 15% ▁▁▁▁▁█▃▂▁▁ 8888 rotated_rects_aa_changing_opaque_srcover 10M 48 4.28µs 4.3µs 4.4µs 5.12µs 6% ▂▂▁▁▁▁▁▁▁█ 8888 rotated_rects_aa_same_transparent_srcover 10M 46 4.24µs 4.29µs 4.66µs 7.11µs 20% ▁▁▁▁▁▁▁▁▃█ 8888 rotated_rects_aa_same_opaque_srcover 10M 3 39.3µs 39.4µs 51.4µs 154µs 70% █▁▁▁▁▁▁▁▁▁ 8888 rectori 10M 1 2.32ms 2.43ms 2.53ms 3.14ms 11% ▁▁▁▁▂▄█▃▅▁ 8888 maskcolor 10M 1 2.33ms 2.37ms 2.54ms 3.21ms 12% ▁▁▁▁▁▂█▅▆▁ 8888 maskopaque 10M 10 15.3µs 15.6µs 15.8µs 17.2µs 4% █▅▃▂▂▂▁▁▁▁ 8888 rrects_3_stroke_4 10M 46 4.03µs 4.09µs 4.15µs 4.47µs 4% █▄▆▂▂▂▁▁▁▁ 8888 rrects_3 10M 15 15.9µs 16.2µs 16.3µs 17.8µs 4% █▄▃▂▂▂▁▁▁▁ 8888 ovals_3_stroke_4 10M 40 5.14µs 5.26µs 5.29µs 5.72µs 3% █▅▃▂▂▁▂▂▁▁ 8888 ovals_3 10M 222 1.91µs 1.99µs 2.21µs 2.91µs 19% ▂▁▁▁▁▁▂▇▇█ 8888 zeroradroundrect 10M 462 3.9µs 3.96µs 4.23µs 5.22µs 12% ▆▄█▁▂▁▁▁▁▁ 8888 arbroundrect 10M 8 8.2µs 8.59µs 8.62µs 8.97µs 3% ▆▄█▄▅▃▁▆▄█ 8888 merge_large 10M 14 6.73µs 6.88µs 6.86µs 7.08µs 2% ▄█▁▂▄▂▅▄▂▅ 8888 merge_small 11M 2 221µs 234µs 237µs 263µs 5% ▄▃▃▃▄▃▂▁▇█ 8888 displacement_full_large 16M 1 387µs 416µs 427µs 471µs 7% ▇█▁▃▃▁▃▃▇▆ 8888 displacement_alpha_large 19M 1 512µs 521µs 528µs 594µs 5% █▂▂▂▁▁▂▃▁▁ 8888 displacement_zero_large 19M 18 9.06µs 9.12µs 9.13µs 9.23µs 1% █▃▃▃▄▃▆▁▅▅ 8888 displacement_full_small 19M 5 55.6µs 55.9µs 56.5µs 59.5µs 2% █▃▂▁▁▁▁▁▅▁ 8888 blurroundrect_WH[100x100]_cr[90] 20M 1 229µs 233µs 235µs 254µs 3% █▄▃▂▂▁▁▂▁▁ 8888 GM_varied_text_clipped_no_lcd 20M 1 270µs 271µs 272µs 278µs 1% █▄▃▂▂▂▁▂▁▇ 8888 GM_varied_text_ignorable_clip_no_lcd 22M 1 1.96ms 2ms 2.06ms 2.45ms 7% ▂▂▁▁▁▁▁▃█▄ 8888 GM_convex_poly_clip BUG=skia: Review URL: https://codereview.chromium.org/754733002
* Cleanup with SkAlphaMulQ_SSE2()Gravatar qiankun.miao2014-11-25
| | | | | | | | | | | | | | | | | | | | | | | | | | Related nanobench results: before: 10M 18 7.03µs 7.31µs 7.38µs 8.46µs 6% ▂▁▂▂▂▃▄▁█▁ 8888 bitmaprect_80_filter_identity 10M 43 6.96µs 6.97µs 6.99µs 7.19µs 1% ▁▂▁▁▁▁▁█▁▁ 8888 bitmaprect_80_nofilter_identity 10M 14 35.7µs 35.8µs 35.9µs 36.3µs 1% ▃▂▁▂▁█▂▁▁▁ 8888 bitmap_BGRA_8888_update_scale_bilerp 10M 16 35.5µs 35.6µs 35.7µs 36.3µs 1% █▅▂▁▁▁▃▂▁▁ 8888 bitmap_BGRA_8888_update_volatile_scale_bilerp 10M 16 35.4µs 35.4µs 35.5µs 36.8µs 1% ▂▁█▁▁▁▁▂▁▁ 8888 bitmap_BGRA_8888_scale_bilerp 10M 25 16.4µs 16.6µs 16.7µs 17.4µs 2% ▂▁▁▂▁▁▁▅▅█ 8888 bitmap_Index_8 10M 15 37.9µs 38µs 38µs 38.4µs 0% ▄▆▂▁▁▁█▂▁▁ 8888 bitmap_RGB_565 10M 33 11.1µs 11.1µs 11.1µs 11.2µs 0% ▆▂█▂▂▂▁▁▂▁ 8888 bitmap_BGRA_8888_scale after: 10M 9 7.04µs 7.06µs 7.1µs 7.32µs 1% █▅▂▁▁▂▁▁▁▁ 8888 bitmaprect_80_filter_identity 10M 18 7.01µs 7.02µs 7.05µs 7.25µs 1% █▂▁▁▁▁▁▁▁▁ 8888 bitmaprect_80_nofilter_identity 10M 5 33.9µs 34µs 34.1µs 34.5µs 1% █▃▂▂▁▁▁▅▃▂ 8888 bitmap_BGRA_8888_update_scale_bilerp 10M 7 35.5µs 35.5µs 35.6µs 36.3µs 1% ▃▂▂▁▂▁▂▁█▂ 8888 bitmap_BGRA_8888_update_volatile_scale_bilerp 10M 7 35.5µs 35.5µs 35.7µs 36.8µs 1% ▂▁▁▁▁▁▁▁▁█ 8888 bitmap_BGRA_8888_scale_bilerp 10M 11 16.4µs 16.4µs 16.4µs 16.6µs 0% █▂▁▁▂▁▁▁▂▁ 8888 bitmap_Index_8 10M 7 37.3µs 37.4µs 38.4µs 47.8µs 9% ▁▁▁▁▁▁▁▁▁█ 8888 bitmap_RGB_565 10M 33 11µs 11µs 11.1µs 11.2µs 1% ▄█▅▃▂▁▁▁▁▁ 8888 bitmap_BGRA_8888_scale BUG=skia: Review URL: https://codereview.chromium.org/755573002
* Cleanup of S32_D565_Opaque_SSE2()Gravatar qiankun.miao2014-11-24
| | | | | | BUG=skia: Review URL: https://codereview.chromium.org/725693003
* Optimize SkAlphaMulQ_SSE2Gravatar qiankun.miao2014-11-14
| | | | | | | | | These two mask clear are useless, because _mm_srli_epi16 fills high byte of each word with 0. BUG=skia: Review URL: https://codereview.chromium.org/724333003
* Fix race in supports_simd().Gravatar mtklein2014-10-13
| | | | | | | | | | Local statics are not thread safe in Chrome. Use an SkLazyPtr instead. See https://code.google.com/p/chromium/issues/detail?id=418041 BUG=418041 Review URL: https://codereview.chromium.org/655573002
* Improve SkARGB32_A8_BlitMask_SSE2Gravatar jmuizelaar2014-10-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With clang this: - movzbl -3(%rbx), %edx - pxor %xmm5, %xmm5 - pinsrw $0, %edx, %xmm5 - pinsrw $1, %edx, %xmm5 - movzbl -2(%rbx), %edx - pinsrw $2, %edx, %xmm5 - pinsrw $3, %edx, %xmm5 - movzbl -1(%rbx), %edx - pinsrw $4, %edx, %xmm5 - pinsrw $5, %edx, %xmm5 - movzbl (%rbx), %edx - pinsrw $6, %edx, %xmm5 - pinsrw $7, %edx, %xmm5 becomes: + movd (%rbx), %xmm4 + punpcklbw %xmm9, %xmm4 + punpcklwd %xmm4, %xmm4 And clang already does better codegen than msvc 2013 on this. BUG=skia: Review URL: https://codereview.chromium.org/609823003
* Enable highQualityFilter_SSE2Gravatar qiankun.miao2014-09-04
| | | | | | | | | | | | | | | | With SSE2, bitmap_BGRA_8888_A_scale_rotate_bicubic gains about 40% performance improvement on desktop i7-3770. BUG=skia: Committed: https://skia.googlesource.com/skia/+/b381fa10d8079c58928058bb8a6db32b39f05e51 CQ_EXTRA_TRYBOTS=tryserver.skia:Test-Mac10.6-MacMini4.1-GeForce320M-x86_64-Release-Trybot R=mtklein@google.com, humper@google.com Author: qiankun.miao@intel.com Review URL: https://codereview.chromium.org/525283002
* Revert of Enable highQualityFilter_SSE2 (patchset #1 id:1 of ↵Gravatar mtklein2014-09-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | https://codereview.chromium.org/525283002/) Reason for revert: Color order looks wrong on Macs: Before: http://chromium-skia-gm.commondatastorage.googleapis.com/gm/bitmap-64bitMD5/filterbitmap_image_mandrill_16.png/12823183142873462143.png After: http://chromium-skia-gm.commondatastorage.googleapis.com/gm/bitmap-64bitMD5/filterbitmap_image_mandrill_16.png/13683040204546320578.png Original issue's description: > Enable highQualityFilter_SSE2 > > With SSE2, bitmap_BGRA_8888_A_scale_rotate_bicubic gains about 40% > performance improvement on desktop i7-3770. > > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/b381fa10d8079c58928058bb8a6db32b39f05e51 R=humper@google.com, qiankun.miao@intel.com TBR=humper@google.com, qiankun.miao@intel.com NOTREECHECKS=true NOTRY=true BUG=skia: Author: mtklein@google.com Review URL: https://codereview.chromium.org/539523002
* Enable highQualityFilter_SSE2Gravatar qiankun.miao2014-09-03
| | | | | | | | | | | | With SSE2, bitmap_BGRA_8888_A_scale_rotate_bicubic gains about 40% performance improvement on desktop i7-3770. BUG=skia: R=mtklein@google.com, humper@google.com Author: qiankun.miao@intel.com Review URL: https://codereview.chromium.org/525283002
* Disable SSE4 S32A_Opaque blit.Gravatar mtklein2014-09-03
| | | | | | | | | | | This code sometimes generates a build warning that bothers Chrome. BUG=399842,skia:2906 R=reed@google.com, mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/538463003
* Remove dead code in SkBitmapFilter_opts_SSE2.h/cppGravatar qiankun.miao2014-09-03
| | | | | | | | | BUG=skia: R=mtklein@google.com, humper@google.com Author: qiankun.miao@intel.com Review URL: https://codereview.chromium.org/530673002
* Disable NEON procs for box blur as it produces invalid resultsGravatar djsollen2014-09-02
| | | | | | | | | | R=reed@google.com, mtklein@google.com, senorblanco@google.com TBR=reed@google.com BUG=skia:2845 Author: djsollen@google.com Review URL: https://codereview.chromium.org/527973002
* Revert of Disable NEON procs for box blur as it produces invalid results ↵Gravatar djsollen2014-09-02
| | | | | | | | | | | | | | | | | | | | | | | | (patchset #1 id:1 of https://codereview.chromium.org/520963002/) Reason for revert: failing more GMs than expected. Original issue's description: > Disable NEON procs for box blur as it produces invalid results > > BUG=skia:2845 > > Committed: https://skia.googlesource.com/skia/+/4a1764688c990fb926aaeab538497dad52768d99 R=senorblanco@google.com, mtklein@google.com TBR=mtklein@google.com, senorblanco@google.com NOTREECHECKS=true NOTRY=true BUG=skia:2845 Author: djsollen@google.com Review URL: https://codereview.chromium.org/531023002
* Disable NEON procs for box blur as it produces invalid resultsGravatar djsollen2014-09-02
| | | | | | | | | BUG=skia:2845 R=senorblanco@google.com, mtklein@google.com Author: djsollen@google.com Review URL: https://codereview.chromium.org/520963002
* Disable Neon optimization of bad S32A/D565 blend.Gravatar mtklein2014-08-22
| | | | | | | | | | | | BUG=skia:2797 Committed: https://skia.googlesource.com/skia/+/84cab93186fbe3e87d931fea73cb31b70ff5017b R=mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/497823002
* Disable Neon optimization of bad S32A/D565 blend.Gravatar mtklein2014-08-22
| | | | | | | | | BUG=skia:2797 R=mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/497823002
* disable neon proc that is triggering assertsGravatar reed2014-08-22
| | | | | | | | | BUG=skia:2845 R=mtklein@google.com Author: reed@google.com Review URL: https://codereview.chromium.org/498733002
* Simplify flattening to just write enough to call the ↵Gravatar reed2014-08-21
| | | | | | | | | | | | | | | | | | factory/public-constructor for the class. We want to *not* rely on private constructors, and not rely on calling through the inheritance hierarchy for either flattening or unflattening(CreateProc). Refactoring pattern: 1. guard the existing constructor(readbuffer) with the legacy build-flag 2. If you are a instancable subclass, implement CreateProc(readbuffer) to create a new instances from the buffer params (or return NULL). If you're a shader subclass 1. You must read/write the local matrix if your class accepts that in its factory/constructor, else ignore it. R=robertphillips@google.com, mtklein@google.com, senorblanco@google.com, senorblanco@chromium.org, sugoi@chromium.org Author: reed@google.com Review URL: https://codereview.chromium.org/395603002
* Turn off NEON SkBoxBlurGetPlatformProcs for ARM64 (for now)Gravatar halcanary2014-08-20
| | | | | | | | | BUG=skia:2845 R=djsollen@google.com, senorblanco@google.com, senorblanco@chromium.org Author: halcanary@google.com Review URL: https://codereview.chromium.org/491973002
* Let skia build with clang's integrated assembler.Gravatar thakis2014-08-11
| | | | | | | | | | | | | | | | 1. vuzpq is a gcc instruction. Replace it with the equivalent vuzp (see http://llvm.org/PR20423) 2. .func / .endfunc only have an effect with -gstabs, which we don't use. As it's unused and clang doesn't support it, remove .func / .endfunc (also see http://llvm.org/20424) BUG=chromium:124610 R=mtklein@google.com Author: thakis@chromium.org Review URL: https://codereview.chromium.org/461693004
* Replace a pre-UAL instruction with its modern form.Gravatar thakis2014-08-11
| | | | | | | | | | | See the notes in the Chromium bug, and http://llvm.org/20427 BUG=chromium:124610,skia:900 R=djsollen@google.com, mtklein@google.com Author: thakis@chromium.org Review URL: https://codereview.chromium.org/455903002
* Fix S32A_D565_Opaque for RGBA on arm64Gravatar kevin.petit2014-08-09
| | | | | | | | | | | Signed-off-by: Kévin PETIT <kevin.petit@arm.com> BUG=skia:2813 R=halcanary@google.com, djsollen@google.com, mtklein@google.com Author: kevin.petit@arm.com Review URL: https://codereview.chromium.org/458453002
* Disable suspect NEON function for 64-bit AndroidGravatar djsollen2014-08-07
| | | | | | | | R=halcanary@google.com, mtklein@google.com, kevin.petit@arm.com Author: djsollen@google.com Review URL: https://codereview.chromium.org/451633006
* Add query for block dimensions of a given formatGravatar krajcevski2014-07-29
| | | | | | | | R=robertphillips@google.com Author: krajcevski@google.com Review URL: https://codereview.chromium.org/422023006
* Enable the SSSE3 compile time check on all platforms (4th attempt)Gravatar djsollen2014-07-24
| | | | | | | | | BUG=skia:2746 R=bungeman@google.com, robertphillips@google.com, mtklein@google.com Author: djsollen@google.com Review URL: https://codereview.chromium.org/414033002
* Revert of Enable the SSSE3 compile time check on all platforms. ↵Gravatar bungeman2014-07-23
| | | | | | | | | | | | | | | | | | | | | | | | (https://codereview.chromium.org/403583002/) Reason for revert: This is blocking the roll. Chromium Windows trybots (like win_chromium_x64_rel) are crashing in the SSSE3 code (for example SkCanvasVideoRenderTest.CroppedFrame). Original issue's description: > Enable the SSSE3 compile time check on all platforms (3rd attempt) > > BUG=skia:2746 > > Committed: https://skia.googlesource.com/skia/+/933834851f9d48fbd85b728cc92e1f0134bfaa4e R=halcanary@google.com, mtklein@google.com, djsollen@google.com TBR=djsollen@google.com, halcanary@google.com, mtklein@google.com NOTREECHECKS=true NOTRY=true BUG=skia:2746 Author: bungeman@google.com Review URL: https://codereview.chromium.org/418523002
* Enable the SSSE3 compile time check on all platforms (3rd attempt)Gravatar djsollen2014-07-22
| | | | | | | | | BUG=skia:2746 R=halcanary@google.com, mtklein@google.com Author: djsollen@google.com Review URL: https://codereview.chromium.org/403583002
* Add support for NEON intrinsics to speed up texture compression. We canGravatar krajcevski2014-07-14
| | | | | | | | | | | | | now convert the time that we would have spent uploading the texture to compressing it giving a net 50% memory savings for these things. Committed: https://skia.googlesource.com/skia/+/bc9205be0a1094e312da098348601398c210dc5a R=robertphillips@google.com, mtklein@google.com, kevin.petit@arm.com Author: krajcevski@google.com Review URL: https://codereview.chromium.org/390453002
* Revert of Enable the SSSE3 compile time check on all platforms. ↵Gravatar halcanary2014-07-14
| | | | | | | | | | | | | | | | | | | | | | | | (https://codereview.chromium.org/391693004/) Reason for revert: windows fail Original issue's description: > Enable the SSSE3 compile time check on all platforms. > > BUG=skia:2746 > > Committed: https://skia.googlesource.com/skia/+/ee349531446ae2a8336b0903e05d0b2150d2131f R=mtklein@google.com, djsollen@google.com TBR=djsollen@google.com, mtklein@google.com NOTREECHECKS=true NOTRY=true BUG=skia:2746 Author: halcanary@google.com Review URL: https://codereview.chromium.org/390063002
* Enable the SSSE3 compile time check on all platforms.Gravatar djsollen2014-07-14
| | | | | | | | | BUG=skia:2746 R=mtklein@google.com Author: djsollen@google.com Review URL: https://codereview.chromium.org/391693004
* MIPS: added optimization for SkRGB16_Opaque_Blitter::blitMaskGravatar djordje.pesut2014-07-14
| | | | | | | | | | gaint is ~30% R=djsollen@google.com Author: djordje.pesut@imgtec.com Review URL: https://codereview.chromium.org/357693002
* Revert of Add support for NEON intrinsics to speed up texture compression. ↵Gravatar krajcevski2014-07-11
| | | | | | | | | | | | | | | | | | | | | | | We can (https://codereview.chromium.org/390453002/) Reason for revert: Breaking chrome. Original issue's description: > Add support for NEON intrinsics to speed up texture compression. We can > now convert the time that we would have spent uploading the texture to > compressing it giving a net 50% memory savings for these things. > > Committed: https://skia.googlesource.com/skia/+/bc9205be0a1094e312da098348601398c210dc5a R=robertphillips@google.com, mtklein@google.com, kevin.petit@arm.com TBR=kevin.petit@arm.com, mtklein@google.com, robertphillips@google.com NOTREECHECKS=true NOTRY=true Author: krajcevski@google.com Review URL: https://codereview.chromium.org/384053003
* Add support for NEON intrinsics to speed up texture compression. We canGravatar krajcevski2014-07-11
| | | | | | | | | | | now convert the time that we would have spent uploading the texture to compressing it giving a net 50% memory savings for these things. R=robertphillips@google.com, mtklein@google.com, kevin.petit@arm.com Author: krajcevski@google.com Review URL: https://codereview.chromium.org/390453002
* MIPS: added optimizations for functions from SkBitmapProcStateGravatar djordje.pesut2014-07-08
| | | | | | | | | | | | | | gain is ~30% following functions are optimized: SI8_D16_nofilter_DX SI8_opaque_D32_nofilter_DX R=djsollen@google.com, teodora.petrovic@gmail.com Author: djordje.pesut@imgtec.com Review URL: https://codereview.chromium.org/336533003
* Add return to SkBoxBlurGetPlatformProcs_SSE4.Gravatar scroggo2014-07-07
| | | | | | | | | | This fixes Android build. R=reed@google.com, mtklein@google.com Author: scroggo@google.com Review URL: https://codereview.chromium.org/378613002
* Add SSE4 version of BlurImage optimizations.Gravatar henrik.smiding2014-07-07
| | | | | | | | | | | | | | | | Adds an SSE4.1 version of the existing BlurImage optimizations. Performance of blur_image_filter_* benchmarks show a 10-50% improvement on Linux/Ubuntu Core i7. Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> Committed: https://skia.googlesource.com/skia/+/2830632ce93c97ed7647b13348365ea92e4ea665 R=mtklein@google.com, reed@chromium.org Author: henrik.smiding@intel.com Review URL: https://codereview.chromium.org/366593004
* Revert of Add SSE4 version of BlurImage optimizations. ↵Gravatar reed2014-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (https://codereview.chromium.org/366593004/) Reason for revert: breaks linker on chrome [04:36:09.966000] [503/5965] LIB obj\chrome\installer_util.lib [04:36:10.466000] FAILED: C:\Users\chrome-bot\buildbot\third_party\depot_tools\python276_bin\python.exe gyp-win-tool link-with-manifests environment.x86 True skia.dll "C:\Users\chrome-bot\buildbot\third_party\depot_tools\python276_bin\python.exe gyp-win-tool link-wrapper environment.x86 False link.exe /nologo /IMPLIB:skia.dll.lib /DLL /OUT:skia.dll @skia.dll.rsp" 2 mt.exe rc.exe "obj\skia\skia.skia.dll.intermediate.manifest" obj\skia\skia.skia.dll.generated.manifest [04:36:10.466000] skia.opts_check_x86.obj : error LNK2019: unresolved external symbol "bool __cdecl SkBoxBlurGetPlatformProcs_SSE4(void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int))" (?SkBoxBlurGetPlatformProcs_SSE4@@YA_NPAP6AXPBIHPAIHHHHH@Z222@Z) referenced in function "bool __cdecl SkBoxBlurGetPlatformProcs(void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int),void (__cdecl**)(unsigned int const *,int,unsigned int *,int,int,int,int,int))" (?SkBoxBlurGetPlatformProcs@@YA_NPAP6AXPBIHPAIHHHHH@Z222@Z) [04:36:10.466000] [04:36:10.466000] skia.dll : fatal error LNK1120: 1 unresolved externals Original issue's description: > Add SSE4 version of BlurImage optimizations. > > Adds an SSE4.1 version of the existing BlurImage optimizations. > Performance of blur_image_filter_* benchmarks show a 10-50% > improvement on Linux/Ubuntu Core i7. > > Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> > > Committed: https://skia.googlesource.com/skia/+/2830632ce93c97ed7647b13348365ea92e4ea665 R=mtklein@google.com, henrik.smiding@intel.com TBR=henrik.smiding@intel.com, mtklein@google.com NOTREECHECKS=true NOTRY=true Author: reed@chromium.org Review URL: https://codereview.chromium.org/375503003
* Add SSE4 version of BlurImage optimizations.Gravatar henrik.smiding2014-07-04
| | | | | | | | | | | | | | Adds an SSE4.1 version of the existing BlurImage optimizations. Performance of blur_image_filter_* benchmarks show a 10-50% improvement on Linux/Ubuntu Core i7. Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> R=mtklein@google.com Author: henrik.smiding@intel.com Review URL: https://codereview.chromium.org/366593004