aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/opts/SkBlitRow_opts_SSE2.h
Commit message (Collapse)AuthorAge
* Cleanup of SSE optimization files.Gravatar commit-bot@chromium.org2014-04-30
| | | | | | | | | | | | | | | | | | | | | General cleanup of optimization files for x86/SSEx. Renamed the opts_check_SSE2.cpp file to _x86, since it's not specific to SSE2. Commented out the ColorRect32 optimization, since it's disabled anyway, to make it more visible. Also fixed a lot of indentation, inclusion guards, spelling, copyright headers, braces, whitespace, and sorting of includes. Author: henrik.smiding@intel.com Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> R=reed@google.com, mtklein@google.com, tomhudson@google.com, djsollen@google.com, joakim.landberg@intel.com Author: henrik.smiding@intel.com Review URL: https://codereview.chromium.org/264603002 git-svn-id: http://skia.googlecode.com/svn/trunk@14464 2bbb7eff-a529-9590-31e7-b0007b416f81
* SSE2 implementation of S32A_D565_Opaque_DitherGravatar commit-bot@chromium.org2014-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Run benchmarks with command line option "--forceDither true --forceBlend 1", almost all the benchmarks exercised S32A_D565_Opaque_Dither can get about 20%-70% performance improvement. Here are the data on i7-3770: before after verts 4314.81 3627.64 15.93% constXTile_MM_filter_trans 1434.22 432.82 69.82% constXTile_CC_filter_trans_scale 1440.17 437.00 69.66% constXTile_RR_filter_trans 1436.96 431.93 69.94% constXTile_MM_trans_scale 1436.33 435.77 69.66% constXTile_CC_trans 1433.12 431.36 69.90% constXTile_RR_trans_scale 1436.13 436.06 69.64% constXTile_MM_filter 1411.55 408.06 71.09% constXTile_CC_filter_scale 1416.68 414.18 70.76% constXTile_RR_filter 1429.46 409.81 71.33% constXTile_MM_scale 1415.00 412.56 70.84% constXTile_CC 1410.32 408.36 71.04% constXTile_RR_scale 1413.26 413.16 70.77% repeatTile_4444_A 1922.01 879.03 54.27% repeatTile_4444_A 1430.68 818.34 42.80% repeatTile_4444_X 1817.43 816.63 55.07% maskshader 5911.09 5895.46 0.26% gradient_create_alpha 4.41 4.41 -0.15% gradient_conical_clamp_3color 35298.71 27574.34 21.88% gradient_conical_clamp_hicolor 35262.15 27538.99 21.90% gradient_conical_clamp 35276.21 27599.80 21.76% gradient_radial2_mirror 20846.74 12969.39 37.79% gradient_radial2_clamp_hicolor 21848.12 13967.57 36.07% gradient_radial2_clamp 21829.95 13978.57 35.97% bitmap_4444_A_scale_rotate_bicubic 105.31 87.13 17.26% bitmap_4444_A_scale_bicubic 73.69 47.76 35.20% bitmap_4444_update_scale_rotate_bilerp 125.65 87.86 30.08% bitmap_4444_update_volatile_scale_rotate_bilerp 125.50 87.65 30.16% bitmap_4444_scale_rotate_bilerp 124.46 87.91 29.37% bitmap_4444_A_scale_rotate_bilerp 105.09 87.27 16.96% bitmap_4444_update_scale_bilerp 106.78 63.28 40.74% bitmap_4444_update_volatile_scale_bilerp 106.66 63.66 40.32% bitmap_4444_scale_bilerp 106.70 63.19 40.78% bitmap_4444_A_scale_bilerp 83.05 62.25 25.04% bitmap_a8 98.11 52.76 46.22% bitmap_a8_A 98.24 52.85 46.20% BUG= R=mtklein@google.com Author: qiankun.miao@intel.com Review URL: https://codereview.chromium.org/179443003 git-svn-id: http://skia.googlecode.com/svn/trunk@13699 2bbb7eff-a529-9590-31e7-b0007b416f81
* SSE2 implementation of S32_D565_Opaque_DitherGravatar commit-bot@chromium.org2014-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Run benchmarks with command line option "--forceDither true". The result shows that all benchmarks exercised S32_D565_Opaque_Dither benefit from this SSE2 optimization. Here are the data on i7-3770: before after constXTile_MM_filter 900.93 217.75 75.83% constXTile_CC_filter_scale 907.59 225.65 75.14% constXTile_RR_filter 903.33 219.41 75.71% constXTile_MM_scale 902.45 221.46 75.46% constXTile_CC 898.55 218.37 75.70% constXTile_RR_scale 902.69 222.35 75.37% repeatTile_4444_X 938.53 240.49 74.38% gradient_radial2_mirror 16999.49 11540.39 32.11% gradient_radial2_clamp_hicolor 17943.38 12501.71 30.33% gradient_radial2_clamp 17816.36 12492.04 29.88% bitmaprect_FF_filter_trans 47.81 10.98 77.03% bitmaprect_FF_nofilter_trans 47.79 10.91 77.18% bitmaprect_FF_filter_identity 47.74 10.89 77.18% bitmaprect_FF_nofilter_identity 47.83 10.89 77.24% bitmap_4444_update_scale_rotate_bilerp 100.45 76.84 23.50% bitmap_4444_update_volatile_scale_rotate_bilerp 100.80 76.70 23.91% bitmap_4444_scale_rotate_bilerp 100.43 77.18 23.15% bitmap_4444_update_scale_bilerp 79.00 49.03 37.93% bitmap_4444_update_volatile_scale_bilerp 78.90 48.87 38.06% bitmap_4444_scale_bilerp 78.92 48.81 38.16% bitmap_4444_update 42.19 11.53 72.68% bitmap_4444_update_volatile 42.28 11.49 72.82% bitmap_a8 60.37 29.75 50.72% bitmap_4444 42.19 11.52 72.69% BUG= R=mtklein@google.com Author: qiankun.miao@intel.com Review URL: https://codereview.chromium.org/181293002 git-svn-id: http://skia.googlecode.com/svn/trunk@13698 2bbb7eff-a529-9590-31e7-b0007b416f81
* SSE2 implementation of S32_D565_OpaqueGravatar commit-bot@chromium.org2014-02-24
| | | | | | | | | | | | | | | | | | | | | | | | | Benchmarks hitting this path can benfit from this patch. Here are the data: before after gradient_radial2_mirror 10885.52 10849.48 0.33% gradient_radial2_clamp_hicolor 11819.69 11644.83 1.48% gradient_radial2_clamp 11816.10 11649.91 1.41% bitmaprect_FF_filter_trans 6.27 4.88 22.17% bitmaprect_FF_nofilter_trans 6.27 4.88 22.17% bitmaprect_FF_filter_identity 6.31 4.86 22.98% bitmaprect_FF_nofilter_identity 6.25 4.86 22.24% bitmap_4444_update 6.26 5.05 19.33% bitmap_4444_update_volatile 6.21 5.06 18.52% bitmap_4444 6.22 5.06 18.65% BUG= R=mtklein@google.com Author: qiankun.miao@intel.com Review URL: https://codereview.chromium.org/172083003 git-svn-id: http://skia.googlecode.com/svn/trunk@13556 2bbb7eff-a529-9590-31e7-b0007b416f81
* SSE2 implementation of S32A_D565_OpaqueGravatar commit-bot@chromium.org2014-02-19
| | | | | | | | | | | | | microbenchmark of S32A_D565_Opaque() shows a 3x speedup after SSE optimization with various count on i7-3770. BUG= R=mtklein@google.com, reed@google.com Author: qiankun.miao@intel.com Review URL: https://codereview.chromium.org/138163013 git-svn-id: http://skia.googlecode.com/svn/trunk@13495 2bbb7eff-a529-9590-31e7-b0007b416f81
* revert 4799-4801 -- red and blue are reversed on windows and linuxGravatar reed@google.com2012-07-27
| | | | git-svn-id: http://skia.googlecode.com/svn/trunk@4803 2bbb7eff-a529-9590-31e7-b0007b416f81
* use SK_RESTRICT instead of __restrict__Gravatar reed@google.com2012-07-27
| | | | git-svn-id: http://skia.googlecode.com/svn/trunk@4801 2bbb7eff-a529-9590-31e7-b0007b416f81
* land http://codereview.appspot.com/6327044/Gravatar reed@google.com2012-07-27
| | | | | | | | SSE optimization for 565 pixel format -- by Lei git-svn-id: http://skia.googlecode.com/svn/trunk@4799 2bbb7eff-a529-9590-31e7-b0007b416f81
* SSE2 version of blit_lcd16, courtesy of Jin Yang.Gravatar tomhudson@google.com2012-02-14
| | | | | | | | | | | | | | | Yields 25-30% speedup on Windows (32b), 4-7% on Linux (64b, less register pressure), not invoked on Mac (lcd text is 32b instead of 16b). Followup: GDI system settings on Windows can suppress LCD text for small fonts, interfering with our benchmarks. (http://code.google.com/p/skia/issues/detail?id=483) http://codereview.appspot.com/5617058/ git-svn-id: http://skia.googlecode.com/svn/trunk@3189 2bbb7eff-a529-9590-31e7-b0007b416f81
* move LCD blits into opts, so they can have assembly versionsGravatar reed@google.com2011-10-18
| | | | git-svn-id: http://skia.googlecode.com/svn/trunk@2484 2bbb7eff-a529-9590-31e7-b0007b416f81
* Automatic update of all copyright notices to reflect new license terms.Gravatar epoger@google.com2011-07-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I have manually examined all of these diffs and restored a few files that seem to require manual adjustment. The following files still need to be modified manually, in a separate CL: android_sample/SampleApp/AndroidManifest.xml android_sample/SampleApp/res/layout/layout.xml android_sample/SampleApp/res/menu/sample.xml android_sample/SampleApp/res/values/strings.xml android_sample/SampleApp/src/com/skia/sampleapp/SampleApp.java android_sample/SampleApp/src/com/skia/sampleapp/SampleView.java experimental/CiCarbonSampleMain.c experimental/CocoaDebugger/main.m experimental/FileReaderApp/main.m experimental/SimpleCocoaApp/main.m experimental/iOSSampleApp/Shared/SkAlertPrompt.h experimental/iOSSampleApp/Shared/SkAlertPrompt.m experimental/iOSSampleApp/SkiOSSampleApp-Base.xcconfig experimental/iOSSampleApp/SkiOSSampleApp-Debug.xcconfig experimental/iOSSampleApp/SkiOSSampleApp-Release.xcconfig gpu/src/android/GrGLDefaultInterface_android.cpp gyp/common.gypi gyp_skia include/ports/SkHarfBuzzFont.h include/views/SkOSWindow_wxwidgets.h make.bat make.py src/opts/memset.arm.S src/opts/memset16_neon.S src/opts/memset32_neon.S src/opts/opts_check_arm.cpp src/ports/SkDebug_brew.cpp src/ports/SkMemory_brew.cpp src/ports/SkOSFile_brew.cpp src/ports/SkXMLParser_empty.cpp src/utils/ios/SkImageDecoder_iOS.mm src/utils/ios/SkOSFile_iOS.mm src/utils/ios/SkStream_NSData.mm tests/FillPathTest.cpp Review URL: http://codereview.appspot.com/4816058 git-svn-id: http://skia.googlecode.com/svn/trunk@1982 2bbb7eff-a529-9590-31e7-b0007b416f81
* http://codereview.appspot.com/3980041/Gravatar reed@google.com2011-03-09
| | | | | | | | | Add blitmask procs (with optional platform acceleration) patch by yaojie.yan git-svn-id: http://skia.googlecode.com/svn/trunk@910 2bbb7eff-a529-9590-31e7-b0007b416f81
* More SSE2-ification; fix for gcc -msse2.Gravatar senorblanco@chromium.org2009-11-16
Review URL: http://codereview.appspot.com/154163 git-svn-id: http://skia.googlecode.com/svn/trunk@428 2bbb7eff-a529-9590-31e7-b0007b416f81