aboutsummaryrefslogtreecommitdiffhomepage
path: root/src/opts/SkBlitRow_opts_SSE4.h
Commit message (Collapse)AuthorAge
* Disable SSE4 S32A_Opaque blit.Gravatar mtklein2014-09-03
| | | | | | | | | | | This code sometimes generates a build warning that bothers Chrome. BUG=399842,skia:2906 R=reed@google.com, mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/538463003
* Exclude Clang on Windows too. Comment this up a bit.Gravatar mtklein2014-07-02
| | | | | | | | | BUG=391016 R=tomhudson@chromium.org, mtklein@google.com, rnk@chromium.org, thakis@chromium.org Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/363983004
* Disable assembly code in MemorySanitizer builds.Gravatar Mike Klein2014-07-02
| | | | | | | | | | MemorySanitizer is an unitialized memory use detector which is used in Chromium, and does not presently support assembly code. BUG=chromium:344505, chromium:373739 R=mtklein@google.com Review URL: https://codereview.chromium.org/367973005
* Hide symbols in S32A_Opaque_BlitRow32_SSE4Gravatar henrik.smiding2014-07-01
| | | | | | | | | | | | | | Marks the symbols in the S32A_Opaque_BlitRow32_SSE4 files as hidden, so Chromium can build. Also enables the optimizations. Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> R=mtklein@google.com, joakim.landberg@intel.com Author: henrik.smiding@intel.com Review URL: https://codereview.chromium.org/368573002
* Revert of Re-enable SSE4. (https://codereview.chromium.org/357593003/)Gravatar mtklein2014-06-30
| | | | | | | | | | | | | | | | | | | | | | | | Reason for revert: Mac Chrome builders still failing. Original issue's description: > Re-enable SSE4. > > I will roll this into Chrome with https://codereview.chromium.org/332393003. > > BUG=skia: > > Committed: https://skia.googlesource.com/skia/+/a75b0fadbdec4214afec6dd727fd224d34ed164f R=reed@google.com, mtklein@chromium.org TBR=mtklein@chromium.org, reed@google.com NOTREECHECKS=true NOTRY=true BUG=skia: Author: mtklein@google.com Review URL: https://codereview.chromium.org/337093004
* Re-enable SSE4.Gravatar mtklein2014-06-30
| | | | | | | | | | | I will roll this into Chrome with https://codereview.chromium.org/332393003. BUG=skia: R=reed@google.com, mtklein@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/357593003
* Disable SSE4 code.Gravatar mtklein2014-06-27
| | | | | | | | | | | | | Chrome canary failing to link chrome: http://108.170.220.120:10115/builders/Canary-Chrome-Ubuntu13.10-Ninja-x86_64-ToT/builds/1009/steps/BuildChrome/logs/stdio BUG=skia: NOTRY=true R=mtklein@google.com, rmistry@google.com Author: mtklein@chromium.org Review URL: https://codereview.chromium.org/361493002
* Add SSE4 optimization of S32A_Opaque_BlitrowGravatar henrik.smiding2014-06-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD instruction set. Special case for when alpha is zero or opaque. Performance increase of 10%-400% compared to the existing SSE2 optimization (measured on Silvermont architecture). Noticeable in ~25 different skia bench subtests, especially in bitmap_8888_*, repeatTile_*, and morph_*. bitmap_8888_A - 100% faster bitmap_8888_A_source_transparent - 250% faster bitmap_8888_A_source_opaque - 25% faster bitmap_8888_A_scale_bicubic - 75% faster Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> Committed: https://skia.googlesource.com/skia/+/e2527b147679b0c43019fae7d59cc3777d2d097e Committed: https://skia.googlesource.com/skia/+/b5c281e1e06af3be804309877de1dac6145686b9 R=reed@google.com, mtklein@google.com, tomhudson@google.com, djsollen@google.com, joakim.landberg@intel.com Author: henrik.smiding@intel.com Review URL: https://codereview.chromium.org/289473009
* Revert of Add SSE4 optimization of S32A_Opaque_Blitrow ↵Gravatar mtklein2014-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (https://codereview.chromium.org/289473009/) NOTREECHECKS=true NOTRY=true Reason for revert: Valgrind bot's seeing this code use uninitialized memory, and it's somehow blocking our roll into Chrome too: > ld: warning: could not create compact unwind for S32A_Opaque_BlitRow32_SSE4_asm: > stack subq instruction is too different from dwarf stack size > [10339/10982 | 3247.792] PACKAGE FRAMEWORK "Chromium Framework.framework", > POSTBUILDS > FAILED: ./gyp-mac-tool package-framework "Chromium Framework.framework" A && > (export > BUILT_PRODUCTS_DIR=/Volumes/data/b/build/slave/mac_gpu/build/src/out/Release; > export CONFIGURATION=Release; export CONTENTS_FOLDER_PATH="Chromium > Framework.framework/Versions/A"; export > DYLIB_INSTALL_NAME_BASE=@executable_path/../Versions/37.0.2056.0; export > EXECUTABLE_NAME="Chromium Framework"; export EXECUTABLE_PATH="Chromium > Framework.framework/Versions/A/Chromium Framework"; export > FULL_PRODUCT_NAME="Chromium Framework.framework"; export > INFOPLIST_PATH="Chromium Framework.framework/Versions/A/Resources/Info.plist"; > export LD_DYLIB_INSTALL_NAME="@executable_path/../Versions/37.0.2056.0/Chromium > Framework.framework/Chromium Framework"; export MACH_O_TYPE=mh_dylib; export > PRODUCT_NAME="Chromium Framework"; export > PRODUCT_TYPE=com.apple.product-type.framework; export > SDKROOT=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.6.sdk; > export > SRCROOT=/Volumes/data/b/build/slave/mac_gpu/build/src/out/Release/../../chrome; > export SOURCE_ROOT="${SRCROOT}"; export > TARGET_BUILD_DIR=/Volumes/data/b/build/slave/mac_gpu/build/src/out/Release; > export TEMP_DIR="${TMPDIR}"; export UNLOCALIZED_RESOURCES_FOLDER_PATH="Chromium > Framework.framework/Versions/A/Resources"; export WRAPPER_NAME="Chromium > Framework.framework"; (cd ../../chrome && ../build/mac/tweak_info_plist.py > "--breakpad=1" "--breakpad_uploads=0" "--keystone=0" "--scm=1" > "--branding=Chromium" && ln -fns Versions/Current/Libraries > "${BUILT_PRODUCTS_DIR}/${WRAPPER_NAME}/Libraries" && > tools/build/mac/verify_order _ChromeMain > "${BUILT_PRODUCTS_DIR}/${EXECUTABLE_PATH}"); G=$?; ((exit $G) || rm -rf > 'Chromium Framework.framework') && exit $G) && touch "Chromium > Framework.framework" > tools/build/mac/verify_order: unordered symbols in > /Volumes/data/b/build/slave/mac_gpu/build/src/out/Release/Chromium > Framework.framework/Versions/A/Chromium Framework: > S32A_Opaque_BlitRow32_SSE4_asm > _S32A_Opaque_BlitRow32_SSE4_asm > ninja: build stopped: subcommand failed. Original issue's description: > Add SSE4 optimization of S32A_Opaque_Blitrow > > Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD > instruction set. Special case for when alpha is zero or opaque. > > Performance increase of 10%-400% compared to the existing SSE2 > optimization (measured on Silvermont architecture). > Noticeable in ~25 different skia bench subtests, especially in > bitmap_8888_*, repeatTile_*, and morph_*. > > bitmap_8888_A - 100% faster > bitmap_8888_A_source_transparent - 250% faster > bitmap_8888_A_source_opaque - 25% faster > bitmap_8888_A_scale_bicubic - 75% faster > > Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> > > Committed: https://skia.googlesource.com/skia/+/e2527b147679b0c43019fae7d59cc3777d2d097e > > Committed: https://skia.googlesource.com/skia/+/b5c281e1e06af3be804309877de1dac6145686b9 R=reed@google.com, tomhudson@google.com, djsollen@google.com, joakim.landberg@intel.com, henrik.smiding@intel.com, mtklein@chromium.org Author: mtklein@google.com Review URL: https://codereview.chromium.org/336413007
* Add SSE4 optimization of S32A_Opaque_BlitrowGravatar henrik.smiding2014-06-17
| | | | | | | | | | | | | | | | | | | | | | | | | Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD instruction set. Special case for when alpha is zero or opaque. Performance increase of 10%-400% compared to the existing SSE2 optimization (measured on Silvermont architecture). Noticeable in ~25 different skia bench subtests, especially in bitmap_8888_*, repeatTile_*, and morph_*. bitmap_8888_A - 100% faster bitmap_8888_A_source_transparent - 250% faster bitmap_8888_A_source_opaque - 25% faster bitmap_8888_A_scale_bicubic - 75% faster Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> Committed: https://skia.googlesource.com/skia/+/e2527b147679b0c43019fae7d59cc3777d2d097e R=reed@google.com, mtklein@google.com, tomhudson@google.com, djsollen@google.com, joakim.landberg@intel.com Author: henrik.smiding@intel.com Review URL: https://codereview.chromium.org/289473009
* Revert of Add SSE4 optimization of S32A_Opaque_Blitrow ↵Gravatar jvanverth2014-06-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (https://codereview.chromium.org/289473009/) Reason for revert: Buildbot failures on Mac 10.6 and Mac 10.7. R=reed@google.com, mtklein@google.com, tomhudson@google.com, djsollen@google.com, joakim.landberg@intel.com, henrik.smiding@intel.com TBR=reed@google.com NOTRY=True Original issue's description: > Add SSE4 optimization of S32A_Opaque_Blitrow > > Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD > instruction set. Special case for when alpha is zero or opaque. > > Performance increase of 10%-400% compared to the existing SSE2 > optimization (measured on Silvermont architecture). > Noticeable in ~25 different skia bench subtests, especially in > bitmap_8888_*, repeatTile_*, and morph_*. > > bitmap_8888_A - 100% faster > bitmap_8888_A_source_transparent - 250% faster > bitmap_8888_A_source_opaque - 25% faster > bitmap_8888_A_scale_bicubic - 75% faster > > Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> > > Committed: https://skia.googlesource.com/skia/+/e2527b147679b0c43019fae7d59cc3777d2d097e Author: jvanverth@google.com Review URL: https://codereview.chromium.org/311053009
* Add SSE4 optimization of S32A_Opaque_BlitrowGravatar henrik.smiding2014-06-05
Adds optimization of Skia S32A_Opaque_Blitrow blitter using SSE4.2 SIMD instruction set. Special case for when alpha is zero or opaque. Performance increase of 10%-400% compared to the existing SSE2 optimization (measured on Silvermont architecture). Noticeable in ~25 different skia bench subtests, especially in bitmap_8888_*, repeatTile_*, and morph_*. bitmap_8888_A - 100% faster bitmap_8888_A_source_transparent - 250% faster bitmap_8888_A_source_opaque - 25% faster bitmap_8888_A_scale_bicubic - 75% faster Signed-off-by: Henrik Smiding <henrik.smiding@intel.com> R=reed@google.com, mtklein@google.com, tomhudson@google.com, djsollen@google.com, joakim.landberg@intel.com Author: henrik.smiding@intel.com Review URL: https://codereview.chromium.org/289473009