diff options
author | 2013-08-05 20:25:57 +0000 | |
---|---|---|
committer | 2013-08-05 20:25:57 +0000 | |
commit | 9772a52f0d9e540d2a360dde2aab0ad41c90b1d8 (patch) | |
tree | a72b03a99a6528745060c487a1f267c647d64ec6 /whitespace.txt | |
parent | 67ed64e9aa70f5a95a2d309f9b73dc0009f3ed8c (diff) |
Minor sk_memset{16|32}_SSE2 optimization.
Using explicitly indexed references allows some compilers to generate more efficient loops. For gcc 4.6.3:
613c18: 83 ea 10 sub $0x10,%edx
613c1b: 66 0f 7f 07 movdqa %xmm0,(%rdi)
613c1f: 66 0f 7f 47 10 movdqa %xmm0,0x10(%rdi)
613c24: 66 0f 7f 47 20 movdqa %xmm0,0x20(%rdi)
613c29: 66 0f 7f 47 30 movdqa %xmm0,0x30(%rdi)
613c2e: 48 83 c7 40 add $0x40,%rdi
613c32: 83 fa 0f cmp $0xf,%edx
613c35: 7f e1 jg 613c18 <_Z16sk_memset32_SSE2Pjji+0x38>
vs. previous:
613c18: 83 ea 10 sub $0x10,%edx
613c1b: 66 0f 7f 07 movdqa %xmm0,(%rdi)
613c1f: 66 0f 7f 47 10 movdqa %xmm0,0x10(%rdi)
613c24: 66 0f 7f 47 20 movdqa %xmm0,0x20(%rdi)
613c29: 48 83 c7 40 add $0x40,%rdi
613c2d: 83 fa 0f cmp $0xf,%edx
613c30: 66 0f 7f 47 f0 movdqa %xmm0,-0x10(%rdi)
613c35: 7f e1 jg 613c18 <_Z16sk_memset32_SSE2Pjji+0x38>
This yields a 0.2% - 1% improvement with the memset micro benchmarks, presumably due to avoiding a stall on the next store after the %rdi increment.
R=reed@google.com, senorblanco@chromium.org
Author: fmalita@chromium.org
Review URL: https://chromiumcodereview.appspot.com/21703003
git-svn-id: http://skia.googlecode.com/svn/trunk@10545 2bbb7eff-a529-9590-31e7-b0007b416f81
Diffstat (limited to 'whitespace.txt')
0 files changed, 0 insertions, 0 deletions