| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The whole point of mempcy32_sse2_unalign is that we didn't align dst128
and src128. So it's not safe at all to cast them back to dst and src.
That tells the compiler that dst/src are 128-bit aligned, and then it
autovectorizes the cleanup while-loop using that (false) knowledge with
aligned SSE instructions.
This leads to crashes on memcpy32_sse2_unalign_10, which is small enough
that we actually get non-16-byte aligned memory. The larger size
benches could be crashing too, but they're big enough allocations that
they're probably always 16-byte aligned anyway.
BUG=skia:2589
R=fmalita@chromium.org, mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/291893008
git-svn-id: http://skia.googlecode.com/svn/trunk@14851 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
|
|
|
|
|
|
|
| |
BUG=skia:2589
R=fmalita@chromium.org, mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/292203009
git-svn-id: http://skia.googlecode.com/svn/trunk@14843 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
|
|
| |
git-svn-id: http://skia.googlecode.com/svn/trunk@14817 2bbb7eff-a529-9590-31e7-b0007b416f81
|
|
This compares 32-bit copies using memcpy, autovectorization, and when SSE2 is
available, aligned and unaligned SSE2.
Running this on my desktop (Intel(R) Xeon(R) CPU E5-2690 0 @ 2.90GHz), I see
all four perform essentially the same, except Clang's autovectorization looks
a little better than GCC's. memcpy is calling libc 2.19's __memcpy_sse2_unaligned.
BUG=skia:
R=reed@google.com, qiankun.miao@intel.com, mtklein@google.com
Author: mtklein@chromium.org
Review URL: https://codereview.chromium.org/290533002
git-svn-id: http://skia.googlecode.com/svn/trunk@14799 2bbb7eff-a529-9590-31e7-b0007b416f81
|