diff options
author | Benoit Steiner <benoit.steiner.goog@gmail.com> | 2016-05-17 09:17:26 -0700 |
---|---|---|
committer | Benoit Steiner <benoit.steiner.goog@gmail.com> | 2016-05-17 09:17:26 -0700 |
commit | 5fa27574dd29fc5753a0b9d39b1fe5c15668e658 (patch) | |
tree | 25b9d2f33d4c1cbfb4b36412ebce9ea8c763da00 /unsupported/Eigen/CXX11/src | |
parent | 86da77cb9bc12e716cc909ada59379c0d16bb22b (diff) |
Allow vectorized padding on GPU. This helps speed things up a little
Before:
BM_padding/10 5000000 460 217.03 MFlops/s
BM_padding/80 5000000 460 13899.40 MFlops/s
BM_padding/640 5000000 461 888421.17 MFlops/s
BM_padding/4K 5000000 460 54316322.55 MFlops/s
After:
BM_padding/10 5000000 454 220.20 MFlops/s
BM_padding/80 5000000 455 14039.86 MFlops/s
BM_padding/640 5000000 452 904968.83 MFlops/s
BM_padding/4K 5000000 411 60750049.21 MFlops/s
Diffstat (limited to 'unsupported/Eigen/CXX11/src')
-rw-r--r-- | unsupported/Eigen/CXX11/src/Tensor/TensorPadding.h | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/unsupported/Eigen/CXX11/src/Tensor/TensorPadding.h b/unsupported/Eigen/CXX11/src/Tensor/TensorPadding.h index 742339f9e..266eea0a0 100644 --- a/unsupported/Eigen/CXX11/src/Tensor/TensorPadding.h +++ b/unsupported/Eigen/CXX11/src/Tensor/TensorPadding.h @@ -93,7 +93,7 @@ struct TensorEvaluator<const TensorPaddingOp<PaddingDimensions, ArgType>, Device static const int PacketSize = internal::unpacket_traits<PacketReturnType>::size; enum { - IsAligned = false, + IsAligned = true, PacketAccess = TensorEvaluator<ArgType, Device>::PacketAccess, Layout = TensorEvaluator<ArgType, Device>::Layout, CoordAccess = true, |