aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen/src/Core/util/XprHelper.h
diff options
context:
space:
mode:
authorGravatar Francesco Mazzoli <f@mazzo.li>2020-01-13 15:11:22 +0100
committerGravatar Gael Guennebaud <g.gael@free.fr>2020-02-07 18:16:16 +0100
commit5ca10480b0756e40b0723d90adeba8506291fc7c (patch)
tree56a2a8f4bddead832bbc60977c7675580fbdac1b /Eigen/src/Core/util/XprHelper.h
parentf584bd9b308b4e62787aed370bd829474510e598 (diff)
avoid selecting half-packets when unnecessary
See <https://stackoverflow.com/questions/59709148/ensuring-that-eigen-uses-avx-vectorization-for-a-certain-operation>; for an explanation of the problem this solves. In short, for some reason, before this commit the half-packet is selected when the array / matrix size is not a multiple of `unpacket_traits<PacketType>::size`, where `PacketType` starts out being the full Packet. For example, for some data of 100 `float`s, `Packet4f` will be selected rather than `Packet8f`, because 100 is not a multiple of 8, the size of `Packet8f`. This commit switches to selecting the half-packet if the size is less than the packet size, which seems to make more sense. As I stated in the SO post I'm not sure that I'm understanding the issue correctly, but this fix resolves the issue in my program. Moreover, `make check` passes, with the exception of line 614 and 616 in `test/packetmath.cpp`, which however also fail on master on my machine: CHECK_CWISE1_IF(PacketTraits::HasBessel, numext::bessel_i0, internal::pbessel_i0); ... CHECK_CWISE1_IF(PacketTraits::HasBessel, numext::bessel_i1, internal::pbessel_i1);
Diffstat (limited to 'Eigen/src/Core/util/XprHelper.h')
-rw-r--r--Eigen/src/Core/util/XprHelper.h2
1 files changed, 1 insertions, 1 deletions
diff --git a/Eigen/src/Core/util/XprHelper.h b/Eigen/src/Core/util/XprHelper.h
index fd2db56a4..26aa609fe 100644
--- a/Eigen/src/Core/util/XprHelper.h
+++ b/Eigen/src/Core/util/XprHelper.h
@@ -195,7 +195,7 @@ template<typename T> struct unpacket_traits
};
template<int Size, typename PacketType,
- bool Stop = Size==Dynamic || (Size%unpacket_traits<PacketType>::size)==0 || is_same<PacketType,typename unpacket_traits<PacketType>::half>::value>
+ bool Stop = Size==Dynamic || Size >= unpacket_traits<PacketType>::size || is_same<PacketType,typename unpacket_traits<PacketType>::half>::value>
struct find_best_packet_helper;
template< int Size, typename PacketType>