Update and extend doc on alignment issues.

author: Gael Guennebaud <g.gael@free.fr> 2016-12-11 22:45:32 +0100
committer: Gael Guennebaud <g.gael@free.fr> 2016-12-11 22:45:32 +0100
commit: 57acb05eefec9f59dafc95fd386b3f6d81040962 (patch)
tree: c895a648fddcc794f030ff049ca92b0dfd1d5735
parent: 76fca221347339e210ba3a24d2739d8dd147c15c (diff)
3 files changed, 18 insertions, 9 deletions
diff --git a/doc/FixedSizeVectorizable.dox b/doc/FixedSizeVectorizable.dox
index 49e38af76..0012465ca 100644
--- a/doc/FixedSizeVectorizable.dox
+++ b/doc/FixedSizeVectorizable.dox
@@ -1,6 +1,6 @@
 namespace Eigen {
 
-/** \eigenManualPage TopicFixedSizeVectorizable Fixed-size vectorizable Eigen objects
+/** \eigenManualPage TopicFixedSizeVectorizable Fixed-size vectorizable %Eigen objects
 
 The goal of this page is to explain what we mean by "fixed-size vectorizable".
 
@@ -23,15 +23,15 @@ Examples include:
 
 \section FixedSizeVectorizable_explanation Explanation
 
-First, "fixed-size" should be clear: an Eigen object has fixed size if its number of rows and its number of columns are fixed at compile-time. So for example Matrix3f has fixed size, but MatrixXf doesn't (the opposite of fixed-size is dynamic-size).
+First, "fixed-size" should be clear: an %Eigen object has fixed size if its number of rows and its number of columns are fixed at compile-time. So for example \ref Matrix3f has fixed size, but \ref MatrixXf doesn't (the opposite of fixed-size is dynamic-size).
 
-The array of coefficients of a fixed-size Eigen object is a plain "static array", it is not dynamically allocated. For example, the data behind a Matrix4f is just a "float array[16]".
+The array of coefficients of a fixed-size %Eigen object is a plain "static array", it is not dynamically allocated. For example, the data behind a \ref Matrix4f is just a "float array[16]".
 
 Fixed-size objects are typically very small, which means that we want to handle them with zero runtime overhead -- both in terms of memory usage and of speed.
 
-Now, vectorization (both SSE and AltiVec) works with 128-bit packets. Moreover, for performance reasons, these packets need to be have 128-bit alignment.
+Now, vectorization works with 128-bit packets (e.g., SSE, AltiVec, NEON), 256-bit packets (e.g., AVX), or 512-bit packets (e.g., AVX512). Moreover, for performance reasons, these packets are most efficiently read and written if they have the same alignment as the packet size, that is 16 bytes, 32 bytes, and 64 bytes respectively.
 
-So it turns out that the only way that fixed-size Eigen objects can be vectorized, is if their size is a multiple of 128 bits, or 16 bytes. Eigen will then request 16-byte alignment for these objects, and henceforth rely on these objects being aligned so no runtime check for alignment is performed.
+So it turns out that the best way that fixed-size %Eigen objects can be vectorized, is if their size is a multiple of 16 bytes (or more). %Eigen will then request 16-byte alignment (or more) for these objects, and henceforth rely on these objects being aligned to achieve maximal efficiency.
 
 */
 
diff --git a/doc/PassingByValue.dox b/doc/PassingByValue.dox
index bf4d0ef4b..9254fe6d8 100644
--- a/doc/PassingByValue.dox
+++ b/doc/PassingByValue.dox
@@ -4,21 +4,21 @@ namespace Eigen {
 
 Passing objects by value is almost always a very bad idea in C++, as this means useless copies, and one should pass them by reference instead.
 
-With Eigen, this is even more important: passing \ref TopicFixedSizeVectorizable "fixed-size vectorizable Eigen objects" by value is not only inefficient, it can be illegal or make your program crash! And the reason is that these Eigen objects have alignment modifiers that aren't respected when they are passed by value.
+With %Eigen, this is even more important: passing \ref TopicFixedSizeVectorizable "fixed-size vectorizable Eigen objects" by value is not only inefficient, it can be illegal or make your program crash! And the reason is that these %Eigen objects have alignment modifiers that aren't respected when they are passed by value.
 
-So for example, a function like this, where v is passed by value:
+For example, a function like this, where \c v is passed by value:
 
 \code
 void my_function(Eigen::Vector2d v);
 \endcode
 
-needs to be rewritten as follows, passing v by reference:
+needs to be rewritten as follows, passing \c v by const reference:
 
 \code
 void my_function(const Eigen::Vector2d& v);
 \endcode
 
-Likewise if you have a class having a Eigen object as member:
+Likewise if you have a class having an %Eigen object as member:
 
 \code
 struct Foo
diff --git a/doc/UnalignedArrayAssert.dox b/doc/UnalignedArrayAssert.dox
index 95d95a2d5..0f7022973 100644
--- a/doc/UnalignedArrayAssert.dox
+++ b/doc/UnalignedArrayAssert.dox
@@ -115,6 +115,15 @@ If you want to know why defining EIGEN_DONT_VECTORIZE does not by itself disable
 It doesn't disable the assertion, because otherwise code that runs fine without vectorization would suddenly crash when enabling vectorization.
 It doesn't disable 16-byte alignment, because that would mean that vectorized and non-vectorized code are not mutually ABI-compatible. This ABI compatibility is very important, even for people who develop only an in-house application, as for instance one may want to have in the same application a vectorized path and a non-vectorized path.
 
+\section checkmycode How can I check my code is safe regarding alignment issues?
+
+Unfortunately, there is no possibility in C++ to detect any of the aformentioned shortcoming at compile time (though static analysers are becoming more and more powerful and could detect some of them).
+Even at runtime, all we can do is to catch invalid unaligned allocation and trigger the explicit assertion mentioned at the begining of this page.
+Therefore, if your program runs fine on a given system with some given compilation flags, then this does not guarantee that your code is safe. For instance, on most 64 bits systems buffer are aligned on 16 bytes boundary and so, if you do not enable AVX instruction set, then your code will run fine. On the other hand, the same code may assert if moving to a more exotic platform, or enabling AVX instructions that required 32 bytes alignment by default.
+
+The situation is not hopeless though. Assuming your code is well covered by unit test, then you can check its alignment safety by linking it to a custom malloc library returning 8 bytes aligned buffers only. This way all alignment shortcomings should pop-up. To this end, you must also compile your program with \link TopicPreprocessorDirectivesPerformance EIGEN_MALLOC_ALREADY_ALIGNED=0 \endlink.
+
+
 */
 
 }
author	Gael Guennebaud <g.gael@free.fr>	2016-12-11 22:45:32 +0100
committer	Gael Guennebaud <g.gael@free.fr>	2016-12-11 22:45:32 +0100
commit	57acb05eefec9f59dafc95fd386b3f6d81040962 (patch)
tree	c895a648fddcc794f030ff049ca92b0dfd1d5735
parent	76fca221347339e210ba3a24d2739d8dd147c15c (diff)