update topic page on products

author: Gael Guennebaud <g.gael@free.fr> 2010-07-04 10:37:32 +0200
committer: Gael Guennebaud <g.gael@free.fr> 2010-07-04 10:37:32 +0200
commit: 0c25f868c7a834e31a2c58bd6a6bf717270608a5 (patch)
tree: dfb7accf770ecf7520111e5497c8bb1fa8a029bd
parent: 41ea92d3559e03278ecc7d61bfd3f8fc2087697e (diff)
2 files changed, 50 insertions, 26 deletions
diff --git a/doc/I02_HiPerformance.dox b/doc/I02_HiPerformance.dox
index a7d271109..7f0ce1569 100644
--- a/doc/I02_HiPerformance.dox
+++ b/doc/I02_HiPerformance.dox
@@ -1,11 +1,11 @@
 
 namespace Eigen {
 
-/** \page TopicHiPerformance Using Eigen with high performance
+/** \page TopicWritingEfficientProductExpression Writing efficient matrix product expressions
 
 In general achieving good performance with Eigen does no require any special effort:
 simply write your expressions in the most high level way. This is especially true
-for small fixed size matrices. For large matrices, however, it might useful to
+for small fixed size matrices. For large matrices, however, it might be useful to
 take some care when writing your expressions in order to minimize useless evaluations
 and optimize the performance.
 In this page we will give a brief overview of the Eigen's internal mechanism to simplify
@@ -16,7 +16,7 @@ all kind of matrix products and triangular solvers.
 Indeed, in Eigen we have implemented a set of highly optimized routines which are very similar
 to BLAS's ones. Unlike BLAS, those routines are made available to user via a high level and
 natural API. Each of these routines can compute in a single evaluation a wide variety of expressions.
-Given an expression, the challenge is then to map it to a minimal set of primitives.
+Given an expression, the challenge is then to map it to a minimal set of routines.
 As explained latter, this mechanism has some limitations, and knowing them will allow
 you to write faster code by making your expressions more Eigen friendly.
 
@@ -29,20 +29,20 @@ perform the following operation:
 where A, B, and C are column and/or row major matrices (or sub-matrices),
 alpha is a scalar value, and op1, op2 can be transpose, adjoint, conjugate, or the identity.
 When Eigen detects a matrix product, it analyzes both sides of the product to extract a
-unique scalar factor alpha, and for each side its effective storage (order and shape) and conjugate state.
+unique scalar factor alpha, and for each side, its effective storage order, shape, and conjugation states.
 More precisely each side is simplified by iteratively removing trivial expressions such as scalar multiple,
-negate and conjugate. Transpose and Block expressions are not evaluated and they only modify the storage order
+negation and conjugation. Transpose and Block expressions are not evaluated and they only modify the storage order
 and shape. All other expressions are immediately evaluated.
 For instance, the following expression:
-\code m1.noalias() -= s1 * m2.adjoint() * (-(s3*m3).conjugate()*s2)  \endcode
+\code m1.noalias() -= s4 * (s1 * m2.adjoint() * (-(s3*m3).conjugate()*s2))  \endcode
 is automatically simplified to:
-\code m1.noalias() += (s1*s2*conj(s3)) * m2.adjoint() * m3.conjugate() \endcode
+\code m1.noalias() += (s1*s2*conj(s3)*s4) * m2.adjoint() * m3.conjugate() \endcode
 which exactly matches our GEMM routine.
 
 \subsection GEMM_Limitations Limitations
 Unfortunately, this simplification mechanism is not perfect yet and not all expressions which could be
 handled by a single GEMM-like call are correctly detected.
-<table class="tutorial_code">
+<table class="tutorial_code" style="width:100%">
 <tr>
 <td>Not optimal expression</td>
 <td>Evaluated as</td>
@@ -50,16 +50,21 @@ handled by a single GEMM-like call are correctly detected.
 <td>Comments</td>
 </tr>
 <tr>
-<td>\code m1 += m2 * m3; \endcode</td>
-<td>\code temp = m2 * m3; m1 += temp; \endcode</td>
-<td>\code m1.noalias() += m2 * m3; \endcode</td>
+<td>\code
+m1 += m2 * m3; \endcode</td>
+<td>\code
+temp = m2 * m3;
+m1 += temp; \endcode</td>
+<td>\code
+m1.noalias() += m2 * m3; \endcode</td>
 <td>Use .noalias() to tell Eigen the result and right-hand-sides do not alias. 
     Otherwise the product m2 * m3 is evaluated into a temporary.</td>
 </tr>
 <tr>
 <td></td>
 <td></td>
-<td>\code m1.noalias() += s1 * (m2 * m3); \endcode</td>
+<td>\code
+m1.noalias() += s1 * (m2 * m3); \endcode</td>
 <td>This is a special feature of Eigen. Here the product between a scalar
     and a matrix product does not evaluate the matrix product but instead it
     returns a matrix product expression tracking the scalar scaling factor. <br>
@@ -67,32 +72,49 @@ handled by a single GEMM-like call are correctly detected.
     temporary as in the next example.</td>
 </tr>
 <tr>
-<td>\code m1.noalias() += (m2 * m3).transpose(); \endcode</td>
-<td>\code temp = m2 * m3; m1 += temp.transpose(); \endcode</td>
-<td>\code m1.noalias() += m3.adjoint() * m3.adjoint(); \endcode</td>
+<td>\code
+m1.noalias() += (m2 * m3).adjoint(); \endcode</td>
+<td>\code
+temp = m2 * m3;
+m1 += temp.adjoint(); \endcode</td>
+<td>\code
+m1.noalias() += m3.adjoint()
+              * m2.adjoint(); \endcode</td>
 <td>This is because the product expression has the EvalBeforeNesting bit which
     enforces the evaluation of the product by the Tranpose expression.</td>
 </tr>
 <tr>
-<td>\code m1 = m1 + m2 * m3; \endcode</td>
-<td>\code temp = (m2 * m3).lazy(); m1 = m1 + temp; \endcode</td>
-<td>\code m1 += (m2 * m3).lazy(); \endcode</td>
+<td>\code
+m1 = m1 + m2 * m3; \endcode</td>
+<td>\code
+temp = m2 * m3;
+m1 = m1 + temp; \endcode</td>
+<td>\code m1.noalias() += m2 * m3; \endcode</td>
 <td>Here there is no way to detect at compile time that the two m1 are the same,
     and so the matrix product will be immediately evaluated.</td>
 </tr>
 <tr>
-<td>\code m1.noalias() = m4 + m2 * m3; \endcode</td>
-<td>\code temp = m2 * m3; m1 = m4 + temp; \endcode</td>
-<td>\code m1 = m4; m1.noalias() += m2 * m3; \endcode</td>
+<td>\code
+m1.noalias() = m4 + m2 * m3; \endcode</td>
+<td>\code
+temp = m2 * m3;
+m1 = m4 + temp; \endcode</td>
+<td>\code
+m1 = m4;
+m1.noalias() += m2 * m3; \endcode</td>
 <td>First of all, here the .noalias() in the first expression is useless because
     m2*m3 will be evaluated anyway. However, note how this expression can be rewritten
-    so that no temporary is evaluated. (tip: for very small fixed size matrix
+    so that no temporary is required. (tip: for very small fixed size matrix
     it is slighlty better to rewrite it like this: m1.noalias() = m2 * m3; m1 += m4;</td>
 </tr>
 <tr>
-<td>\code m1.noalias() += ((s1*m2).block(....) * m3); \endcode</td>
-<td>\code temp = (s1*m2).block(....); m1 += temp * m3; \endcode</td>
-<td>\code m1.noalias() += s1 * m2.block(....) * m3; \endcode</td>
+<td>\code
+m1.noalias() += (s1*m2).block(..) * m3; \endcode</td>
+<td>\code
+temp = (s1*m2).block(..);
+m1 += temp * m3; \endcode</td>
+<td>\code
+m1.noalias() += s1 * m2.block(..) * m3; \endcode</td>
 <td>This is because our expression analyzer is currently not able to extract trivial
     expressions nested in a Block expression. Therefore the nested scalar
     multiple cannot be properly extracted.</td>
diff --git a/doc/Overview.dox b/doc/Overview.dox
index 8bbdf1f1a..60687fdf9 100644
--- a/doc/Overview.dox
+++ b/doc/Overview.dox
@@ -28,12 +28,14 @@ For a first contact with Eigen, the best place is to have a look at the \ref Get
     - \ref TutorialGeometry
     - \ref TutorialSparseMatrix
   - \ref QuickRefPage
-  - \b Advanced \b topics
+  - <b>Advanced topics</b>
     - \ref TopicLazyEvaluation
     - \ref TopicLinearAlgebraDecompositions
     - \ref TopicCustomizingEigen
     - \ref TopicInsideEigenExample
     - \ref TopicHiPerformance
+  - <b>Topics on getting high performances</b>
+    - \ref TopicWritingEfficientProductExpression
   - <b>Topics related to alignment issues</b>
     - \ref TopicUnalignedArrayAssert
     - \ref TopicFixedSizeVectorizable
author	Gael Guennebaud <g.gael@free.fr>	2010-07-04 10:37:32 +0200
committer	Gael Guennebaud <g.gael@free.fr>	2010-07-04 10:37:32 +0200
commit	0c25f868c7a834e31a2c58bd6a6bf717270608a5 (patch)
tree	dfb7accf770ecf7520111e5497c8bb1fa8a029bd
parent	41ea92d3559e03278ecc7d61bfd3f8fc2087697e (diff)