aboutsummaryrefslogtreecommitdiffhomepage
path: root/tvmet-1.7.1/doc/notes.dox
blob: 2b07ed4a53649b1ad8c7e12757027e477afc6fb5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
/*
 * $Id: notes.dox,v 1.20 2004/04/11 01:29:41 opetzold Exp $
 */

/**
   \page notes Some Notes ...

   <p>Contents:</p>
   -# \ref optimizing
   -# \ref temporaries
   -# \ref operators
   -# \ref threads
   -# \ref expressions
   -# \ref adl
   -# \ref alias
   -# \ref spec_meta_func
   -# \ref mmv




   \section optimizing ... on optimizing

   This depends heavily on compiler and the flags used. The code produced with
   -O could be better than with -O2 even on gcc-2.9x suite. To get the best
   results, you should examine the assembler code generated by your compiler.
   Maybe I will write a benchmark suite for different compiler options one day.
   (Maybe you could contribute?)






   \section temporaries ... on temporaries

   The use of expression templates (ET) and meta templates (MT) allows the
   generated code to avoid the creation of many temporaries -- especially with
   standard mathematical and assignment operations. There are times that you
   have to use actual temporaries e.g. when swapping variables of type
   double -- with integer values you can use the XOR operator.

   Some implementations are using a loop with temporaries even if there is a
   solution with ET. Than the loops are faster than MT.

   \sa \ref mmv





   \section operators ... on operators and namespace element_wise

   Some operations on matrices and vectors are not available at first glance.
   These are defined in the namespace <code>element_wise</code> because they
   are element wise (and not strictly mathematical) operations.

   But there is more: some functions do element wise operations per se (e.g.
   vector addition) and are NOT inside namespace element_wise. Furthermore,
   all comparison operators perform element wise operations.

   \sa \ref compare





   \section threads ... about Threads

   This library is not thread safe. It's designed for small math operations where
   the overhead for locking policies is too great.  If you require locking for
   a multi-threaded application, you will need to write a wrapper.





   \section expressions ... on expressions

   The first versions of %tvmet had only one expression (Xpr) which was shared
   for both vectors and matrices.  This was working fine, but limited tvmet's
   use for arithmetic expressions expressions on complex values. For this
   reason, I had to separate expression types for vectors and matrices. The
   same problem appeared when using the eval() function for evaluating these
   expressions. (Which index operator should handle it?) Unfortunately, the
   use of separate expression types vastly increases the number of operators
   and functions needed to make the library viable. Fortunately, most boundary
   checks are not necessary since they are done at compile time (such as those
   needed by the assignment operator, etc).





   \section adl ... on namespaces and Koenig Lookup

   IMO, the cleanest way would be to put all functions into their own
   namespace <code>Functional</code> instead of naming them with the
   <code>fncl_</code> prefix they currently have. (I did beforehand, and
   have thought better since).  Unfortunately, this technique doesn't work well.
   I got compiler errors like:

   \code
   template <class T> Functional::xyt<T>' is not a function
   conflict with `template <class E> xyz(Xpr<E>)' in call to `xyz'
   \endcode

   when trying:

   \code
   typedef Vector<double, 3> vector3d;
   vector3d t1(1,2,3);
   vector3d t2(t1);
   vector3d r;
   r = sqrt( t1 * t2 );
   \endcode

   ADL (argument dependent lookup), aka Koenig Lookup, is causing the
   compiler to check for a match in namespace Functional, since the
   template instantiation is part of Functional (the Xpr stuff), it matches
   before the global namespace (the Vector stuff) is checked. Writing:

   \code
   r = ::sqrt( t1 * t2 );
   \endcode

   seems to solve the problem at first glance. However, to force the user of
   the library into this syntax is painful and could probably run cause other
   problems with other namespaces (I haven't checked this). Therefore, all
   "Functionals" have the prefix fncl_.





   \section alias ... about aliasing

   tvmet assumes that all matrices and vectors are alias free. These means that
   source and destination memory layout of matrices and vectors never overlaps
   (during an operation).

   This is very easy to understood if you see a matrix-vector product. Both
   contain different data in different (unique, non-overlapping) memory
   regions -- hence, they are alias free. Contrast this with a matrix-matrix
   multiply which maybe can have an aliasing, e.g. \f$A = A * B\f$.
   When source and destination memory regions are the same, the computed results
   may be wrong. (Probably they will be.) But, \f$C = A * B\f$ is alias free.

   Let's see an example in detail:

   \par Example:
   \code
   Matrix<double,3,3>		M1;
   M1 = 1,2,3,4,5,6,7,8,9;

   cout << "M1 = " << M1 << endl;
   M1 = trans(M1);
   cout << "M1 = " << M1 << endl;
   \endcode

   \par Output:
   \code
   M1 = Matrix<d, 3, 3> = [
     [1, 2, 3],
     [4, 5, 6],
     [7, 8, 9]
   ]
   M1 = Matrix<d, 3, 3> = [
     [1, 4, 7],
     [4, 5, 8],
     [7, 8, 9]
   ]
   \endcode

   As you can see, the lower triangular matrix isn't what you expected due to
   the aliasing. These results depends on the compiler optimizations, too.

   Unfortunately, to avoid the aliasing problem, you must use temporaries
   as shown here:

   \par Example:
   \code
    matrix_type 				temp_A(A);
    A = temp_A * B;
    cout << "matrix_type temp_A(A);\n"
         << "A = temp_A * B = " << A << endl;
   \endcode

   Anyway, it seems there is a small exception (no guarantee, since it's
   compiler dependent I assume) for element wise operations with matrices
   or vectors on right hand side.

   Starting with tvmet release 1.4.1 there is a new function alias. These
   function use a proxy to call special member functions of the %Matrix/Vector
   class. These member functions introduce the temporary for you.

   \par Example:
   \code
   typedef tvmet::Matrix<double, 3, 3>		matrix_type;
   matrix_type					M;
   std::generate(M.begin(), M.end(),
                 tvmet::util::Incrementor<matrix_type::value_type>());
   std::cout << "M = " << M << std::endl;

   alias(M) = trans(M);
   std::cout << "M = " << M << std::endl;
   \endcode

   with the expected

   \par Output:
   \code
   M = [
     [1, 4, 7],
     [2, 5, 8],
     [3, 6, 9]
   ]
   \endcode

   These function/proxy will work for the element wise operators +=, -=, *= and /=
   with expressions, e.g. as trans() returns.

   \sa \ref assign_op





   \section spec_meta_func ... special Meta-Template Functions

   From a principle point of view, there is no need for some special functions
   for %Matrix and %Vector functions, namely \f$M^T\, x\f$, \f$M^T\,M\f$,
   \f$M\,M^T\f$, and \f$(M\,M)^T\f$.

   Unfortunately, the g++ compiler throws in the towel sometimes even on
   transposing matrices. Because of this, %tvmet offers specialized functions
   which speed up at runtime (about factor 2 ... 3) using meta templates.

   \par Example:
   \code
   using namespace tvmet;

   Matrix<double, 6, 3>			M1(0); // will be transposed to be conform to vector size
   Vector<double, 6> 			v1(0);
   Vector<double, 3> 			v2(0);

   M1 = ...
   v1 = ...

   v2 = Mtx_prod(M1, v1);		// equal to: v2 = trans(M1)*v1;
   \endcode

   BTW, the %Matrix-%Matrix \f$M\,M\f$ and %Matrix-%Vector \f$M\,x\f$
   products use Meta-Templates, too.

   \sa \ref Mtx_prod
   \sa \ref MMt_prod
   \sa \ref MtM_prod
   \sa \ref trans_prod





   \section mmv ... about Matrix-Matrix-Vector and Matrix-Matrix-Matrix-operations

   The problem is related to the optimizer - due to the expression and meta
   templates used.

   Internally, an expression template may contain other expression templates
   (meta templates inside as well as) too - the compiler will unroll all of
   these expression into a single resultant expression (which is a hard job).
   Sometimes the code generated from this is worse (from a performance point
   of view) than just using simple temporaries.

   You can chain matrix-matrix and matrix-vector operations without writing
   temporaries by yourself (if this is what you want).

   \par from examples/hspiess.cc:
   \code
   tvmet::Matrix<double,3,2>	B;
   tvmet::Matrix<double,3,3>	D;
   tvmet::Matrix<double,2,2>	K;

   B =
     -0.05,  0,
      0,     0.05,
      0.05, -0.05;
   D =
      2000, 1000, 0,
      1000, 2000, 0,
      0,    0,    500;

   K = trans(B) * D * B;
   \endcode

   The performance can be sub optimal due to the increasing complexity
   of operations. This can be reduced by a user specified temporary:

   \par from examples/hspiess.cc:
   \code
   // as before

   K = tvmet::Matrix<double,2,3>(trans(B) * D) * B;
   \endcode

   or
   \code
   K = prod(tvmet::Matrix<double,2,3>(prod(trans(B), D)), B);
   \endcode

   At this moment an intelligent cache and pre-evaluating strategy is
   missing by %tvmet.

   \sa \ref spec_meta_func
   \sa some notes \ref temporaries



*/


// Local Variables:
// mode:c++
// End: