aboutsummaryrefslogtreecommitdiffhomepage
path: root/doc/C09_TutorialSparse.dox
blob: 5d9050a8503e87e530e406f5fe09c024bbc1267b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
namespace Eigen {

/** \page TutorialSparse Tutorial page 9 - Sparse Matrix
    \ingroup Tutorial

\li \b Previous: \ref TutorialGeometry
\li \b Next: TODO

\b Table \b of \b contents \n
  - \ref TutorialSparseIntro
  - \ref TutorialSparseFilling
  - \ref TutorialSparseFeatureSet
  - \ref TutorialSparseDirectSolvers
<hr>

\section TutorialSparseIntro Sparse matrix representations

In many applications (e.g., finite element methods) it is common to deal with very large matrices where only a few coefficients are different from zero.  In such cases, memory consumption can be reduced and performance increased by using a specialized representation storing only nonzero coefficients. Such a matrix is called a sparse matrix.

\b Declaring \b sparse \b matrices \b and \b vectors \n
The SparseMatrix class is the main sparse matrix representation of Eigen's sparse module; it offers high performance, low memory usage, and compatibility with most sparse linear algebra packages. These advantages come at the cost of some loss of flexibility, particularly during the assembly of the sparse matrix; consequently, a variant called DynamicSparseMatrix is offered which is tailored for low-level sparse matrix assembly. Both of them can be either row major or column major:

\code
#include <Eigen/Sparse>
SparseMatrix<std::complex<float> > m1(1000,2000);         // declare a 1000x2000 col-major compressed sparse matrix of complex<float>
SparseMatrix<double,RowMajor> m2(1000,2000);              // declare a 1000x2000 row-major compressed sparse matrix of double
DynamicSparseMatrix<std::complex<float> > m1(1000,2000);  // declare a 1000x2000 col-major dynamic sparse matrix of complex<float>
DynamicSparseMatrix<double,RowMajor> m2(1000,2000);       // declare a 1000x2000 row-major dynamic sparse matrix of double
\endcode

Although a sparse matrix could also be used to represent a sparse vector, for that purpose it is better to use the specialized SparseVector class:
\code
SparseVector<std::complex<float> > v1(1000); // declare a column sparse vector of complex<float> of size 1000
SparseVector<double,RowMajor> v2(1000);      // declare a row sparse vector of double of size 1000
\endcode
As with dense vectors, the size of a sparse vector denotes its dimension and not the number of nonzero coefficients. At the time of allocation, both sparse matrices and sparse vectors do not have any nonzero coefficients---they correspond to the "all zeros" matrix or vector. 


\b Overview \b of \b the \b internal \b sparse \b storage \n
In order to get the most out of Eigen's sparse objects, it is important to have a rough idea of the way they are represented internally. The SparseMatrix class implements the widely-used Compressed Column (or Row) Storage scheme. It consists of three compact arrays: one for the coefficient values, and two for the indices of the nonzero entries. However, the indices are \em not stored as a direct column, row list; instead, the beginning of each column (or row) is encoded as a pointer index.  For instance, let \c m be a column-major sparse matrix. Then its nonzero coefficients are sequentially stored in memory in column-major order (\em values). A second array of integers stores the respective row index of each coefficient (\em inner \em indices). Finally, a third array of integers, having the same length as the number of columns, stores the index in the previous arrays of the first element of each column (\em outer \em indices).

Here is an example, with the matrix:
<table class="manual">
<tr><td>0</td><td>3</td><td>0</td><td>0</td><td>0</td></tr>
<tr><td>22</td><td>0</td><td>0</td><td>0</td><td>17</td></tr>
<tr><td>7</td><td>5</td><td>0</td><td>1</td><td>0</td></tr>
<tr><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>
<tr><td>0</td><td>0</td><td>14</td><td>0</td><td>8</td></tr>
</table>

and its internal representation using the Compressed Column Storage format:
<table class="manual">
<tr><td>Values:</td>        <td>22</td><td>7</td><td>3</td><td>5</td><td>14</td><td>1</td><td>17</td><td>8</td></tr>
<tr><td>Inner indices:</td> <td> 1</td><td>2</td><td>0</td><td>2</td><td> 4</td><td>2</td><td> 1</td><td>4</td></tr>
</table>
Outer indices:<table class="manual"><tr><td>0</td><td>2</td><td>4</td><td>5</td><td>6</td><td>\em 7 </td></tr></table>

As you might guess, here the storage order is even more important than with dense matrices. We will therefore often make a clear difference between the \em inner and \em outer dimensions. For instance, it is efficient to loop over the coefficients of an \em inner \em vector (e.g., a column of a column-major matrix), but completely inefficient to do the same for an \em outer \em vector (e.g., a row of a column-major matrix).

The SparseVector class implements the same compressed storage scheme but, of course, without any outer index buffer.

Since all nonzero coefficients of such a matrix are sequentially stored in memory, inserting a new nonzero near the "beginning" of the matrix can be extremely costly. As described below (\ref TutorialSparseFilling), one strategy is to fill nonzero coefficients in order. In cases where this is not possible, Eigen's sparse module also provides a DynamicSparseMatrix class which allows efficient random insertion. DynamicSparseMatrix is essentially implemented as an array of SparseVector, where the values and inner-indices arrays have been split into multiple small and resizable arrays. Assuming the number of nonzeros per inner vector is relatively small, this modification allows for very fast random insertion at the cost of a slight memory overhead (due to extra memory preallocated by each inner vector to avoid an expensive memory reallocation at every insertion) and a loss of compatibility with other sparse libraries used by some of our high-level solvers. Once complete, a DynamicSparseMatrix can be converted to a SparseMatrix to permit usage of these sparse libraries.

To summarize, it is recommended to use SparseMatrix whenever possible, and reserve the use of DynamicSparseMatrix to assemble a sparse matrix in cases when a SparseMatrix is not flexible enough. The respective pros/cons of both representations are summarized in the following table:

<table class="manual">
<tr><td></td> <td>SparseMatrix</td><td>DynamicSparseMatrix</td></tr>
<tr><td>memory efficiency</td><td>***</td><td>**</td></tr>
<tr><td>sorted insertion</td><td>***</td><td>***</td></tr>
<tr><td>random insertion \n in sorted inner vector</td><td>**</td><td>**</td></tr>
<tr><td>sorted insertion \n in random inner vector</td><td>-</td><td>***</td></tr>
<tr><td>random insertion</td><td>-</td><td>**</td></tr>
<tr><td>coeff wise unary operators</td><td>***</td><td>***</td></tr>
<tr><td>coeff wise binary operators</td><td>***</td><td>***</td></tr>
<tr><td>matrix products</td><td>***</td><td>**(*)</td></tr>
<tr><td>transpose</td><td>**</td><td>***</td></tr>
<tr><td>redux</td><td>***</td><td>**</td></tr>
<tr><td>*= scalar</td><td>***</td><td>**</td></tr>
<tr><td>Compatibility with highlevel solvers \n (TAUCS, Cholmod, SuperLU, UmfPack)</td><td>***</td><td>-</td></tr>
</table>


\b Matrix \b and \b vector \b properties \n

Here mat and vec represent any sparse-matrix and sparse-vector type, respectively.

<table class="manual">
<tr><td>Standard \n dimensions</td><td>\code
mat.rows()
mat.cols()\endcode</td>
<td>\code
vec.size() \endcode</td>
</tr>
<tr><td>Sizes along the \n inner/outer dimensions</td><td>\code
mat.innerSize()
mat.outerSize()\endcode</td>
<td></td>
</tr>
<tr><td>Number of non \n zero coefficients</td><td>\code
mat.nonZeros() \endcode</td>
<td>\code
vec.nonZeros() \endcode</td></tr>
</table>


\b Iterating \b over \b the \b nonzero \b coefficients \n

Iterating over the coefficients of a sparse matrix can be done only in the same order as the storage order. Here is an example:
<table class="manual">
<tr><td>
\code
SparseMatrixType mat(rows,cols);
for (int k=0; k<m1.outerSize(); ++k)
  for (SparseMatrixType::InnerIterator it(mat,k); it; ++it)
  {
    it.value();
    it.row();   // row index
    it.col();   // col index (here it is equal to k)
    it.index(); // inner index, here it is equal to it.row()
  }
\endcode
</td><td>
\code
SparseVector<double> vec(size);
for (SparseVector<double>::InnerIterator it(vec); it; ++it)
{
  it.value(); // == vec[ it.index() ]
  it.index();
}
\endcode
</td></tr>
</table>


\section TutorialSparseFilling Filling a sparse matrix

Because of the special storage scheme of a SparseMatrix, adding new nonzero entries can have consequences for performance. For instance, the cost of a purely random insertion into a SparseMatrix is O(nnz), where nnz is the current number of nonzero coefficients.  In order to cover all use cases with best efficiency, Eigen provides various mechanisms, from the easiest but slowest, to the fastest but most restrictive.

If you don't have any prior knowledge about the order your matrix will be filled, then the best choice is to use a DynamicSparseMatrix. With a DynamicSparseMatrix, you can add or modify any coefficients at any time using the coeffRef(row,col) method. Here is an example:
\code
DynamicSparseMatrix<float> aux(1000,1000);
aux.reserve(estimated_number_of_non_zero); // optional
for (...)
  for each j                          // the j can be random
    for each i interacting with j     // the i can be random
      aux.coeffRef(i,j) += foo(i,j);
\endcode
Then the DynamicSparseMatrix object can be converted to a compact SparseMatrix to be used, e.g., by one of our supported solvers:
\code
SparseMatrix<float> mat(aux);
\endcode

In order to optimize this process, instead of the generic coeffRef(i,j) method one can also use:
 - \code m.insert(i,j) = value; \endcode which assumes the coefficient of coordinate (i,j) does not already exist (otherwise this is a programming error and your program will stop).
 - \code m.insertBack(i,j) = value; \endcode which, in addition to the requirements of insert(), also assumes that the coefficient of coordinate (i,j) will be inserted at the end of the target inner-vector. More precisely, if the matrix m is column major, then the row index of the last non zero coefficient of the j-th column must be smaller than i.


The SparseMatrix class also supports random insertion via the insert() method. However, it should only be used when the inserted coefficient is nearly the last one of the compact storage array. In practice, this means it should be used only to perform random (or sorted) insertion into the current inner-vector while filling the inner-vectors in increasing order. Moreover, with a SparseMatrix an insertion session must be closed by a call to finalize() before any use of the matrix. Here is an example for a column major matrix:

\code
SparseMatrix<float> mat(1000,1000);
mat.reserve(estimated_number_of_non_zero);  // optional
for each j                                  // should be in increasing order for performance reasons
  for each i interacting with j             // the i can be random
    mat.insert(i,j) = foo(i,j);             // optional for a DynamicSparseMatrix
mat.finalize();
\endcode

Finally, the fastest way to fill a SparseMatrix object is to insert the elements in purely increasing order (increasing inner index per outer index, and increasing outer index) using the insertBack() function:

\code
SparseMatrix<float> mat(1000,1000);
mat.reserve(estimated_number_of_non_zero);  // optional
for(int j=0; j<1000; ++j)
{
  mat.startVec(j);                          // optional for a DynamicSparseMatrix
  for each i interacting with j             // with increasing i
      mat.insertBack(i,j) = foo(i,j);
}
mat.finalize();                             // optional for a DynamicSparseMatrix
\endcode
Note that there is also an insertBackByOuterInner(Index outer, Index inner) function which allows one to write code agnostic to the storage order.

\section TutorialSparseFeatureSet Supported operators and functions

In the following \em sm denotes a sparse matrix, \em sv a sparse vector, \em dm a dense matrix, and \em dv a dense vector.
In Eigen's sparse module we chose to expose only the subset of the dense matrix API which can be efficiently implemented. Moreover, not every combination is allowed; for instance, it is not possible to add two sparse matrices having two different storage orders. On the other hand, it is perfectly fine to evaluate a sparse matrix or expression to a matrix having a different storage order:
\code
SparseMatrixType sm1, sm2, sm3;
sm3 = sm1.transpose() + sm2;                    // invalid, because transpose() changes the storage order
sm3 = SparseMatrixType(sm1.transpose()) + sm2;  // correct, because evaluation reformats as column-major
\endcode

Here are some examples of supported operations:
\code
sm1 *= 0.5;
sm4 = sm1 + sm2 + sm3;          // only if sm1, sm2 and sm3 have the same storage order
sm3 = sm1 * sm2;
dv3 = sm1 * dv2;
dm3 = sm1 * dm2;
dm3 = dm2 * sm1;
sm3 = sm1.cwiseProduct(sm2);    // only if sm1 and sm2 have the same storage order
dv2 = sm1.triangularView<Upper>().solve(dv2);
\endcode

The product of a sparse \em symmetric matrix A with a dense matrix (or vector) d can be optimized by specifying the symmetry of A using selfadjointView:
\code
res = A.selfadjointView<>() * d;        // if all coefficients of A are stored
res = A.selfadjointView<Upper>() * d;   // if only the upper part of A is stored
res = A.selfadjointView<Lower>() * d;   // if only the lower part of A is stored
\endcode


\section TutorialSparseDirectSolvers Using the direct solvers

To solve a sparse problem you currently have to use one or several of the following "unsupported" modules:
- \ref SparseExtra_Module
  - \b solvers: SparseLLT<SparseMatrixType>, SparseLDLT<SparseMatrixType> (\#include <Eigen/SparseExtra>)
  - \b notes: built-in basic LLT and LDLT solvers
- \ref CholmodSupport_Module
  - \b solver: SparseLLT<SparseMatrixType, Cholmod> (\#include <Eigen/CholmodSupport>)
  - \b notes: LLT solving using Cholmod, requires a SparseMatrix object. (recommended for symmetric/selfadjoint problems)
- \ref UmfPackSupport_Module
  - \b solver: SparseLU<SparseMatrixType, UmfPack> (\#include <Eigen/UmfPackSupport>)
  - \b notes: LU solving using UmfPack, requires a SparseMatrix object (recommended for squared matrices)
- \ref SuperLUSupport_Module
  - \b solver: SparseLU<SparseMatrixType, SuperLU> (\#include <Eigen/SuperLUSupport>)
  - \b notes: (LU solving using SuperLU, requires a SparseMatrix object, recommended for squared matrices)
- \ref TaucsSupport_Module
  - \b solver: SparseLLT<SparseMatrixType, Taucs> (\#include <Eigen/TaucsSupport>)
  - \b notes: LLT solving using Taucs, requires a SparseMatrix object (not recommended)

\warning Those modules are currently considered to be unsupported because 1) they are not documented, and 2) their API is likely to change in the future.

Here is a typical example:
\code
#include <Eigen/UmfPackSupport>
// ...
SparseMatrix<double> A;
// fill A
VectorXd b, x;
// fill b
// solve Ax = b using UmfPack:
SparseLU<SparseMatrix<double>,UmfPack> lu_of_A(A);
if(!lu_of_A.succeeded()) {
  // decomposition failed
  return;
}
if(!lu_of_A.solve(b,&x)) {
  // solving failed
  return;
}
\endcode

See also the class SparseLLT, class SparseLU, and class SparseLDLT.

\li \b Next: TODO

*/

}