aboutsummaryrefslogtreecommitdiffhomepage
path: root/doc/TutorialSparse.dox
blob: 2b3fdc61c5c165db44e833e02f129e75e5b686b7 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
namespace Eigen {

/** \page TutorialSparse Tutorial - Getting started with the sparse module
    \ingroup Tutorial

<div class="eimainmenu">\ref index "Overview"
  | \ref TutorialCore "Core features"
  | \ref TutorialGeometry "Geometry"
  | \ref TutorialAdvancedLinearAlgebra "Advanced linear algebra"
  | \b Sparse \b matrix
</div>

\b Table \b of \b contents \n
  - \ref TutorialSparseIntro
  - \ref TutorialSparseFilling
  - \ref TutorialSparseFeatureSet
  - \ref TutorialSparseDirectSolvers
<hr>

\section TutorialSparseIntro Introduction

In many applications (e.g., finite element methods) it is common to deal with very large matrices where only a few coefficients are different than zero. Both in term of memory consumption and performance, it is fundamental to use an adequate representation storing only nonzero coefficients. Such a matrix is called a sparse matrix.

In order to get the best of the Eigen's sparse matrix, it is important to have a rough idea of the way they are internally stored. In Eigen we chose to use the common and generic Compressed Column/Row Storage scheme. Let m be a column-major sparse matrix. Then its nonzero coefficients are sequentially stored in memory in a col-major order (\em values). A second array of integer stores the respective row index of each coefficient (\em inner \em indices). Finally, a third array of integer, having the same length than the number of columns, stores the index in the previous arrays of the first element of each column (\em outer \em indices).

Here is an example, with the matrix:
<table>
<tr><td>0</td><td>3</td><td>0</td><td>0</td><td>0</td></tr>
<tr><td>22</td><td>0</td><td>0</td><td>0</td><td>17</td></tr>
<tr><td>7</td><td>5</td><td>0</td><td>1</td><td>0</td></tr>
<tr><td>0</td><td>0</td><td>0</td><td>0</td><td>0</td></tr>
<tr><td>0</td><td>0</td><td>14</td><td>0</td><td>8</td></tr>
</table>

and its internal representation using the Compressed Column Storage format:
<table>
<tr><td>Values:</td>        <td>22</td><td>7</td><td>3</td><td>5</td><td>14</td><td>1</td><td>17</td><td>8</td></tr>
<tr><td>Inner indices:</td> <td> 1</td><td>2</td><td>0</td><td>2</td><td> 4</td><td>2</td><td> 1</td><td>4</td></tr>
</table>
Outer indices:<table><tr><td>0</td><td>2</td><td>4</td><td>5</td><td>6</td><td>\em 7 </td></tr></table>

As you can guess, here the storage order is even more important than with dense matrix. We will therefore often make a clear difference between the \em inner dimension and the \em outer dimension. For instance, it is easy to loop over the coefficients of an \em inner \em vector (e.g., a column of a col-major matrix), but very inefficient to do the same for an \em outer \em vector (e.g., a row of a col-major matrix).


\b Eigen's \b sparse \b matrix \b and \b vector \b class \n
Before using any sparse feature you must include the Sparse header file:
\code
#include <Eigen/Sparse>
\endcode
In Eigen, sparse matrices are handled by the SparseMatrix class:
\code
SparseMatrix<std::complex<float> > m1(1000,2000); // declare a 1000x2000 col-major sparse matrix of complex<float>
SparseMatrix<double,RowMajor> m2(1000,2000);      // declare a 1000x2000 row-major sparse matrix of double
\endcode
where \c Scalar is the type of the coefficient and \c Options is a set of bit flags allowing to control the shape and storage of the matrix. A typical choice for \c Options is \c RowMajor to get a row-major matrix (the default is col-major).

Unlike the dense path, the sparse module provides a special class to declare a sparse vector:
\code
SparseVector<std::complex<float> > v1(1000); // declare a column sparse vector of complex<float> of size 1000
SparseVector<double,RowMajor> v2(1000);      // declare a row sparse vector of double of size 1000
\endcode
Note that here the size of a vector denotes its dimension and not the number of nonzero coefficients which is initially zero.

\b Matrix \b and \b vector \b properties \n

<table>
<tr><td>Standard \n dimensions</td><td>\code
mat.rows()
mat.cols()\endcode</td>
<td>\code
vec.size() \endcode</td>
</tr>
<tr><td>Sizes along the \n inner/outer dimensions</td><td>\code
mat.innerSize()
mat.outerSize()\endcode</td>
<td></td>
</tr>
<tr><td>Number of non \n zero coefficiens</td><td>\code
mat.nonZeros() \endcode</td>
<td>\code
vec.nonZeros() \endcode</td></tr>
</table>

\b Iterating \b over \b the \b nonzero \b coefficients \n

Iterating over the coefficients of a sparse matrix can be done only in the same order than the storage order. Here is an example:
<table>
<tr><td>
\code
SparseMatrix<double> m1(rows,cols);
for (int k=0; k<m1.outerSize(); ++k)
  for (SparseMatrix<double>::InnerIterator it(m1,k); it; ++it)
  {
    it.value();
    it.row();   // row index
    it.col();   // col index (here it is equal to k)
    it.index(); // inner index, here it is equal to it.row()
  }
\endcode
</td><td>
\code
SparseVector<double> v1(size);
for (SparseVector<double>::InnerIterator it(v1); it; ++it)
{
  it.value(); // == v1[ it.index() ]
  it.index(); 
}
\endcode
</td></tr>
</table>


\section TutorialSparseFilling Filling a sparse matrix

Unlike dense matrix, filling a sparse matrix efficiently is a though problem which requires some special care from the user. Indeed, directly filling in a random order a matrix in a compressed storage format would requires a lot of searches and memory copies, and some trade of between the efficiency and the ease of use have to be found. In Eigen we currently provide three ways to set the elements of a sparse matrix:

1 - If you can set the coefficients in exactly the same order that the storage order, then the matrix can be filled directly and very efficiently. Here is an example initializing a random, row-major sparse matrix:
\code
SparseMatrix<double,RowMajor> m(rows,cols);
m.startFill(rows*cols*percent_of_non_zero); // estimate of the number of nonzeros (optional)
for (int i=0; i\<rows; ++i)
  for (int j=0; j\<cols; ++j)
    if (rand()\<percent_of_non_zero)
      m.fill(i,j) = rand();
m.endFill();
\endcode

2 - If you can set each outer vector in a consistent order, but do not have sorted data for each inner vector, then you can use fillrand() instead of fill():
\code
SparseMatrix<double,RowMajor> m(rows,cols);
m.startFill(rows*cols*percent_of_non_zero); // estimate of the number of nonzeros (optional)
for (int i=0; i\<rows; ++i)
  for (int k=0; k\<cols*percent_of_non_zero; ++k)
      m.fillrand(i,rand(0,cols)) = rand();
m.endFill();
\endcode
The fillrand() function performs a sorted insertion into an array sequentially stored in memory and requires a copy of all coefficients stored after its target position. This method is therefore reserved for matrices having only a few elements per row/column (up to 50) and works better if the insertion are almost sorted.

3 - Eventually, if none of the above solution is practicable for you, then you have to use a RandomSetter which temporarily wraps the matrix into a more flexible hash map allowing complete random accesses:
\code
SparseMatrix<double,RowMajor> m(rows,cols);
{
  RandomSetter<SparseMatrix<double,RowMajor> > setter(m);
  for (int k=0; k\<cols*rows*percent_of_non_zero; ++k)
    setter(rand(0,rows), rand(0,cols)) = rand();
}
\endcode
The matrix \c m is set at the destruction of the setter, hence the use of a nested block. This imposed syntax has the advantage to emphasize the critical section where m is not valid and cannot be used.


\section TutorialSparseFeatureSet Supported operators and functions

In the following \em sm denote a sparse matrix, \em sv a sparse vector, \em dm a dense matrix, and \em dv a dense vector.
In Eigen's sparse module we chose to expose only the subset of the dense matrix API which can be efficiently implemented. Moreover, all combinations are not always possible. For instance, it is not possible to add two sparse matrices having two different storage order. On the other hand it is perfectly fine to evaluate a sparse matrix/expression to a matrix having a different storage order:
\code
SparseMatrixType sm1, sm2, sm3;
sm3 = sm1.transpose() + sm2;                    // invalid
sm3 = SparseMatrixType(sm1.transpose()) + sm2;  // correct
\endcode

Here are some examples of the supported operations:
\code
s_1 *= 0.5;
sm4 = sm1 + sm2 + sm3;          // only if s_1, s_2 and s_3 have the same storage order
sm3 = sm1 * sm2;
dv3 = sm1 * dv2;
dm3 = sm1 * dm2;
dm3 = dm2 * sm1;
sm3 = sm1.cwise() * sm2;        // only if s_1 and s_2 have the same storage order
dv2 = sm1.marked<UpperTriangular>().solveTriangular(dv2);
\endcode

The product of a sparse matrix A time a dense matrix/vector dv with A symmetric can be optimized by telling that to Eigen:
\code
res = A.marked<SeflAdjoint>() * dv;                   // if all coefficients of A are stored
res = A.marked<SeflAdjoint|UpperTriangular>() * dv;   // if only the upper part of A is stored
res = A.marked<SeflAdjoint|LowerTriangular>() * dv;   // if only the lower part of A is stored
\endcode


\section TutorialSparseDirectSolvers Using the direct solvers

TODO

\subsection TutorialSparseDirectSolvers_LLT LLT
Cholmod, Taucs.

\subsection TutorialSparseDirectSolvers_LDLT LDLT


\subsection TutorialSparseDirectSolvers_LU LU
SuperLU, UmfPack.

*/

}