| Commit message (Collapse) | Author | Age |
|
|
|
| |
* some simplifications and fixes in cache friendly products
|
|
|
|
|
|
| |
and vector * row-major products. Currently, it is enabled only is the matrix
has DirectAccessBit flag and the product is "large enough".
Added the respective unit tests in test/product/cpp.
|
|
|
|
| |
Cwise.
|
|
|
|
|
|
| |
(using either a cache friendly strategy or re-using dot-product
vectorized implementation)
* add LinearAccessBit to Transpose
|
|
|
|
|
|
| |
such that
lazyAssign overloads of <xpr> are automatically called (this also reduces assign instansiations)
|
| |
|
|
|
|
|
| |
- fix compilation in product.cpp with std::complex
- fix bug in MatrixBase::operator!=
|
|
|
|
| |
can be seen in Eigen/src/Core/Cwise.h.
|
|
|
|
|
| |
* add comment in Product.h about CanVectorizeInner
* fix typo in test/product.cpp
|
|
|
|
|
| |
* added some tests for product and swap
* overload .swap() for dynamic-sized matrix of same size
|
|
|
|
|
| |
(equivalent to the GEMM blas routine)
* added a GEMM benchmark
|
|
|
|
|
|
|
| |
might be twice faster fot small fixed size matrix
* added a sparse triangular solver (sparse version
of inverseProduct)
* various other improvements in the Sparse module
|
|
|
|
|
|
|
| |
* added complete implementation of sparse matrix product
(with a little glue in Eigen/Core)
* added an exhaustive bench of sparse products including GMM++ and MTL4
=> Eigen outperforms in all transposed/density configurations !
|
| |
|
|
|
|
|
| |
* allow default Matrix constructor in dynamic size, defaulting to (1,
1), this is convenient in mandelbrot example.
|
| |
|
|
|
|
|
|
|
|
| |
* rework PacketMath and DummyPacketMath, make these actual template
specializations instead of just overriding by non-template inline
functions
* introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix
* remove Matrix::map() methods, use Map constructors instead.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* added some glue to Eigen/Core (SparseBit, ei_eval, Matrix)
* add two new sparse matrix types:
HashMatrix: based on std::map (for random writes)
LinkedVectorMatrix: array of linked vectors
(for outer coherent writes, e.g. to transpose a matrix)
* add a SparseSetter class to easily set/update any kind of matrices, e.g.:
{ SparseSetter<MatrixType,RandomAccessPattern> wrapper(mymatrix);
for (...) wrapper->coeffRef(rand(),rand()) = rand(); }
* automatic shallow copy for RValue
* and a lot of mess !
plus:
* remove the remaining ArrayBit related stuff
* don't use alloca in product for very large memory allocation
|
|
|
|
|
|
|
|
| |
to "public:method()" i.e. reimplementing the generic method()
from MatrixBase.
improves compilation speed by 7%, reduces almost by half the call depth
of trivial functions, making gcc errors and application backtraces
nicer...
|
|
|
|
|
|
|
|
|
|
|
|
| |
* introduce packet(int), make use of it in linear vectorized paths
--> completely fixes the slowdown noticed in benchVecAdd.
* generalize coeff(int) to linear-access xprs
* clarify the access flag bits
* rework api dox in Coeffs.h and util/Constants.h
* improve certain expressions's flags, allowing more vectorization
* fix bug in Block: start(int) and end(int) returned dyn*dyn size
* fix bug in Block: just because the Eval type has packet access
doesn't imply the block xpr should have it too.
|
|
|
|
| |
on architectures having a packed-mul-add assembly instruction.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* make the conj functor vectorizable: it is just identity in real case,
and complex doesn't use the vectorized path anyway.
* fix bug in Block: a 3x1 block in a 4x4 matrix (all fixed-size)
should not be vectorizable, since in fixed-size we are assuming
the size to be a multiple of packet size. (Or would you prefer
Vector3d to be flagged "packetaccess" even though no packet access
is possible on vectors of that type?)
* rename:
isOrtho for vectors ---> isOrthogonal
isOrtho for matrices ---> isUnitary
* add normalize()
* reimplement normalized with quotient1 functor
|
|
|
|
| |
* add vdw benchmark from Tim's real-world use case
|
| |
|
|
|
|
|
|
|
|
| |
- uses the common "Compressed Column Storage" scheme
- supports every unary and binary operators with xpr template
assuming binaryOp(0,0) == 0 and unaryOp(0) = 0 (otherwise a sparse
matrix doesnot make sense)
- this is the first commit, so of course, there are still several shorcommings !
|
|
|
|
|
| |
vectorization....
now the sum benchmark runs 3x faster with vectorization than without.
|
|
|
|
|
|
| |
(could come back to redux after it has been vectorized,
and could serve as a starting point for that)
also make the abs2 functor vectorizable (for real types).
|
|
|
|
|
|
|
|
|
| |
packet access, it is not certain that it will bring a performance
improvement: benchmarking needed.
* improve logic choosing slice vectorization.
* fix typo in SSE packet math, causing crash in unaligned case.
* fix bug in Product, causing crash in unaligned case.
* add TEST_SSE3 CMake option.
|
|
|
|
|
| |
- convertions are done trough constructors and operator=
- added a EulerAngles class
|
|
|
|
| |
to be evaluated, it is enough to just read them.
|
|
|
|
|
|
|
|
| |
- matrix-scalar addition/subtraction operators, e.g.:
m.array() += 0.5;
- matrix/matrix comparison operators, e.g.:
if (m1.array() < m2.array()) {}
* fix compilation issues with Transform and gcc < 4.1
|
|
|
|
| |
enums to int is enough to get compile time constants with ICC.
|
|
|
|
|
|
| |
* make Matrix2f (and similar) vectorized using linear path
* fix a couple of warnings and compilation issues with ICC and gcc 3.3/3.4
(cannot get Transform compiles with gcc 3.3/3.4, see the FIXME)
|
|
|
|
|
|
|
|
| |
* use ProductReturnType<>::Type to get the correct Product xpr type
* Product is no longer instanciated for xpr types which are evaluated
* vectorization of "a.transpose() * b" for the normal product (small and fixed-size matrix)
* some cleanning
* removed ArrayBase
|
| |
|
|
|
|
|
|
|
|
|
| |
now have the Like1D flag.
* Big renaming:
packetCoeff ---> packet
VectorizableBit ---> PacketAccessBit
Like1DArrayBit ---> LinearAccessBit
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
** Much better organization
** Fix a few bugs
** Add the ability to unroll only the inner loop
** Add an unrolled path to the Like1D vectorization. Not well tested.
** Add placeholder for sliced vectorization. Unimplemented.
* Rework of corrected_flags:
** improve rules determining vectorizability
** for vectors, the storage-order is indifferent, so we tweak it
to allow vectorization of row-vectors.
* fix compilation in benchmark, and a warning in Transpose.
|
|
|
|
|
|
| |
representation in Transform via the template static class
ToRotationMatrix.
Added a lightweight AngleAxis class (similar to Rotation2D).
|
|
|
|
|
|
|
|
|
|
| |
to optimize matrix-diag and diag-matrix products without
making Product over complicated.
* compilation fixes in Tridiagonalization and HessenbergDecomposition
in the case of 2x2 matrices.
* added an Orientation2D small class with similar interface than Quaternion
(used by Transform to handle 2D and 3D orientations seamlessly)
* added a couple of features in Transform.
|
|
|
|
|
| |
homography.
Fix indentation in Quaternion.h
|
|
|
|
|
|
| |
(as new members to SelfAdjointEigenSolver)
The QR module now depends on Cholesky.
* Fix Transpose to correctly preserve the *TriangularBit.
|
|
|
|
| |
To try it with the unit tests set the cmake variable TEST_LIB to ON.
|
|
|
|
|
|
|
| |
them in the ei_traits, so that they're guaranteed even if the user
specified his own non-default flags (like before).
Measured to not make compilation any slower.
|
|
|
|
|
|
|
|
|
| |
flags. This ensures that unless explicitly messed up otherwise,
a Matrix type is equal to its own Eval type. This seriously reduces
the number of types instantiated. Measured +13% compile speed, -7%
binary size.
* Improve doc of Matrix template parameters.
|
|
|
|
|
|
|
| |
This is the first step towards a non-selfadjoint eigen solver.
Notes:
- We might consider merging Tridiagonalization and Hessenberg toghether ?
- Or we could factorize some code into a Householder class (could also be shared with QR)
|
|
|
|
|
|
| |
- works for complex
- allows direct access to the matrix R
* removed the scale by the matrix dimensions in MatrixBase::isMuchSmallerThan(scalar)
|
|
|
|
|
|
| |
which is even better optimized by the compiler.
* Quaternion no longer inherits MatrixBase. Instead it stores the coefficients
using a Matrix<> and provides only relevant methods.
|
|
|
|
|
| |
* fix a couple of compilation issues when unrolling is disabled
* reduce default unrolling limit to a more reasonable value
|