aboutsummaryrefslogtreecommitdiffhomepage
Commit message (Collapse)AuthorAge
...
* some documentation fixes (Cwise* and Cholesky)Gravatar Gael Guennebaud2008-05-22
|
* * improved product performance:Gravatar Gael Guennebaud2008-05-22
| | | | | | | - fallback to normal product for small dynamic matrices - overloaded "c += (a * b).lazy()" to avoid the expensive and useless temporary and setZero() in such very common cases. * fix a couple of issues with the flags
* restored the product testGravatar Gael Guennebaud2008-05-22
|
* remove Like1DArrayBit in TransposeGravatar Gael Guennebaud2008-05-22
|
* update of the testing framework:Gravatar Gael Guennebaud2008-05-22
| | | | | replaced the QTestLib framework my custom macros and a (optional) custom script to run the tests from ctest.
* Fix compilation issues with MSVC and NVCC.Gravatar Gael Guennebaud2008-05-15
| | | | Added a few typedef of complex return types in MatrixBase (Needed by MSVC)
* Introduce generic Flagged xpr, remove already Lazy.h and Temporary.hGravatar Benoit Jacob2008-05-14
| | | | Rename DefaultLostFlagMask --> HerediraryBits
* * Clean a bit the eigenvalue solver: if the matrix is known to beGravatar Gael Guennebaud2008-05-13
| | | | | selfadjoint at compile time, then it returns real eigenvalues. * Fix a couple of bugs with the new product.
* -fix certain #includesGravatar Benoit Jacob2008-05-12
| | | | -fix CMakeLists, public headers weren't getting installed
* * Added several cast to int of the enums (needed for some compilers)Gravatar Gael Guennebaud2008-05-12
| | | | | | * Fix a mistake in CwiseNullary. * Added a CoreDeclarions header that declares only the forward declarations and related basic stuffs.
* put inline keywords everywhere appropriate. So we don't need anymore to passGravatar Benoit Jacob2008-05-12
| | | | -finline-limit=1000 to gcc to get good performance. By the way some cleanup.
* updated product test to carefully test all scalar typesGravatar Gael Guennebaud2008-05-12
| | | | and fix an issue in the triangular test
* * Draft of a eigenvalues solverGravatar Gael Guennebaud2008-05-12
| | | | | | | | | | | | | | | (does not support complex and does not re-use the QR decomposition) * Rewrite the cache friendly product to have only one instance per scalar type ! This significantly speeds up compilation time and reduces executable size. The current drawback is that some trivial expressions might be evaluated like conjugate or negate. * Renamed "cache optimal" to "cache friendly" * Added the ability to directly access matrix data of some expressions via: - the stride()/_stride() methods - DirectAccessBit flag (replace ReferencableBit)
* move arch-specific code to arch/SSE and arch/AltiVec subdirs.Gravatar Benoit Jacob2008-05-12
| | | | rename the noarch PacketMath.h to DummyPacketMath.h
* * Give Konstantinos a copyright lineGravatar Benoit Jacob2008-05-12
| | | | | | | | * Fix compilation of Inverse.h with vectorisation * Introduce EIGEN_GNUC_AT_LEAST(x,y) macro doing future-proof (e.g. gcc v5.0) check * Only use ProductWIP if vectorisation is enabled * rename EIGEN_ALWAYS_INLINE -> EIGEN_INLINE with fall-back to inline keyword * some cleanup/indentation
* only include SSE3 headers if compiling with SSE3 supportGravatar Benoit Jacob2008-05-08
|
* removed "sort brief" in doxygen documentationGravatar Gael Guennebaud2008-05-08
|
* * Added ReferencableBit flag to known if coeffRef is available.Gravatar Gael Guennebaud2008-05-08
| | | | | | | | | | | (needed by the new product implementation) * Make the packet* members template to support aligned and unaligned access. This makes Block vectorizable. Combined with ReferencableBit, we should be able to determine at runtime (in some specific cases) if an aligned vectorization is possible or not. * Improved the new product implementation to robustly handle all cases, it now passes all the tests. * Renamed the packet version ei_predux to ei_preduxp to avoid name collision.
* * split PacketMath.h to SSE and Altivec specific filesGravatar Gael Guennebaud2008-05-05
| | | | | * improved the flexibility of the new product implementation, now all sizes seems to be properly handled.
* * Started support for unaligned vectorization.Gravatar Gael Guennebaud2008-05-05
| | | | | | | | | | | | | * Introduce a new highly optimized matrix-matrix product for large matrices. The code is still highly experimental and it is activated only if you define EIGEN_WIP_PRODUCT at compile time. Currently the third dimension of the product must be a factor of the packet size (x4 for floats) and the right handed side matrix must be column major. Moreover, currently c = a*b; actually computes c += a*b !! Therefore, the code is provided for experimentation purpose only ! These limitations will be fixed soon or later to become the default product implementation.
* * Patch by Konstantinos Margaritis: AltiVec vectorization.Gravatar Benoit Jacob2008-05-03
| | | | * Fix several warnings, temporarily disable determinant test.
* slighly improved the cache friendly product to use mul-add onlyGravatar Gael Guennebaud2008-05-03
|
* added packet mul-add function (ei_pmad) and updated Product to use it.Gravatar Gael Guennebaud2008-05-03
| | | | | this change nothing for current SSE architecture but might be helpful for altivec/cell and up comming AMD processors.
* Removed ei_pload1, use posix_memalign to allocate aligned memory,Gravatar Gael Guennebaud2008-05-02
| | | | | and make Product ok when only one side is vectorizable (and the product is still vectorized)
* added a test for triangular matricesGravatar Gael Guennebaud2008-05-02
|
* Make products always eval into expressions. Improves performanceGravatar Benoit Jacob2008-05-02
| | | | in benchmark. Still not as fasts as explicit eval(), strangely.
* fix flag and cost computations for nested expressionsGravatar Gael Guennebaud2008-05-01
|
* nullary xpr are now vectorizedGravatar Gael Guennebaud2008-05-01
|
* Enable vectorization of product with dynamic matrices,Gravatar Gael Guennebaud2008-05-01
| | | | | | extended cache optimal product to work in any row/column major situations, and a few bugfixes (forgot to add the Cholesky header, vectorization of CwiseBinary)
* some cleaning in Cholesky and removed evil ei_sqrt of complexGravatar Gael Guennebaud2008-04-27
|
* * added ei_sqrt for complexGravatar Gael Guennebaud2008-04-27
| | | | | * updated Cholesky to support complex * correct result_type for abs and abs2 functors
* added Cholesky moduleGravatar Gael Guennebaud2008-04-27
|
* Fixed a couple of issues introduced in previous commits.Gravatar Gael Guennebaud2008-04-26
| | | | Added a test for Triangular.
* Added triangular assignement, e.g.:Gravatar Gael Guennebaud2008-04-26
| | | | | | | | | | | m.upper() = a+b; only updates the upper triangular part of m. Note that: m = (a+b).upper(); updates all coefficients of m (but half of the additions will be skiped) Updated back/forward substitution to better use Eigen's capability.
* Added Triangular expression to extract upper or lower (strictly or not)Gravatar Gael Guennebaud2008-04-26
| | | | | | | | | | | part of a matrix. Triangular also provide an optimised method for forward and backward substitution. Further optimizations regarding assignments and products might come later. Updated determinant() to take into account triangular matrices. Started the QR module with a QR decompostion algorithm. Help needed to build a QR algorithm (eigen solver) based on it.
* fix a bug in determinant of 4x4 matrices and a small type issue in InverseGravatar Gael Guennebaud2008-04-26
|
* added a tough test to check the determinant that currently failsGravatar Gael Guennebaud2008-04-25
|
* Various fixes in:Gravatar Gael Guennebaud2008-04-25
| | | | | | | | - vector to vector assign - PartialRedux - Vectorization criteria of Product - returned type of normalized - SSE integer mul
* Make the explicit vectorization much more flexible:Gravatar Gael Guennebaud2008-04-25
| | | | | | | | - support dynamic sizes - support arbitrary matrix size when the matrix can be seen as a 1D array (except for fixed size matrices where the size in Bytes must be a factor of 16, this is to allow compact storage of a vector of matrices) Note that the explict vectorization is still experimental and far to be completely tested.
* forgot to add a file in the previous commitGravatar Gael Guennebaud2008-04-24
|
* Fix a couple of issue with the vectorization. In particular, default ei_p* ↵Gravatar Gael Guennebaud2008-04-24
| | | | | | | | | functions are provided to handle not suported types seemlessly. Added a generic null-ary expression with null-ary functors. They replace Zero, Ones, Identity and Random.
* give up on OpenMP... for nowGravatar Benoit Jacob2008-04-18
|
* - add _packetCoeff() to Inverse, allowing vectorization.Gravatar Benoit Jacob2008-04-16
| | | | | | | | | - let Inverse take template parameter MatrixType instead of ExpressionType, in order to reduce executable code size when taking inverses of xpr's. - introduce ei_corrected_matrix_flags : the flags template parameter to the Matrix class is only a suggestion. This is also useful in ei_eval.
* +5% optimization in 4x4 inverse:Gravatar Benoit Jacob2008-04-15
| | | | | -only evaluate block expressions for which that is beneficial -don't check for invertibility unless requested
* for 4x4 matrices implement the special algorithm that Markos proposed,Gravatar Benoit Jacob2008-04-15
| | | | falling back to the general algorithm in the bad case.
* - optimized determinant calculations for small matrices (size <= 4)Gravatar Benoit Jacob2008-04-14
| | | | | | | (only 30 muls for size 4) - rework the matrix inversion: now using cofactor technique for size<=3, so the ugly unrolling is only used for size 4 anymore, and even there I'm looking to get rid of it.
* when evaluating an xpr, the result can now be vectorizableGravatar Benoit Jacob2008-04-14
| | | | even if the xpr itself wasn't vectorizable.
* * Start of the LU module, with matrix inversion already there andGravatar Benoit Jacob2008-04-14
| | | | | | fully optimized. * Even if LargeBit is set, only parallelize for large enough objects (controlled by EIGEN_PARALLELIZATION_TRESHOLD).
* * Add fixed-size template versions of corner(), start(), end().Gravatar Benoit Jacob2008-04-12
| | | | | | | * Use them to write an unrolled path in echelon.cpp, as an experiment before I do this LU module. * For floating-point types, make ei_random() use an amplitude of 1.
* - cleaner use of OpenMP (no code duplication anymore)Gravatar Benoit Jacob2008-04-11
| | | | | | | | | | | | | | using a macro and _Pragma. - use OpenMP also in cacheOptimalProduct and in the vectorized paths as well - kill the vector assignment unroller. implement in operator= the logic for assigning a row-vector in a col-vector. - CMakeLists support for building tests/examples with -fopenmp and/or -msse2 - updates in bench/, especially replace identity() by ones() which prevents underflows from perturbing bench results.