aboutsummaryrefslogtreecommitdiffhomepage
Commit message (Collapse)AuthorAge
* Fixed a couple of issues introduced in previous commits.Gravatar Gael Guennebaud2008-04-26
| | | | Added a test for Triangular.
* Added triangular assignement, e.g.:Gravatar Gael Guennebaud2008-04-26
| | | | | | | | | | | m.upper() = a+b; only updates the upper triangular part of m. Note that: m = (a+b).upper(); updates all coefficients of m (but half of the additions will be skiped) Updated back/forward substitution to better use Eigen's capability.
* Added Triangular expression to extract upper or lower (strictly or not)Gravatar Gael Guennebaud2008-04-26
| | | | | | | | | | | part of a matrix. Triangular also provide an optimised method for forward and backward substitution. Further optimizations regarding assignments and products might come later. Updated determinant() to take into account triangular matrices. Started the QR module with a QR decompostion algorithm. Help needed to build a QR algorithm (eigen solver) based on it.
* fix a bug in determinant of 4x4 matrices and a small type issue in InverseGravatar Gael Guennebaud2008-04-26
|
* added a tough test to check the determinant that currently failsGravatar Gael Guennebaud2008-04-25
|
* Various fixes in:Gravatar Gael Guennebaud2008-04-25
| | | | | | | | - vector to vector assign - PartialRedux - Vectorization criteria of Product - returned type of normalized - SSE integer mul
* Make the explicit vectorization much more flexible:Gravatar Gael Guennebaud2008-04-25
| | | | | | | | - support dynamic sizes - support arbitrary matrix size when the matrix can be seen as a 1D array (except for fixed size matrices where the size in Bytes must be a factor of 16, this is to allow compact storage of a vector of matrices) Note that the explict vectorization is still experimental and far to be completely tested.
* forgot to add a file in the previous commitGravatar Gael Guennebaud2008-04-24
|
* Fix a couple of issue with the vectorization. In particular, default ei_p* ↵Gravatar Gael Guennebaud2008-04-24
| | | | | | | | | functions are provided to handle not suported types seemlessly. Added a generic null-ary expression with null-ary functors. They replace Zero, Ones, Identity and Random.
* give up on OpenMP... for nowGravatar Benoit Jacob2008-04-18
|
* - add _packetCoeff() to Inverse, allowing vectorization.Gravatar Benoit Jacob2008-04-16
| | | | | | | | | - let Inverse take template parameter MatrixType instead of ExpressionType, in order to reduce executable code size when taking inverses of xpr's. - introduce ei_corrected_matrix_flags : the flags template parameter to the Matrix class is only a suggestion. This is also useful in ei_eval.
* +5% optimization in 4x4 inverse:Gravatar Benoit Jacob2008-04-15
| | | | | -only evaluate block expressions for which that is beneficial -don't check for invertibility unless requested
* for 4x4 matrices implement the special algorithm that Markos proposed,Gravatar Benoit Jacob2008-04-15
| | | | falling back to the general algorithm in the bad case.
* - optimized determinant calculations for small matrices (size <= 4)Gravatar Benoit Jacob2008-04-14
| | | | | | | (only 30 muls for size 4) - rework the matrix inversion: now using cofactor technique for size<=3, so the ugly unrolling is only used for size 4 anymore, and even there I'm looking to get rid of it.
* when evaluating an xpr, the result can now be vectorizableGravatar Benoit Jacob2008-04-14
| | | | even if the xpr itself wasn't vectorizable.
* * Start of the LU module, with matrix inversion already there andGravatar Benoit Jacob2008-04-14
| | | | | | fully optimized. * Even if LargeBit is set, only parallelize for large enough objects (controlled by EIGEN_PARALLELIZATION_TRESHOLD).
* * Add fixed-size template versions of corner(), start(), end().Gravatar Benoit Jacob2008-04-12
| | | | | | | * Use them to write an unrolled path in echelon.cpp, as an experiment before I do this LU module. * For floating-point types, make ei_random() use an amplitude of 1.
* - cleaner use of OpenMP (no code duplication anymore)Gravatar Benoit Jacob2008-04-11
| | | | | | | | | | | | | | using a macro and _Pragma. - use OpenMP also in cacheOptimalProduct and in the vectorized paths as well - kill the vector assignment unroller. implement in operator= the logic for assigning a row-vector in a col-vector. - CMakeLists support for building tests/examples with -fopenmp and/or -msse2 - updates in bench/, especially replace identity() by ones() which prevents underflows from perturbing bench results.
* Merge Gael's experimental OpenMP parallelization support into Assign.h.Gravatar Benoit Jacob2008-04-11
|
* added a vectorized version of Product::_cacheOptimalProduct,Gravatar Gael Guennebaud2008-04-10
| | | | | added the possibility to disable the vectorization using EIGEN_DONT_VECTORIZE (some architectures has SSE support by default)
* * add typedefs for matrices/vectors with LargeBitGravatar Benoit Jacob2008-04-10
| | | | | | | | * add -pedantic to CXXFLAGS * cleanup intricated expressions with && and || which gave warnings because of "missing" parentheses * fix compile error in NumTraits, apparently discovered by -pedantic
* split those files in util/Gravatar Benoit Jacob2008-04-10
| | | | some more renaming
* * rename XprCopy -> NestedGravatar Benoit Jacob2008-04-10
| | | | | * rename OperatorEquals -> Assign * move Util.h and FwDecl.h to a util/ subdir
* fix priority operator bugs in the computationGravatar Gael Guennebaud2008-04-09
| | | | of the VectorizableBit flag, now benchmark.cpp is properly vectorized
* a better bugfix in ei_matrix_operator_equals_packet_unrollerGravatar Gael Guennebaud2008-04-09
|
* bugfix in ei_matrix_operator_equals_packet_unrollerGravatar Gael Guennebaud2008-04-09
|
* Added initial experimental support for explicit vectorization.Gravatar Gael Guennebaud2008-04-09
| | | | | | | | | | | | | Currently only the following platform/operations are supported: - SSE2 compatible architecture - compiler compatible with intel's SSE2 intrinsics - float, double and int data types - fixed size matrices with a storage major dimension multiple of 4 (or 2 for double) - scalar-matrix product, component wise: +,-,*,min,max - matrix-matrix product only if the left matrix is vectorizable and column major or the right matrix is vectorizable and row major, e.g.: a.transpose() * b is not vectorized with the default column major storage. To use it you must define EIGEN_VECTORIZE and EIGEN_INTEL_PLATFORM.
* finish making use of CoeffReadCost and the new XprCopy everywhereGravatar Benoit Jacob2008-04-08
| | | | seems appropriate to me.
* - merge ei_xpr_copy and ei_eval_if_needed_before_nestingGravatar Benoit Jacob2008-04-06
| | | | | | | - make use of CoeffReadCost to determine when to unroll the loops, for now only in Product.h and in OperatorEquals.h performance remains the same: generally still not as good as before the big changes.
* fix compilation (finish removal of EIGEN_UNROLLED_LOOPS)Gravatar Benoit Jacob2008-04-05
|
* fixes as discussed with Gael on IRC. Mainly, in Fuzzy.h, and Dot.h, useGravatar Benoit Jacob2008-04-05
| | | | | | ei_xpr_copy to evaluate args when needed. Had to introduce an ugly trick with ei_unref as when the XprCopy type is a reference one can't directly access member typedefs such as Scalar.
* * make use of the EvalBeforeNestingBit and EvalBeforeAssigningBitGravatar Gael Guennebaud2008-04-05
| | | | | | | | | | | in ei_xpr_copy and operator=, respectively. * added Matrix::lazyAssign() when EvalBeforeAssigningBit must be skipped (mainly internal use only) * all expressions are now stored by const reference * added Temporary xpr: .temporary() must be called on any temporary expression not directly returned by a function (mainly internal use only) * moved all functors in the Functors.h header * added some preliminaries stuff for the explicit vectorization
* * added cwise comparisonsGravatar Gael Guennebaud2008-04-03
| | | | | | | * added "all" and "any" special redux operators * added support bool matrices * added support for cost model of STL functors via ei_functor_traits (By default ei_functor_traits query the functor member Cost)
* current state of the mess. One line fails in the tests, andGravatar Benoit Jacob2008-04-03
| | | | | | | | | | useless copies are made when evaluating nested expressions. Changes: - kill LazyBit, introduce EvalBeforeNestingBit and EvalBeforeAssigningBit - product and random don't evaluate immediately anymore - eval() always evaluates - change the value of Dynamic to some large positive value, in preparation of future simplifications
* More clever evaluation of arguments: now it occurs in earlier, in operator*,Gravatar Benoit Jacob2008-04-03
| | | | | | | | before the Product<> type is constructed. This resets template depth on each intermediate evaluation, and gives simpler code. Introducing ei_eval_if_expensive<Derived, n> which evaluates Derived if it's worth it given that each of its coeffs will be accessed n times. Operator* uses this with adequate values of n to evaluate args exactly when needed.
* fix a compilation issue with gcc-3.3 and ei_result_ofGravatar Gael Guennebaud2008-04-03
|
* -new: recursive costs system, useful to determine automaticallyGravatar Benoit Jacob2008-04-03
| | | | | | | | when to evaluate arguments and when to meta-unroll. -use it in Product to determine when to eval args. not yet used to determine when to unroll. for now, not used anywhere else but that'll follow. -fix badness of my last commit
* - remove Eval/EvalOMP (moving them to a disabled/ subdir in orderGravatar Benoit Jacob2008-03-31
| | | | | | | to preserve SVN history). They are made useless by the new ei_eval_unless_lazy. - introduce a generic Eval member typedef so one can do e.g. T t; U u; Product<T, U>::Eval m; m = t*u;
* Make use of the LazyBit, introduce .lazy(), remove lazyProduct.Gravatar Benoit Jacob2008-03-31
|
* * introducte recursive Flags system for the expressionsGravatar Benoit Jacob2008-03-30
| | | | | | -- currently 3 flags: RowMajor, Lazy and Large -- only RowMajor actually used for now * many minor improvements
* * fix compilation with gcc-4.0 which doesn't like "using" too muchGravatar Benoit Jacob2008-03-29
| | | | | | * add Eigen:: in some macros to allow using them from outside of namespace Eigen Problems and solutions communicated by Gael.
* look at that subtle difference in Product.h...Gravatar Benoit Jacob2008-03-26
| | | | | | | | | the cacheOptimal is only good for large enough matrices. When taking a block in a fixed-size (hence small) matrix, the SizeAtCompileTime is Dynamic hence that's not a good indicator. This example shows that the good indicator is MaxSizeAtCompileTime. Result: +10% speed in echelon.cpp
* * add Gael copyright lines on 2 more filesGravatar Benoit Jacob2008-03-26
| | | | | | | * macro renaming: EIGEN_NDEBUG becomes EIGEN_NO_DEBUG as this is much better (and similar to Qt) and EIGEN_CUSTOM_ASSERT becomes EIGEN_USE_CUSTOM_ASSERT * protect Core header by a EIGEN_CORE_H
* * #define EIGEN_NDEBUG now also disables asserts. UsefulGravatar Benoit Jacob2008-03-26
| | | | | | | | | | | | to disable eigen's asserts without disabling one's own program's asserts. Notice that Eigen code should now use ei_assert() instead of assert(). * Remove findBiggestCoeff() as it's now almost redundant. * Improve echelon.cpp: inner for loop replaced by xprs. * remove useless "(*this)." here and there. I think they were first introduced by automatic search&replace. * fix compilation in Visitor.h (issue triggered by echelon.cpp) * improve comment on swap().
* * support for matrix-scalar quotient with integer scalar types.Gravatar Gael Guennebaud2008-03-21
| | | | | | * added cache efficient matrix-matrix product. - provides a huge speed-up for large matrices. - currently it is enabled when an explicit unrolling is not possible.
* * cleanup: in public api docs, don't put \sa links to \internal things.Gravatar Benoit Jacob2008-03-17
| | | | | | | | | | | (the global funcs in MathFunctions.h and Fuzzy.h don't count as internal). * Mainpage.dox. Add a few prospective Eigen users; change the recommended -finline-limit from 10000 to 1000. The reason is: it could be harmful to have a too big value here, couldn't it? (e.g. exceedingly large executables, cache misses). Looking at gcc, a value of 900 would exactly mean "determine the inlining of all functions as if they were marked with 'inline' keyword". So a value of 1000 seems a reasonable round number. In the benchmark that motivated this (TestEigenSolvers) a value of 400 is enough on my system.
* update to fix compilationGravatar Benoit Jacob2008-03-16
|
* * Added a generic *redux* mini framework allowing custom redux operationsGravatar Gael Guennebaud2008-03-16
| | | | | | | | | | | | | | | as well as partial redux (vertical or horizontal redux). Includes shortcuts for: sum, minCoeff and maxCoeff. There is no shortcut for the partial redux. * Added a generic *visitor* mini framework. A visitor is a custom object sequentially applied on each coefficient with knowledge of its value and coordinates. It is currentlly used to implement minCoeff(int*,int*) and maxCoeff(int*,int*). findBiggestCoeff is now a shortcut for "this->cwiseAbs().maxCoeff(i,j)" * Added coeff-wise min and max. * fixed an issue with ei_pow(int,int) and gcc < 4.3 or ICC
* - introduce sum() returning the sum of the coeffs of a vectorGravatar Benoit Jacob2008-03-15
| | | | | - reimplement trace() as just diagonal().sum() - apidoc fixes
* - expand MathFunctions.h to provide more functions, like exp, log...Gravatar Benoit Jacob2008-03-14
| | | | | | | | | | | - add cwiseExp(), cwiseLog()... --> for example, doing a gamma-correction on a bitmap image stored as an array of floats is a simple matter of: Eigen::Map<VectorXf> m = VectorXf::map(bitmap,size); m = m.cwisePow(gamma); - apidoc improvements, reorganization of the \name's - remove obsolete examples - remove EIGEN_ALWAYS_INLINE on lazyProduct(), it seems useless.