| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
| |
functions
are provided to handle not suported types seemlessly.
Added a generic null-ary expression with null-ary functors. They replace
Zero, Ones, Identity and Random.
|
| |
|
|
|
|
|
|
|
|
|
| |
- let Inverse take template parameter MatrixType instead
of ExpressionType, in order to reduce executable code size
when taking inverses of xpr's.
- introduce ei_corrected_matrix_flags : the flags template
parameter to the Matrix class is only a suggestion. This
is also useful in ei_eval.
|
|
|
|
|
|
|
| |
(only 30 muls for size 4)
- rework the matrix inversion: now using cofactor technique for size<=3,
so the ugly unrolling is only used for size 4 anymore, and even there
I'm looking to get rid of it.
|
|
|
|
| |
even if the xpr itself wasn't vectorizable.
|
|
|
|
|
|
| |
fully optimized.
* Even if LargeBit is set, only parallelize for large enough objects
(controlled by EIGEN_PARALLELIZATION_TRESHOLD).
|
|
|
|
|
|
|
| |
* Use them to write an unrolled path in echelon.cpp, as an
experiment before I do this LU module.
* For floating-point types, make ei_random() use an amplitude
of 1.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
using a macro and _Pragma.
- use OpenMP also in cacheOptimalProduct and in the
vectorized paths as well
- kill the vector assignment unroller. implement in
operator= the logic for assigning a row-vector in
a col-vector.
- CMakeLists support for building tests/examples
with -fopenmp and/or -msse2
- updates in bench/, especially replace identity()
by ones() which prevents underflows from perturbing
bench results.
|
| |
|
|
|
|
|
| |
added the possibility to disable the vectorization using EIGEN_DONT_VECTORIZE
(some architectures has SSE support by default)
|
|
|
|
|
|
|
|
| |
* add -pedantic to CXXFLAGS
* cleanup intricated expressions with && and ||
which gave warnings because of "missing" parentheses
* fix compile error in NumTraits, apparently discovered
by -pedantic
|
|
|
|
| |
some more renaming
|
|
|
|
|
| |
* rename OperatorEquals -> Assign
* move Util.h and FwDecl.h to a util/ subdir
|
|
|
|
| |
of the VectorizableBit flag, now benchmark.cpp is properly vectorized
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently only the following platform/operations are supported:
- SSE2 compatible architecture
- compiler compatible with intel's SSE2 intrinsics
- float, double and int data types
- fixed size matrices with a storage major dimension multiple of 4 (or 2 for double)
- scalar-matrix product, component wise: +,-,*,min,max
- matrix-matrix product only if the left matrix is vectorizable and column major
or the right matrix is vectorizable and row major, e.g.:
a.transpose() * b is not vectorized with the default column major storage.
To use it you must define EIGEN_VECTORIZE and EIGEN_INTEL_PLATFORM.
|
|
|
|
| |
seems appropriate to me.
|
|
|
|
|
|
|
| |
- make use of CoeffReadCost to determine when to unroll the loops,
for now only in Product.h and in OperatorEquals.h
performance remains the same: generally still not as good as before the
big changes.
|
| |
|
|
|
|
|
|
| |
ei_xpr_copy to evaluate args when needed. Had to introduce an ugly
trick with ei_unref as when the XprCopy type is a reference one can't
directly access member typedefs such as Scalar.
|
|
|
|
|
|
|
|
|
|
|
| |
in ei_xpr_copy and operator=, respectively.
* added Matrix::lazyAssign() when EvalBeforeAssigningBit must be skipped
(mainly internal use only)
* all expressions are now stored by const reference
* added Temporary xpr: .temporary() must be called on any temporary expression
not directly returned by a function (mainly internal use only)
* moved all functors in the Functors.h header
* added some preliminaries stuff for the explicit vectorization
|
|
|
|
|
|
|
| |
* added "all" and "any" special redux operators
* added support bool matrices
* added support for cost model of STL functors via ei_functor_traits
(By default ei_functor_traits query the functor member Cost)
|
|
|
|
|
|
|
|
|
|
| |
useless copies are made when evaluating nested expressions.
Changes:
- kill LazyBit, introduce EvalBeforeNestingBit and EvalBeforeAssigningBit
- product and random don't evaluate immediately anymore
- eval() always evaluates
- change the value of Dynamic to some large positive value,
in preparation of future simplifications
|
|
|
|
|
|
|
|
| |
before the Product<> type is constructed. This resets template depth on each
intermediate evaluation, and gives simpler code. Introducing
ei_eval_if_expensive<Derived, n> which evaluates Derived if it's worth it
given that each of its coeffs will be accessed n times. Operator*
uses this with adequate values of n to evaluate args exactly when needed.
|
| |
|
|
|
|
|
|
|
|
| |
when to evaluate arguments and when to meta-unroll.
-use it in Product to determine when to eval args. not yet used
to determine when to unroll. for now, not used anywhere else but
that'll follow.
-fix badness of my last commit
|
|
|
|
|
|
|
| |
to preserve SVN history). They are made useless by the new
ei_eval_unless_lazy.
- introduce a generic Eval member typedef so one can do e.g.
T t; U u; Product<T, U>::Eval m; m = t*u;
|
| |
|
|
|
|
|
|
| |
-- currently 3 flags: RowMajor, Lazy and Large
-- only RowMajor actually used for now
* many minor improvements
|
|
|
|
|
|
| |
* add Eigen:: in some macros to allow using them from outside
of namespace Eigen
Problems and solutions communicated by Gael.
|
|
|
|
|
|
|
|
|
| |
the cacheOptimal is only good for large enough matrices.
When taking a block in a fixed-size (hence small) matrix,
the SizeAtCompileTime is Dynamic hence that's not a good
indicator. This example shows that the good indicator is
MaxSizeAtCompileTime.
Result: +10% speed in echelon.cpp
|
|
|
|
|
|
|
| |
* macro renaming: EIGEN_NDEBUG becomes EIGEN_NO_DEBUG
as this is much better (and similar to Qt) and
EIGEN_CUSTOM_ASSERT becomes EIGEN_USE_CUSTOM_ASSERT
* protect Core header by a EIGEN_CORE_H
|
|
|
|
|
|
|
|
|
|
|
|
| |
to disable eigen's asserts without disabling one's own program's
asserts. Notice that Eigen code should now use ei_assert()
instead of assert().
* Remove findBiggestCoeff() as it's now almost redundant.
* Improve echelon.cpp: inner for loop replaced by xprs.
* remove useless "(*this)." here and there. I think they were
first introduced by automatic search&replace.
* fix compilation in Visitor.h (issue triggered by echelon.cpp)
* improve comment on swap().
|
|
|
|
|
|
| |
* added cache efficient matrix-matrix product.
- provides a huge speed-up for large matrices.
- currently it is enabled when an explicit unrolling is not possible.
|
|
|
|
|
|
|
|
|
|
|
| |
(the global funcs in MathFunctions.h and Fuzzy.h don't count as internal).
* Mainpage.dox. Add a few prospective Eigen users; change the recommended
-finline-limit from 10000 to 1000. The reason is: it could be harmful to have
a too big value here, couldn't it? (e.g. exceedingly large executables, cache
misses). Looking at gcc, a value of 900 would exactly mean "determine the inlining
of all functions as if they were marked with 'inline' keyword". So a value of
1000 seems a reasonable round number. In the benchmark that motivated this
(TestEigenSolvers) a value of 400 is enough on my system.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
as well as partial redux (vertical or horizontal redux).
Includes shortcuts for: sum, minCoeff and maxCoeff.
There is no shortcut for the partial redux.
* Added a generic *visitor* mini framework. A visitor is a custom object
sequentially applied on each coefficient with knowledge of its value and
coordinates.
It is currentlly used to implement minCoeff(int*,int*) and maxCoeff(int*,int*).
findBiggestCoeff is now a shortcut for "this->cwiseAbs().maxCoeff(i,j)"
* Added coeff-wise min and max.
* fixed an issue with ei_pow(int,int) and gcc < 4.3 or ICC
|
|
|
|
|
| |
- reimplement trace() as just diagonal().sum()
- apidoc fixes
|
|
|
|
|
|
|
|
|
|
|
| |
- add cwiseExp(), cwiseLog()...
--> for example, doing a gamma-correction on a bitmap image stored as
an array of floats is a simple matter of:
Eigen::Map<VectorXf> m = VectorXf::map(bitmap,size);
m = m.cwisePow(gamma);
- apidoc improvements, reorganization of the \name's
- remove obsolete examples
- remove EIGEN_ALWAYS_INLINE on lazyProduct(), it seems useless.
|
| |
|
|
|
|
|
|
|
|
| |
internal classes: AaBb -> ei_aa_bb
IntAtRunTimeIfDynamic -> ei_int_if_dynamic
unify UNROLLING_LIMIT (there was no reason to have operator= use
a higher limit)
etc...
|
| |
|
|
|
|
|
|
| |
Finally the importing macro is named EIGEN_BASIC_PUBLIC_INTERFACE
because it does not only import the ei_traits, it also makes the base class
a friend, etc.
|
|
|
|
|
|
| |
template parameter "Scalar" is removed. This is achieved by introducting a
template <typename Derived> struct Scalar to achieve a forward-declaration of
the Scalar typedefs.
|
|
|
|
|
| |
internaly uses OpenMP if enabled at compile time.
* added a bench/ folder with a couple benchmarks and benchmark tools.
|
|
|
|
|
|
| |
Matrix3i mat; Vector2i vec(33,66);
mat << vec.transpose(), 99,
vec, Matrix2i::random();
|
|
|
|
|
|
|
|
|
|
| |
If the number of coefficients does not match the matrix size, then an assertion is raised.
No support for xpr on the right side for the moment.
* Added support for assertion checking. This allows to test that an assertion is indeed raised
when it should be.
* Fixed a mistake in the CwiseUnary example.
|
|
|
|
|
|
| |
- compatible with current STL's functors as well as with the extention proposal (TR1)
* thanks to the above, Cast and ScalarMultiple have been removed
* benchmark_suite is more flexible (compiler and matrix size)
|
|
|
|
|
|
| |
were always instanciated.
* the unrolling limits are configurable at compile time.
|
| |
|