| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
|
|
|
|
|
| |
- rename EvalBeforeAssignBit to MayAliasBit
- make .lazy() remove the MayAliasBit only, and mark it as deprecated
- add a NoAlias pseudo expression, and MatrixBase::noalias() function
Todo:
- we have to decide whether += and -= assume no aliasing by default ?
- once we agree on the API: update the Sparse module and the unit tests respectively.
|
| |
|
| |
|
|
|
|
| |
we have to do runtime checks and we don't unroll, so it's only good for large enough sizes
|
|
|
|
| |
ei_assign_traits are printed
|
| |
|
| |
|
|
|
|
| |
it never made very precise sense. but now does it still make any?
|
| |
|
|
|
|
|
|
| |
ensures they can't be bypassed (e.g.
until now it was possible to bypass the static assert on sizes)
|
|
|
|
|
|
|
| |
types
* fix issues in Product revealed by this test
* in Dot.h forbid mixing of different types (at least for now, might allow real.dot(complex) in the future).
|
|
|
|
|
|
| |
* use _mm_malloc/_mm_free on other platforms than linux of MSVC (eg., cygwin, OSX)
* replace a lot of inline keywords by EIGEN_STRONG_INLINE to compensate for
poor MSVC inlining
|
|
|
|
|
| |
* fix some "unused variable" warnings in the tests; there remains a libstdc++ "deprecated"
warning which I haven't looked much into
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- in matrix-matrix product, static assert on the two scalar types to be the same.
- Similarly in CwiseBinaryOp. POTENTIALLY CONTROVERSIAL: we don't allow anymore binary
ops to take two different scalar types. The functors that we defined take two args
of the same type anyway; also we still allow the return type to be different.
Again the reason is that different scalar types are incompatible with vectorization.
Better have the user realize explicitly what mixing different numeric types costs him
in terms of performance.
See comment in CwiseBinaryOp constructor.
- This allowed to fix a little mistake in test/regression.cpp, mixing float and double
- Remove redundant semicolon (;) after static asserts
|
| |
|
|
|
|
|
| |
Anyway: LinearVectorization+CompleteUnrolling actually uses the InnerVectorization
unrollers, so these two cases could be merged to a single one...
|
|
|
|
| |
* keep going on the doc: added a short geometry tutorial
|
| |
|
|
|
|
|
| |
* Bug fixes in euler angle snippet, Assign and MapBase
* Started a "quick start guide" (draft state)
|
|
|
|
|
|
| |
IRC).
extended the documentation of the triangular solver.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- added a MapBase base xpr on top of which Map and the specialization
of Block are implemented
- MapBase forces both aligned loads (and aligned stores, see below) in expressions
such as "x.block(...) += other_expr"
* Significant vectorization improvement:
- added a AlignedBit flag meaning the first coeff/packet is aligned,
this allows to not generate extra code to deal with the first unaligned part
- removed all unaligned stores when no unrolling
- removed unaligned loads in Sum when the input as the DirectAccessBit flag
* Some code simplification in CacheFriendly product
* Some minor documentation improvements
|
|
|
|
|
| |
Assign, in preparation for new Swap impl reusing Assign code.
remove last remnant of old Inverse class in Transform.
|
|
|
|
|
|
|
|
| |
- added explicit enum to int conversion where needed
- if a function is not defined as declared and the return type is "tricky"
then the type must be typedefined somewhere. A "tricky return type" can be:
* a template class with a default parameter which depends on another template parameter
* a nested template class, or type of a nested template class
|
|
|
|
|
|
| |
and vector * row-major products. Currently, it is enabled only is the matrix
has DirectAccessBit flag and the product is "large enough".
Added the respective unit tests in test/product/cpp.
|
|
|
|
| |
Cwise.
|
|
|
|
|
| |
* add comment in Product.h about CanVectorizeInner
* fix typo in test/product.cpp
|
|
|
|
|
|
|
| |
* added complete implementation of sparse matrix product
(with a little glue in Eigen/Core)
* added an exhaustive bench of sparse products including GMM++ and MTL4
=> Eigen outperforms in all transposed/density configurations !
|
|
|
|
|
|
|
|
| |
* rework PacketMath and DummyPacketMath, make these actual template
specializations instead of just overriding by non-template inline
functions
* introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix
* remove Matrix::map() methods, use Map constructors instead.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* introduce packet(int), make use of it in linear vectorized paths
--> completely fixes the slowdown noticed in benchVecAdd.
* generalize coeff(int) to linear-access xprs
* clarify the access flag bits
* rework api dox in Coeffs.h and util/Constants.h
* improve certain expressions's flags, allowing more vectorization
* fix bug in Block: start(int) and end(int) returned dyn*dyn size
* fix bug in Block: just because the Eval type has packet access
doesn't imply the block xpr should have it too.
|
| |
|
|
|
|
|
|
| |
(could come back to redux after it has been vectorized,
and could serve as a starting point for that)
also make the abs2 functor vectorizable (for real types).
|
|
|
|
|
|
|
|
|
| |
packet access, it is not certain that it will bring a performance
improvement: benchmarking needed.
* improve logic choosing slice vectorization.
* fix typo in SSE packet math, causing crash in unaligned case.
* fix bug in Product, causing crash in unaligned case.
* add TEST_SSE3 CMake option.
|
|
|
|
| |
enums to int is enough to get compile time constants with ICC.
|
|
|
|
|
|
| |
* make Matrix2f (and similar) vectorized using linear path
* fix a couple of warnings and compilation issues with ICC and gcc 3.3/3.4
(cannot get Transform compiles with gcc 3.3/3.4, see the FIXME)
|
|
|
|
|
|
|
|
|
| |
now have the Like1D flag.
* Big renaming:
packetCoeff ---> packet
VectorizableBit ---> PacketAccessBit
Like1DArrayBit ---> LinearAccessBit
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
** Much better organization
** Fix a few bugs
** Add the ability to unroll only the inner loop
** Add an unrolled path to the Like1D vectorization. Not well tested.
** Add placeholder for sliced vectorization. Unimplemented.
* Rework of corrected_flags:
** improve rules determining vectorizability
** for vectors, the storage-order is indifferent, so we tweak it
to allow vectorization of row-vectors.
* fix compilation in benchmark, and a warning in Transpose.
|
|
|
|
|
| |
* fix a couple of compilation issues when unrolling is disabled
* reduce default unrolling limit to a more reasonable value
|
|
|
|
| |
(see notes in Core/util/StaticAssert.h for details)
|
|
|
|
|
| |
as it speed up compilation.
* fix minor typo introduced in the previous commit
|
|
|
|
|
|
| |
in MatrixBase work
* removed product_selector and cleaned Product.h a bit
* cleaned Assign.h a bit
|
|
|
|
|
| |
* bugfix in Assign and cache friendly product (weird that worked before)
* improved argument evaluation in Product
|
|
|
|
|
|
|
|
|
|
| |
Triangular class
- full meta-unrolling in Part
- move inverseProduct() to MatrixBase
- compilation fix in ProductWIP: introduce a meta-selector to only do
direct access on types that support it.
- phase out the old Product, remove the WIP_DIRTY stuff.
- misc renaming and fixes
|
|
|
|
|
|
| |
* Fix a mistake in CwiseNullary.
* Added a CoreDeclarions header that declares only the forward declarations
and related basic stuffs.
|
|
|
|
| |
-finline-limit=1000 to gcc to get good performance. By the way some cleanup.
|
|
|
|
|
|
|
|
| |
* Fix compilation of Inverse.h with vectorisation
* Introduce EIGEN_GNUC_AT_LEAST(x,y) macro doing future-proof (e.g. gcc v5.0) check
* Only use ProductWIP if vectorisation is enabled
* rename EIGEN_ALWAYS_INLINE -> EIGEN_INLINE with fall-back to inline keyword
* some cleanup/indentation
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Introduce a new highly optimized matrix-matrix product for large
matrices. The code is still highly experimental and it is activated
only if you define EIGEN_WIP_PRODUCT at compile time.
Currently the third dimension of the product must be a factor of
the packet size (x4 for floats) and the right handed side matrix
must be column major.
Moreover, currently c = a*b; actually computes c += a*b !!
Therefore, the code is provided for experimentation purpose only !
These limitations will be fixed soon or later to become the default
product implementation.
|