| Commit message (Collapse) | Author | Age |
|
|
|
| |
to compile (duplicate symbols).
|
| |
|
| |
|
|
|
|
|
|
|
| |
and various cleaning in Altivec code. Altivec vectorization have been re-enabled
in CoreDeclaration
* added copy constructors in non empty functors because I observed weird behavior with
std::complex<>
|
| |
|
|
|
|
|
|
|
|
| |
very small fixed size matrices)
* bugfix in Dot unroller
* added special random generator for the unit tests and reduced the tolerance threshold by an order of magnitude
this fixes issues with sum.cpp but other tests still failed sometimes, this have to be carefully checked...
|
| |
|
|
|
|
| |
and bug fix in PacketMath/SSE
|
|
|
|
| |
* fix warning in SolveTriangular
|
|
|
|
|
| |
This allows much faster code dealing with unligned as
in the updated matrix-vector product functions.
|
|
|
|
|
|
|
|
| |
* faster matrix-matrix and matrix-vector products (especially for not aligned cases)
* faster tridiagonalization (make it using our matrix-vector impl.)
Others:
* fix Flags of Map
* split the test_product to two smaller ones
|
|
|
|
|
|
| |
Documentation:
* add an overview for each module.
* add an example for .all() and Cwise::operator<
|
|
|
|
|
|
|
|
|
|
| |
- conflicts with operator * overloads
- discard the use of ei_pdiv for interger
(g++ handles operators on __m128* types, this is why it worked)
- weird behavior of icc in fixed size Block() constructor complaining
the initializer of m_blockRows and m_blockCols were missing while
we are in fixed size (maybe this hide deeper problem since this is a
recent one, but icc gives only little feedback)
|
|
|
|
|
|
|
| |
* Improve the efficiency of matrix*vector in unaligned cases
* Trivial fixes in the destructors of MatrixStorage
* Removed the matrixNorm in test/product.cpp (twice faster and
that assumed the matrix product was ok while checking that !!)
|
|
|
|
|
|
|
|
| |
* rework PacketMath and DummyPacketMath, make these actual template
specializations instead of just overriding by non-template inline
functions
* introduce ei_ploadt and ei_pstoret, make use of them in Map and Matrix
* remove Matrix::map() methods, use Map constructors instead.
|
|
|
|
| |
* add vdw benchmark from Tim's real-world use case
|
|
|
|
|
|
|
|
|
| |
packet access, it is not certain that it will bring a performance
improvement: benchmarking needed.
* improve logic choosing slice vectorization.
* fix typo in SSE packet math, causing crash in unaligned case.
* fix bug in Product, causing crash in unaligned case.
* add TEST_SSE3 CMake option.
|
|
|
|
| |
-finline-limit=1000 to gcc to get good performance. By the way some cleanup.
|
|
rename the noarch PacketMath.h to DummyPacketMath.h
|