aboutsummaryrefslogtreecommitdiffhomepage
Commit message (Collapse)AuthorAge
* improve block-size heuristicGravatar Gael Guennebaud2010-07-20
|
* fix openmp versionGravatar Gael Guennebaud2010-07-20
|
* fix declaration of pack_lhs in trsmGravatar Gael Guennebaud2010-07-20
|
* uncomment commented code for debugGravatar Gael Guennebaud2010-07-20
|
* report a true assert when not checking for an assertionGravatar Gael Guennebaud2010-07-20
|
* it appears only the "on the left" case was testedGravatar Gael Guennebaud2010-07-20
|
* fix trmm and symm wrt lhs packingGravatar Gael Guennebaud2010-07-20
|
* fix compilation by including file in correct orderGravatar Gael Guennebaud2010-07-19
|
* * fix SelfCwiseBinaryOp traits and handling of mixed typesGravatar Gael Guennebaud2010-07-19
| | | | * improve compilation error in case of type mismatch
* explicitely disable vectorization for mixed coeff based productsGravatar Gael Guennebaud2010-07-19
|
* fix lhs packing in the case of real * complex productsGravatar Gael Guennebaud2010-07-19
|
* port Jacobi to new ei_pset1/ei_pload APIGravatar Gael Guennebaud2010-07-19
|
* * fix compilation of mixed scalar productGravatar Gael Guennebaud2010-07-19
| | | | * optimize mixed scalar products
* * fix a couple of remaining issues with previous commit,Gravatar Gael Guennebaud2010-07-19
| | | | * merge ei_product_blocking_traits into ei_gepb_traits
* * _mm_loaddup_pd is slowGravatar Gael Guennebaud2010-07-19
| | | | * optimize SSE ei_ploaddup<Packet4f>
* wip: extend the gebp kernel to optimize complex and mixed productsGravatar Gael Guennebaud2010-07-19
|
* update mixing type testGravatar Gael Guennebaud2010-07-15
|
* update unit test for new APIGravatar Gael Guennebaud2010-07-15
|
* add support for mixing type in trsvGravatar Gael Guennebaud2010-07-13
|
* optimize non fused MADD, and add a flatten attribute macro to enforceGravatar Gael Guennebaud2010-07-13
| | | | inlining within a function
* matrix product: move the alpha factor to gebp instead of the packing,Gravatar Gael Guennebaud2010-07-12
| | | | clean some temporaries, etc.
* mixing types step 3:Gravatar Gael Guennebaud2010-07-11
| | | | | - improve support of colmajor by vector and matrix - matrix - now all configurations are well handled, but the perf are not always very good
* make colmaj * vector uses pointers onlyGravatar Gael Guennebaud2010-07-11
|
* mixing types in product step 2:Gravatar Gael Guennebaud2010-07-11
| | | | | | | | * pload* and pset1 are now templated on the packet type * gemv routines are now embeded into a structure with a consistent API with respect to gemm * some configurations of vector * matrix and matrix * matrix works fine, some need more work...
* syncGravatar Gael Guennebaud2010-07-10
|\
| * * generalize rowmajor by vectorGravatar Gael Guennebaud2010-07-10
| | | | | | | | * fix weird compilation error when constructing a matrix with a row by matrix product
| * fix compilation: make the check_coordinates* functions constGravatar Gael Guennebaud2010-07-10
| |
| * let ei_pset1 use _mm_loaddup_pd. Not a significant speed improvement, but ↵Gravatar Benoit Jacob2010-07-09
| | | | | | | | also not a speed regression, and replaces 3 instructions by 1 single instruction.
| * Added NEON/Complex.h, ~3.5x faster than scalar std::complex<float>Gravatar Konstantinos Margaritis2010-07-10
| | | | | | | | minor fix in AltiVec Complex.h
| * disable MSVC optimization when the underlying compiler is ICCGravatar Gael Guennebaud2010-07-09
| |
| * move ei_conj_if to a more appropriate fileGravatar Gael Guennebaud2010-07-09
| |
| * forgot to commit ei_p4f_FORWARD;Gravatar Konstantinos Margaritis2010-07-09
| |
| * forgot to add the Complex.h include for AltiVec.Gravatar Konstantinos Margaritis2010-07-09
| |
| * Altivec port of Complex.h.Gravatar Konstantinos Margaritis2010-07-09
| | | | | | | | | | | | | | | | Note: For some reason g++ 4.4 is >200% slower than g++ 4.3 on altivec code. The same benchmark (bench_gemm) was tested, on the same hardware/OS (G4/Debian testing), with same CFLAGS. With some code reorganizing I managed to get some minor gain on 4.4, but I just could not reach 4.3 speed. This is most likely a bug, but I'm waiting to see if it's fixed on 4.5. I'll look into this a bit more.
| * Be consistent in how the tutorial pages link together.Gravatar Jitse Niesen2010-07-09
| |
| * Small changes to tutorial page 2 (matrix arithmetic):Gravatar Jitse Niesen2010-07-09
| | | | | | | | | | | | * slightly more extensive discussion of aliasing * layout: put example code and output side-by-side * add some links, etc
* | fix a few weird issues with gcc 4.3 32bits and complex<float>Gravatar Gael Guennebaud2010-07-09
| |
| * bench: use of Eigen/Array is deprecated + fix includes for iostreamGravatar Thomas Capricelli2010-07-09
| |
* | fix SliceVectorizedTraversal for packetsize==1Gravatar Gael Guennebaud2010-07-08
| |
* | extend vectorization_logicGravatar Gael Guennebaud2010-07-08
| |
| * Added more redux types/examples in tutorial and fixed some display issuesGravatar Carlos Becker2010-07-08
| |
| * Reductions/Broadcasting/Visitor Tutorial added to indexGravatar Carlos Becker2010-07-08
| |
| * Reductions/Broadcasting/Visitor Tutorial addedGravatar Carlos Becker2010-07-08
| |
* | scalars fitting in a single packet requires more work, step 1Gravatar Gael Guennebaud2010-07-08
| | | | | | | | | | * add a, Alignable trait * update LinearVectorization assignment
* | compilation fixGravatar Gael Guennebaud2010-07-08
| |
| * enabling aligned loads/store for complex<double> is much more tricky,Gravatar Gael Guennebaud2010-07-07
| | | | | | | | so the temporary fix is to always perform unaligned load/store
* | an attempt to fix wrong unaligned storeGravatar Gael Guennebaud2010-07-07
| |
* | update to support mixin typesGravatar Gael Guennebaud2010-07-07
| |
* | support for real * complex matrix product - step 1 (works for some special ↵Gravatar Gael Guennebaud2010-07-07
| | | | | | | | cases)
| * mention that array = matrix is fine tooGravatar Gael Guennebaud2010-07-07
|/