aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen/src/Core/arch
Commit message (Collapse)AuthorAge
* Let's try to stick to the original code, thus activate the fix of #62 only ↵Gravatar Hauke Heibel2009-11-04
| | | | for 64 bit builds.
* Direct access of the packet structs fixes bug #62 and doe not seem toGravatar Hauke Heibel2009-11-04
| | | | influence compiler optimization.
* we were already aligning to 16 byte boundary fixed-size objects that are ↵Gravatar Benoit Jacob2009-10-05
| | | | | | | | multiple of 16 bytes; now we also align to 8byte boundary fixed-size objects that are multiple of 8 bytes. That's only useful for now for double, not e.g. for Vector2f, but that didn't seem to hurt. Am I missing something? Do you prefer that we don't align Vector2f at all? Also, improvements in test_unalignedassert.
* clean the commented asm instructions because now I'm sureGravatar Gael Guennebaud2009-09-17
| | | | the previous fix is ok
* fix #53: performance regression, hopefully I did not resurected anotherGravatar Gael Guennebaud2009-09-17
| | | | perf. issue...
* make custom asm directive volatileGravatar Gael Guennebaud2009-08-09
|
* * implement a second level of micro blocking (faster for small sizes)Gravatar Gael Guennebaud2009-08-07
| | | | * workaround GCC bad implementation of _mm_set1_p*
* finally directly calling the low-level products is fasterGravatar Gael Guennebaud2009-07-10
|
* only disable the inline ASM if we're NEITHER gcc nor icc. right ??Gravatar Benoit Jacob2009-06-26
|
* re-enable the fast unaligned loads for gcc and icc using inline assemblyGravatar Gael Guennebaud2009-06-24
| | | | (this allows to avoid incompatible pointer casts and to specify the dependency to the data explicitely)
* use the slower unaligned load intrinsics in ei_ploadu because GCC mess up ↵Gravatar Gael Guennebaud2009-06-23
| | | | with my tricks
* remove sentence "Eigen itself is part of the KDE project."Gravatar Benoit Jacob2009-05-22
| | | | it never made very precise sense. but now does it still make any?
* * compilation fixes for gcc 3.3Gravatar Gael Guennebaud2009-05-06
| | | | * test Part::swap
* fix warnings with unused static functionsGravatar Benoit Jacob2009-05-04
|
* make the ei_p* math functions overloads instead of templateGravatar Gael Guennebaud2009-04-22
| | | | specializations
* more patches from Hauke Heibel: compilation/warning fixes from VC++Gravatar Benoit Jacob2009-04-09
|
* relicence Julien Pommier's SSE code to Eigen's licensesGravatar Gael Guennebaud2009-04-09
|
* * fix the binary bloat issue, Rohit's idea was the good oneGravatar Benoit Jacob2009-04-06
| | | | * a few dox fixes (alloc routines do return 0 on error) and forgot to update version number in CMakeLists
* add vectorization of sqrt for floatGravatar Gael Guennebaud2009-03-27
|
* for some reason passing the argument by const reference killed the perfGravatar Gael Guennebaud2009-03-25
| | | | | (in the packet version of sin, cos, exp, lop), so let's pass them by value. Also, improve the perf of ei_plog by reducing dependencies.
* add SSE2 versions of sin, cos, log, exp using code from JulienGravatar Gael Guennebaud2009-03-25
| | | | | | | | Pommier. They are for float only, and they return exactly the same result as the standard versions in about 90% of the cases. Otherwise the max error is below 1e-7. However, for very large values (>1e3) the accuracy of sin and cos slighlty decrease. They are about 3 or 4 times faster than 4 calls to their respective standard versions. So, is it ok to enable them by default in their respective functors ?
* ei_pnegate implemented for AltiVecGravatar Konstantinos A. Margaritis2009-03-20
|
* add vectorization of unary operator-() (the AltiVec version is probablyGravatar Gael Guennebaud2009-03-20
| | | | broken)
* add the vectorization of absGravatar Gael Guennebaud2009-03-09
|
* slight optimization of SSE base integer mul (thanks to Rohit Garg)Gravatar Gael Guennebaud2009-03-08
|
* add much faster versions of unaligned stores (and slightly fasterGravatar Gael Guennebaud2009-03-03
| | | | unaligned loads)
* Add COMPONENT DevelGravatar Laurent Montel2009-02-23
|
* no reason for 3 vec_mins, 2 are enough apparently in ei_predux_minGravatar Konstantinos A. Margaritis2009-02-12
|
* modified ei_predux_min/max to actually use altivec instructionsGravatar Konstantinos A. Margaritis2009-02-12
|
* * exit Sum.h, exit Prod.h, welcome vectorization of redux() !Gravatar Gael Guennebaud2009-02-12
| | | | * add vectorization for minCoeff and maxCoeff
* add ei_predux_mul for AltiVecGravatar Gael Guennebaud2009-02-10
|
* * add ei_predux_mul internal functionGravatar Gael Guennebaud2009-02-10
| | | | * apply Ricard Marxer's prod() patch with fixes for the vectorized path
* fixed preserve_mask definition for AltiVec (needed __vector keyword)Gravatar Konstantinos A. Margaritis2009-02-08
|
* add bench_reverse, draft of a reverse vectorization for AltiVec, makeGravatar Gael Guennebaud2009-02-06
| | | | global Scaling function static
* Add vectorization of Reverse (was more tricky than I thought) andGravatar Gael Guennebaud2009-02-06
| | | | simplify the index based functions
* fix MSVC internal compilation errorGravatar Gael Guennebaud2009-01-29
|
* fix a bunch of warnings (actual issues) reported by FrankGravatar Benoit Jacob2009-01-22
|
* * fix a vectorization issue in ProductGravatar Gael Guennebaud2008-12-19
| | | | | | * use _mm_malloc/_mm_free on other platforms than linux of MSVC (eg., cygwin, OSX) * replace a lot of inline keywords by EIGEN_STRONG_INLINE to compensate for poor MSVC inlining
* Hopefully fix compilation of SSE Packetmath with MSVC.Gravatar Benoit Jacob2008-12-16
| | | | | The reason why we didn't realize until now that it didn't compile at all with MSVC is that before today with MSVC the SSE2 detection didn't work.
* Missing inline keywords in AltiVec/PacketMath were making Avogadro failGravatar Benoit Jacob2008-08-27
| | | | to compile (duplicate symbols).
* remove double ;Gravatar Benoit Jacob2008-08-27
|
* replace vector by __vector to prevent conflict with std::vectorGravatar Benoit Jacob2008-08-26
|
* * patch from Konstantinos Margaritis: bugfix in Altivec version of ei_pdivGravatar Gael Guennebaud2008-08-25
| | | | | | | and various cleaning in Altivec code. Altivec vectorization have been re-enabled in CoreDeclaration * added copy constructors in non empty functors because I observed weird behavior with std::complex<>
* Shut up two bogus gcc 4.3 warningsGravatar Benoit Jacob2008-08-25
|
* * bugfix in SolveTriangular found by Timothy Hunter (did not compiled for ↵Gravatar Gael Guennebaud2008-08-22
| | | | | | | | very small fixed size matrices) * bugfix in Dot unroller * added special random generator for the unit tests and reduced the tolerance threshold by an order of magnitude this fixes issues with sum.cpp but other tests still failed sometimes, this have to be carefully checked...
* patch from Konstantinos Margaritis: Altivec vectorization is resurrected !Gravatar Gael Guennebaud2008-08-22
|
* Add a packetmath unit test, re-enable the comma-initializer unit test,Gravatar Gael Guennebaud2008-08-20
| | | | and bug fix in PacketMath/SSE
* * fix bug found by Boudewijn Rempt: no CMakeLists in arch/ subdirGravatar Benoit Jacob2008-08-19
| | | | * fix warning in SolveTriangular
* Added a ei_palign function align a packet from two others.Gravatar Gael Guennebaud2008-08-03
| | | | | This allows much faster code dealing with unligned as in the updated matrix-vector product functions.
* Optimizations:Gravatar Gael Guennebaud2008-08-01
| | | | | | | | * faster matrix-matrix and matrix-vector products (especially for not aligned cases) * faster tridiagonalization (make it using our matrix-vector impl.) Others: * fix Flags of Map * split the test_product to two smaller ones