aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen/src/Core/products
Commit message (Collapse)AuthorAge
...
| * Remove the rotating kernel. It was only useful on some ARM CPUs (Qualcomm ↵Gravatar Benoit Jacob2016-05-24
| | | | | | | | Krait) that are not as ubiquitous today as they were when I introduced it.
| * Don't optimize the processing of the last rows of a matrix matrix product in ↵Gravatar Benoit Steiner2016-05-23
| | | | | | | | cases that violate the assumptions made by the optimized code path.
* | Pulled latest updates from upstreamGravatar Benoit Steiner2016-04-29
|\|
| * Made the index type a template parameter to evaluateProductBlockingSizesGravatar Benoit Steiner2016-04-27
| | | | | | | | Use numext::mini and numext::maxi instead of std::min/std::max to compute blocking sizes.
| * Deleted extraneous comma.Gravatar Benoit Steiner2016-04-15
| |
| * Improved the matrix multiplication blocking in the case where mr is not a ↵Gravatar Benoit Steiner2016-04-15
| | | | | | | | power of 2 (e.g on Haswell CPUs).
| * Fix trmv for mixing types.Gravatar Gael Guennebaud2016-04-15
| |
| * Added ability to access the cache sizes from the tensor devicesGravatar Benoit Steiner2016-04-14
| |
| * Workaround a division by zero when outerstride==0Gravatar Gael Guennebaud2016-04-13
| |
* | Pull latest updates from upstreamGravatar Benoit Steiner2016-04-11
|\|
| * Cleanup obsolete assign_scalar_eig2mkl helper.Gravatar Gael Guennebaud2016-04-11
| |
| * Remove all references to MKL in BLAS wrappers.Gravatar Gael Guennebaud2016-04-11
| |
| * Fix long to int conversion in BLAS API.Gravatar Gael Guennebaud2016-04-11
| |
| * Silent unused warning.Gravatar Gael Guennebaud2016-04-11
| |
| * Relax dependency on MKL for EIGEN_USE_BLASGravatar Gael Guennebaud2016-04-11
| |
| * Removed executable bit from header filesGravatar Benoit Steiner2016-03-23
| |
| * bug #1161: fix division by zero for huge scalar typesGravatar Gael Guennebaud2016-02-03
| |
* | Updated the matrix multiplication code to make it compile with AVX512 enabled.Gravatar Benoit Steiner2016-02-01
| |
| * Fix tri = complex * real product, and add respective unit test.Gravatar Gael Guennebaud2016-01-27
| |
| * Remove dead code.Gravatar Gael Guennebaud2016-01-26
| |
| * Re-enable blocking on rows in non-l3 blocking mode.Gravatar Gael Guennebaud2016-01-26
| |
| * Make sure that micro-panel-size is smaller than blocking sizes (otherwise we ↵Gravatar Gael Guennebaud2016-01-26
| | | | | | | | might get a buffer overflow)
| * Make sure that block sizes are smaller than input matrix sizes.Gravatar Gael Guennebaud2016-01-26
| |
| * bug #51: add block preallocation mechanism to selfadjoit*matrix product.Gravatar Gael Guennebaud2016-01-25
| |
| * bug #51: make general_matrix_matrix_triangular_product use L3-blocking ↵Gravatar Gael Guennebaud2016-01-25
| | | | | | | | helper so that general symmetric rank-updates and general-matrix-to-triangular products do not trigger dynamic memory allocation for fixed size matrices.
| * bug #1151: remove useless critical sectionGravatar Gael Guennebaud2016-01-21
| |
* | Disabled part of the matrix matrix peeling code that's incompatible with 512 ↵Gravatar Benoit Steiner2015-12-21
| | | | | | | | bit registers
| * Fix compilation of MKL support.Gravatar Gael Guennebaud2015-12-11
|/
* Fixes internal compiler error while compiling with VC2015 Update1 x64.Gravatar Nikolay Fedorov2015-12-03
|
* Fix degenerate cases in syrk and trsmGravatar Gael Guennebaud2015-11-30
|
* Use a class constructor to initialize CPU cache sizesGravatar Chris Jones2015-11-20
| | | | | | | | Using a static instance of a class to initialize the values for the CPU cache sizes guarantees thread-safe initialization of the values when using C++11. Therefore under C++11 it is no longer necessary to call Eigen::initParallel() before calling any eigen functions on different threads.
* Avoid any openmp calls if multi-threading is explicitely disabled at runtime.Gravatar Gael Guennebaud2015-10-22
|
* Improve numerical accuracy in LLT and triangular solve by using true scalar ↵Gravatar Gael Guennebaud2015-10-18
| | | | divisions (instead of x * (1/y))
* Remove dead code in selfadjoint_matrix_vector_productGravatar Gael Guennebaud2015-10-09
|
* Optimize a bit complex selfadjoint * vector product.Gravatar Gael Guennebaud2015-10-09
|
* bug #1043: Avoid integer conversion sign warningGravatar Christoph Hertzberg2015-08-19
|
* Generalize first_aligned to take the requested alignment as a template ↵Gravatar Gael Guennebaud2015-08-06
| | | | parameter, and add a first_default_aligned variante calling first_aligned with the requirement of the largest packet for the given scalar type.
* Enable runtime stack alignment in gemm_blocking_space.Gravatar Gael Guennebaud2015-08-06
|
* bug #973: update macro-level control of alignement by introducing ↵Gravatar Gael Guennebaud2015-07-29
| | | | user-controllable EIGEN_MAX_ALIGN_BYTES and EIGEN_MAX_STATIC_ALIGN_BYTES macros. This changeset also removes EIGEN_ALIGN (replaced by EIGEN_MAX_ALIGN_BYTES>0), EIGEN_ALIGN_STATICALLY (replaced by EIGEN_MAX_STATIC_ALIGN_BYTES>0), EIGEN_USER_ALIGN*, EIGEN_ALIGN_DEFAULT (replaced by EIGEN_ALIGN_MAX).
* bug #923: fix EIGEN_USE_BLAS modeGravatar Gael Guennebaud2015-06-23
|
* Remove a few deprecated internal expressionsGravatar Gael Guennebaud2015-06-19
|
* Fix shadow warnings triggered by clangGravatar Gael Guennebaud2015-06-09
|
* Abandon blocking size lookup table approach. Not performing as well in real ↵Gravatar Benoit Jacob2015-05-19
| | | | world as in microbenchmark.
* Improved the blocking strategy to speedup multithreaded tensor contractions.Gravatar Benoit Steiner2015-04-09
|
* add a note on bug #992Gravatar Gael Guennebaud2015-04-08
|
* bug #992: don't select a 3p GEMM path with non-vectorizable scalar types, ↵Gravatar Benoit Jacob2015-04-07
| | | | this hits unsupported paths in symm/triangular products code
* Only use blocking sizes LUTs for single-thread products for nowGravatar Benoit Jacob2015-03-31
|
* Fix computeProductBlockingSizes with m==0, and add respective unit test.Gravatar Gael Guennebaud2015-03-31
|
* use unsigned short instead of uint16_t which doesn't exist in c++98Gravatar Benoit Jacob2015-03-17
|
* Similar to cset 3589a9c115a892ea3ca5dac74d71a1526764cb38Gravatar Benoit Jacob2015-03-16
| | | | , also in 2px4 kernel: actual_panel_rows computation should always be resilient to parameters not consistent with the known L1 cache size, see comment