aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen/src
Commit message (Collapse)AuthorAge
...
| * Similar to cset 3589a9c115a892ea3ca5dac74d71a1526764cb38Gravatar Benoit Jacob2015-03-16
| | | | | | | | , also in 2px4 kernel: actual_panel_rows computation should always be resilient to parameters not consistent with the known L1 cache size, see comment
| * fix bug in maxsize calculation, which would cause products of size > 2048 to ↵Gravatar Benoit Jacob2015-03-16
| | | | | | | | address the lookup table out of bounds
| * Update Nexus 5 lookup table from combining now 2 runs of the benchmark, ↵Gravatar Benoit Jacob2015-03-16
| | | | | | | | using the analyze-blocking-sizes partition tool. Gives better worst-case performance.
| * fix compilation with GCC 4.8Gravatar Benoit Jacob2015-03-16
| |
| * Fix bug in case where EIGEN_TEST_SPECIFIC_BLOCKING_SIZE is defined but falseGravatar Benoit Jacob2015-03-15
| |
| * Provide a empirical lookup table for blocking sizes measured on a Nexus 5. ↵Gravatar Benoit Jacob2015-03-15
| | | | | | | | Only for float, only for Android on ARM 32bit for now.
| * actual_panel_rows computation should always be resilient to parameters not ↵Gravatar Benoit Jacob2015-03-15
| | | | | | | | consistent with the known L1 cache size, see comment
| * Fix a unused-var warningGravatar Benoit Jacob2015-03-15
| |
| * Refactor computeProductBlockingSizes to make room for the possibility of ↵Gravatar Benoit Jacob2015-03-15
| | | | | | | | using lookup tables
| * organize a little our default cache sizes, and use a saner default L1 ↵Gravatar Benoit Jacob2015-03-13
| | | | | | | | outside of x86 (10% faster on Nexus 5)
| * bug #973, improve AVX support by enabling vectorization of Vector4i-like ↵Gravatar Gael Guennebaud2015-03-13
| | | | | | | | types, and enforcing alignement of Vector4f/Vector2d-like types to preserve compatibility with SSE and future Eigen versions that will vectorize them with AVX enabled.
| * Fix internal::random(x,y) for integer types. The previous implementation ↵Gravatar Gael Guennebaud2015-03-13
| | | | | | | | could return y+1. The new implementation uses rejection sampling to get an unbiased behabior.
| * bug #949: add static assertion for incompatible scalar types in dense ↵Gravatar Gael Guennebaud2015-03-13
| | | | | | | | end-user decompositions.
| * SparseMatrix::insert: switch to a fully uncompressed mode if sequential ↵Gravatar Gael Guennebaud2015-03-13
| | | | | | | | insertion is not possible (otherwise an arbitrary large amount of memory was preallocated in some cases)
| * Bound pre-allocation to the maximal size representable by StorageIndex and ↵Gravatar Gael Guennebaud2015-03-13
| | | | | | | | throw bad_alloc if that's not possible.
| * Add missing coeff/coeffRef members to Block<sparse>, and extend unit tests.Gravatar Gael Guennebaud2015-03-13
| |
| * Fix compilation of iterative solvers with dense matricesGravatar Gael Guennebaud2015-03-09
| |
| * Add typedefs for return types of SparseMatrixBase::selfadjointViewGravatar Gael Guennebaud2015-03-09
| |
| * Add unit tests for CG and sparse-LLT for long int as storage-indexGravatar Gael Guennebaud2015-03-09
| |
| * bug #963: make IncompleteLUT compatible with non-default storage index types.Gravatar Gael Guennebaud2015-03-09
| |
| * Avoid undeflow when blocking size are tuned manually.Gravatar Gael Guennebaud2015-03-06
| |
| * bug #969: workaround abiguous calls to Ref using enable_if.Gravatar Gael Guennebaud2015-03-06
| |
| * bug #978: early return for vanishing productsGravatar Gael Guennebaud2015-03-06
| |
| * Improve blocking heuristic: if the lhs fit within L1, then block on the rhs ↵Gravatar Gael Guennebaud2015-03-06
| | | | | | | | in L1 (allows to keep packed rhs in L1)
| * Improve product kernel: replace the previous dynamic loop swaping strategy ↵Gravatar Gael Guennebaud2015-03-06
| | | | | | | | | | | | by a more general one: It consists in increasing the actual number of rows of lhs's micro horizontal panel for small depth such that L1 cache is fully exploited.
| * Rename LSCG to LeastSquaresConjugateGradientGravatar Gael Guennebaud2015-03-05
| |
| * Product optimization: implement a dynamic loop-swapping startegy to improve ↵Gravatar Gael Guennebaud2015-03-05
| | | | | | | | memory accesses to the destination matrix in the case of K-rank-update like products, i.e., for products of the kind: "large x small" * "small x large"
| * bug #824: improve accuracy of Quaternion::angularDistance using atan2 ↵Gravatar Gael Guennebaud2015-03-04
| | | | | | | | instead of acos.
* | Fixed the optimized AVX implementation of the fast rsqrt functionGravatar Benoit Steiner2015-03-02
| |
* | Added an optimized version of rsqrt for SSE and AVX that is used when ↵Gravatar Benoit Steiner2015-03-02
| | | | | | | | EIGEN_FAST_MATH is defined.
* | Improved the default implementation of prsqrtGravatar Benoit Steiner2015-02-28
| |
* | Pulled latest updates from trunkGravatar Benoit Steiner2015-02-27
|\ \
* | | Added support for 32bit index on a per tensor/tensor expression. This ↵Gravatar Benoit Steiner2015-02-27
| | | | | | | | | | | | enables us to use 32bit indices to evaluate expressions on GPU faster while keeping the ability to use 64 bit indices to manipulate large tensors on CPU in the same binary.
* | | Switch to truncated casting when converting floating point types to integer. ↵Gravatar Benoit Steiner2015-02-27
| | | | | | | | | | | | This ensures that vectorized casts are consistent with scalar casts
* | | Added support for vectorized type casting of tensorsGravatar Benoit Steiner2015-02-27
| | |
* | | Added support for fast reciprocal square root computation.Gravatar Benoit Steiner2015-02-26
| | |
| | * Really use zero guess in ConjugateGradients::solve as documentedGravatar Jan Blechta2015-02-18
| | | | | | | | | | | | and expected for consistency with other methods.
| | * mergeGravatar Gael Guennebaud2015-03-04
| | |\
| | * | Check for no-reallocation in SparseMatrix::insert (bug #974)Gravatar Gael Guennebaud2015-03-04
| | | |
| | * | Improve efficiency of SparseMatrix::insert/coeffRef for sequential ↵Gravatar Gael Guennebaud2015-03-04
| | | | | | | | | | | | | | | | outer-index insertion strategies (bug #974)
| | * | Add a CG-based solver for rectangular least-square problems (bug #975).Gravatar Gael Guennebaud2015-03-04
| | | |
| | | * Fix asm comments in 1px1 kernelGravatar Benoit Jacob2015-03-03
| | | |
| | | * Add a benchmark-default-sizes action to benchmark-blocking-sizes.cppGravatar Benoit Jacob2015-03-03
| | | |
| | | * New scoring functor to select the pivot.Gravatar Marc Glisse2015-03-03
| | | | | | | | | | | | | | | | This is can be useful for non-floating point scalars, where choosing the biggest element is generally not the best choice.
| | | * must also disable complex<double> when disabling double vectorizationGravatar Benoit Jacob2015-03-03
| | |/
| | * Work around an ICE in Clang 3.5 in the iOS toolchain with double NEON ↵Gravatar Benoit Jacob2015-03-03
| | | | | | | | | | | | intrinsics.
| | * HalfPacket also needed to be disabled for double, on ARMv8.Gravatar Benoit Jacob2015-03-02
| | |
| | * Add SSE vectorization of Quaternion::conjugate. Significant speed-up when ↵Gravatar Gael Guennebaud2015-03-02
| | | | | | | | | | | | combined with products like q1*q2.conjugate()
| | * Increase unit-test L1 cache size to ensure we are doing at least 2 peeled ↵Gravatar Gael Guennebaud2015-02-27
| | | | | | | | | | | | loop within product kernel.
| | * Re-enbale detection of min/max parentheses protection, and re-enable ↵Gravatar Gael Guennebaud2015-02-27
| |/ | | | | | | mpreal_support unit test.