Commit message (Collapse) | Author | Age | ||
---|---|---|---|---|
... | ||||
| * | Similar to cset 3589a9c115a892ea3ca5dac74d71a1526764cb38 | 2015-03-16 | ||
| | | | | | | | | , also in 2px4 kernel: actual_panel_rows computation should always be resilient to parameters not consistent with the known L1 cache size, see comment | |||
| * | fix bug in maxsize calculation, which would cause products of size > 2048 to ↵ | 2015-03-16 | ||
| | | | | | | | | address the lookup table out of bounds | |||
| * | Update Nexus 5 lookup table from combining now 2 runs of the benchmark, ↵ | 2015-03-16 | ||
| | | | | | | | | using the analyze-blocking-sizes partition tool. Gives better worst-case performance. | |||
| * | fix compilation with GCC 4.8 | 2015-03-16 | ||
| | | ||||
| * | Fix bug in case where EIGEN_TEST_SPECIFIC_BLOCKING_SIZE is defined but false | 2015-03-15 | ||
| | | ||||
| * | Provide a empirical lookup table for blocking sizes measured on a Nexus 5. ↵ | 2015-03-15 | ||
| | | | | | | | | Only for float, only for Android on ARM 32bit for now. | |||
| * | actual_panel_rows computation should always be resilient to parameters not ↵ | 2015-03-15 | ||
| | | | | | | | | consistent with the known L1 cache size, see comment | |||
| * | Fix a unused-var warning | 2015-03-15 | ||
| | | ||||
| * | Refactor computeProductBlockingSizes to make room for the possibility of ↵ | 2015-03-15 | ||
| | | | | | | | | using lookup tables | |||
| * | organize a little our default cache sizes, and use a saner default L1 ↵ | 2015-03-13 | ||
| | | | | | | | | outside of x86 (10% faster on Nexus 5) | |||
| * | bug #973, improve AVX support by enabling vectorization of Vector4i-like ↵ | 2015-03-13 | ||
| | | | | | | | | types, and enforcing alignement of Vector4f/Vector2d-like types to preserve compatibility with SSE and future Eigen versions that will vectorize them with AVX enabled. | |||
| * | Fix internal::random(x,y) for integer types. The previous implementation ↵ | 2015-03-13 | ||
| | | | | | | | | could return y+1. The new implementation uses rejection sampling to get an unbiased behabior. | |||
| * | bug #949: add static assertion for incompatible scalar types in dense ↵ | 2015-03-13 | ||
| | | | | | | | | end-user decompositions. | |||
| * | SparseMatrix::insert: switch to a fully uncompressed mode if sequential ↵ | 2015-03-13 | ||
| | | | | | | | | insertion is not possible (otherwise an arbitrary large amount of memory was preallocated in some cases) | |||
| * | Bound pre-allocation to the maximal size representable by StorageIndex and ↵ | 2015-03-13 | ||
| | | | | | | | | throw bad_alloc if that's not possible. | |||
| * | Add missing coeff/coeffRef members to Block<sparse>, and extend unit tests. | 2015-03-13 | ||
| | | ||||
| * | Fix compilation of iterative solvers with dense matrices | 2015-03-09 | ||
| | | ||||
| * | Add typedefs for return types of SparseMatrixBase::selfadjointView | 2015-03-09 | ||
| | | ||||
| * | Add unit tests for CG and sparse-LLT for long int as storage-index | 2015-03-09 | ||
| | | ||||
| * | bug #963: make IncompleteLUT compatible with non-default storage index types. | 2015-03-09 | ||
| | | ||||
| * | Avoid undeflow when blocking size are tuned manually. | 2015-03-06 | ||
| | | ||||
| * | bug #969: workaround abiguous calls to Ref using enable_if. | 2015-03-06 | ||
| | | ||||
| * | bug #978: early return for vanishing products | 2015-03-06 | ||
| | | ||||
| * | Improve blocking heuristic: if the lhs fit within L1, then block on the rhs ↵ | 2015-03-06 | ||
| | | | | | | | | in L1 (allows to keep packed rhs in L1) | |||
| * | Improve product kernel: replace the previous dynamic loop swaping strategy ↵ | 2015-03-06 | ||
| | | | | | | | | | | | | by a more general one: It consists in increasing the actual number of rows of lhs's micro horizontal panel for small depth such that L1 cache is fully exploited. | |||
| * | Rename LSCG to LeastSquaresConjugateGradient | 2015-03-05 | ||
| | | ||||
| * | Product optimization: implement a dynamic loop-swapping startegy to improve ↵ | 2015-03-05 | ||
| | | | | | | | | memory accesses to the destination matrix in the case of K-rank-update like products, i.e., for products of the kind: "large x small" * "small x large" | |||
| * | bug #824: improve accuracy of Quaternion::angularDistance using atan2 ↵ | 2015-03-04 | ||
| | | | | | | | | instead of acos. | |||
* | | Fixed the optimized AVX implementation of the fast rsqrt function | 2015-03-02 | ||
| | | ||||
* | | Added an optimized version of rsqrt for SSE and AVX that is used when ↵ | 2015-03-02 | ||
| | | | | | | | | EIGEN_FAST_MATH is defined. | |||
* | | Improved the default implementation of prsqrt | 2015-02-28 | ||
| | | ||||
* | | Pulled latest updates from trunk | 2015-02-27 | ||
|\ \ | ||||
* | | | Added support for 32bit index on a per tensor/tensor expression. This ↵ | 2015-02-27 | ||
| | | | | | | | | | | | | enables us to use 32bit indices to evaluate expressions on GPU faster while keeping the ability to use 64 bit indices to manipulate large tensors on CPU in the same binary. | |||
* | | | Switch to truncated casting when converting floating point types to integer. ↵ | 2015-02-27 | ||
| | | | | | | | | | | | | This ensures that vectorized casts are consistent with scalar casts | |||
* | | | Added support for vectorized type casting of tensors | 2015-02-27 | ||
| | | | ||||
* | | | Added support for fast reciprocal square root computation. | 2015-02-26 | ||
| | | | ||||
| | * | Really use zero guess in ConjugateGradients::solve as documented | 2015-02-18 | ||
| | | | | | | | | | | | | and expected for consistency with other methods. | |||
| | * | merge | 2015-03-04 | ||
| | |\ | ||||
| | * | | Check for no-reallocation in SparseMatrix::insert (bug #974) | 2015-03-04 | ||
| | | | | ||||
| | * | | Improve efficiency of SparseMatrix::insert/coeffRef for sequential ↵ | 2015-03-04 | ||
| | | | | | | | | | | | | | | | | outer-index insertion strategies (bug #974) | |||
| | * | | Add a CG-based solver for rectangular least-square problems (bug #975). | 2015-03-04 | ||
| | | | | ||||
| | | * | Fix asm comments in 1px1 kernel | 2015-03-03 | ||
| | | | | ||||
| | | * | Add a benchmark-default-sizes action to benchmark-blocking-sizes.cpp | 2015-03-03 | ||
| | | | | ||||
| | | * | New scoring functor to select the pivot. | 2015-03-03 | ||
| | | | | | | | | | | | | | | | | This is can be useful for non-floating point scalars, where choosing the biggest element is generally not the best choice. | |||
| | | * | must also disable complex<double> when disabling double vectorization | 2015-03-03 | ||
| | |/ | ||||
| | * | Work around an ICE in Clang 3.5 in the iOS toolchain with double NEON ↵ | 2015-03-03 | ||
| | | | | | | | | | | | | intrinsics. | |||
| | * | HalfPacket also needed to be disabled for double, on ARMv8. | 2015-03-02 | ||
| | | | ||||
| | * | Add SSE vectorization of Quaternion::conjugate. Significant speed-up when ↵ | 2015-03-02 | ||
| | | | | | | | | | | | | combined with products like q1*q2.conjugate() | |||
| | * | Increase unit-test L1 cache size to ensure we are doing at least 2 peeled ↵ | 2015-02-27 | ||
| | | | | | | | | | | | | loop within product kernel. | |||
| | * | Re-enbale detection of min/max parentheses protection, and re-enable ↵ | 2015-02-27 | ||
| |/ | | | | | | | mpreal_support unit test. |