Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | bug #1543: improve linear indexing for general block expressions | 2018-07-10 | |
| | |||
* | Introduce the macro ei_declare_local_nested_eval to help allocating on the ↵ | 2018-07-09 | |
| | | | | | | stack local temporaries via alloca, and let outer-products makes a good use of it. If successful, we should use it everywhere nested_eval is used to declare local dense temporaries. | ||
* | bug #1567: add optimized path for tensor broadcasting and 'Channel First' shape | 2018-07-09 | |
| | |||
* | Skip null numerators in triangular-vector-solve (as in BLAS TRSV). | 2018-07-09 | |
| | |||
* | Fix legitimate "declaration shadows a typedef" warning | 2018-07-09 | |
| | |||
* | Fix the Packet16h version of ptranspose | 2018-06-16 | |
| | | | | | | | | | | | The AVX512 version of ptranpose for PacketBlock<Packet16h,16> was reordering the PacketBlock argument incorrectly. This lead to errors in the multiplication of matrices composed of 16 bit floats on AVX512 machines, if at least of the matrices was using RowMajor order. This error is responsible for one tensorflow unit test failure on AVX512 machines: //tensorflow/python/kernel_tests:batch_matmul_op_test | ||
* | Fix a few issues with Packet16h | 2018-07-07 | |
| | |||
* | complete implementation of Packet16h (AVX512) | 2018-07-06 | |
| | |||
* | palign is not used anymore, so let's relax the unit test | 2018-07-06 | |
| | |||
* | test product kernel with half-floats. | 2018-07-06 | |
| | |||
* | Complete Packet8h implementation and test it in packetmath unit test | 2018-07-06 | |
| | |||
* | Add unitests for inverse and selfadjoint-eigenvalues on CUDA | 2018-07-06 | |
| | |||
* | Extend CUDA support to matrix inversion and selfadjointeigensolver | 2018-06-11 | |
| | |||
* | bug #1565: help MSVC to generatenot too bad ASM in reductions. | 2018-07-05 | |
| | |||
* | Implement custom inplace triangular product to avoid a temporary | 2018-07-03 | |
| | |||
* | Make is_same_dense compatible with different scalar types. | 2018-07-03 | |
| | |||
* | Activate dgmres unit test | 2018-07-02 | |
| | |||
* | Fix regression in changeset f05dea6b2326836e5e0243fbaffbece84b833d64 | 2018-07-02 | |
| | | | | : computeFromHessenberg can take any expression for matrixQ, not only an HouseholderSequence. | ||
* | Simplify redux_evaluator using inheritance, and properly rename parameters ↵ | 2018-07-02 | |
| | | | | in reducers. | ||
* | bug #1562: optimize evaluation of small products of the form s*A*B by ↵ | 2018-07-02 | |
| | | | | rewriting them as: s*(A.lazyProduct(B)) to save a costly temporary. Measured speedup from 2x to 5x... | ||
* | Fix unit test | 2018-07-01 | |
| | |||
* | update comment | 2018-06-29 | |
| | |||
* | Merged in net147/eigen (pull request PR-411) | 2018-06-28 | |
|\ | | | | | | | Use std::complex constructor instead of assignment from scalar | ||
* | | Fix order of EIGEN_DEVICE_FUNC and returned type | 2018-06-28 | |
| | | |||
| * | Use std::complex constructor instead of assignment from scalar | 2018-06-28 | |
|/ | | | | | Fixes GCC conversion to non-scalar type requested compile error when using boost::multiprecision::cpp_dec_float_50 as scalar type. | ||
* | First step towards a generic vectorised quaternion product | 2018-06-25 | |
| | |||
* | bug #1560 fix product with a 1x1 diagonal matrix | 2018-06-25 | |
| | |||
* | merge | 2018-06-22 | |
|\ | |||
* | | Fix typo in pbend for AltiVec. | 2018-06-22 | |
| | | |||
| * | Merged in rmlarsen/eigen2 (pull request PR-409) | 2018-06-21 | |
| |\ | |/ |/| | | | Fix oversharding bug in parallelFor. | ||
| * | bug #1555: compilation fix with XLC | 2018-06-21 | |
| | | |||
* | | Fix oversharding bug in parallelFor. | 2018-06-20 | |
|/ | |||
* | fix md5sum of lapack_addons | 2018-06-15 | |
| | |||
* | Merged in mfigurnov/eigen/gamma-der-a (pull request PR-403) | 2018-06-11 | |
|\ | | | | | | | | | | | Derivative of the incomplete Gamma function and the sample of a Gamma random variable Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com> | ||
* | | bug #1531: make dedicatd unit testing for NumDimensions | 2018-06-08 | |
| | | |||
* | | bug #1531: expose NumDimensions for solve and sparse expressions. | 2018-06-08 | |
| | | |||
* | | bug #1531: expose NumDimensions for compatibility with Tensor | 2018-06-08 | |
| | | |||
* | | bug #1550: prevent avoidable memory allocation in RealSchur | 2018-06-08 | |
| | | |||
* | | fix prototype | 2018-06-08 | |
| | | |||
* | | Fix the way matrix folder is passed to the tests. | 2018-06-08 | |
| | | |||
* | | Don't use std::equal_to inside cuda kernels since it's not supported. | 2018-06-07 | |
| | | |||
* | | Missing line during manual rebase of PR-374 | 2018-06-07 | |
| | | |||
| * | Merge from eigen/eigen | 2018-06-07 | |
| |\ | |||
| * | | Updated the stopping criteria in igammac_cf_impl. | 2018-06-07 | |
| | | | | | | | | | | | | Previously, when computing the derivative, it used a relative error threshold. Now it uses an absolute error threshold. The behavior for computing the value is unchanged. This makes more sense since we do not expect the derivative to often be close to zero. This change makes the derivatives about 30% faster across the board. The error for the igamma_der_a is almost unchanged, while for gamma_sample_der_alpha it is a bit worse for float32 and unchanged for float64. | ||
| * | | Derivative of the incomplete Gamma function and the sample of a Gamma random ↵ | 2018-06-06 | |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | variable. In addition to igamma(a, x), this code implements: * igamma_der_a(a, x) = d igamma(a, x) / da -- derivative of igamma with respect to the parameter * gamma_sample_der_alpha(alpha, sample) -- reparameterization derivative of a Gamma(alpha, 1) random variable sample with respect to the alpha parameter The derivatives are computed by forward mode differentiation of the igamma(a, x) code. Although gamma_sample_der_alpha can be implemented via igamma_der_a, a separate function is more accurate and efficient due to analytical cancellation of some terms. All three functions are implemented by a method parameterized with "mode" that always computes the derivatives, but does not return them unless required by the mode. The compiler is expected to (and, based on benchmarks, does) skip the unnecessary computations depending on the mode. | ||
* | | | Adding EIGEN_DEVICE_FUNC to Products, especially Dense2Dense Assignment | 2018-03-14 | |
| |/ |/| | | | | | | | specializations. Otherwise causes problems with small fixed size matrix multiplication (call to 0x00 in call_assignment_no_alias in debug mode or trap in release with CUDA 9.1). | ||
* | | Merged in mfigurnov/eigen/fix-bessel (pull request PR-404) | 2018-06-07 | |
|\ \ | | | | | | | | | | Fix compilation of special functions without C99 math. | ||
| * \ | Merge from eigen/eigen. | 2018-06-07 | |
| |\ \ | |/ / |/| | | |||
* | | | Fiw some warnings in dox examples | 2018-06-07 | |
| | | | |||
* | | | Fix int versus Index | 2018-06-07 | |
| | | |