Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Fix some implicit literal to Scalar conversions in SparseCore | Gael Guennebaud | 2019-09-11 |
| | |||
* | bug #1741: fix SelfAdjointView::rankUpdate and product to triangular part ↵ | Gael Guennebaud | 2019-09-10 |
| | | | | for destination with non-trivial inner stride | ||
* | bug #1741: fix C.noalias() = A*C; with C.innerStride()!=1 | Gael Guennebaud | 2019-09-10 |
| | |||
* | Fix a circular dependency regarding pshift* functions and ↵ | Gael Guennebaud | 2019-09-06 |
| | | | | | | | GenericPacketMathFunctions. Another solution would have been to make pshift* fully generic template functions with partial specialization which is always a mess in c++03. | ||
* | Fix compilation without vector engine available (e.g., x86 with SSE disabled): | Gael Guennebaud | 2019-09-05 |
| | | | | -> ppolevl is required by ndtri even for the scalar path | ||
* | PR 621: Fix documentation of EIGEN_COMP_EMSCRIPTEN | David Tellenbach | 2019-03-21 |
| | |||
* | Fix doc issues regarding ndtri | Gael Guennebaud | 2019-09-04 |
| | |||
* | Fix possible warning regarding strict equality comparisons | Gael Guennebaud | 2019-09-04 |
| | |||
* | PR 681: Add ndtri function, the inverse of the normal distribution function. | Srinivas Vasudevan | 2019-08-12 |
| | |||
* | Change typedefs from private to protected to fix MSVC compilation | Eugene Zhulenev | 2019-09-03 |
| | |||
* | Allow move-only done callback in TensorAsyncDevice | Eugene Zhulenev | 2019-09-03 |
| | |||
* | Add test for const TensorMap underlying data mutation | Eugene Zhulenev | 2019-09-03 |
| | |||
* | TensorMap constness should not change underlying storage constness | Eugene Zhulenev | 2019-09-03 |
| | |||
* | Makes Scalar/RealScalar typedefs public in Pardiso's wrappers (see PR 688) | Gael Guennebaud | 2019-09-03 |
| | |||
* | Fixed Tensor documentation formatting. | Alberto Luaces | 2019-07-23 |
| | |||
* | More colamd cleanup: | Gael Guennebaud | 2019-09-03 |
| | | | | | | - Move colamd implementation in its own namespace to avoid polluting the internal namespace with Ok, Status, etc. - Fix signed/unsigned warning - move some ugly free functions as member functions | ||
* | Eigen_Colamd.h updated to replace constexpr with consts and enums. | Anshul Jaiswal | 2019-08-17 |
| | |||
* | Ordering.h edited to fix dependencies on Eigen_Colamd.h | Anshul Jaiswal | 2019-08-15 |
| | |||
* | Eigen_Colamd.h edited replacing macros with constexprs and functions. | Anshul Jaiswal | 2019-08-15 |
| | |||
* | Eigen_Colamd.h edited online with Bitbucket replacing constant #defines with ↵ | Anshul Jaiswal | 2019-07-21 |
| | | | | const definitions | ||
* | Updated Eigen_Colamd.h, namespacing macros ALIVE & DEAD as COLAMD_ALIVE & ↵ | Anshul Jaiswal | 2019-06-08 |
| | | | | | | COLAMD_DEAD to prevent conflicts with other libraries / code. | ||
* | Fix shadow warnings in TensorContractionThreadPool | Eugene Zhulenev | 2019-08-30 |
| | |||
* | Fix block mapper type name in TensorExecutor | Eugene Zhulenev | 2019-08-30 |
| | |||
* | evalSubExprsIfNeededAsync + async TensorContractionThreadPool | Eugene Zhulenev | 2019-08-30 |
| | |||
* | Revert accidentally removed <memory> header from ThreadPool | Eugene Zhulenev | 2019-08-30 |
| | |||
* | Asynchronous expression evaluation with TensorAsyncDevice | Eugene Zhulenev | 2019-08-30 |
| | |||
* | Fix missing header inclusion and colliding definitions for half type ↵ | Rasmus Munk Larsen | 2019-08-30 |
| | | | | | | casting, which broke build with -march=native on Haswell/Skylake. | ||
* | Const correctness in TensorMap<const Tensor<T, ...>> expressions | Eugene Zhulenev | 2019-08-28 |
| | |||
* | Add more tests for corner cases of log1p and expm1. Add handling of infinite ↵ | Rasmus Munk Larsen | 2019-08-28 |
| | | | | arguments to log1p such that log1p(inf) = inf. | ||
* | Remove shadow warnings in TensorDeviceThreadPool | Eugene Zhulenev | 2019-08-28 |
| | |||
* | Revert changes to std_falback::log1p that broke handling of arguments less ↵ | Rasmus Munk Larsen | 2019-08-27 |
| | | | | than -1. Fix packet op accordingly. | ||
* | Clean up float16 a.k.a. Eigen::half support in Eigen. Move the definition of ↵ | Rasmus Munk Larsen | 2019-08-27 |
| | | | | half to Core/arch/Default and move arch-specific packet ops to their respective sub-directories. | ||
* | Merged in ezhulenev/eigen-01 (pull request PR-683) | Rasmus Larsen | 2019-08-26 |
|\ | | | | | | | Asynchronous parallelFor in Eigen ThreadPoolDevice | ||
* | | Fix get_random_seed on Native Client | maratek | 2019-08-23 |
| | | | | | | | | | | Newlib in Native Client SDK does not provide ::random function. Implement get_random_seed for NaCl using ::rand, similarly to Windows version. | ||
| * | Asynchronous parallelFor in Eigen ThreadPoolDevice | Eugene Zhulenev | 2019-08-22 |
|/ | |||
* | Merged in jaopaulolc/eigen (pull request PR-679) | Christoph Hertzberg | 2019-08-22 |
|\ | | | | | | | Fixes for Altivec/VSX and compilation with clang on PowerPC | ||
* \ | Merged in rmlarsen/eigen (pull request PR-680) | Rasmus Larsen | 2019-08-22 |
|\ \ | | | | | | | | | | Implement vectorized versions of log1p and expm1 in Eigen using Kahan's formulas, and change the scalar implementations to properly handle infinite arguments. | ||
* | | | Remove XSMM support from Tensor module | Eugene Zhulenev | 2019-08-19 |
| | | | |||
| | * | Fix debug macros in p{load,store}u | João P. L. de Carvalho | 2019-08-14 |
| | | | |||
| | * | Add missing pcmp_XX methods for double/Packet2d | João P. L. de Carvalho | 2019-08-14 |
| | | | | | | | | | | | | This actually fixes an issue in unit-test packetmath_2 with pcmp_eq when it is compiled with clang. When pcmp_eq(Packet4f,Packet4f) is used instead of pcmp_eq(Packet2d,Packet2d), the unit-test does not pass due to NaN on ref vector. | ||
| * | | Implement vectorized versions of log1p and expm1 in Eigen using Kahan's ↵ | Rasmus Munk Larsen | 2019-08-12 |
|/ / | | | | | | | | | | | | | | | | | | | | | | | formulas, and change the scalar implementations to properly handle infinite arguments. Depending on instruction set, significant speedups are observed for the vectorized path: log1p wall time is reduced 60-93% (2.5x - 15x speedup) expm1 wall time is reduced 0-85% (1x - 7x speedup) The scalar path is slower by 20-30% due to the extra branch needed to handle +infinity correctly. Full benchmarks measured on Intel(R) Xeon(R) Gold 6154 here: https://bitbucket.org/snippets/rmlarsen/MXBkpM | ||
| * | Fix packed load/store for PowerPC's VSX | João P. L. de Carvalho | 2019-08-09 |
| | | | | | | | | | | | | | | | | The vec_vsx_ld/vec_vsx_st builtins were wrongly used for aligned load/store. In fact, they perform unaligned memory access and, even when the address is 16-byte aligned, they are much slower (at least 2x) than their aligned counterparts. For double/Packet2d vec_xl/vec_xst should be prefered over vec_ld/vec_st, although the latter works when casted to float/Packet4f. Silencing some weird warning with throw but some GCC versions. Such warning are not thrown by Clang. | ||
| * | Fix offset argument of ploadu/pstoreu for Altivec | João P. L. de Carvalho | 2019-08-09 |
| | | | | | | | | | | | | | | | | | | | | If no offset is given, them it should be zero. Also passes full address to vec_vsx_ld/st builtins. Removes userless _EIGEN_ALIGNED_PTR & _EIGEN_MASK_ALIGNMENT. Removes unnecessary casts. | ||
| * | bug #1718: Add cast to successfully compile with clang on PowerPC | João P. L. de Carvalho | 2019-08-09 |
|/ | | | | Ignoring -Wc11-extensions warnings thrown by clang at Altivec/PacketMath.h | ||
* | Fix bugs in log1p and expm1 where repeated using statements would clobber ↵ | Rasmus Munk Larsen | 2019-08-08 |
| | | | | | | each other. Add specializations for complex types since std::log1p and std::exp1m do not support complex. | ||
* | Guard against repeated definition of EIGEN_MPL2_ONLY | Rasmus Munk Larsen | 2019-08-07 |
| | |||
* | Disable tests for contraction with output kernels when using libxsmm, which ↵ | Rasmus Munk Larsen | 2019-08-07 |
| | | | | does not support this. | ||
* | [Eigen] Vectorize evaluation of coefficient-wise functions over tensor ↵ | Rasmus Munk Larsen | 2019-08-07 |
| | | | | | | | | | | | | blocks if the strides are known to be 1. Provides up to 20-25% speedup of the TF cross entropy op with AVX. A few benchmark numbers: name old time/op new time/op delta BM_Xent_16_10000_cpu 448µs ± 3% 389µs ± 2% -13.21% (p=0.008 n=5+5) BM_Xent_32_10000_cpu 575µs ± 6% 454µs ± 3% -21.00% (p=0.008 n=5+5) BM_Xent_64_10000_cpu 933µs ± 4% 712µs ± 1% -23.71% (p=0.008 n=5+5) | ||
* | Clean up unnecessary namespace specifiers in TensorBlock.h. | Rasmus Munk Larsen | 2019-08-07 |
| | |||
* | Fix doc regarding alignment and c++17 | Gael Guennebaud | 2019-08-04 |
| |