Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Fix compile issues for gcc 4.8. | Antonio Sanchez | 2021-07-01 |
| | | | | | | - Move constructors can only be defaulted as NOEXCEPT if all members have NOEXCEPT move constructors. - gcc 4.8 has some funny parsing bug in `a < b->c`, thinking `b-` is a template parameter. | ||
* | Modify tensor argmin/argmax to always return first occurence. | Antonio Sanchez | 2021-06-29 |
| | | | | | | | | | As written, depending on multithreading/gpu, the returned index from `argmin`/`argmax` is not currently stable. Here we modify the functors to always keep the first occurence (i.e. if the value is equal to the current min/max, then keep the one with the smallest index). This is otherwise causing unpredictable results in some TF tests. | ||
* | Add packet generic ops `predux_fmin`, `predux_fmin_nan`, `predux_fmax`, and ↵ | Rasmus Munk Larsen | 2020-10-13 |
| | | | | `predux_fmax_nan` that implement reductions with `PropagateNaN`, and `PropagateNumbers` semantics. Add (slow) generic implementations for most reductions. | ||
* | [SYCL] This PR adds the minimum modifications to the Eigen unsupported ↵ | Mehdi Goli | 2019-06-28 |
| | | | | | | | | | | module required to run it on devices supporting SYCL. * Abstracting the pointer type so that both SYCL memory and pointer can be captured. * Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class. * Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node. * Adding SYCL macro for controlling loop unrolling. * Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes. | ||
* | Don't vectorize the MeanReducer unless pdiv is available. | Rasmus Munk Larsen | 2018-09-11 |
| | |||
* | Use numerically stable tree reduction in TensorReduction. | Rasmus Munk Larsen | 2018-09-11 |
| | |||
* | Move sigmoid functor to core. | Rasmus Munk Larsen | 2018-08-03 |
| | |||
* | Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO. | Mehdi Goli | 2018-08-01 |
| | |||
* | PR430: Convert count to the reducer type in MeanReducer | Eugene Zhulenev | 2018-07-19 |
| | | | | | | | | | | | | | Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails. cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this) ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_' (type 'const DenseIndex {aka const long int}') to type 'const type& {aka const Eigen::half&}' return pdiv(vaccum, pset1<Packet>(packetCount_)); Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade. static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising. | ||
* | bug #1569: fix Tensor<half>::mean() on AVX with respective unit test. | Gael Guennebaud | 2018-07-19 |
| | |||
* | Rename clip2 to clamp. | Rasmus Munk Larsen | 2018-05-16 |
| | |||
* | Rename scalar_clip_op to scalar_clip2_op to prevent collision with existing ↵ | Rasmus Munk Larsen | 2018-05-16 |
| | | | | functor in TensorFlow. | ||
* | Add vectorized clip functor for Eigen Tensors. | Rasmus Munk Larsen | 2018-05-14 |
| | |||
* | Use scalar_sum_op and scalar_quotient_op instead of operator+ and operator/ ↵ | RJ Ryan | 2017-04-14 |
| | | | | | | | | | | in MeanReducer. Improves support for std::complex types when compiling for CUDA. Expands on e2e9cdd16970914cf0a892fea5e7c4402b3ede41 and 2bda1b0d93fb627d0c500ec48b20302d44c32cb7 . | ||
* | Deleted unnecessary semicolons | Benoit Steiner | 2016-11-18 |
| | |||
* | Fully support complex types in SumReducer and MeanReducer when building for ↵ | RJ Ryan | 2016-10-06 |
| | | | | CUDA by using scalar_sum_op and scalar_product_op instead of operator+ and operator*. | ||
* | Cleaned up the random number generation code. | Benoit Steiner | 2016-10-04 |
| | |||
* | Updated the tensor sum and mean reducer to enable them to process complex ↵ | Benoit Steiner | 2016-09-28 |
| | | | | numbers on cuda gpus. | ||
* | Made the gaussian generator usable on GPU | Benoit Steiner | 2016-09-22 |
| | |||
* | Fix order of "static inline". | Gael Guennebaud | 2016-09-16 |
| | |||
* | bug #1195: move NumTraits::Div<>::Cost to internal::scalar_div_cost (with ↵ | Gael Guennebaud | 2016-09-08 |
| | | | | some specializations in arch/SSE and arch/AVX) | ||
* | Fix CUDA build broken by changes to min and max reduction. | Rasmus Munk Larsen | 2016-09-02 |
| | |||
* | Adjust Tensor module wrt recent change in nullary functor | Gael Guennebaud | 2016-09-01 |
| | |||
* | Fix bugs to make min- and max reducers with correctly with IEEE infinities. | Rasmus Munk Larsen | 2016-08-31 |
| | |||
* | Simplified the code that dispatches vectorized reductions on GPU | Benoit Steiner | 2016-06-09 |
| | |||
* | Fixed definition of some of the reducer_traits | Benoit Steiner | 2016-06-09 |
| | |||
* | Improved support for vectorization of 16-bit floats | Benoit Steiner | 2016-06-09 |
| | |||
* | Advertize the packet api of the tensor reducers iff the corresponding packet ↵ | Benoit Steiner | 2016-05-18 |
| | | | | primitives are available. | ||
* | Roll back changes to core. Move include of TensorFunctors.h up to satisfy ↵ | Rasmus Munk Larsen | 2016-05-17 |
| | | | | dependence in TensorCostModel.h. | ||
* | Improvements to parallelFor. | Rasmus Munk Larsen | 2016-05-12 |
| | | | | Move some scalar functors from TensorFunctors. to Eigen core. | ||
* | Worked around compilation errors with older versions of gcc | Benoit Steiner | 2016-05-11 |
| | |||
* | Added support for fp16 to the sigmoid functor. | Benoit Steiner | 2016-05-10 |
| | |||
* | Use DenseIndex in the MeanReducer to avoid overflows when processing very ↵ | Benoit Steiner | 2016-04-19 |
| | | | | large tensors. | ||
* | Merge upstream updates. | Rasmus Munk Larsen | 2016-04-14 |
|\ | |||
* | | Eigen cost model part 1. This implements a basic recursive framework to ↵ | Rasmus Munk Larsen | 2016-04-14 |
| | | | | | | | | estimate the cost of evaluating tensor expressions. | ||
| * | Added support for fp16 to the sigmoid function | Benoit Steiner | 2016-04-14 |
|/ | |||
* | Added support for fmod | Benoit Steiner | 2016-03-28 |
| | |||
* | Improved support for integer modulo | Benoit Steiner | 2016-03-25 |
| | |||
* | Avoid mutable class members when possible | Benoit Steiner | 2016-03-17 |
| | |||
* | Allocate the mersenne twister used by the random number generators on the ↵ | Benoit Steiner | 2016-03-17 |
| | | | | heap instead of on the stack since they tend to keep a lot of state (i.e. about 5k) around. | ||
* | Enable the random number generators when compiling with visual studio | Benoit Steiner | 2016-03-09 |
| | |||
* | Use NumTraits::highest() and NumTraits::lowest() instead of the ↵ | Benoit Steiner | 2016-03-07 |
| | | | | std::numeric_limits to make the tensor min and max functors more CUDA friendly. | ||
* | Properly vectorized the random number generators | Benoit Steiner | 2016-02-26 |
| | |||
* | Marked the And and Or reducers as stateless. | Benoit Steiner | 2016-02-24 |
| | |||
* | Added support for tensor reductions on half floats | Benoit Steiner | 2016-02-19 |
| | |||
* | Don't attempt to vectorize mean reductions of integers since we can't use | Benoit Steiner | 2015-12-22 |
| | | | | SSE or AVX instructions to divide 2 integers. | ||
* | Made it possible to use the sigmoid functor within a CUDA kernel. | Benoit Steiner | 2015-12-04 |
| | |||
* | Added more missing EIGEN_DEVICE_FUNC | Benoit Steiner | 2015-11-06 |
| | |||
* | Added support for modulo operation | Benoit Steiner | 2015-11-05 |
| | |||
* | Added support for boolean reductions (ie 'and' & 'or' reductions) | Benoit Steiner | 2015-10-20 |
| |