Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Add packet generic ops `predux_fmin`, `predux_fmin_nan`, `predux_fmax`, and ↵ | 2020-10-13 | |
| | | | | `predux_fmax_nan` that implement reductions with `PropagateNaN`, and `PropagateNumbers` semantics. Add (slow) generic implementations for most reductions. | ||
* | [SYCL] This PR adds the minimum modifications to the Eigen unsupported ↵ | 2019-06-28 | |
| | | | | | | | | | | module required to run it on devices supporting SYCL. * Abstracting the pointer type so that both SYCL memory and pointer can be captured. * Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class. * Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node. * Adding SYCL macro for controlling loop unrolling. * Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes. | ||
* | Don't vectorize the MeanReducer unless pdiv is available. | 2018-09-11 | |
| | |||
* | Use numerically stable tree reduction in TensorReduction. | 2018-09-11 | |
| | |||
* | Move sigmoid functor to core. | 2018-08-03 | |
| | |||
* | Converting ad-hoc inline keyword to EIGEN_STRONG_INLINE MACRO. | 2018-08-01 | |
| | |||
* | PR430: Convert count to the reducer type in MeanReducer | 2018-07-19 | |
| | | | | | | | | | | | | | Without explicit conversion Tensorflow fails to compile, pset1 template deduction fails. cannot convert '((const Eigen::internal::MeanReducer<Eigen::half>*)this) ->Eigen::internal::MeanReducer<Eigen::half>::packetCount_' (type 'const DenseIndex {aka const long int}') to type 'const type& {aka const Eigen::half&}' return pdiv(vaccum, pset1<Packet>(packetCount_)); Honestly I’m not sure why it works in Eigen tests, because Eigen::half constructor is explicit, and why it stopped working in TF, I didn’t find any relevant changes since previous Eigen upgrade. static_cast<T>(packetCount_) - breaks cxx11_tensor_reductions test for Eigen::half, also quite surprising. | ||
* | bug #1569: fix Tensor<half>::mean() on AVX with respective unit test. | 2018-07-19 | |
| | |||
* | Rename clip2 to clamp. | 2018-05-16 | |
| | |||
* | Rename scalar_clip_op to scalar_clip2_op to prevent collision with existing ↵ | 2018-05-16 | |
| | | | | functor in TensorFlow. | ||
* | Add vectorized clip functor for Eigen Tensors. | 2018-05-14 | |
| | |||
* | Use scalar_sum_op and scalar_quotient_op instead of operator+ and operator/ ↵ | 2017-04-14 | |
| | | | | | | | | | | in MeanReducer. Improves support for std::complex types when compiling for CUDA. Expands on e2e9cdd16970914cf0a892fea5e7c4402b3ede41 and 2bda1b0d93fb627d0c500ec48b20302d44c32cb7 . | ||
* | Deleted unnecessary semicolons | 2016-11-18 | |
| | |||
* | Fully support complex types in SumReducer and MeanReducer when building for ↵ | 2016-10-06 | |
| | | | | CUDA by using scalar_sum_op and scalar_product_op instead of operator+ and operator*. | ||
* | Cleaned up the random number generation code. | 2016-10-04 | |
| | |||
* | Updated the tensor sum and mean reducer to enable them to process complex ↵ | 2016-09-28 | |
| | | | | numbers on cuda gpus. | ||
* | Made the gaussian generator usable on GPU | 2016-09-22 | |
| | |||
* | Fix order of "static inline". | 2016-09-16 | |
| | |||
* | bug #1195: move NumTraits::Div<>::Cost to internal::scalar_div_cost (with ↵ | 2016-09-08 | |
| | | | | some specializations in arch/SSE and arch/AVX) | ||
* | Fix CUDA build broken by changes to min and max reduction. | 2016-09-02 | |
| | |||
* | Adjust Tensor module wrt recent change in nullary functor | 2016-09-01 | |
| | |||
* | Fix bugs to make min- and max reducers with correctly with IEEE infinities. | 2016-08-31 | |
| | |||
* | Simplified the code that dispatches vectorized reductions on GPU | 2016-06-09 | |
| | |||
* | Fixed definition of some of the reducer_traits | 2016-06-09 | |
| | |||
* | Improved support for vectorization of 16-bit floats | 2016-06-09 | |
| | |||
* | Advertize the packet api of the tensor reducers iff the corresponding packet ↵ | 2016-05-18 | |
| | | | | primitives are available. | ||
* | Roll back changes to core. Move include of TensorFunctors.h up to satisfy ↵ | 2016-05-17 | |
| | | | | dependence in TensorCostModel.h. | ||
* | Improvements to parallelFor. | 2016-05-12 | |
| | | | | Move some scalar functors from TensorFunctors. to Eigen core. | ||
* | Worked around compilation errors with older versions of gcc | 2016-05-11 | |
| | |||
* | Added support for fp16 to the sigmoid functor. | 2016-05-10 | |
| | |||
* | Use DenseIndex in the MeanReducer to avoid overflows when processing very ↵ | 2016-04-19 | |
| | | | | large tensors. | ||
* | Merge upstream updates. | 2016-04-14 | |
|\ | |||
* | | Eigen cost model part 1. This implements a basic recursive framework to ↵ | 2016-04-14 | |
| | | | | | | | | estimate the cost of evaluating tensor expressions. | ||
| * | Added support for fp16 to the sigmoid function | 2016-04-14 | |
|/ | |||
* | Added support for fmod | 2016-03-28 | |
| | |||
* | Improved support for integer modulo | 2016-03-25 | |
| | |||
* | Avoid mutable class members when possible | 2016-03-17 | |
| | |||
* | Allocate the mersenne twister used by the random number generators on the ↵ | 2016-03-17 | |
| | | | | heap instead of on the stack since they tend to keep a lot of state (i.e. about 5k) around. | ||
* | Enable the random number generators when compiling with visual studio | 2016-03-09 | |
| | |||
* | Use NumTraits::highest() and NumTraits::lowest() instead of the ↵ | 2016-03-07 | |
| | | | | std::numeric_limits to make the tensor min and max functors more CUDA friendly. | ||
* | Properly vectorized the random number generators | 2016-02-26 | |
| | |||
* | Marked the And and Or reducers as stateless. | 2016-02-24 | |
| | |||
* | Added support for tensor reductions on half floats | 2016-02-19 | |
| | |||
* | Don't attempt to vectorize mean reductions of integers since we can't use | 2015-12-22 | |
| | | | | SSE or AVX instructions to divide 2 integers. | ||
* | Made it possible to use the sigmoid functor within a CUDA kernel. | 2015-12-04 | |
| | |||
* | Added more missing EIGEN_DEVICE_FUNC | 2015-11-06 | |
| | |||
* | Added support for modulo operation | 2015-11-05 | |
| | |||
* | Added support for boolean reductions (ie 'and' & 'or' reductions) | 2015-10-20 | |
| | |||
* | Added support for argmax/argmin | 2015-08-31 | |
| | |||
* | Use numext::mini/numext::maxi instead of std::min/std::max in the tensor code | 2015-08-28 | |
| |