Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Fixed a compilation error with nvcc 7. | 2016-04-19 | |
| | |||
* | Simplified the code that launches cuda kernels. | 2016-04-19 | |
| | |||
* | Don't take the address of a kernel on CUDA devices that don't support this ↵ | 2016-04-19 | |
| | | | | feature. | ||
* | Use numext::ceil instead of std::ceil | 2016-04-19 | |
| | |||
* | Avoid an unnecessary copy of the evaluator. | 2016-04-19 | |
| | |||
* | Use DenseIndex in the MeanReducer to avoid overflows when processing very ↵ | 2016-04-19 | |
| | | | | large tensors. | ||
* | Move the evalGemm method into the TensorContractionEvaluatorBase class to ↵ | 2016-04-15 | |
| | | | | make it accessible from both the single and multithreaded contraction evaluators. | ||
* | Deleted unnecessary variable | 2016-04-15 | |
| | |||
* | Fixed a few compilation warnings | 2016-04-15 | |
| | |||
* | Merged in rmlarsen/eigen (pull request PR-178) | 2016-04-15 | |
|\ | | | | | | | Eigen Tensor cost model part 2: Thread scheduling for standard evaluators and reductions. | ||
| * | Get rid of void* casting when calling EvalRange::run. | 2016-04-15 | |
| | | |||
* | | Added ability to access the cache sizes from the tensor devices | 2016-04-14 | |
| | | |||
* | | Added support for exclusive or | 2016-04-14 | |
| | | |||
| * | Eigen Tensor cost model part 2: Thread scheduling for standard evaluators ↵ | 2016-04-14 | |
| | | | | | | | | and reductions. The cost model is turned off by default. | ||
* | | Added missing definition of PacketSize in the gpu evaluator of convolution | 2016-04-14 | |
| | | |||
* | | Merged in rmlarsen/eigen (pull request PR-177) | 2016-04-14 | |
|\| | | | | | | | Eigen Tensor cost model part 1. | ||
* | | Prepared the migration to the new non blocking thread pool | 2016-04-14 | |
| | | |||
| * | Improvements to cost model. | 2016-04-14 | |
| | | |||
| * | Merge upstream updates. | 2016-04-14 | |
| |\ | |/ |/| | |||
| * | Eigen cost model part 1. This implements a basic recursive framework to ↵ | 2016-04-14 | |
| | | | | | | | | estimate the cost of evaluating tensor expressions. | ||
* | | Silenced a compilation warning | 2016-04-14 | |
| | | |||
* | | Added support for fp16 to the sigmoid function | 2016-04-14 | |
|/ | |||
* | Defer the decision to vectorize tensor CUDA code to the meta kernel. This ↵ | 2016-04-12 | |
| | | | | makes it possible to decide to vectorize or not depending on the capability of the target cuda architecture. In particular, this enables us to vectorize the processing of fp16 when running on device of capability >= 5.3 | ||
* | Added missing EIGEN_DEVICE_FUNC to the tensor conversion code. | 2016-04-07 | |
| | |||
* | Added support for isinf, isnan, and isfinite checks to the tensor api | 2016-04-07 | |
| | |||
* | Fixed typos in the implementation of the zeta and polygamma ops. | 2016-04-06 | |
| | |||
* | Merge upstream. | 2016-04-01 | |
|\ | |||
* | | Fixed CUDA signature. | 2016-04-01 | |
| | | |||
| * | Merged eigen/eigen into default | 2016-04-01 | |
|/| | |||
* | | Added polygamma function. | 2016-04-01 | |
| | | |||
* | | Added zeta function. | 2016-04-01 | |
| | | |||
| * | Relaxed the condition used to gate the fft code. | 2016-03-31 | |
| | | |||
| * | Properly gate the fft code | 2016-03-31 | |
|/ | |||
* | Fixed a off-by-one bug in a debug assertion | 2016-03-30 | |
| | |||
* | Added NumTraits for type2index. | 2016-03-30 | |
| | |||
* | Fixed compilation warning | 2016-03-30 | |
| | |||
* | Added missing assignment operator to the TensorUInt128 class, and made misc ↵ | 2016-03-30 | |
| | | | | small improvements | ||
* | Fixed the formatting of the README. | 2016-03-29 | |
| | |||
* | Attempt to fix the formatting of the README | 2016-03-29 | |
| | |||
* | Added support for fmod | 2016-03-28 | |
| | |||
* | Made it possible to customize the threadpool | 2016-03-28 | |
| | |||
* | Fixed compilation warnings on arm | 2016-03-28 | |
| | |||
* | Prevent potential overflow. | 2016-03-28 | |
| | |||
* | Improved support for integer modulo | 2016-03-25 | |
| | |||
* | Avoid unnecessary conversions | 2016-03-23 | |
| | |||
* | Fixed compilation warning | 2016-03-23 | |
| | |||
* | Fixed compilation error | 2016-03-22 | |
| | |||
* | Pulled latest updates from trunk | 2016-03-22 | |
|\ | |||
* | | Use a single Barrier instead of a collection of Notifications to reduce the ↵ | 2016-03-22 | |
| | | | | | | | | thread synchronization overhead | ||
| * | Fixed a couple of typos | 2016-03-22 | |
| | |