aboutsummaryrefslogtreecommitdiffhomepage
path: root/unsupported/Eigen/CXX11/src
Commit message (Collapse)AuthorAge
* Enabled the use of fixed dimensions from within a cuda kernel.Gravatar Benoit Steiner2016-01-11
|
* Deleted unused variable.Gravatar Benoit Steiner2016-01-11
|
* Silenced a nvcc compilation warningGravatar Benoit Steiner2016-01-11
|
* Silenced several compilation warnings triggered by nvcc.Gravatar Benoit Steiner2016-01-11
|
* Merged in jeremy_barnes/eigen/shader-model-3.0 (pull request PR-152)Gravatar Benoit Steiner2016-01-11
|\ | | | | | | Alternative way of forcing instantiation of device kernels without causing warnings or requiring device to device kernel invocations.
* | Fixed a bug in the dispatch of optimized reduction kernels.Gravatar Benoit Steiner2016-01-11
| |
* | Re-enabled the optimized reduction CUDA code.Gravatar Benoit Steiner2016-01-11
| |
| * Cleaned up double-defined macro from last commitGravatar Jeremy Barnes2016-01-10
| |
| * Alternative way of forcing instantiation of device kernels withoutGravatar Jeremy Barnes2016-01-10
|/ | | | | | causing warnings or requiring device to device kernel invocations. This allows Tensorflow to work on SM 3.0 (ie, Amazon EC2) machines.
* Simplified the dispatch code.Gravatar Benoit Steiner2016-01-08
|
* Made it possible to use array of size 0 on CUDA devicesGravatar Benoit Steiner2016-01-08
|
* Reworked the dispatch of optimized cuda reduction kernels to workaround a ↵Gravatar Benoit Steiner2016-01-08
| | | | nvcc bug that prevented the code from compiling in optimized mode in some cases
* Prevent nvcc from miscompiling the cuda metakernel. Unfortunately this ↵Gravatar Benoit Steiner2016-01-08
| | | | reintroduces some compulation warnings but it's much better than having to deal with random assertion failures.
* Removed a couple of partial specialization that confuse nvcc and result in ↵Gravatar Benoit Steiner2016-01-07
| | | | | | | | errors such as this: error: more than one partial specialization matches the template argument list of class "Eigen::internal::get<3, Eigen::internal::numeric_list<std::size_t, 1UL, 1UL, 1UL, 1UL>>" "Eigen::internal::get<n, Eigen::internal::numeric_list<T, a, as...>>" "Eigen::internal::get<n, Eigen::internal::numeric_list<T, as...>>"
* Fixed a typo.Gravatar Benoit Steiner2016-01-06
|
* Optimized the performance of broadcasting of scalars.Gravatar Benoit Steiner2016-01-06
|
* Improved the performance of reductions on CUDA devicesGravatar Benoit Steiner2016-01-04
|
* Added a 'divup' util to compute the floor of the quotient of two integersGravatar Benoit Steiner2016-01-04
|
* Add missing ctor from uintGravatar Gael Guennebaud2015-12-30
|
* Don't attempt to vectorize mean reductions of integers since we can't useGravatar Benoit Steiner2015-12-22
| | | | SSE or AVX instructions to divide 2 integers.
* Optimized the configuration of the outer reduction cuda kernelGravatar Benoit Steiner2015-12-22
|
* Added missing defineGravatar Benoit Steiner2015-12-22
|
* Made sure the optimized gpu reduction code is actually compiled.Gravatar Benoit Steiner2015-12-22
|
* Optimized outer reduction on GPUs.Gravatar Benoit Steiner2015-12-22
|
* Added missing constGravatar Benoit Steiner2015-12-21
|
* Add alignment requirement for local buffer used by the slicing op.Gravatar Benoit Steiner2015-12-18
|
* Doubled the speed of full reductions on GPUs.Gravatar Benoit Steiner2015-12-18
|
* Fixed a clang compilation warning triggered by the use of arrays of size 0.Gravatar Benoit Steiner2015-12-17
|
* Silenced some compilation warnings triggered by nvccGravatar Benoit Steiner2015-12-17
|
* Made it possible to run tensor chipping operations on CUDA devicesGravatar Benoit Steiner2015-12-17
|
* Made the entire TensorFixedSize api callable from a CUDA kernel.Gravatar Benoit Steiner2015-12-14
|
* Marked the tensor constructors as EIGEN_DEVICE_FUNC: This makes it possible ↵Gravatar Benoit Steiner2015-12-14
| | | | to call them from a CUDA kernel.
* Merged in ebrevdo/eigen (pull request PR-148)Gravatar Gael Guennebaud2015-12-11
|\ | | | | | | Add special functions to eigen: lgamma, erf, erfc.
* | Fixed a typo in the constructor of tensors of rank 5.Gravatar Benoit Steiner2015-12-10
| |
* | Fixed the coefficient accessors use for the 2d and 3d case when compiling ↵Gravatar Benoit Steiner2015-12-10
| | | | | | | | without cxx11 support.
| * Add special functions to Eigen: lgamma, erf, erfc.Gravatar Eugene Brevdo2015-12-07
| | | | | | | | Includes CUDA support and unit tests.
* | Fixed another compilation warningGravatar Benoit Steiner2015-12-07
|/
* Fixed compilation warningsGravatar Benoit Steiner2015-12-07
|
* Use signed integers instead of unsigned ones more consistently in the codebase.Gravatar Benoit Steiner2015-12-04
|
* Use integers instead of std::size_t to encode the number of dimensions in ↵Gravatar Benoit Steiner2015-12-04
| | | | the Tensor class since most of the code currently already use integers.
* Made it possible to use the sigmoid functor within a CUDA kernel.Gravatar Benoit Steiner2015-12-04
|
* Deleted redundant codeGravatar Benoit Steiner2015-12-03
|
* added scalar_sign_op (both real,complex)Gravatar Mark Borgerding2015-11-24
|
* Fixed a bug in TensorArgMax.hGravatar Benoit Steiner2015-11-23
|
* Fixed the implementation of Eigen::internal::count_leading_zeros for MSVC.Gravatar Benoit Steiner2015-11-23
| | | | Also updated the code to silence bogux warnings generated by nvcc when compilining this function.
* Don't create more cuda blocks than necessaryGravatar Benoit Steiner2015-11-23
|
* Made it possible to refer t oa GPUDevice from code compile with a regular ↵Gravatar Benoit Steiner2015-11-23
| | | | C++ compiler
* Deleted unused variable.Gravatar Benoit Steiner2015-11-23
|
* Split TensorDeviceType.h in 3 files to make it more manageableGravatar Benoit Steiner2015-11-20
|
* Added option to force the usage of the Eigen array class instead of the ↵Gravatar Benoit Steiner2015-11-20
| | | | std::array class.