aboutsummaryrefslogtreecommitdiffhomepage
path: root/unsupported/Eigen/CXX11/src/Tensor/TensorContractionGpu.h
Commit message (Collapse)AuthorAge
* Fix calls to device functions from host codeGravatar Nathan Luehr2021-05-11
|
* Undo the block size change.Gravatar Artem Belevich2019-12-09
| | | | .z *is* used by the EigenContractionKernelInternal().
* Improve performance of contraction kernelsGravatar Artem Belevich2019-12-05
| | | | | | | | | | * Force-inline implementations. They pass around pointers to shared memory blocks. Without inlining compiler must operate via generic pointers. Inlining allows compiler to detect that we're operating on shared memory which allows generation of substantially faster code. * Fixed a long-standing typo which resulted in launching 8x more kernels than we needed (.z dimension of the block is unused by the kernel).
* Remove __host__ annotation for device-only function.Gravatar Rasmus Munk Larsen2019-12-03
|
* Use EIGEN_DEVICE_FUNC macro instead of __device__.Gravatar Rasmus Munk Larsen2019-12-03
|
* Clean up CUDA/NVCC version macros and their use in Eigen, and a few other ↵Gravatar Rasmus Munk Larsen2019-05-31
| | | | CUDA build failures.
* bug #1654: fix compilation with cuda and no c++11Gravatar Gael Guennebaud2019-01-09
|
* Fix GPU support.Gravatar Gael Guennebaud2018-09-20
|
* Assert that no output kernel is defined for GPU contractionGravatar Eugene Zhulenev2018-07-18
|
* Fix TensorContractionOp evaluators for GPU and SYCLGravatar Eugene Zhulenev2018-07-17
|
* applying EIGEN_DECLARE_TEST to *gpu* testsGravatar Deven Desai2018-07-17
| | | | | | | | | | | | | Also, a few minor fixes for GPU tests running in HIP mode. 1. Adding an include for hip/hip_runtime.h in the Macros.h file For HIP __host__ and __device__ are macros which are defined in hip headers. Their definitions need to be included before their use in the file. 2. Fixing the compile failure in TensorContractionGpu introduced by the commit to "Fuse computations into the Tensor contractions using output kernel" 3. Fixing a HIP/clang specific compile error by making the struct-member assignment explicit
* Updates corresponding to the latest round of PR feedbackGravatar Deven Desai2018-07-11
| | | | | | | | | | | | | | The major changes are 1. Moving CUDA/PacketMath.h to GPU/PacketMath.h 2. Moving CUDA/MathFunctions.h to GPU/MathFunction.h 3. Moving CUDA/CudaSpecialFunctions.h to GPU/GpuSpecialFunctions.h The above three changes effectively enable the Eigen "Packet" layer for the HIP platform 4. Merging the "hip_basic" and "cuda_basic" unit tests into one ("gpu_basic") 5. Updating the "EIGEN_DEVICE_FUNC" marking in some places The change has been tested on the HIP and CUDA platforms.
* merging the CUDA and HIP implementation for the Tensor directory and the ↵Gravatar Deven Desai2018-06-20
| | | | unit tests
* renaming *Cuda files to *Gpu in the unsupported/Eigen/CXX11/src/Tensor and ↵Gravatar Deven Desai2018-06-20
unsupported/test directories