aboutsummaryrefslogtreecommitdiffhomepage
path: root/unsupported/Eigen/CXX11/src
Commit message (Collapse)AuthorAge
* renaming *Cuda files to *Gpu in the unsupported/Eigen/CXX11/src/Tensor and ↵Gravatar Deven Desai2018-06-20
| | | | unsupported/test directories
* updates based on PR feedbackGravatar Deven Desai2018-06-14
| | | | | | | | | | | | | | | | | There are two major changes (and a few minor ones which are not listed here...see PR discussion for details) 1. Eigen::half implementations for HIP and CUDA have been merged. This means that - `CUDA/Half.h` and `HIP/hcc/Half.h` got merged to a new file `GPU/Half.h` - `CUDA/PacketMathHalf.h` and `HIP/hcc/PacketMathHalf.h` got merged to a new file `GPU/PacketMathHalf.h` - `CUDA/TypeCasting.h` and `HIP/hcc/TypeCasting.h` got merged to a new file `GPU/TypeCasting.h` After this change the `HIP/hcc` directory only contains one file `math_constants.h`. That will go away too once that file becomes a part of the HIP install. 2. new macros EIGEN_GPUCC, EIGEN_GPU_COMPILE_PHASE and EIGEN_HAS_GPU_FP16 have been added and the code has been updated to use them where appropriate. - `EIGEN_GPUCC` is the same as `(EIGEN_CUDACC || EIGEN_HIPCC)` - `EIGEN_GPU_DEVICE_COMPILE` is the same as `(EIGEN_CUDA_ARCH || EIGEN_HIP_DEVICE_COMPILE)` - `EIGEN_HAS_GPU_FP16` is the same as `(EIGEN_HAS_CUDA_FP16 or EIGEN_HAS_HIP_FP16)`
* syncing this fork with upstreamGravatar Deven Desai2018-06-13
|\
| * Merge from eigen/eigenGravatar Michael Figurnov2018-06-07
| |\
| | * Fix typos found using codespellGravatar Gael Guennebaud2018-06-07
| | |
| * | Derivative of the incomplete Gamma function and the sample of a Gamma random ↵Gravatar Michael Figurnov2018-06-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | variable. In addition to igamma(a, x), this code implements: * igamma_der_a(a, x) = d igamma(a, x) / da -- derivative of igamma with respect to the parameter * gamma_sample_der_alpha(alpha, sample) -- reparameterization derivative of a Gamma(alpha, 1) random variable sample with respect to the alpha parameter The derivatives are computed by forward mode differentiation of the igamma(a, x) code. Although gamma_sample_der_alpha can be implemented via igamma_der_a, a separate function is more accurate and efficient due to analytical cancellation of some terms. All three functions are implemented by a method parameterized with "mode" that always computes the derivatives, but does not return them unless required by the mode. The compiler is expected to (and, based on benchmarks, does) skip the unnecessary computations depending on the mode.
* | | Adding support for using Eigen in HIP kernels.Gravatar Deven Desai2018-06-06
|/ / | | | | | | | | | | | | | | | | This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs. Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor) Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
| * Performance improvements to tensor broadcast operationGravatar Vamsi Sripathi2018-05-23
|/ | | | | | 1. Added new packet functions using SIMD for NByOne, OneByN cases 2. Modified existing packet functions to reduce index calculations when input stride is non-SIMD 3. Added 4 test cases to cover the new packet functions
* Merged in mfigurnov/eigen (pull request PR-400)Gravatar Benoit Steiner2018-06-05
|\ | | | | | | | | | | Exponentially scaled modified Bessel functions of order zero and one. Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
* | Add a ThreadPoolInterface* getter for ThreadPoolDevice.Gravatar Penporn Koanantakool2018-06-02
| |
| * Exponentially scaled modified Bessel functions of order zero and one.Gravatar Michael Figurnov2018-05-31
|/ | | | | | The functions are conventionally called i0e and i1e. The exponentially scaled version is more numerically stable. The standard Bessel functions can be obtained as i0(x) = exp(|x|) i0e(x) The code is ported from Cephes and tested against SciPy.
* Hyperlink DOIs against preferred resolverGravatar Katrin Leinweber2018-05-24
|
* Merged in rmlarsen/eigen2 (pull request PR-393)Gravatar Benoit Steiner2018-05-16
|\ | | | | | | Rename scalar_clip_op to scalar_clamp_op to prevent collision with existing functor in TensorFlow.
| * Rename clip2 to clamp.Gravatar Rasmus Munk Larsen2018-05-16
| |
| * Rename scalar_clip_op to scalar_clip2_op to prevent collision with existing ↵Gravatar Rasmus Munk Larsen2018-05-16
| | | | | | | | functor in TensorFlow.
* | Merged in didierjansen/eigen (pull request PR-360)Gravatar Benoit Steiner2018-05-16
|\ \ | |/ |/| | | Fix bugs and typos in the contraction example of the tensor README
* | Add vectorized clip functor for Eigen Tensors.Gravatar Rasmus Munk Larsen2018-05-14
| |
* | Enable RawAccess to tensor slices whenever possinle.Gravatar Benoit Steiner2018-04-30
| | | | | | | | Avoid 32-bit integer overflow in TensorSlicingOp
* | Avoid using memcpy for non-POD elementsGravatar Weiming Zhao2018-04-11
| |
| * Fix typos in the contraction example of tensor READMEGravatar Lee.Deokjae2018-01-06
| |
* | Update the padding computation for PADDING_SAME to be consistent with ↵Gravatar Benoit Steiner2018-01-30
|\ \ | | | | | | | | | TensorFlow.
* | | Disable use of recurrence for computing twiddle factors. Fixes FFT precision ↵Gravatar RJ Ryan2017-12-31
| |/ |/| | | | | issues for large FFTs. https://github.com/tensorflow/tensorflow/issues/10749#issuecomment-354557689
* | Workaround nvcc 9.0 issue. See PR 351.Gravatar Gael Guennebaud2017-12-15
| | | | | | | | https://bitbucket.org/eigen/eigen/pull-requests/351
| * Update the padding computation for PADDING_SAME to be consistent with ↵Gravatar Yangzihao Wang2017-12-12
|/ | | | TensorFlow.
* Merged in JonasMu/eigen (pull request PR-329)Gravatar Benoit Steiner2017-10-27
|\ | | | | | | | | | | Added an example for a contraction to a scalar value to README.md Approved-by: Jonas Harsch <jonas.harsch@gmail.com>
* \ Merged in benoitsteiner/opencl (pull request PR-341)Gravatar Benoit Steiner2017-10-17
|\ \
* | | Specialize ThreadPoolDevice::enqueueNotification for the case with no args. ↵Gravatar Rasmus Munk Larsen2017-10-13
| | | | | | | | | | | | As an example this reduces binary size of an TensorFlow demo app for Android by about 2.5%.
| * | Changes required for new ComputeCpp CE version.Gravatar Mehdi Goli2017-09-18
| | |
* | | Fix cut-and-paste error.Gravatar Rasmus Munk Larsen2017-09-08
| | |
* | | Avoid undefined behavior in Eigen::TensorCostModel::numThreads.Gravatar Rasmus Munk Larsen2017-09-08
| | | | | | | | | | | | | | | | | | | | | If the cost is large enough then the thread count can be larger than the maximum representable int, so just casting it to an int is undefined behavior. Contributed by phurst@google.com.
| | * Added an example for a contraction to a scalar value, e.g. a double ↵Gravatar Jonas Harsch2017-09-01
| |/ |/| | | | | contraction of two second order tensors and how you can get the value of the result. I lost one day to get this doen so I think it will help some guys. I also added Eigen:: to the IndexPair and and array in the same example.
* | Added support for CUDA 9.0.Gravatar Benoit Steiner2017-08-31
| |
| * Fixing Argmax that was breaking upstream TensorFlow.Gravatar Benoit Steiner2017-07-22
| |
* | Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH ↵Gravatar Gael Guennebaud2017-07-17
|/ | | | aliases
* Code cleanupGravatar Benoit Steiner2017-07-10
|
* Fixed syntax errors generated by xcodeGravatar Benoit Steiner2017-07-09
|
* Avoid relying on cxx11 features when possible.Gravatar Benoit Steiner2017-07-08
|
* Merged in benoitsteiner/opencl (pull request PR-323)Gravatar Benoit Steiner2017-07-07
|\ | | | | | | Improved support for OpenCL
* \ Merged in hughperkins/eigen/add-endif-labels-TensorReductionCuda.h (pull ↵Gravatar Benoit Steiner2017-07-07
|\ \ | | | | | | | | | | | | | | | request PR-315) Add labels to #ifdef, in TensorReductionCuda.h
* | | Merged in tntnatbry/eigen (pull request PR-319)Gravatar Benoit Steiner2017-07-07
| | | | | | | | | | | | Tensor Trace op
* | | Improved the randomness of the tensor random generatorGravatar Benoit Steiner2017-07-06
| | |
* | | Fixed compilation warningGravatar Benoit Steiner2017-07-06
| | |
| | * Merged in mehdi_goli/upstr_benoit/TensorSYCLImageVolumePatchFixed (pull ↵Gravatar Benoit Steiner2017-07-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | request PR-14) Applying Benoit's comment for Fixing ImageVolumePatch. * Applying Benoit's comment for Fixing ImageVolumePatch. Fixing conflict on cmake file. * Fixing dealocation of the memory in ImagePatch test for SYCL. * Fixing the automerge issue.
| | * Merged in mehdi_goli/opencl/DataDependancy (pull request PR-10)Gravatar Benoit Steiner2017-06-28
| |/ |/| | | | | | | | | | | | | | | | | DataDependancy * Wrapping data type to the pointer class for sycl in non-terminal nodes; not having that breaks Tensorflow Conv2d code. * Applying Ronnan's Comments. * Applying benoit's comments
| * Add labels to #ifdef, in TensorReductionCuda.hGravatar Hugh Perkins2017-06-06
| |
* | Merged in mehdi_goli/opencl/SYCLAlignAllocator (pull request PR-7)Gravatar Benoit Steiner2017-05-26
|\ \ | | | | | | | | | Fixing SYCL alignment issue required by TensorFlow.
* \ \ Merged eigen/eigen into defaultGravatar Benoit Steiner2017-05-26
|\ \ \ | | |/ | |/|
| | * Applying Ronnan's comments.Gravatar Mehdi Goli2017-05-26
| | |
| | * Applying Benoit's comment;removing dead code.Gravatar Mehdi Goli2017-05-25
| | |
* | | Restore misplaced commentGravatar a-doumoulakis2017-05-24
| | |