aboutsummaryrefslogtreecommitdiffhomepage
path: root/unsupported/test
Commit message (Collapse)AuthorAge
* syncing this fork with upstreamGravatar Deven Desai2018-06-13
|\
| * Merge from eigen/eigenGravatar Michael Figurnov2018-06-07
| |\
| | * Merge from eigen/eigen.Gravatar Michael Figurnov2018-06-07
| | |\
| | * | Fix compilation of special functions without C99 math.Gravatar Michael Figurnov2018-06-07
| | | | | | | | | | | | | | | | | | | | | | | | The commit with Bessel functions i0e and i1e placed the ifdef/endif incorrectly, causing i0e/i1e to be undefined when EIGEN_HAS_C99_MATH=0. These functions do not actually require C99 math, so now they are always available.
| | | * Fix typos found using codespellGravatar Gael Guennebaud2018-06-07
| | | |
| * | | Derivative of the incomplete Gamma function and the sample of a Gamma random ↵Gravatar Michael Figurnov2018-06-06
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | variable. In addition to igamma(a, x), this code implements: * igamma_der_a(a, x) = d igamma(a, x) / da -- derivative of igamma with respect to the parameter * gamma_sample_der_alpha(alpha, sample) -- reparameterization derivative of a Gamma(alpha, 1) random variable sample with respect to the alpha parameter The derivatives are computed by forward mode differentiation of the igamma(a, x) code. Although gamma_sample_der_alpha can be implemented via igamma_der_a, a separate function is more accurate and efficient due to analytical cancellation of some terms. All three functions are implemented by a method parameterized with "mode" that always computes the derivatives, but does not return them unless required by the mode. The compiler is expected to (and, based on benchmarks, does) skip the unnecessary computations depending on the mode.
* / / Adding support for using Eigen in HIP kernels.Gravatar Deven Desai2018-06-06
|/ / | | | | | | | | | | | | | | | | This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs. Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor) Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
| * Performance improvements to tensor broadcast operationGravatar Vamsi Sripathi2018-05-23
|/ | | | | | 1. Added new packet functions using SIMD for NByOne, OneByN cases 2. Modified existing packet functions to reduce index calculations when input stride is non-SIMD 3. Added 4 test cases to cover the new packet functions
* Exponentially scaled modified Bessel functions of order zero and one.Gravatar Michael Figurnov2018-05-31
| | | | | | The functions are conventionally called i0e and i1e. The exponentially scaled version is more numerically stable. The standard Bessel functions can be obtained as i0(x) = exp(|x|) i0e(x) The code is ported from Cephes and tested against SciPy.
* Use numext::maxi & numext::mini.Gravatar Rasmus Munk Larsen2018-05-14
|
* Add vectorized clip functor for Eigen Tensors.Gravatar Rasmus Munk Larsen2018-05-14
|
* Fix "used uninitialized" warningsGravatar Gael Guennebaud2018-04-24
|
* Workaround warningGravatar Gael Guennebaud2018-04-24
|
* Recent Adolc versions require C++11Gravatar Christoph Hertzberg2018-04-13
|
* Update the padding computation for PADDING_SAME to be consistent with ↵Gravatar Benoit Steiner2018-01-30
|\ | | | | | | TensorFlow.
* | Disable use of recurrence for computing twiddle factors. Fixes FFT precision ↵Gravatar RJ Ryan2017-12-31
| | | | | | | | issues for large FFTs. https://github.com/tensorflow/tensorflow/issues/10749#issuecomment-354557689
| * Update the padding computation for PADDING_SAME to be consistent with ↵Gravatar Yangzihao Wang2017-12-12
|/ | | | TensorFlow.
* Removed unecesasry #includeGravatar Benoit Steiner2017-10-22
|
* Merged in infinitei/eigen (pull request PR-328)Gravatar Gael Guennebaud2017-09-06
|\ | | | | | | | | | | | | bug #1464 : Fixes construction of EulerAngles from 3D vector expression. Approved-by: Tal Hadad <tal_hd@hotmail.com> Approved-by: Abhijit Kundu <abhijit.kundu@gatech.edu>
* | Added support for CUDA 9.0.Gravatar Benoit Steiner2017-08-31
| |
| * bug #1464 : Fixes construction of EulerAngles from 3D vector expression.Gravatar Abhijit Kundu2017-08-30
|/
* Handle min/max/inf/etc issue in cuda_fp16.h directly in test/main.hGravatar Gael Guennebaud2017-08-24
|
* bug #1462: remove all occurences of the deprecated __CUDACC_VER__ macro by ↵Gravatar Gael Guennebaud2017-08-24
| | | | introducing EIGEN_CUDACC_VER
* Merged in benoitsteiner/opencl (pull request PR-323)Gravatar Benoit Steiner2017-07-07
|\ | | | | | | Improved support for OpenCL
* | Merged in tntnatbry/eigen (pull request PR-319)Gravatar Benoit Steiner2017-07-07
| | | | | | | | Tensor Trace op
| * Merged in mehdi_goli/upstr_benoit/TensorSYCLImageVolumePatchFixed (pull ↵Gravatar Benoit Steiner2017-07-06
|/ | | | | | | | | | | | request PR-14) Applying Benoit's comment for Fixing ImageVolumePatch. * Applying Benoit's comment for Fixing ImageVolumePatch. Fixing conflict on cmake file. * Fixing dealocation of the memory in ImagePatch test for SYCL. * Fixing the automerge issue.
* Merged in benoitsteiner/opencl (pull request PR-318)Gravatar Benoit Steiner2017-06-13
|\ | | | | | | Improved support for OpenCL
* | fix compilation in C++98Gravatar Gael Guennebaud2017-06-09
| |
| * Merged eigen/eigen into defaultGravatar Benoit Steiner2017-05-26
| |\
| * \ Merge changed from upstreamGravatar a-doumoulakis2017-05-24
| |\ \
* | | | Specializing numeric_limits For AutoDiffScalarGravatar Mmanu Chaturvedi2017-05-23
| |_|/ |/| |
| | * Fixing Cmake Dependency for SYCLGravatar Mehdi Goli2017-05-22
| | |
| * | Add support for triSYCLGravatar a-doumoulakis2017-05-05
| |/ | | | | | | | | | | Eigen is now able to use triSYCL with EIGEN_SYCL_TRISYCL and TRISYCL_INCLUDE_DIR options Fix contraction kernel with correct nd_item dimension
* / Use scalar_sum_op and scalar_quotient_op instead of operator+ and operator/ ↵Gravatar RJ Ryan2017-04-14
|/ | | | | | | | | | in MeanReducer. Improves support for std::complex types when compiling for CUDA. Expands on e2e9cdd16970914cf0a892fea5e7c4402b3ede41 and 2bda1b0d93fb627d0c500ec48b20302d44c32cb7 .
* Preserve file naming conventionsGravatar Benoit Steiner2017-04-04
|
* Fixing TensorArgMaxSycl.h; Removing warning related to the hardcoded type of ↵Gravatar Mehdi Goli2017-03-28
| | | | dims to be int in Argmax.
* Merged eigen/eigen into defaultGravatar Benoit Steiner2017-03-15
|\
* | Temporary: Disables cxx11_tensor_argmax_sycl test since it is causing zombie ↵Gravatar Luke Iwanski2017-03-15
| | | | | | | | thread
| * Make the non-blocking threadpool more flexible and less wasteful of CPU ↵Gravatar Rasmus Munk Larsen2017-03-09
| | | | | | | | | | | | | | | | | | | | | | | | cycles for high-latency use-cases. * Adds a hint to ThreadPool allowing us to turn off spin waiting. Currently each reader and record yielder op in a graph creates a threadpool with a thread that spins for 1000 iterations through the work stealing loop before yielding. This is wasteful for such ops that process I/O. * This also changes the number of iterations through the steal loop to be inversely proportional to the number of threads. Since the time of each iteration is proportional to the number of threads, this yields roughly a constant spin time. * Implement a separate worker loop for the num_threads == 1 case since there is no point in going through the expensive steal loop. Moreover, since Steal() calls PopBack() on the victim queues it might reverse the order in which ops are executed, compared to the order in which they are scheduled, which is usually counter-productive for the types of I/O workloads the single thread pools tend to be used for. * Store num_threads in a member variable for simplicity and to avoid a data race between the thread creation loop and worker threads calling threads_.size().
* | Adding TensorIndexTuple and TensorTupleReduceOP backend (ArgMax/Min) for ↵Gravatar Mehdi Goli2017-03-07
| | | | | | | | sycl; fixing the address space issue for const TensorMap; converting all discard_write to write due to data missmatch.
* | Adding sycl backend for TensorCustomOp; fixing the partial lhs modification ↵Gravatar Mehdi Goli2017-02-28
| | | | | | | | issue on sycl when the rhs is TensorContraction, reduction or convolution; Fixing the partial modification for memset when sycl backend is used.
* | Adding TensorVolumePatchOP.h for syclGravatar Mehdi Goli2017-02-24
| |
* | Adding Sycl Backend for TensorGenerator.h.Gravatar Mehdi Goli2017-02-22
| |
* | Reducing the number of warnings.Gravatar Mehdi Goli2017-02-21
| |
* | Adding Sycl backend for TensorImagePatchOP.h; adding Sycl backend for ↵Gravatar Mehdi Goli2017-02-20
| | | | | | | | TensorInflation.h.
* | Adding TensorLayoutSwapOp for sycl.Gravatar Mehdi Goli2017-02-15
| |
* | Adding TensorPatch.h for sycl backend.Gravatar Mehdi Goli2017-02-15
|/
* Adding TensorChippingOP for sycl backend; fixing the index value in the ↵Gravatar Mehdi Goli2017-02-13
| | | | verification operation for cxx11_tensorChipping.cpp test
* Adding mean to TensorReductionSycl.hGravatar Mehdi Goli2017-02-07
|
* Fixing TensorReductionSycl for min and max.Gravatar Mehdi Goli2017-02-06
|