aboutsummaryrefslogtreecommitdiffhomepage
path: root/unsupported/Eigen/CXX11/src/Tensor
Commit message (Collapse)AuthorAge
* Add async evaluation support to TensorPadding/TensorImagePatch/TensorShufflingGravatar Eugene Zhulenev2019-11-26
|
* Remove legacy block evaluation supportGravatar Eugene Zhulenev2019-11-12
|
* Fix a race in async tensor evaluation: Don't run on_done() until after ↵Gravatar Rasmus Munk Larsen2019-11-11
| | | | device.deallocate() / evaluator.cleanup() complete, since the device might be destroyed after on_done() runs.
* Break loop dependence in TensorGenerator block accessGravatar Eugene Zhulenev2019-11-11
|
* Add EIGEN_HAS_INTRINSIC_INT128 macroGravatar Rasmus Munk Larsen2019-11-06
| | | | Add a new EIGEN_HAS_INTRINSIC_INT128 macro, and use this instead of __SIZEOF_INT128__. This fixes related issues with TensorIntDiv.h when building with Clang for Windows, where support for 128-bit integer arithmetic is advertised but broken in practice.
* Rollback or PR-746 and partial rollback of ↵Gravatar Rasmus Munk Larsen2019-11-05
| | | | | | | | https://bitbucket.org/eigen/eigen/commits/668ab3fc474e54c7919eda4fbaf11f3a99246494 . std::array is still not supported in CUDA device code on Windows.
* Remove internal::smart_copy and replace with std::copyGravatar Eugene Zhulenev2019-10-29
|
* Prevent potential ODR in TensorExecutorGravatar Eugene Zhulenev2019-10-28
|
* Merged in deven-amd/eigen-hip-fix-191018 (pull request PR-738)Gravatar Rasmus Larsen2019-10-22
|\ | | | | | | Fix for the HIP build+test errors.
* | Add block evaluation V2 to TensorAsyncExecutor.Gravatar Rasmus Munk Larsen2019-10-22
| | | | | | | | Add async evaluation to a number of ops.
| * Fix for the HIP build+test errors.Gravatar Deven Desai2019-10-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The errors were introduced by this commit : After the above mentioned commit, some of the tests started failing with the following error ``` Built target cxx11_tensor_reduction Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_reduction_gpu_5.dir/cxx11_tensor_reduction_gpu_5_generated_cxx11_tensor_reduction_gpu.cu.o In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:117: /home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:155:5: error: the field type is not amp-compatible DestinationBufferKind m_kind; ^ /home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:211:3: error: the field type is not amp-compatible DestinationBuffer m_destination; ^ ``` For some reason HIPCC does not like device code to contain enum types which do not have the base-type explicitly declared. The fix is trivial, explicitly state "int" as the basetype
* | Drop support for c++03 in Eigen tensor. Get rid of some code used to emulate ↵Gravatar Rasmus Munk Larsen2019-10-18
|/ | | | c++11 functionality with older compilers.
* Propagate block evaluation preference through rvalue tensor expressionsGravatar Eugene Zhulenev2019-10-17
|
* Cleanup Tensor block destination and materialized block storage allocationGravatar Eugene Zhulenev2019-10-16
|
* TensorBroadcasting support for random/uniform blocksGravatar Eugene Zhulenev2019-10-16
|
* Block evaluation for TensorGenerator/TensorReverse/TensorShufflingGravatar Eugene Zhulenev2019-10-14
|
* Block evaluation for TensorGenerator + TensorReverse + fixed bug in tensor ↵Gravatar Eugene Zhulenev2019-10-10
| | | | reverse op
* Block evaluation for TensorChipping + fixed bugs in TensorPadding and ↵Gravatar Eugene Zhulenev2019-10-09
| | | | TensorSlicing
* Add block evaluation to TensorEvalTo and fix few small bugsGravatar Eugene Zhulenev2019-10-07
|
* Fixing incorrect size in Tensor documentation.Gravatar Brian Zhao2019-10-04
|
* Fix compilation warnings and errors with clang in TensorBlockV2 code and testsGravatar Eugene Zhulenev2019-10-04
|
* Add block evaluation to TensorReshaping/TensorCasting/TensorPadding/TensorSelectGravatar Eugene Zhulenev2019-10-02
|
* Add beta to TensorContractionKernel and make memset optionalGravatar Eugene Zhulenev2019-10-02
|
* Fix compilation warnings and errors with clang in TensorBlockV2Gravatar Eugene Zhulenev2019-09-25
|
* Fix a bug in a packed block type in TensorContractionThreadPoolGravatar Eugene Zhulenev2019-09-24
|
* Choose TensorBlock StridedLinearCopy type staticallyGravatar Eugene Zhulenev2019-09-24
|
* Add new TensorBlock api implementation + testsGravatar Eugene Zhulenev2019-09-24
|
* Tensor block evaluation V2 support for unary/binary/broadcstingGravatar Eugene Zhulenev2019-09-24
|
* Fix (or mask away) conversion warnings introduced in ↵Gravatar Christoph Hertzberg2019-09-23
| | | | | | 553caeb6a3bb545aef895f8fc9f219be44679017 .
* Add support for asynchronous evaluation of tensor casting expressions.Gravatar Rasmus Munk Larsen2019-09-19
|
* Merging eigen/eigen.Gravatar Srinivas Vasudevan2019-09-16
|\
* | Add Bessel functions to SpecialFunctions.Gravatar Srinivas Vasudevan2019-09-14
| | | | | | | | | | | | | | | | | | - Split SpecialFunctions files in to a separate BesselFunctions file. In particular add: - Modified bessel functions of the second kind k0, k1, k0e, k1e - Bessel functions of the first kind j0, j1 - Bessel functions of the second kind y0, y1
| * Fix maybe-unitialized warnings in TensorContractionThreadPoolGravatar Eugene Zhulenev2019-09-13
| |
| * Use ThreadLocal container in TensorContractionThreadPoolGravatar Eugene Zhulenev2019-09-13
|/
* PR 681: Add ndtri function, the inverse of the normal distribution function.Gravatar Srinivas Vasudevan2019-08-12
|
* Allow move-only done callback in TensorAsyncDeviceGravatar Eugene Zhulenev2019-09-03
|
* TensorMap constness should not change underlying storage constnessGravatar Eugene Zhulenev2019-09-03
|
* Fixed Tensor documentation formatting.Gravatar Alberto Luaces2019-07-23
|
* Fix shadow warnings in TensorContractionThreadPoolGravatar Eugene Zhulenev2019-08-30
|
* Fix block mapper type name in TensorExecutorGravatar Eugene Zhulenev2019-08-30
|
* evalSubExprsIfNeededAsync + async TensorContractionThreadPoolGravatar Eugene Zhulenev2019-08-30
|
* Asynchronous expression evaluation with TensorAsyncDeviceGravatar Eugene Zhulenev2019-08-30
|
* Const correctness in TensorMap<const Tensor<T, ...>> expressionsGravatar Eugene Zhulenev2019-08-28
|
* Remove shadow warnings in TensorDeviceThreadPoolGravatar Eugene Zhulenev2019-08-28
|
* Merged in ezhulenev/eigen-01 (pull request PR-683)Gravatar Rasmus Larsen2019-08-26
|\ | | | | | | Asynchronous parallelFor in Eigen ThreadPoolDevice
* | Fix get_random_seed on Native ClientGravatar maratek2019-08-23
| | | | | | | | | | Newlib in Native Client SDK does not provide ::random function. Implement get_random_seed for NaCl using ::rand, similarly to Windows version.
| * Asynchronous parallelFor in Eigen ThreadPoolDeviceGravatar Eugene Zhulenev2019-08-22
|/
* Remove XSMM support from Tensor moduleGravatar Eugene Zhulenev2019-08-19
|
* [Eigen] Vectorize evaluation of coefficient-wise functions over tensor ↵Gravatar Rasmus Munk Larsen2019-08-07
| | | | | | | | | | | | blocks if the strides are known to be 1. Provides up to 20-25% speedup of the TF cross entropy op with AVX. A few benchmark numbers: name old time/op new time/op delta BM_Xent_16_10000_cpu 448µs ± 3% 389µs ± 2% -13.21% (p=0.008 n=5+5) BM_Xent_32_10000_cpu 575µs ± 6% 454µs ± 3% -21.00% (p=0.008 n=5+5) BM_Xent_64_10000_cpu 933µs ± 4% 712µs ± 1% -23.71% (p=0.008 n=5+5)
* Clean up unnecessary namespace specifiers in TensorBlock.h.Gravatar Rasmus Munk Larsen2019-08-07
|