Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Add async evaluation support to TensorPadding/TensorImagePatch/TensorShuffling | Eugene Zhulenev | 2019-11-26 |
| | |||
* | Remove legacy block evaluation support | Eugene Zhulenev | 2019-11-12 |
| | |||
* | Fix a race in async tensor evaluation: Don't run on_done() until after ↵ | Rasmus Munk Larsen | 2019-11-11 |
| | | | | device.deallocate() / evaluator.cleanup() complete, since the device might be destroyed after on_done() runs. | ||
* | Break loop dependence in TensorGenerator block access | Eugene Zhulenev | 2019-11-11 |
| | |||
* | Add EIGEN_HAS_INTRINSIC_INT128 macro | Rasmus Munk Larsen | 2019-11-06 |
| | | | | Add a new EIGEN_HAS_INTRINSIC_INT128 macro, and use this instead of __SIZEOF_INT128__. This fixes related issues with TensorIntDiv.h when building with Clang for Windows, where support for 128-bit integer arithmetic is advertised but broken in practice. | ||
* | Rollback or PR-746 and partial rollback of ↵ | Rasmus Munk Larsen | 2019-11-05 |
| | | | | | | | | https://bitbucket.org/eigen/eigen/commits/668ab3fc474e54c7919eda4fbaf11f3a99246494 . std::array is still not supported in CUDA device code on Windows. | ||
* | Remove internal::smart_copy and replace with std::copy | Eugene Zhulenev | 2019-10-29 |
| | |||
* | Prevent potential ODR in TensorExecutor | Eugene Zhulenev | 2019-10-28 |
| | |||
* | Merged in deven-amd/eigen-hip-fix-191018 (pull request PR-738) | Rasmus Larsen | 2019-10-22 |
|\ | | | | | | | Fix for the HIP build+test errors. | ||
* | | Add block evaluation V2 to TensorAsyncExecutor. | Rasmus Munk Larsen | 2019-10-22 |
| | | | | | | | | Add async evaluation to a number of ops. | ||
| * | Fix for the HIP build+test errors. | Deven Desai | 2019-10-22 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The errors were introduced by this commit : After the above mentioned commit, some of the tests started failing with the following error ``` Built target cxx11_tensor_reduction Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_reduction_gpu_5.dir/cxx11_tensor_reduction_gpu_5_generated_cxx11_tensor_reduction_gpu.cu.o In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16: In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:117: /home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:155:5: error: the field type is not amp-compatible DestinationBufferKind m_kind; ^ /home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:211:3: error: the field type is not amp-compatible DestinationBuffer m_destination; ^ ``` For some reason HIPCC does not like device code to contain enum types which do not have the base-type explicitly declared. The fix is trivial, explicitly state "int" as the basetype | ||
* | | Drop support for c++03 in Eigen tensor. Get rid of some code used to emulate ↵ | Rasmus Munk Larsen | 2019-10-18 |
|/ | | | | c++11 functionality with older compilers. | ||
* | Propagate block evaluation preference through rvalue tensor expressions | Eugene Zhulenev | 2019-10-17 |
| | |||
* | Cleanup Tensor block destination and materialized block storage allocation | Eugene Zhulenev | 2019-10-16 |
| | |||
* | TensorBroadcasting support for random/uniform blocks | Eugene Zhulenev | 2019-10-16 |
| | |||
* | Block evaluation for TensorGenerator/TensorReverse/TensorShuffling | Eugene Zhulenev | 2019-10-14 |
| | |||
* | Block evaluation for TensorGenerator + TensorReverse + fixed bug in tensor ↵ | Eugene Zhulenev | 2019-10-10 |
| | | | | reverse op | ||
* | Block evaluation for TensorChipping + fixed bugs in TensorPadding and ↵ | Eugene Zhulenev | 2019-10-09 |
| | | | | TensorSlicing | ||
* | Add block evaluation to TensorEvalTo and fix few small bugs | Eugene Zhulenev | 2019-10-07 |
| | |||
* | Fixing incorrect size in Tensor documentation. | Brian Zhao | 2019-10-04 |
| | |||
* | Fix compilation warnings and errors with clang in TensorBlockV2 code and tests | Eugene Zhulenev | 2019-10-04 |
| | |||
* | Add block evaluation to TensorReshaping/TensorCasting/TensorPadding/TensorSelect | Eugene Zhulenev | 2019-10-02 |
| | |||
* | Add beta to TensorContractionKernel and make memset optional | Eugene Zhulenev | 2019-10-02 |
| | |||
* | Fix compilation warnings and errors with clang in TensorBlockV2 | Eugene Zhulenev | 2019-09-25 |
| | |||
* | Fix a bug in a packed block type in TensorContractionThreadPool | Eugene Zhulenev | 2019-09-24 |
| | |||
* | Choose TensorBlock StridedLinearCopy type statically | Eugene Zhulenev | 2019-09-24 |
| | |||
* | Add new TensorBlock api implementation + tests | Eugene Zhulenev | 2019-09-24 |
| | |||
* | Tensor block evaluation V2 support for unary/binary/broadcsting | Eugene Zhulenev | 2019-09-24 |
| | |||
* | Fix (or mask away) conversion warnings introduced in ↵ | Christoph Hertzberg | 2019-09-23 |
| | | | | | | 553caeb6a3bb545aef895f8fc9f219be44679017 . | ||
* | Add support for asynchronous evaluation of tensor casting expressions. | Rasmus Munk Larsen | 2019-09-19 |
| | |||
* | Merging eigen/eigen. | Srinivas Vasudevan | 2019-09-16 |
|\ | |||
* | | Add Bessel functions to SpecialFunctions. | Srinivas Vasudevan | 2019-09-14 |
| | | | | | | | | | | | | | | | | | | - Split SpecialFunctions files in to a separate BesselFunctions file. In particular add: - Modified bessel functions of the second kind k0, k1, k0e, k1e - Bessel functions of the first kind j0, j1 - Bessel functions of the second kind y0, y1 | ||
| * | Fix maybe-unitialized warnings in TensorContractionThreadPool | Eugene Zhulenev | 2019-09-13 |
| | | |||
| * | Use ThreadLocal container in TensorContractionThreadPool | Eugene Zhulenev | 2019-09-13 |
|/ | |||
* | PR 681: Add ndtri function, the inverse of the normal distribution function. | Srinivas Vasudevan | 2019-08-12 |
| | |||
* | Allow move-only done callback in TensorAsyncDevice | Eugene Zhulenev | 2019-09-03 |
| | |||
* | TensorMap constness should not change underlying storage constness | Eugene Zhulenev | 2019-09-03 |
| | |||
* | Fixed Tensor documentation formatting. | Alberto Luaces | 2019-07-23 |
| | |||
* | Fix shadow warnings in TensorContractionThreadPool | Eugene Zhulenev | 2019-08-30 |
| | |||
* | Fix block mapper type name in TensorExecutor | Eugene Zhulenev | 2019-08-30 |
| | |||
* | evalSubExprsIfNeededAsync + async TensorContractionThreadPool | Eugene Zhulenev | 2019-08-30 |
| | |||
* | Asynchronous expression evaluation with TensorAsyncDevice | Eugene Zhulenev | 2019-08-30 |
| | |||
* | Const correctness in TensorMap<const Tensor<T, ...>> expressions | Eugene Zhulenev | 2019-08-28 |
| | |||
* | Remove shadow warnings in TensorDeviceThreadPool | Eugene Zhulenev | 2019-08-28 |
| | |||
* | Merged in ezhulenev/eigen-01 (pull request PR-683) | Rasmus Larsen | 2019-08-26 |
|\ | | | | | | | Asynchronous parallelFor in Eigen ThreadPoolDevice | ||
* | | Fix get_random_seed on Native Client | maratek | 2019-08-23 |
| | | | | | | | | | | Newlib in Native Client SDK does not provide ::random function. Implement get_random_seed for NaCl using ::rand, similarly to Windows version. | ||
| * | Asynchronous parallelFor in Eigen ThreadPoolDevice | Eugene Zhulenev | 2019-08-22 |
|/ | |||
* | Remove XSMM support from Tensor module | Eugene Zhulenev | 2019-08-19 |
| | |||
* | [Eigen] Vectorize evaluation of coefficient-wise functions over tensor ↵ | Rasmus Munk Larsen | 2019-08-07 |
| | | | | | | | | | | | | blocks if the strides are known to be 1. Provides up to 20-25% speedup of the TF cross entropy op with AVX. A few benchmark numbers: name old time/op new time/op delta BM_Xent_16_10000_cpu 448µs ± 3% 389µs ± 2% -13.21% (p=0.008 n=5+5) BM_Xent_32_10000_cpu 575µs ± 6% 454µs ± 3% -21.00% (p=0.008 n=5+5) BM_Xent_64_10000_cpu 933µs ± 4% 712µs ± 1% -23.71% (p=0.008 n=5+5) | ||
* | Clean up unnecessary namespace specifiers in TensorBlock.h. | Rasmus Munk Larsen | 2019-08-07 |
| |