Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Enabling per device specialisation of packetsize. | Mehdi Goli | 2018-08-01 |
| | |||
* | Rename Index to StorageIndex + use Eigen::Array and Eigen::Map when possible | Eugene Zhulenev | 2018-07-27 |
| | |||
* | Add tiled evaluation support to TensorExecutor | Eugene Zhulenev | 2018-07-25 |
| | |||
* | Avoid using memcpy for non-POD elements | Weiming Zhao | 2018-04-11 |
| | |||
* | Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH ↵ | Gael Guennebaud | 2017-07-17 |
| | | | | aliases | ||
* | Merged in mehdi_goli/opencl/DataDependancy (pull request PR-10) | Benoit Steiner | 2017-06-28 |
| | | | | | | | | | | DataDependancy * Wrapping data type to the pointer class for sycl in non-terminal nodes; not having that breaks Tensorflow Conv2d code. * Applying Ronnan's Comments. * Applying benoit's comments | ||
* | Adding TensorIndexTuple and TensorTupleReduceOP backend (ArgMax/Min) for ↵ | Mehdi Goli | 2017-03-07 |
| | | | | sycl; fixing the address space issue for const TensorMap; converting all discard_write to write due to data missmatch. | ||
* | Converting all parallel for lambda to functor in order to prevent kernel ↵ | Mehdi Goli | 2016-12-16 |
| | | | | duplication name error; adding tensorConcatinationOp backend for sycl. | ||
* | Adding tensor contraction operation backend for Sycl; adding test for ↵ | Mehdi Goli | 2016-12-14 |
| | | | | contractionOp sycl backend; adding temporary solution to prevent memory leak in buffer; cleaning up cxx11_tensor_buildins_sycl.h | ||
* | Merged with default. | Luke Iwanski | 2016-09-19 |
|\ | |||
* | | Partial OpenCL support via SYCL compatible with ComputeCpp CE. | Luke Iwanski | 2016-09-19 |
| | | |||
| * | Made the index type an explicit template parameter to help some compilers ↵ | Benoit Steiner | 2016-09-02 |
| | | | | | | | | compile the code. | ||
| * | Adjust Tensor module wrt recent change in nullary functor | Gael Guennebaud | 2016-09-01 |
| | | |||
| * | Force the inlining of a simple accessor. | Benoit Steiner | 2016-08-18 |
|/ | |||
* | bug #1266: half implementation has been moved to half_impl namespace | Benoit Steiner | 2016-07-29 |
| | |||
* | Moved assertions to the constructor to make the code more portable | Benoit Steiner | 2016-06-06 |
| | |||
* | Add TernaryFunctors and the betainc SpecialFunction. | Eugene Brevdo | 2016-06-02 |
| | | | | | | | | | | | | | | | | | | | TernaryFunctors and their executors allow operations on 3-tuples of inputs. API fully implemented for Arrays and Tensors based on binary functors. Ported the cephes betainc function (regularized incomplete beta integral) to Eigen, with support for CPU and GPU, floats, doubles, and half types. Added unit tests in array.cpp and cxx11_tensor_cuda.cu Collapsed revision * Merged helper methods for betainc across floats and doubles. * Added TensorGlobalFunctions with betainc(). Removed betainc() from TensorBase. * Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper. * betainc: merge incbcf and incbd into incbeta_cfe. and more cleanup. * Update TernaryOp and SpecialFunctions (betainc) based on review comments. | ||
* | Added the ability to load fp16 using the texture path. | Benoit Steiner | 2016-05-11 |
| | | | | Improved the performance of some reductions on fp16 | ||
* | Deleted unnecessary variable | Benoit Steiner | 2016-04-15 |
| | |||
* | Eigen Tensor cost model part 2: Thread scheduling for standard evaluators ↵ | Rasmus Munk Larsen | 2016-04-14 |
| | | | | and reductions. The cost model is turned off by default. | ||
* | Eigen cost model part 1. This implements a basic recursive framework to ↵ | Rasmus Munk Larsen | 2016-04-14 |
| | | | | estimate the cost of evaluating tensor expressions. | ||
* | Fixed the tensor chipping code. | Benoit Steiner | 2016-03-08 |
| | |||
* | Decoupled the packet type definition from the definition of the tensor ops. ↵ | Benoit Steiner | 2016-03-08 |
| | | | | All the vectorization is now defined in the tensor evaluators. This will make it possible to relialably support devices with different packet types in the same compilation unit. | ||
* | Record whether the underlying tensor storage can be accessed directly during ↵ | Benoit Steiner | 2016-01-19 |
| | | | | the evaluation of an expression. | ||
* | Added support for rank-0 tensors | Benoit Steiner | 2015-10-29 |
| | |||
* | Fix Tensor module wrt nullary functor recent change | Gael Guennebaud | 2015-08-09 |
| | |||
* | Use NumTraits<T>::RequireInitialization instead of ↵ | Benoit Steiner | 2015-07-07 |
| | | | | internal::is_arithmetic<T>::value to check whether it's possible to bypass the type constructor in the tensor code. | ||
* | Only attempt to use the texture path on GPUs when it's supported by CUDA | Benoit Steiner | 2015-07-06 |
| | |||
* | Sped up the assignment of a tensor to a tensor slice, as well as the ↵ | Benoit Steiner | 2015-04-20 |
| | | | | assigment of a constant slice to a tensor | ||
* | Fixed the vectorized implementation of the Tensor select() method | Benoit Steiner | 2015-03-25 |
| | |||
* | Silenced more compilation warnings | Benoit Steiner | 2015-02-10 |
| | |||
* | Silcenced a few compilation warnings | Benoit Steiner | 2015-02-10 |
| | |||
* | Silenced several compilation warnings | Benoit Steiner | 2015-02-10 |
| | |||
* | Fixed the return type of coefficient wise operations. For example, the abs ↵ | Benoit Steiner | 2015-01-14 |
| | | | | function returns a floating point value when called on a complex input. | ||
* | Fixed the return type of the coefficient-wise tensor operations. | Benoit Steiner | 2014-11-04 |
| | |||
* | Fixed the return types of unary and binary expressions to properly handle ↵ | Benoit Steiner | 2014-10-16 |
| | | | | the case where it is different from the input type (e.g. abs(complex<float>)) | ||
* | Misc improvements and cleanups | Benoit Steiner | 2014-10-13 |
| | |||
* | Added suppor for in place evaluation to simple tensor expressions. | Benoit Steiner | 2014-08-13 |
| | | | | Use mempy to speedup tensor copies whenever possible. | ||
* | Improved evaluation of tensor expressions when used as rvalues | Benoit Steiner | 2014-07-08 |
| | |||
* | Reworked the expression evaluation mechanism in order to make it possible to ↵ | Benoit Steiner | 2014-06-13 |
| | | | | | | | | efficiently compute convolutions and contractions in the future: * The scheduling of computation is moved out the the assignment code and into a new TensorExecutor class * The assignment itself is now a regular node on the expression tree * The expression evaluators start by recursively evaluating all their subexpressions if needed | ||
* | TensorEval are now typed on the device: this will make it possible to use ↵ | Benoit Steiner | 2014-06-10 |
| | | | | | | partial template specialization to optimize the strategy of each evaluator for each device type. Started work on partial evaluations. | ||
* | Added support for tensor contractions | Benoit Steiner | 2014-06-04 |
| | | | | | Updated expression evaluation mechanism to also compute the size of the tensor result Misc fixes and improvements. | ||
* | Added support for additional tensor operations: | Benoit Steiner | 2014-05-22 |
| | | | | | | | | * comparison (<, <=, ==, !=, ...) * selection * nullary ops such as random or constant generation * misc unary ops such as log(), exp(), or a user defined unaryExpr() Cleaned up the code a little. | ||
* | Vectorized the evaluation of tensor expression (using SSE, AVX, NEON, ...) | Benoit Steiner | 2014-05-16 |
| | | | | | Added the ability to parallelize the evaluation of a tensor expression over multiple cpu cores. Added the ability to offload the evaluation of a tensor expression to a GPU. | ||
* | Added support for fixed sized tensors. | Benoit Steiner | 2014-05-06 |
| | | | | Improved support for tensor expressions. | ||
* | Extended support for Tensors: | Benoit Steiner | 2014-04-28 |
* Added ability to map a region of the memory to a tensor * Added basic support for unary and binary coefficient wise expressions, such as addition or square root * Provided an emulation layer to make it possible to compile the code with compilers (such as nvcc) that don't support cxx11. |