| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original swap approach leads to potential undefined behavior (reading
uninitialized memory) and results in unnecessary copying of data for static
storage.
Here we pass down the move assignment to the underlying storage. Static
storage does a one-way copy, dynamic storage does a swap.
Modified the tests to no longer read from the moved-from matrix/tensor,
since that can lead to UB. Added a test to ensure we do not access
uninitialized memory in a move.
Fixes: #2119
|
|
|
|
| |
FixedDimensions.
|
|
|
|
|
|
|
| |
Removed m_dimension as instance member of TensorStorage with
FixedDimensions and instead use the template parameter. This
means that the sizeof a pure fixed-size storage is exactly
equal to the data it is storing.
|
|
|
|
|
|
|
|
| |
https://bitbucket.org/eigen/eigen/commits/668ab3fc474e54c7919eda4fbaf11f3a99246494
.
std::array is still not supported in CUDA device code on Windows.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
encoding it in the options.
|
| |
|
| |
|
|
|
|
| |
enables us to use 32bit indices to evaluate expressions on GPU faster while keeping the ability to use 64 bit indices to manipulate large tensors on CPU in the same binary.
|
|
|
|
| |
cuda kernels.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Updated expression evaluation mechanism to also compute the size of the tensor result
Misc fixes and improvements.
|
|
|
|
|
| |
Added the ability to parallelize the evaluation of a tensor expression over multiple cpu cores.
Added the ability to offload the evaluation of a tensor expression to a GPU.
|
|
|
|
| |
Improved support for tensor expressions.
|
|
|
|
|
|
| |
* Added ability to map a region of the memory to a tensor
* Added basic support for unary and binary coefficient wise expressions, such as addition or square root
* Provided an emulation layer to make it possible to compile the code with compilers (such as nvcc) that don't support cxx11.
|
| |
|
|
This commit adds an initial implementation of a class template Tensor
that allows for the storage of objects with more than two indices.
Currently, only storing data and setting the object to zero for POD
data types are implemented.
|