| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
branch.
* Unifying all loadLocalTile from lhs and rhs to an extract_block function.
* Adding get_tensor operation which was missing in TensorContractionMapper.
* Adding the -D method missing from cmake for Disable_Skinny Contraction operation.
* Wrapping all the indices in TensorScanSycl into Scan parameter struct.
* Fixing typo in Device SYCL
* Unifying load to private register for tall/skinny no shared
* Unifying load to vector tile for tensor-vector/vector-tensor operation
* Removing all the LHS/RHS class for extracting data from global
* Removing Outputfunction from TensorContractionSkinnyNoshared.
* Combining the local memory version of tall/skinny and normal tensor contraction into one kernel.
* Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel.
* Combining General Tensor-Vector and VectorTensor contraction into one kernel.
* Making double buffering optional for Tensor contraction when local memory is version is used.
* Modifying benchmark to accept custom Reduction Sizes
* Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host
* Adding Test for SYCL
* Modifying SYCL CMake
|
|
|
|
| |
sycl; fixing the address space issue for const TensorMap; converting all discard_write to write due to data missmatch.
|
|
|
|
| |
issue on sycl when the rhs is TensorContraction, reduction or convolution; Fixing the partial modification for memset when sycl backend is used.
|
| |
|
|
|
|
| |
TensorInflation.h.
|
| |
|
|
|
|
| |
verification operation for cxx11_tensorChipping.cpp test
|
|
|
|
| |
Tensor Contractsycl to be located in any place in the expression tree.
|
|
|
|
| |
contractionOp sycl backend; adding temporary solution to prevent memory leak in buffer; cleaning up cxx11_tensor_buildins_sycl.h
|
|
|
|
| |
TensorIntDiv.h; disabling cashsize for sycl in tensorDeviceDefault.h; adding sycl backend for StrideSliceOP ; removing sycl compiler warning for creating an array of size 0 in CXX11Meta.h; cleaning up the sycl backend code.
|
|
|
|
| |
sycl; cleaning up the sycl backend.
|
| |
|
| |
|
|
|
|
| |
Unsupported/Eigen/CXX11/Tensor; added TensorReduction for sycl (full reduction and partial reduction); added TensorReduction test case for sycl (full reduction and partial reduction); fixed the tile size on TensorSyclRun.h based on the device max work group size;
|
| |
|
|
|