Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Eigen moved the `scanLauncehr` function inside the internal namespace. | mehdi-goli | 2020-05-11 |
| | | | | | | | This commit applies the following changes: - Moving the `scamLauncher` specialization inside internal namespace to fix compiler crash on TensorScan for SYCL backend. - Replacing `SYCL/sycl.hpp` to `CL/sycl.hpp` in order to follow SYCL 1.2.1 standard. - minor fixes: commenting out an unused variable to avoid compiler warnings. | ||
* | [SYCL] Rebasing the SYCL support branch on top of the Einge upstream master ↵ | Mehdi Goli | 2019-11-28 |
| | | | | | | | | | | | | | | | | | | | | | | branch. * Unifying all loadLocalTile from lhs and rhs to an extract_block function. * Adding get_tensor operation which was missing in TensorContractionMapper. * Adding the -D method missing from cmake for Disable_Skinny Contraction operation. * Wrapping all the indices in TensorScanSycl into Scan parameter struct. * Fixing typo in Device SYCL * Unifying load to private register for tall/skinny no shared * Unifying load to vector tile for tensor-vector/vector-tensor operation * Removing all the LHS/RHS class for extracting data from global * Removing Outputfunction from TensorContractionSkinnyNoshared. * Combining the local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining the no-local memory version of tall/skinny and normal tensor contraction into one kernel. * Combining General Tensor-Vector and VectorTensor contraction into one kernel. * Making double buffering optional for Tensor contraction when local memory is version is used. * Modifying benchmark to accept custom Reduction Sizes * Disabling AVX optimization for SYCL backend on the host to allow SSE optimization to the host * Adding Test for SYCL * Modifying SYCL CMake | ||
* | Adding synchronisation to convolution kernel for sycl backend. | Mehdi Goli | 2017-03-13 |
| | |||
* | Fixing typo in sycl Benchmark. | Mehdi Goli | 2017-03-08 |
| | |||
* | Adding sycl Benchmarks. | Mehdi Goli | 2017-03-08 |
| | |||
* | Fixed the sycl benchmarking code | Benoit Steiner | 2016-12-22 |
| | |||
* | Partial OpenCL support via SYCL compatible with ComputeCpp CE. | Luke Iwanski | 2016-09-19 |
| | |||
* | Updated the README file for the tensor benchmarks | Benoit Steiner | 2016-05-25 |
| | |||
* | Improved the performance of tensor padding | Benoit Steiner | 2016-05-25 |
| | |||
* | Added benchmarks for contraction on CPU. | Benoit Steiner | 2016-05-13 |
| | |||
* | Added a benchmark to measure the performance of full reductions of 16 bit floats | Benoit Steiner | 2016-05-05 |
| | |||
* | Use index list for the striding benchmarks | Benoit Steiner | 2016-04-21 |
| | |||
* | Enable the benchmarks for algebraic and transcendental fnctions on fp16. | Benoit Steiner | 2016-04-12 |
| | |||
* | Turned on the contraction benchmarks for fp16 | Benoit Steiner | 2016-04-12 |
| | |||
* | Turn on the coeffWise benchmarks on fp16 | Benoit Steiner | 2016-04-07 |
| | |||
* | Fixed the type casting benchmarks for fp16 | Benoit Steiner | 2016-04-07 |
| | |||
* | Fixed the benchmarking of fp16 coefficient wise operations | Benoit Steiner | 2016-04-07 |
| | |||
* | Updated the benchmarking code to use Eigen::half instead of half | Benoit Steiner | 2016-03-24 |
| | |||
* | Made the tensor benchmarks compile on MacOS | Benoit Steiner | 2016-03-23 |
| | |||
* | Added benchmarks for full reduction | Benoit Steiner | 2016-02-29 |
| | |||
* | Improved the README | Benoit Steiner | 2016-02-27 |
| | |||
* | Added benchmarks for type casting of float16 | Benoit Steiner | 2016-02-26 |
| | |||
* | Added benchmarks for fp16 | Benoit Steiner | 2016-02-26 |
| | |||
* | Extended the tensor benchmark suite to support types other than floats | Benoit Steiner | 2016-02-23 |
| | |||
* | Updated the tensor benchmarking code to work with compilers that don't ↵ | Benoit Steiner | 2016-02-23 |
| | | | | support cxx11. | ||
* | Added 2 benchmarks to the suite of tensor benchmarks running on GPU | Benoit Steiner | 2016-01-30 |
| | |||
* | Fixed the tensor benchmarks on apple devices | Benoit Steiner | 2016-01-28 |
| | |||
* | Fixed clang related compilation error | Benoit Steiner | 2016-01-28 |
| | |||
* | Fixed a typo | Benoit Steiner | 2016-01-28 |
| | |||
* | Made sure the number of floating point operations done by a benchmark is ↵ | Benoit Steiner | 2016-01-28 |
| | | | | computed using 64 bit integers to avoid overflows. | ||
* | Added a readme to explain how to compile the tensor benchmarks. | Benoit Steiner | 2016-01-28 |
| | |||
* | Updated the benchmarking code to print the number of flops processed instead ↵ | Benoit Steiner | 2016-01-28 |
| | | | | of the number of bytes. | ||
* | Added extra tensor benchmarks | Benoit Steiner | 2016-01-28 |
| | |||
* | bugfix | Yangqing Jia | 2016-01-28 |
| | |||
* | benchmark modifications to make it compilable in a standalone fashion. | Yangqing Jia | 2016-01-28 |
| | |||
* | Added a few benchmarks for the tensor code | Benoit Steiner | 2015-01-26 |