Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Merged in benoitsteiner/opencl (pull request PR-253) | Benoit Steiner | 2016-11-19 |
|\ | | | | | | | OpenCL improvements | ||
| * | Added the ability to run test exclusively OpenCL devices that are listed by ↵ | Benoit Steiner | 2016-11-18 |
| | | | | | | | | sycl::device::get_devices(). | ||
* | | Deleted unnecessary semicolons | Benoit Steiner | 2016-11-18 |
| | | |||
| * | Cleaned up the sycl device code | Benoit Steiner | 2016-11-18 |
| | | |||
| * | adding Benoit changes on the TensorDeviceSycl.h | Mehdi Goli | 2016-11-18 |
| | | |||
| * | Modifying TensorDeviceSycl.h to always create buffer of type uint8_t and ↵ | Mehdi Goli | 2016-11-18 |
| | | | | | | | | convert them to the actual type at the execution on the device; adding the queue interface class to separate the lifespan of sycl queue and buffers,created for that queue, from Eigen::SyclDevice; modifying sycl tests to support the evaluation of the results for both row major and column major data layout on all different devices that are supported by Sycl{CPU; GPU; and Host}. | ||
| * | Merged eigen/eigen into default | Benoit Steiner | 2016-11-17 |
| |\ | |/ |/| | |||
| * | Added a way to detect errors generated by the opencl device from the host | Benoit Steiner | 2016-11-17 |
| | | |||
| * | Cleanup | Benoit Steiner | 2016-11-17 |
| | | |||
| * | Created a test to check that the sycl runtime can successfully report errors ↵ | Benoit Steiner | 2016-11-17 |
| | | | | | | | | | | | | (like ivision by 0). Small cleanup | ||
* | | Made TensorDeviceCuda.h compile on windows | Benoit Steiner | 2016-11-17 |
|/ | |||
* | Merged eigen/eigen into default | Benoit Steiner | 2016-11-14 |
|\ | |||
| * | Reduce dispatch overhead in parallelFor by only calling ↵ | Rasmus Munk Larsen | 2016-11-14 |
| | | | | | | | | thread_pool.Schedule() for one of the two recursive calls in handleRange. This avoids going through the scedule path to push both recursive calls onto another thread-queue in the binary tree, but instead executes one of them on the main thread. At the leaf level this will still activate a full complement of threads, but will save up to 50% of the overhead in Schedule (random number generation, insertion in queue which includes signaling via atomics). | ||
* | | Adding extra test for non-fixed size to broadcast; Replacing stcl with sycl. | Mehdi Goli | 2016-11-14 |
| | | |||
* | | Adding TensorFixsize; adding sycl device memcpy; adding insial stage of slicing. | Mehdi Goli | 2016-11-14 |
| | | |||
* | | Adding comment to TensorDeviceSycl.h and cleaning the code. | Mehdi Goli | 2016-11-11 |
|/ | |||
* | Adding EIGEN_STRONG_INLINE back; using size() instead of ↵ | Mehdi Goli | 2016-11-10 |
| | | | | dimensions.TotalSize() on Tensor. | ||
* | adding the missing in eigen_assert! | Mehdi Goli | 2016-11-10 |
| | |||
* | Adding Memset; optimising MecopyDeviceToHost by removing double copying; | Mehdi Goli | 2016-11-10 |
| | |||
* | Fixed the formatting of the code | Benoit Steiner | 2016-11-08 |
| | |||
* | #if EIGEN_EXCEPTION -> #ifdef EIGEN_EXCEPTIONS. | Luke Iwanski | 2016-11-08 |
| | |||
* | Fix for SYCL queue initialisation. | Luke Iwanski | 2016-11-08 |
| | |||
* | Use try/catch only when exceptions are enabled. | Luke Iwanski | 2016-11-08 |
| | |||
* | Converting all sycl buffers to uninitialised device only buffers; adding ↵ | Mehdi Goli | 2016-11-08 |
| | | | | memcpyHostToDevice and memcpyDeviceToHost on syclDevice; modifying all examples to obey the new rules; moving sycl queue creating to the device based on Benoit suggestion; removing the sycl specefic condition for returning m_result in TensorReduction.h according to Benoit suggestion. | ||
* | Removed the sycl include from Eigen/Core and moved it to ↵ | Mehdi Goli | 2016-11-04 |
| | | | | Unsupported/Eigen/CXX11/Tensor; added TensorReduction for sycl (full reduction and partial reduction); added TensorReduction test case for sycl (full reduction and partial reduction); fixed the tile size on TensorSyclRun.h based on the device max work group size; | ||
* | Disable vectorization on device only when compiling for sycl | Benoit Steiner | 2016-11-02 |
| | |||
* | Fixed the ambiguity in callig make_tuple for sycl backend. | Mehdi Goli | 2016-10-31 |
| | |||
* | Worked around Visual Studio compilation errors | Benoit Steiner | 2016-10-28 |
| | |||
* | Added missing template parameters | Benoit Steiner | 2016-10-28 |
| | |||
* | Workaround MSVC issue. | Gael Guennebaud | 2016-10-27 |
| | |||
* | Removed a template parameter for fixed sized tensors | Benoit Steiner | 2016-10-26 |
| | |||
* | Replaced tabs with spaces | Benoit Steiner | 2016-10-25 |
| | |||
* | Code cleanup | Benoit Steiner | 2016-10-25 |
| | |||
* | Merge latest updates from trunk | Benoit Steiner | 2016-10-20 |
|\ | |||
| * | Fixed a few typos in the ternary tensor expressions types | Benoit Steiner | 2016-10-19 |
| | | |||
* | | Fixing the code indentation in the TensorReduction.h file. | Mehdi Goli | 2016-10-14 |
| | | |||
* | | Merged ComputeCpp into default. | Luke Iwanski | 2016-10-14 |
|\ \ | |||
| * | | Reducing the code by generalising sycl backend functions/structs. | Mehdi Goli | 2016-10-14 |
| | | | |||
* | | | Merged eigen/eigen into default | Benoit Steiner | 2016-10-12 |
|\ \ \ | | |/ | |/| | |||
* | | | Merge the content of the ComputeCpp branch into the default branch | Benoit Steiner | 2016-10-07 |
|\ \ \ | | |/ | |/| | |||
| | * | Fully support complex types in SumReducer and MeanReducer when building for ↵ | RJ Ryan | 2016-10-06 |
| |/ |/| | | | | | CUDA by using scalar_sum_op and scalar_product_op instead of operator+ and operator*. | ||
| * | Pull the latest updates from trunk | Benoit Steiner | 2016-10-05 |
| |\ | |||
| * | | Fixed compilation warning | Benoit Steiner | 2016-10-05 |
| | | | |||
* | | | ::rand() returns a signed integer on win32 | Benoit Steiner | 2016-10-05 |
| | | | |||
* | | | Fixed a typo that impacts windows builds | Benoit Steiner | 2016-10-05 |
| |/ |/| | |||
* | | Silenced compilation warning | Benoit Steiner | 2016-10-04 |
| | | |||
* | | Cleanup the cuda executor code. | Benoit Steiner | 2016-10-04 |
| | | |||
* | | Cleaned up the random number generation code. | Benoit Steiner | 2016-10-04 |
| | | |||
* | | Updated the tensor sum and mean reducer to enable them to process complex ↵ | Benoit Steiner | 2016-09-28 |
| | | | | | | | | numbers on cuda gpus. | ||
| * | Converting alias template to nested struct in order to be compatible with CXX-03 | Mehdi Goli | 2016-09-27 |
| | |