Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | Silenced compilation warning | Benoit Steiner | 2017-03-15 |
| | |||
* | Merged in ilya-biryukov/eigen/fix_clang_cuda_compilation (pull request PR-304) | Benoit Steiner | 2017-03-15 |
|\ | | | | | | | Fixed compilation with cuda-clang | ||
* | | better check array index before using it | Gael Guennebaud | 2017-03-15 |
| | | |||
* | | ARM prefetch fixes: Implement prefetch on ARM64. Do not clobber cc on ARM32. | Benoit Jacob | 2017-03-15 |
| | | |||
| * | Fixed compilation with cuda-clang | Ilya Biryukov | 2017-03-06 |
|/ | |||
* | Made the reduction code compile with cuda-clang | Benoit Steiner | 2017-03-14 |
| | |||
* | Get rid of Init(). | Rasmus Munk Larsen | 2017-03-10 |
| | |||
* | Use C++11 ctor forwarding to simplify code a bit. | Rasmus Munk Larsen | 2017-03-10 |
| | |||
* | Make the non-blocking threadpool more flexible and less wasteful of CPU ↵ | Rasmus Munk Larsen | 2017-03-09 |
| | | | | | | | | | | | | cycles for high-latency use-cases. * Adds a hint to ThreadPool allowing us to turn off spin waiting. Currently each reader and record yielder op in a graph creates a threadpool with a thread that spins for 1000 iterations through the work stealing loop before yielding. This is wasteful for such ops that process I/O. * This also changes the number of iterations through the steal loop to be inversely proportional to the number of threads. Since the time of each iteration is proportional to the number of threads, this yields roughly a constant spin time. * Implement a separate worker loop for the num_threads == 1 case since there is no point in going through the expensive steal loop. Moreover, since Steal() calls PopBack() on the victim queues it might reverse the order in which ops are executed, compared to the order in which they are scheduled, which is usually counter-productive for the types of I/O workloads the single thread pools tend to be used for. * Store num_threads in a member variable for simplicity and to avoid a data race between the thread creation loop and worker threads calling threads_.size(). | ||
* | bug #1401: fix compilation of "cond ? x : -x" with x an AutoDiffScalar | Gael Guennebaud | 2017-03-08 |
| | |||
* | fix typo | Gael Guennebaud | 2017-03-07 |
| | |||
* | remove UTF8 symbol | Gael Guennebaud | 2017-03-07 |
| | |||
* | remove UTF8 symbols | Gael Guennebaud | 2017-03-07 |
| | |||
* | do not include std header within extern C | Gael Guennebaud | 2017-03-07 |
| | |||
* | bug #1400: fix stableNorm with EIGEN_DONT_ALIGN_STATICALLY | Gael Guennebaud | 2017-03-07 |
| | |||
* | Made the Tensor code compile with clang 3.9 | Benoit Steiner | 2017-03-02 |
| | |||
* | Adjusted the EIGEN_DEVICE_FUNC qualifiers to make sure that: | Benoit Steiner | 2017-03-01 |
| | | | | | * they're used consistently between the declaration and the definition of a function * we avoid calling host only methods from host device methods. | ||
* | Silenced a couple of compilation warnings | Benoit Steiner | 2017-03-01 |
| | |||
* | Added missing EIGEN_DEVICE_FUNC qualifiers | Benoit Steiner | 2017-03-01 |
| | |||
* | Added missing EIGEN_DEVICE_FUNC qualifiers | Benoit Steiner | 2017-02-28 |
| | |||
* | Made most of the packet math primitives usable within CUDA kernel when ↵ | Benoit Steiner | 2017-02-28 |
| | | | | compiling with clang | ||
* | Silenced clang compilation warning. | Benoit Steiner | 2017-02-28 |
| | |||
* | Added missing EIGEN_DEVICE_FUNC qualifiers | Benoit Steiner | 2017-02-28 |
| | |||
* | Added missing EIGEN_DEVICE_FUNC qualifiers | Benoit Steiner | 2017-02-28 |
| | |||
* | Added missing EIGEN_DEVICE_FUNC | Benoit Steiner | 2017-02-28 |
| | |||
* | Made the TensorStorage class compile with clang 3.9 | Benoit Steiner | 2017-02-28 |
| | |||
* | Deleted extra: EIGEN_DEVICE_FUNC: the QR and Cholesky code isn't ready to ↵ | Benoit Steiner | 2017-02-28 |
| | | | | run on GPU yet. | ||
* | Added missing EIGEN_DEVICE_FUNC qualifiers | Benoit Steiner | 2017-02-28 |
| | |||
* | Added missing EIGEN_DEVICE_FUNC qualifiers | Benoit Steiner | 2017-02-28 |
| | |||
* | Added missing EIGEN_DEVICE_FUNC qualifiers | Benoit Steiner | 2017-02-28 |
| | |||
* | bug #1396: add some missing EIGEN_DEVICE_FUNC | Gael Guennebaud | 2017-02-28 |
| | |||
* | Fix typo. | Gael Guennebaud | 2017-02-28 |
| | |||
* | Added missing EIGEN_DEVICE_FUNC to the SelfCwise binary ops | Benoit Steiner | 2017-02-27 |
| | |||
* | Added missing EIGEN_DEVICE_FUNC qualifiers to several nullary op methods. | Benoit Steiner | 2017-02-27 |
| | |||
* | Declared the plset, ploadt_ro, and ploaddup packet primitives as usable ↵ | Benoit Steiner | 2017-02-27 |
| | | | | within a gpu kernel | ||
* | Added missing EIGEN_DEVICE_FUNC qualifiers. | Benoit Steiner | 2017-02-27 |
| | |||
* | Added EIGEN_DEVICE_FUNC to make the prototype of the EigenBase override ↵ | Benoit Steiner | 2017-02-27 |
| | | | | match that of DenseBase | ||
* | Avoid unecessary float to double conversions. | Benoit Steiner | 2017-02-27 |
| | |||
* | bug #1394: fix compilation of SelfAdjointEigenSolver<Matrix>(sparse*sparse); | Gael Guennebaud | 2017-02-20 |
| | |||
* | bug #1380: for Map<> as input of matrix exponential | Gael Guennebaud | 2017-02-20 |
| | |||
* | bug #1395: fix the use of compile-time vectors as inputs of JacobiSVD. | Gael Guennebaud | 2017-02-20 |
| | |||
* | Silent warning. | Gael Guennebaud | 2017-02-20 |
| | |||
* | Fix usage of CUDACC_VER | Gael Guennebaud | 2017-02-20 |
| | |||
* | Fix tracking of temporaries in unit tests | Gael Guennebaud | 2017-02-19 |
| | |||
* | Fix compilation. | Gael Guennebaud | 2017-02-18 |
| | |||
* | Use int32_t instead of int in NEON code. Some platforms with 16 bytes int ↵ | Gael Guennebaud | 2017-02-17 |
| | | | | supports ARM NEON. | ||
* | bug #1393: enable Matrix/Array explicit ctor from types with conversion ↵ | Gael Guennebaud | 2017-02-17 |
| | | | | operators (was ok with 3.2) | ||
* | Size indices are signed. | Benoit Steiner | 2017-02-16 |
| | |||
* | Merged eigen/eigen into default | Benoit Steiner | 2017-02-14 |
|\ | |||
* | | Adding TensorChippingOP for sycl backend; fixing the index value in the ↵ | Mehdi Goli | 2017-02-13 |
| | | | | | | | | verification operation for cxx11_tensorChipping.cpp test |