aboutsummaryrefslogtreecommitdiffhomepage
Commit message (Collapse)AuthorAge
* Silenced compilation warningGravatar Benoit Steiner2017-03-15
|
* Merged in ilya-biryukov/eigen/fix_clang_cuda_compilation (pull request PR-304)Gravatar Benoit Steiner2017-03-15
|\ | | | | | | Fixed compilation with cuda-clang
* | better check array index before using itGravatar Gael Guennebaud2017-03-15
| |
* | ARM prefetch fixes: Implement prefetch on ARM64. Do not clobber cc on ARM32.Gravatar Benoit Jacob2017-03-15
| |
| * Fixed compilation with cuda-clangGravatar Ilya Biryukov2017-03-06
|/
* Made the reduction code compile with cuda-clangGravatar Benoit Steiner2017-03-14
|
* Get rid of Init().Gravatar Rasmus Munk Larsen2017-03-10
|
* Use C++11 ctor forwarding to simplify code a bit.Gravatar Rasmus Munk Larsen2017-03-10
|
* Make the non-blocking threadpool more flexible and less wasteful of CPU ↵Gravatar Rasmus Munk Larsen2017-03-09
| | | | | | | | | | | | cycles for high-latency use-cases. * Adds a hint to ThreadPool allowing us to turn off spin waiting. Currently each reader and record yielder op in a graph creates a threadpool with a thread that spins for 1000 iterations through the work stealing loop before yielding. This is wasteful for such ops that process I/O. * This also changes the number of iterations through the steal loop to be inversely proportional to the number of threads. Since the time of each iteration is proportional to the number of threads, this yields roughly a constant spin time. * Implement a separate worker loop for the num_threads == 1 case since there is no point in going through the expensive steal loop. Moreover, since Steal() calls PopBack() on the victim queues it might reverse the order in which ops are executed, compared to the order in which they are scheduled, which is usually counter-productive for the types of I/O workloads the single thread pools tend to be used for. * Store num_threads in a member variable for simplicity and to avoid a data race between the thread creation loop and worker threads calling threads_.size().
* bug #1401: fix compilation of "cond ? x : -x" with x an AutoDiffScalarGravatar Gael Guennebaud2017-03-08
|
* fix typoGravatar Gael Guennebaud2017-03-07
|
* remove UTF8 symbolGravatar Gael Guennebaud2017-03-07
|
* remove UTF8 symbolsGravatar Gael Guennebaud2017-03-07
|
* do not include std header within extern CGravatar Gael Guennebaud2017-03-07
|
* bug #1400: fix stableNorm with EIGEN_DONT_ALIGN_STATICALLYGravatar Gael Guennebaud2017-03-07
|
* Made the Tensor code compile with clang 3.9Gravatar Benoit Steiner2017-03-02
|
* Adjusted the EIGEN_DEVICE_FUNC qualifiers to make sure that:Gravatar Benoit Steiner2017-03-01
| | | | | * they're used consistently between the declaration and the definition of a function * we avoid calling host only methods from host device methods.
* Silenced a couple of compilation warningsGravatar Benoit Steiner2017-03-01
|
* Added missing EIGEN_DEVICE_FUNC qualifiersGravatar Benoit Steiner2017-03-01
|
* Added missing EIGEN_DEVICE_FUNC qualifiersGravatar Benoit Steiner2017-02-28
|
* Made most of the packet math primitives usable within CUDA kernel when ↵Gravatar Benoit Steiner2017-02-28
| | | | compiling with clang
* Silenced clang compilation warning.Gravatar Benoit Steiner2017-02-28
|
* Added missing EIGEN_DEVICE_FUNC qualifiersGravatar Benoit Steiner2017-02-28
|
* Added missing EIGEN_DEVICE_FUNC qualifiersGravatar Benoit Steiner2017-02-28
|
* Added missing EIGEN_DEVICE_FUNCGravatar Benoit Steiner2017-02-28
|
* Made the TensorStorage class compile with clang 3.9Gravatar Benoit Steiner2017-02-28
|
* Deleted extra: EIGEN_DEVICE_FUNC: the QR and Cholesky code isn't ready to ↵Gravatar Benoit Steiner2017-02-28
| | | | run on GPU yet.
* Added missing EIGEN_DEVICE_FUNC qualifiersGravatar Benoit Steiner2017-02-28
|
* Added missing EIGEN_DEVICE_FUNC qualifiersGravatar Benoit Steiner2017-02-28
|
* Added missing EIGEN_DEVICE_FUNC qualifiersGravatar Benoit Steiner2017-02-28
|
* bug #1396: add some missing EIGEN_DEVICE_FUNCGravatar Gael Guennebaud2017-02-28
|
* Fix typo.Gravatar Gael Guennebaud2017-02-28
|
* Added missing EIGEN_DEVICE_FUNC to the SelfCwise binary opsGravatar Benoit Steiner2017-02-27
|
* Added missing EIGEN_DEVICE_FUNC qualifiers to several nullary op methods.Gravatar Benoit Steiner2017-02-27
|
* Declared the plset, ploadt_ro, and ploaddup packet primitives as usable ↵Gravatar Benoit Steiner2017-02-27
| | | | within a gpu kernel
* Added missing EIGEN_DEVICE_FUNC qualifiers.Gravatar Benoit Steiner2017-02-27
|
* Added EIGEN_DEVICE_FUNC to make the prototype of the EigenBase override ↵Gravatar Benoit Steiner2017-02-27
| | | | match that of DenseBase
* Avoid unecessary float to double conversions.Gravatar Benoit Steiner2017-02-27
|
* bug #1394: fix compilation of SelfAdjointEigenSolver<Matrix>(sparse*sparse);Gravatar Gael Guennebaud2017-02-20
|
* bug #1380: for Map<> as input of matrix exponentialGravatar Gael Guennebaud2017-02-20
|
* bug #1395: fix the use of compile-time vectors as inputs of JacobiSVD.Gravatar Gael Guennebaud2017-02-20
|
* Silent warning.Gravatar Gael Guennebaud2017-02-20
|
* Fix usage of CUDACC_VERGravatar Gael Guennebaud2017-02-20
|
* Fix tracking of temporaries in unit testsGravatar Gael Guennebaud2017-02-19
|
* Fix compilation.Gravatar Gael Guennebaud2017-02-18
|
* Use int32_t instead of int in NEON code. Some platforms with 16 bytes int ↵Gravatar Gael Guennebaud2017-02-17
| | | | supports ARM NEON.
* bug #1393: enable Matrix/Array explicit ctor from types with conversion ↵Gravatar Gael Guennebaud2017-02-17
| | | | operators (was ok with 3.2)
* Size indices are signed.Gravatar Benoit Steiner2017-02-16
|
* Merged eigen/eigen into defaultGravatar Benoit Steiner2017-02-14
|\
* | Adding TensorChippingOP for sycl backend; fixing the index value in the ↵Gravatar Mehdi Goli2017-02-13
| | | | | | | | verification operation for cxx11_tensorChipping.cpp test