Commit message (Collapse) | Author | Age | |
---|---|---|---|
* | size_t -> int | 2016-06-03 | |
| | |||
* | Add CurrentThreadId and NumThreads methods to Eigen threadpools and ↵ | 2016-06-03 | |
| | | | | TensorDeviceThreadPool. | ||
* | Fix MatrixFunctions module. | 2016-06-03 | |
| | |||
* | Align the first element of the Waiter struct instead of padding it. This ↵ | 2016-06-02 | |
| | | | | reduces its memory footprint a bit while achieving the goal of preventing false sharing | ||
* | Add syntactic sugar to Eigen tensors to allow more natural syntax. | 2016-06-02 | |
| | | | | | | | | | Specifically, this enables expressions involving: scalar + tensor scalar * tensor scalar / tensor scalar - tensor | ||
* | Add tensor scan op | 2016-06-02 | |
| | | | | | This is the initial implementation a generic scan operation. Based on this, cumsum and cumprod method have been added to TensorBase. | ||
* | Use a single PacketSize variable | 2016-06-01 | |
| | |||
* | Fixed compilation warning | 2016-06-01 | |
| | |||
* | Speedup a test | 2016-06-01 | |
| | |||
* | Silenced compilation warning generated by nvcc. | 2016-06-01 | |
| | |||
* | Added support for mean reductions on fp16 | 2016-06-01 | |
| | |||
* | Only enable optimized reductions of fp16 if the reduction functor supports them | 2016-05-31 | |
| | |||
* | Reimplement clamp as a static function. | 2016-05-27 | |
| | |||
* | Use NULL instead of nullptr to preserve the compatibility with cxx03 | 2016-05-27 | |
| | |||
* | Added a new operation to enable more powerful tensorindexing. | 2016-05-27 | |
| | |||
* | Fixed option '--relaxed-constexpr' has been deprecated and replaced by ↵ | 2016-05-27 | |
| | | | | option '--expt-relaxed-constexpr' warning generated by nvcc 7.5 | ||
* | Fix compilation when defaulting to row-major | 2016-05-27 | |
| | |||
* | Fixed some compilation warnings | 2016-05-26 | |
| | |||
* | Preserve the ability to vectorize the evaluation of an expression even when ↵ | 2016-05-26 | |
| | | | | it involves a cast that isn't vectorized (e.g fp16 to float) | ||
* | Resolved merge conflicts | 2016-05-26 | |
| | |||
* | Merged latest reduction improvements | 2016-05-26 | |
|\ | |||
* | | Improved the performance of inner reductions. | 2016-05-26 | |
| | | |||
* | | Improved the coverage of the fp16 reduction tests | 2016-05-26 | |
| | | |||
* | | Code cleanup. | 2016-05-26 | |
| | | |||
* | | Made the static storage class qualifier come first. | 2016-05-25 | |
| | | |||
* | | Deleted unnecessary explicit qualifiers. | 2016-05-25 | |
| | | |||
* | | Don't mark inline functions as static since it confuses the ICC compiler | 2016-05-25 | |
| | | |||
* | | Marked unused variables as such | 2016-05-25 | |
| | | |||
* | | Made the IndexPair code compile in non cxx11 mode | 2016-05-25 | |
| | | |||
* | | Made the index pair list code more portable accross various compilers | 2016-05-25 | |
| | | |||
* | | Improved the performance of tensor padding | 2016-05-25 | |
| | | |||
* | | Added support for statically known lists of pairs of indices | 2016-05-25 | |
| | | |||
* | | There is no need to make the fp16 full reduction kernel a static function. | 2016-05-24 | |
| | | |||
* | | Fixed compilation warning | 2016-05-24 | |
| | | |||
* | | Merged in rmlarsen/eigen (pull request PR-188) | 2016-05-23 | |
|\ \ | | | | | | | | | | Minor cleanups: 1. Get rid of a few unused variables. 2. Get rid of last uses of EIGEN_USE_COST_MODEL. | ||
* | | | Silenced several double-promotion warnings | 2016-05-22 | |
| | | | |||
* | | | fixed macro name | 2016-05-22 | |
| | | | |||
* | | | Fix some sign-compare warnings | 2016-05-22 | |
| | | | |||
* | | | Make EIGEN_HAS_CONSTEXPR user configurable | 2016-05-20 | |
| | | | |||
* | | | Make EIGEN_HAS_VARIADIC_TEMPLATES user configurable | 2016-05-20 | |
| | | | |||
* | | | Make EIGEN_HAS_RVALUE_REFERENCES user configurable | 2016-05-20 | |
| | | | |||
* | | | Rename EIGEN_HAVE_RVALUE_REFERENCES to EIGEN_HAS_RVALUE_REFERENCES | 2016-05-20 | |
| | | | |||
* | | | Remove std:: to enable custom scalar types. | 2016-05-19 | |
| | | | |||
| * | | Merged eigen/eigen into default | 2016-05-18 | |
| |\ \ | |||
| * \ \ | Merge. | 2016-05-18 | |
| |\ \ \ | |||
| * | | | | Minor cleanups: 1. Get rid of unused variables. 2. Get rid of last uses of ↵ | 2016-05-18 | |
| | | | | | | | | | | | | | | | | | | | | EIGEN_USE_COST_MODEL. | ||
| | * | | | Reduce overhead for small tensors and cheap ops by short-circuiting the ↵ | 2016-05-17 | |
| |/ / / | | | | | | | | | | | | | const computation and block size calculation in parallelFor. | ||
| | | * | Merged latest updates from trunk | 2016-05-17 | |
| | | |\ | |||
| | | * | | Allow vectorized padding on GPU. This helps speed things up a little. | 2016-05-17 | |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before: BM_padding/10 5000000 460 217.03 MFlops/s BM_padding/80 5000000 460 13899.40 MFlops/s BM_padding/640 5000000 461 888421.17 MFlops/s BM_padding/4K 5000000 460 54316322.55 MFlops/s After: BM_padding/10 5000000 454 220.20 MFlops/s BM_padding/80 5000000 455 14039.86 MFlops/s BM_padding/640 5000000 452 904968.83 MFlops/s BM_padding/4K 5000000 411 60750049.21 MFlops/s | ||
* | | | | | made a fix to the GMRES solver so that it now correctly reports the error ↵ | 2016-05-16 | |
| |/ / / |/| | | | | | | | | | | | achieved in the solution process |