aboutsummaryrefslogtreecommitdiffhomepage
Commit message (Collapse)AuthorAge
* Re-add executable flags to minimize changeset.Gravatar Ville Kallioniemi2016-01-22
|
* Update to latest default branchGravatar Ville Kallioniemi2016-01-21
|\
* \ Make use of 32 bit ints explicit and remove executable bit from headers.Gravatar Ville Kallioniemi2016-01-21
|\ \
| | * Pulled latest updates from trunkGravatar Benoit Steiner2016-01-21
| | |\
| | * | Fixed a constness bugGravatar Benoit Steiner2016-01-21
| | | |
| | | * bug #977: avoid division by 0 in normalize() and normalized().Gravatar Gael Guennebaud2016-01-21
| | | |
| | | * Fix compilation on old gcc+AVXGravatar Gael Guennebaud2016-01-21
| | | |
| | | * Add numext::sqrt function to enable custom optimized implementation.Gravatar Gael Guennebaud2016-01-21
| | |/ | | | | | | | | | | | | | | | | | | | | | This changeset add two specializations for float/double on SSE. Those are mostly usefull with GCC for which std::sqrt add an extra and costly check on the result of _mm_sqrt_*. Clang does not add this burden. In this changeset, only DenseBase::norm() makes use of it.
| | * bug #1151: remove useless critical sectionGravatar Gael Guennebaud2016-01-21
| | |
| | * fix clang warningsGravatar Jan Prach2016-01-20
| | | | | | | | | | | | "braces around scalar initializer"
| | * Pulled latest updates from the trunkGravatar Benoit Steiner2016-01-20
| | |\
| | * | Small cleanup and small fix to the contraction of row major tensorsGravatar Benoit Steiner2016-01-20
| | | |
| | | * add upper|lower case in incomplete_cholesky unit testGravatar Gael Guennebaud2016-01-21
| | | |
| | * | Reduce the register pressure exerted by the tensor mappers whenever ↵Gravatar Benoit Steiner2016-01-20
| | |/ | | | | | | | | | possible. This improves the performance of the contraction of a matrix with a vector by about 35%.
| | * Pulled latest updates from trunkGravatar Benoit Steiner2016-01-20
| | |\
| | * | bug #1149: fix Pastix*::*parm()Gravatar Gael Guennebaud2016-01-20
| | | |
| | * | bug #1148: silent Pastix by defaultGravatar Gael Guennebaud2016-01-20
| | | |
| | * | bug #1145: fix PastixSupport LLT/LDLT wrappers (missing resize prior to ↵Gravatar Gael Guennebaud2016-01-20
| | | | | | | | | | | | | | | | calls to selfAdjointView)
| | * | bug #1147: fix compilation of PastixSupportGravatar Gael Guennebaud2016-01-20
| | | |
| | * | Add static assertion to y(), z(), w() accessorsGravatar Gael Guennebaud2016-01-20
| | | |
| * | | Remove executable bit from header filesGravatar Ville Kallioniemi2016-01-19
| | | |
| * | | Use explicitly 32 bit integer types in constructors.Gravatar Ville Kallioniemi2016-01-19
|/ / /
| | * Improved the formatting of the codeGravatar Benoit Steiner2016-01-19
| |/
| * Moved the contraction mapping code to its own file to make the code more ↵Gravatar Benoit Steiner2016-01-19
| | | | | | | | manageable.
| * Improved code indentationGravatar Benoit Steiner2016-01-19
| |
| * Record whether the underlying tensor storage can be accessed directly during ↵Gravatar Benoit Steiner2016-01-19
| | | | | | | | the evaluation of an expression.
* | Add ctor for longGravatar Ville Kallioniemi2016-01-17
| |
| * Fixed a race condition that could affect some reductions on CUDA devices.Gravatar Benoit Steiner2016-01-15
| |
| * Made it possible to compare tensor dimensions inside a CUDA kernel.Gravatar Benoit Steiner2016-01-15
| |
| * Use warp shuffles instead of shared memory access to speedup the inner ↵Gravatar Benoit Steiner2016-01-14
| | | | | | | | reduction kernel.
| * Fixed a boundary condition bug in the outer reduction kernelGravatar Benoit Steiner2016-01-14
| |
| * Properly record the rank of reduced tensors in the tensor traits.Gravatar Benoit Steiner2016-01-13
| |
| * Trigger the optimized matrix vector path more conservatively.Gravatar Benoit Steiner2016-01-12
| |
| * Improved the performance of the contraction of a 2d tensor with a 1d tensor ↵Gravatar Benoit Steiner2016-01-12
| | | | | | | | by a factor of 3 or more. This helps speedup LSTM neural networks.
| * Reverted a previous change that tripped nvcc when compiling in debug mode.Gravatar Benoit Steiner2016-01-11
| |
| * Made the blas utils usable from within a cuda kernelGravatar Benoit Steiner2016-01-11
| |
| * Silenced a few compilation warnings.Gravatar Benoit Steiner2016-01-11
| |
| * Updated the tensor traits: the alignment is not part of the Flags enum anymoreGravatar Benoit Steiner2016-01-11
| |
| * Enabled the use of fixed dimensions from within a cuda kernel.Gravatar Benoit Steiner2016-01-11
| |
| * Deleted unused variable.Gravatar Benoit Steiner2016-01-11
| |
| * Silenced a nvcc compilation warningGravatar Benoit Steiner2016-01-11
| |
| * Silenced several compilation warnings triggered by nvcc.Gravatar Benoit Steiner2016-01-11
| |
| * Merged in jeremy_barnes/eigen/shader-model-3.0 (pull request PR-152)Gravatar Benoit Steiner2016-01-11
| |\ | | | | | | | | | Alternative way of forcing instantiation of device kernels without causing warnings or requiring device to device kernel invocations.
| * | Fixed a bug in the dispatch of optimized reduction kernels.Gravatar Benoit Steiner2016-01-11
| | |
| * | Re-enabled the optimized reduction CUDA code.Gravatar Benoit Steiner2016-01-11
| | |
| | * Cleaned up double-defined macro from last commitGravatar Jeremy Barnes2016-01-10
| | |
| | * Alternative way of forcing instantiation of device kernels withoutGravatar Jeremy Barnes2016-01-10
| |/ | | | | | | | | | | causing warnings or requiring device to device kernel invocations. This allows Tensorflow to work on SM 3.0 (ie, Amazon EC2) machines.
| * mergeGravatar Gael Guennebaud2016-01-09
| |\
| * | bug #1144: fix regression in x=y+A*x (aliasing), and move ↵Gravatar Gael Guennebaud2016-01-09
| | | | | | | | | | | | evaluator_traits::AssumeAliasing to evaluator_assume_aliasing.
| | * Simplified the dispatch code.Gravatar Benoit Steiner2016-01-08
| | |