aboutsummaryrefslogtreecommitdiffhomepage
Commit message (Collapse)AuthorAge
* Added SYCL include in Tensor.Gravatar Luke Iwanski2016-10-20
|
* Fixing the typo regarding missing #if needed for proper handling of ↵Gravatar Mehdi Goli2016-10-16
| | | | exceptions in Eigen/Core.
* Fixing the code indentation in the TensorReduction.h file.Gravatar Mehdi Goli2016-10-14
|
* Merged ComputeCpp to default.Gravatar Luke Iwanski2016-10-14
|\
| * Applyiing Benoit's comment to return the missing line back in Eigen/CoreGravatar Mehdi Goli2016-10-14
| |
* | Merged ComputeCpp into default.Gravatar Luke Iwanski2016-10-14
|\|
| * Reducing the code by generalising sycl backend functions/structs.Gravatar Mehdi Goli2016-10-14
| |
* | Merged eigen/eigen into defaultGravatar Benoit Steiner2016-10-12
|\ \
| * | Remove double ;;Gravatar Gael Guennebaud2016-10-12
| | |
| * | Fix SPQR for rectangular matricesGravatar Gael Guennebaud2016-10-12
| | |
| * | Fix outer-stride.Gravatar Gael Guennebaud2016-10-12
| | |
| * | Merged in rmlarsen/eigen (pull request PR-230)Gravatar Gael Guennebaud2016-10-12
| |\ \ | | | | | | | | | | | | Fix a bug in psqrt for SSE and AVX when EIGEN_FAST_MATH=1
| | * | Fix copy-paste error: Must use _mm256_cmp_ps for AVX.Gravatar Rasmus Munk Larsen2016-10-12
| | | |
| * | | bug #1325: fix compilation on NEON with clangGravatar Gael Guennebaud2016-10-12
| | | |
| * | | Manually define int16_t and uint16_t when compiling with Visual StudioGravatar Benoit Steiner2016-10-08
| | | |
| * | | Reenabled the use of variadic templates on tegra x1 provides that the latest ↵Gravatar Benoit Steiner2016-10-08
| | | | | | | | | | | | | | | | version (i.e. JetPack 2.3) is used.
| * | | Cleaned up a regression testGravatar Benoit Steiner2016-10-08
| | | |
* | | | Merge the content of the ComputeCpp branch into the default branchGravatar Benoit Steiner2016-10-07
|\ \ \ \ | | |_|/ | |/| |
| | * | Remove static qualifier of free-functions (inline is enough and this helps ↵Gravatar Gael Guennebaud2016-10-07
| | | | | | | | | | | | | | | | ICC to find the right overload)
| | * | Merged in rryan/eigen/tensorfunctors (pull request PR-233)Gravatar Benoit Steiner2016-10-06
| |/| | |/| | | | | | | | | | | Fully support complex types in SumReducer and MeanReducer when building for CUDA by using scalar_sum_op and scalar_product_op instead of operator+ and operator*.
| | * | Add a test that GPU complex product reductions match CPU reductions.Gravatar RJ Ryan2016-10-06
| | | |
| | * | Fully support complex types in SumReducer and MeanReducer when building for ↵Gravatar RJ Ryan2016-10-06
| | | | | | | | | | | | | | | | CUDA by using scalar_sum_op and scalar_product_op instead of operator+ and operator*.
* | | | Added missing AVX intrinsics for fp16: in particular, implemented predux ↵Gravatar Benoit Steiner2016-10-06
| |/ / |/| | | | | | | | which is required by the matrix-vector code.
* | | Fix compilation of qr.inverse() for column and full pivoting variants.Gravatar Gael Guennebaud2016-10-06
| | |
| * | Fixed a couple of compilation warningsGravatar Benoit Steiner2016-10-05
| | |
| * | Pull the latest updates from trunkGravatar Benoit Steiner2016-10-05
| |\ \
| * | | Fixed compilation warningsGravatar Benoit Steiner2016-10-05
| | | |
| * | | Fixed compilation warningGravatar Benoit Steiner2016-10-05
| | | |
* | | | Increased the robustness of the reduction tests on fp16Gravatar Benoit Steiner2016-10-05
| | | |
* | | | Increase the tolerance to numerical noise.Gravatar Benoit Steiner2016-10-05
| | | |
* | | | ::rand() returns a signed integer on win32Gravatar Benoit Steiner2016-10-05
| | | |
* | | | Fixed a typo that impacts windows buildsGravatar Benoit Steiner2016-10-05
| |/ / |/| |
* | | Silenced compilation warningGravatar Benoit Steiner2016-10-04
| | |
* | | Properly characterize the CUDA packet primitives for fp16 as device onlyGravatar Benoit Steiner2016-10-04
| | |
| | * Update comment for fast sqrt.Gravatar Rasmus Munk Larsen2016-10-04
| | |
| | * Update comment for fast sqrt.Gravatar Rasmus Munk Larsen2016-10-04
| | |
| | * Fix a bug in the implementation of Carmack's fast sqrt algorithm in Eigen ↵Gravatar Rasmus Munk Larsen2016-10-04
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | (enabled by EIGEN_FAST_MATH), which causes the vectorized parts of the computation to return -0.0 instead of NaN for negative arguments. Benchmark speed in Giga-sqrts/s Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz ----------------------------------------- SSE AVX Fast=1 2.529G 4.380G Fast=0 1.944G 1.898G Fast=1 fixed 2.214G 3.739G This table illustrates the worst case in terms speed impact: It was measured by repeatedly computing the sqrt of an n=4096 float vector that fits in L1 cache. For large vectors the operation becomes memory bound and the differences between the different versions almost negligible.
* | Cleanup the cuda executor code.Gravatar Benoit Steiner2016-10-04
| |
* | Cleaned up the random number generation code.Gravatar Benoit Steiner2016-10-04
| |
* | Use explicit type casting to generate packets of zeros.Gravatar Benoit Steiner2016-10-04
| |
* | Improved support for compiling CUDA code with clang as the host compilerGravatar Benoit Steiner2016-10-03
| |
* | Added support for constand std::complex numbers on GPUGravatar Benoit Steiner2016-10-03
| |
* | bug #1317: fix performance regression with some Block expressions and clang ↵Gravatar Gael Guennebaud2016-10-01
| | | | | | | | | | | | by helping it to remove dead code. The trick is to get rid of the nested expression in the evaluator by copying only the required information (here, the strides).
* | bug #1310: workaround a compilation regression from 3.2 regarding triangular ↵Gravatar Gael Guennebaud2016-09-30
| | | | | | | | * homogeneous
| * Renamed the SYCL tests to follow the standard naming convention.Gravatar Benoit Steiner2016-09-30
| |
* | Fix angle rangeGravatar Gael Guennebaud2016-09-30
| |
* | Remove std:: prefixGravatar Gael Guennebaud2016-09-30
| |
* | bug #1312: Quaternion to AxisAngle conversion now ensures the angle will be ↵Gravatar Gael Guennebaud2016-09-29
| | | | | | | | in the range [-pi,pi]. This also increases accuracy when q.w is negative.
* | bug #1308: fix compilation of some small products involving nullary-expressions.Gravatar Gael Guennebaud2016-09-29
| |
* | Updated the list of warnings to reflect the new message ids introduced in ↵Gravatar Benoit Steiner2016-09-28
| | | | | | | | cuda 8.0