aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen/src/Core/arch
Commit message (Collapse)AuthorAge
* Undo changes in AltiVec --- I don't have any way to test there.Gravatar Benoit Jacob2016-06-28
|
* Avoid global variables with static constructors in NEON/Complex.hGravatar Benoit Jacob2016-06-28
|
* bug #1240: Remove any assumption on NEON vector types.Gravatar Gael Guennebaud2016-06-09
|
* Fix compile errors initializing packets on ARM DS-5 5.20Gravatar Sean Templeton2016-06-03
| | | | | | | | | | | | | | | | | | | | | The ARM DS-5 5.20 compiler fails compiling with the following errors: "src/Core/arch/NEON/PacketMath.h", line 113: Error: #146: too many initializer values Packet4f countdown = EIGEN_INIT_NEON_PACKET4(0, 1, 2, 3); ^ "src/Core/arch/NEON/PacketMath.h", line 118: Error: #146: too many initializer values Packet4i countdown = EIGEN_INIT_NEON_PACKET4(0, 1, 2, 3); ^ "src/Core/arch/NEON/Complex.h", line 30: Error: #146: too many initializer values static uint32x4_t p4ui_CONJ_XOR = EIGEN_INIT_NEON_PACKET4(0x00000000, 0x80000000, 0x00000000, 0x80000000); ^ "src/Core/arch/NEON/Complex.h", line 31: Error: #146: too many initializer values static uint32x2_t p2ui_CONJ_XOR = EIGEN_INIT_NEON_PACKET2(0x00000000, 0x80000000); ^ The vectors are implemented as two doubles, hence the too many initializer values error. Changed the code to use intrinsic load functions which all compilers implementing NEON should have.
* Enable the vectorization of adds and mults of fp16Gravatar Benoit Steiner2016-06-07
|
* Add TernaryFunctors and the betainc SpecialFunction.Gravatar Eugene Brevdo2016-06-02
| | | | | | | | | | | | | | | | | | | TernaryFunctors and their executors allow operations on 3-tuples of inputs. API fully implemented for Arrays and Tensors based on binary functors. Ported the cephes betainc function (regularized incomplete beta integral) to Eigen, with support for CPU and GPU, floats, doubles, and half types. Added unit tests in array.cpp and cxx11_tensor_cuda.cu Collapsed revision * Merged helper methods for betainc across floats and doubles. * Added TensorGlobalFunctions with betainc(). Removed betainc() from TensorBase. * Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper. * betainc: merge incbcf and incbd into incbeta_cfe. and more cleanup. * Update TernaryOp and SpecialFunctions (betainc) based on review comments.
* Improved support for CUDA 8.0Gravatar Benoit Steiner2016-05-31
|
* Disable the use of MMX instructions since the code is broken on many platformsGravatar Benoit Steiner2016-05-27
|
* Deleted extra namespaceGravatar Benoit Steiner2016-05-26
|
* Disable usage of MMX with msvc.Gravatar Gael Guennebaud2016-05-26
|
* Add missing inclusion of mmintrin.hGravatar Gael Guennebaud2016-05-26
|
* Silenced a compilation warningGravatar Benoit Steiner2016-05-25
|
* Specify the rounding mode in the correct locationGravatar Benoit Steiner2016-05-25
|
* Explicitly specify the rounding mode when converting floats to fp16Gravatar Benoit Steiner2016-05-25
|
* Disable the use of MMX instructions on x86_64 since too many compilers only ↵Gravatar Benoit Steiner2016-05-25
| | | | support them in 32bit mode
* Fix compilation with ICC.Gravatar Gael Guennebaud2016-05-25
|
* Cleaned up the fp16 code a little moreGravatar Benoit Steiner2016-05-24
|
* Cleaned up the fp16 codeGravatar Benoit Steiner2016-05-24
|
* Remove now-unused protate PacketMath funcGravatar Benoit Jacob2016-05-24
|
* Don't attempt to use MMX instructions with visualstudio since they're only ↵Gravatar Benoit Steiner2016-05-24
| | | | partially supported.
* Worked around missing clang intrinsicGravatar Benoit Steiner2016-05-24
|
* Use the generic ploadquad intrinsics since it does the jobGravatar Benoit Steiner2016-05-24
|
* Worked around missing clang intrinsicsGravatar Benoit Steiner2016-05-24
|
* Added missing EIGEN_DEVICE_FUNC qualifierGravatar Benoit Steiner2016-05-23
|
* Use the Index type instead of integers to specify the strides in ↵Gravatar Benoit Steiner2016-05-23
| | | | pgather/pscatter
* Added missing alignment in the fp16 packet traitsGravatar Benoit Steiner2016-05-23
|
* ptranspose is not a template.Gravatar Benoit Steiner2016-05-23
|
* Avoid unnecessary float to double conversion.Gravatar Benoit Steiner2016-05-23
|
* Started to vectorize the processing of 16bit floats on CPU.Gravatar Benoit Steiner2016-05-23
|
* Replace multiple constructors of half-type by a generic/templated ↵Gravatar Christoph Hertzberg2016-05-23
| | | | constructor. This fixes an incompatibility with long double, exposed by the previous commit.
* Make EIGEN_HAS_C99_MATH user configurableGravatar Gael Guennebaud2016-05-20
|
* Fixed a couple of bugs related to the Pascalfamily of GPUsGravatar Benoit Steiner2016-05-11
| | | | H: Enter commit message. Lines beginning with 'HG:' are removed.
* Added the ability to load fp16 using the texture path.Gravatar Benoit Steiner2016-05-11
| | | | Improved the performance of some reductions on fp16
* Misc fixes for fp16Gravatar Benoit Steiner2016-05-11
|
* Made predux_min and predux_max on fp16 less noisyGravatar Benoit Steiner2016-05-11
|
* __ldg is only available with cuda architectures >= 3.5Gravatar Benoit Steiner2016-05-11
|
* Fixed a typoGravatar Benoit Steiner2016-05-11
|
* Added missing EIGEN_DEVICE_FUNCGravatar Benoit Steiner2016-05-11
|\
| * Added missing EIGEN_DEVICE_FUNC qualifiersGravatar Benoit Steiner2016-05-11
|/
* Added packet primitives to compute exp, log, sqrt and rsqrt on fp16. This ↵Gravatar Benoit Steiner2016-05-10
| | | | improves the performance by 10 to 30%.
* Added support for packet processing of fp16 on kepler and maxwell gpusGravatar Benoit Steiner2016-05-06
|
* Relaxed the dummy precision for fp16Gravatar Benoit Steiner2016-05-05
|
* Fixed compilation error with cuda >= 7.5Gravatar Benoit Steiner2016-05-03
|
* Made a cast explicitGravatar Benoit Steiner2016-05-02
|
* Fixed compilation errors generated by clangGravatar Benoit Steiner2016-04-29
|
* fpclassify isn't portable enough. In particular, the return values of the ↵Gravatar Benoit Steiner2016-04-27
| | | | function are not available on all the platforms Eigen supportes: remove it from Eigen.
* Improved support for min and max on 16 bit floats when running on recent ↵Gravatar Benoit Steiner2016-04-27
| | | | cuda gpus
* Added support for fpclassify in Eigen::NumextGravatar Benoit Steiner2016-04-27
|
* Force the inlining of the << operator on half floatsGravatar Benoit Steiner2016-04-14
|
* Inline the << operator on half floatsGravatar Benoit Steiner2016-04-14
|