aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen/src/Core/arch/CUDA/PacketMathHalf.h
Commit message (Expand)AuthorAge
* half implementation has been moved to half_impl namespaceGravatar Benoit Steiner2016-07-29
* Fix CUDA compilationGravatar Gael Guennebaud2016-07-22
* More cleaning in half:Gravatar Gael Guennebaud2016-07-22
* Enable the vectorization of adds and mults of fp16Gravatar Benoit Steiner2016-06-07
* Improved support for CUDA 8.0Gravatar Benoit Steiner2016-05-31
* Disable the use of MMX instructions since the code is broken on many platformsGravatar Benoit Steiner2016-05-27
* Disable usage of MMX with msvc.Gravatar Gael Guennebaud2016-05-26
* Add missing inclusion of mmintrin.hGravatar Gael Guennebaud2016-05-26
* Silenced a compilation warningGravatar Benoit Steiner2016-05-25
* Specify the rounding mode in the correct locationGravatar Benoit Steiner2016-05-25
* Explicitly specify the rounding mode when converting floats to fp16Gravatar Benoit Steiner2016-05-25
* Disable the use of MMX instructions on x86_64 since too many compilers only s...Gravatar Benoit Steiner2016-05-25
* Cleaned up the fp16 code a little moreGravatar Benoit Steiner2016-05-24
* Cleaned up the fp16 codeGravatar Benoit Steiner2016-05-24
* Don't attempt to use MMX instructions with visualstudio since they're only pa...Gravatar Benoit Steiner2016-05-24
* Use the generic ploadquad intrinsics since it does the jobGravatar Benoit Steiner2016-05-24
* Worked around missing clang intrinsicsGravatar Benoit Steiner2016-05-24
* Use the Index type instead of integers to specify the strides in pgather/psca...Gravatar Benoit Steiner2016-05-23
* Added missing alignment in the fp16 packet traitsGravatar Benoit Steiner2016-05-23
* ptranspose is not a template.Gravatar Benoit Steiner2016-05-23
* Started to vectorize the processing of 16bit floats on CPU.Gravatar Benoit Steiner2016-05-23
* Fixed a couple of bugs related to the Pascalfamily of GPUsGravatar Benoit Steiner2016-05-11
* Added the ability to load fp16 using the texture path.Gravatar Benoit Steiner2016-05-11
* Made predux_min and predux_max on fp16 less noisyGravatar Benoit Steiner2016-05-11
* __ldg is only available with cuda architectures >= 3.5Gravatar Benoit Steiner2016-05-11
* Fixed a typoGravatar Benoit Steiner2016-05-11
* Added packet primitives to compute exp, log, sqrt and rsqrt on fp16. This imp...Gravatar Benoit Steiner2016-05-10
* Added support for packet processing of fp16 on kepler and maxwell gpusGravatar Benoit Steiner2016-05-06
* Disabled the use of half2 on cuda devices of compute capability < 5.3Gravatar Benoit Steiner2016-04-08
* Fixed the packet_traits for half floats.Gravatar Benoit Steiner2016-04-08
* Fixed packet_traits<half>Gravatar Benoit Steiner2016-04-06
* Made half floats usable on hardware that doesn't support them natively.Gravatar Benoit Steiner2016-03-11
* Fixed the +=, -=, *= and /= operators to return a referenceGravatar Benoit Steiner2016-03-10
* Enable partial support for half floats on Kepler GPUs.Gravatar Benoit Steiner2016-03-03
* Declare the half float type as arithmetic.Gravatar Benoit Steiner2016-02-22
* Implemented the ptranspose function on half floatsGravatar Benoit Steiner2016-02-21
* Added the ability to compute the absolute value of a half floatGravatar Benoit Steiner2016-02-21
* Moved some of the fp16 operators outside the Eigen namespace to workaround so...Gravatar Benoit Steiner2016-02-20
* Implemented the scalar division of 2 half floatsGravatar Benoit Steiner2016-02-19
* Added support for operators +=, -=, *= and /= on CUDA half floatsGravatar Benoit Steiner2016-02-19
* Added support for simple coefficient wise tensor expression using half floats...Gravatar Benoit Steiner2016-02-19