diff options
author | Rasmus Munk Larsen <rmlarsen@google.com> | 2016-10-04 14:22:56 -0700 |
---|---|---|
committer | Rasmus Munk Larsen <rmlarsen@google.com> | 2016-10-04 14:22:56 -0700 |
commit | 3ed67cb0bb4af65fbf243df598604a8c7630bf7d (patch) | |
tree | 02c61529c1a3edab6c9894f271100a7488cd7cdc /README.md | |
parent | 6af5ac7e2749bdea7a31323855ef3b4333b91c3e (diff) |
Fix a bug in the implementation of Carmack's fast sqrt algorithm in Eigen (enabled by EIGEN_FAST_MATH), which causes the vectorized parts of the computation to return -0.0 instead of NaN for negative arguments.
Benchmark speed in Giga-sqrts/s
Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz
-----------------------------------------
SSE AVX
Fast=1 2.529G 4.380G
Fast=0 1.944G 1.898G
Fast=1 fixed 2.214G 3.739G
This table illustrates the worst case in terms speed impact: It was measured by repeatedly computing the sqrt of an n=4096 float vector that fits in L1 cache. For large vectors the operation becomes memory bound and the differences between the different versions almost negligible.
Diffstat (limited to 'README.md')
0 files changed, 0 insertions, 0 deletions