aboutsummaryrefslogtreecommitdiffhomepage
path: root/README.md
diff options
context:
space:
mode:
authorGravatar Rasmus Munk Larsen <rmlarsen@google.com>2016-10-04 14:22:56 -0700
committerGravatar Rasmus Munk Larsen <rmlarsen@google.com>2016-10-04 14:22:56 -0700
commit3ed67cb0bb4af65fbf243df598604a8c7630bf7d (patch)
tree02c61529c1a3edab6c9894f271100a7488cd7cdc /README.md
parent6af5ac7e2749bdea7a31323855ef3b4333b91c3e (diff)
Fix a bug in the implementation of Carmack's fast sqrt algorithm in Eigen (enabled by EIGEN_FAST_MATH), which causes the vectorized parts of the computation to return -0.0 instead of NaN for negative arguments.
Benchmark speed in Giga-sqrts/s Intel(R) Xeon(R) CPU E5-1650 v3 @ 3.50GHz ----------------------------------------- SSE AVX Fast=1 2.529G 4.380G Fast=0 1.944G 1.898G Fast=1 fixed 2.214G 3.739G This table illustrates the worst case in terms speed impact: It was measured by repeatedly computing the sqrt of an n=4096 float vector that fits in L1 cache. For large vectors the operation becomes memory bound and the differences between the different versions almost negligible.
Diffstat (limited to 'README.md')
0 files changed, 0 insertions, 0 deletions