aboutsummaryrefslogtreecommitdiffhomepage
path: root/unsupported/test/cxx11_tensor_intdiv.cpp
diff options
context:
space:
mode:
authorGravatar Benoit Steiner <benoit.steiner.goog@gmail.com>2016-05-09 17:09:54 -0700
committerGravatar Benoit Steiner <benoit.steiner.goog@gmail.com>2016-05-09 17:09:54 -0700
commit4670d7d5ce2517b2e9201f1cf44ae62ef2284eb5 (patch)
tree15f6faeec806068d4a786146ab92d557156321db /unsupported/test/cxx11_tensor_intdiv.cpp
parentc3859a2b584730ed1cb155a6d9bc422592d8d26b (diff)
Improved the performance of full reductions on GPU:
Before: BM_fullReduction/10 200000 11751 8.51 MFlops/s BM_fullReduction/80 5000 523385 12.23 MFlops/s BM_fullReduction/640 50 36179326 11.32 MFlops/s BM_fullReduction/4K 1 2173517195 11.50 MFlops/s After: BM_fullReduction/10 500000 5987 16.70 MFlops/s BM_fullReduction/80 200000 10636 601.73 MFlops/s BM_fullReduction/640 50000 58428 7010.31 MFlops/s BM_fullReduction/4K 1000 2006106 12461.95 MFlops/s
Diffstat (limited to 'unsupported/test/cxx11_tensor_intdiv.cpp')
0 files changed, 0 insertions, 0 deletions