diff options
author | Rasmus Munk Larsen <rmlarsen@google.com> | 2019-08-02 11:18:13 -0700 |
---|---|---|
committer | Rasmus Munk Larsen <rmlarsen@google.com> | 2019-08-02 11:18:13 -0700 |
commit | e2999d4c388f3bc556a556befdcb51b1139e9d92 (patch) | |
tree | dd0d03cd89ce8dbc2b74b22d741a21b9e6bc0752 /doc | |
parent | f22b7283a3822524d03e4d93e6144bc1c9dd13a5 (diff) |
Fix performance regressions due to https://bitbucket.org/eigen/eigen/pull-requests/662.
The change caused the device struct to be copied for each expression evaluation, and caused, e.g., a 10% regression in the TensorFlow multinomial op on GPU:
Benchmark Time(ns) CPU(ns) Iterations
----------------------------------------------------------------------
BM_Multinomial_gpu_1_100000_4 128173 231326 2922 1.610G items/s
VS
Benchmark Time(ns) CPU(ns) Iterations
----------------------------------------------------------------------
BM_Multinomial_gpu_1_100000_4 146683 246914 2719 1.509G items/s
Diffstat (limited to 'doc')
0 files changed, 0 insertions, 0 deletions