aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/docs_src
diff options
context:
space:
mode:
authorGravatar Asim Shankar <ashankar@google.com>2018-08-07 11:42:23 -0700
committerGravatar TensorFlower Gardener <gardener@tensorflow.org>2018-08-07 12:03:33 -0700
commitc3d1f4bc30c2cc5e0999ac2b0f04d41d607cb1fe (patch)
treea321b14d8d73ceaa9683d6cbcc4609d1704599b6 /tensorflow/docs_src
parent01d734d3778c43ada5d56dce87b4f8ba61b5b560 (diff)
[Docs]: Reduce over-estimation while measuring compute time.
Diffstat (limited to 'tensorflow/docs_src')
-rw-r--r--tensorflow/docs_src/guide/eager.md12
1 files changed, 9 insertions, 3 deletions
diff --git a/tensorflow/docs_src/guide/eager.md b/tensorflow/docs_src/guide/eager.md
index 3b54d6d2bb..24f6e4ee95 100644
--- a/tensorflow/docs_src/guide/eager.md
+++ b/tensorflow/docs_src/guide/eager.md
@@ -727,7 +727,13 @@ def measure(x, steps):
start = time.time()
for i in range(steps):
x = tf.matmul(x, x)
- _ = x.numpy() # Make sure to execute op and not just enqueue it
+ # tf.matmul can return before completing the matrix multiplication
+ # (e.g., can return after enqueing the operation on a CUDA stream).
+ # The x.numpy() call below will ensure that all enqueued operations
+ # have completed (and will also copy the result to host memory,
+ # so we're including a little more than just the matmul operation
+ # time).
+ _ = x.numpy()
end = time.time()
return end - start
@@ -751,8 +757,8 @@ Output (exact numbers depend on hardware):
```
Time to multiply a (1000, 1000) matrix by itself 200 times:
-CPU: 4.614904403686523 secs
-GPU: 0.5581181049346924 secs
+CPU: 1.46628093719 secs
+GPU: 0.0593810081482 secs
```
A `tf.Tensor` object can be copied to a different device to execute its