aboutsummaryrefslogtreecommitdiffhomepage
path: root/RELEASE.md
diff options
context:
space:
mode:
authorGravatar Michael Case <mikecase@google.com>2018-02-08 00:57:18 -0800
committerGravatar Michael Case <mikecase@google.com>2018-02-08 00:57:18 -0800
commit348bf0c436e4f571022bc08d0699e3b257125467 (patch)
treed6dcbde54352ce501c3e9115117e5bec07ce6579 /RELEASE.md
parentf7f7036d1cdc5716aff976fae0ea4d1b9a931b56 (diff)
parent78e4ed153a853536622ff606fc5f6c48a1573ac6 (diff)
Merge commit for internal changes
Diffstat (limited to 'RELEASE.md')
-rw-r--r--RELEASE.md23
1 files changed, 22 insertions, 1 deletions
diff --git a/RELEASE.md b/RELEASE.md
index 0fad3b5d41..0720a8c639 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -96,6 +96,27 @@ Yoni Tsafir, yordun, Yuan (Terry) Tang, Yuxin Wu, zhengdi, Zhengsheng Wei, ç”°ä¼
* Starting from 1.6 release, our prebuilt binaries will use AVX instructions.
This may break TF on older CPUs.
+## Known Bugs
+* Using XLA:GPU with CUDA 9 and CUDA 9.1 results in garbage results and/or
+ `CUDA_ILLEGAL_ADDRESS` failures.
+
+ Google discovered in mid-December 2017 that the PTX-to-SASS compiler in CUDA 9
+ and CUDA 9.1 sometimes does not properly compute the carry bit when
+ decomposing 64-bit address calculations with large offsets (e.g. `load [x +
+ large_constant]`) into 32-bit arithmetic in SASS.
+
+ As a result, these versions of `ptxas` miscompile most XLA programs which use
+ more than 4GB of temp memory. This results in garbage results and/or
+ `CUDA_ERROR_ILLEGAL_ADDRESS` failures.
+
+ A fix in CUDA 9.1.121 is expected in late February 2018. We do not expect a
+ fix for CUDA 9.0.x. Until the fix is available, the only workaround is to
+ [downgrade](https://developer.nvidia.com/cuda-toolkit-archive) to CUDA 8.0.x
+ or disable XLA:GPU.
+
+ TensorFlow will print a warning if you use XLA:GPU with a known-bad version of
+ CUDA; see e00ba24c4038e7644da417ddc639169b6ea59122.
+
## Major Features And Improvements
* [Eager execution](https://github.com/tensorflow/tensorflow/tree/r1.5/tensorflow/contrib/eager)
preview version is now available.
@@ -633,7 +654,7 @@ answered questions, and were part of inspiring discussions.
* Fixed LIBXSMM integration.
* Make decode_jpeg/decode_png/decode_gif handle all formats, since users frequently try to decode an image as the wrong type.
* Improve implicit broadcasting lowering.
-* Improving stability of GCS/Bigquery clients by a faster retrying of stale transmissions.
+* Improving stability of GCS/BigQuery clients by a faster retrying of stale transmissions.
* Remove OpKernelConstruction::op_def() as part of minimizing proto dependencies.
* VectorLaplaceDiag distribution added.
* Android demo no longer requires libtensorflow_demo.so to run (libtensorflow_inference.so still required)