aboutsummaryrefslogtreecommitdiffhomepage
path: root/RELEASE.md
diff options
context:
space:
mode:
authorGravatar Michael Case <mikecase@google.com>2018-02-07 14:36:00 -0800
committerGravatar TensorFlower Gardener <gardener@tensorflow.org>2018-02-07 14:39:49 -0800
commitd90054e7c0f41f4bab81df0548577a73b939a87a (patch)
treea15aea686a9d3f305e316d2a6ada0859ad8170d1 /RELEASE.md
parent8461760f9f6cde8ed97507484d2a879140141032 (diff)
Merge changes from github.
PiperOrigin-RevId: 184897758
Diffstat (limited to 'RELEASE.md')
-rw-r--r--RELEASE.md27
1 files changed, 24 insertions, 3 deletions
diff --git a/RELEASE.md b/RELEASE.md
index fdf10407fd..b11b1e40db 100644
--- a/RELEASE.md
+++ b/RELEASE.md
@@ -1,18 +1,39 @@
# Release 1.5.0
## Breaking Changes
-* Prebuilt binaries are now built against CUDA 9 and cuDNN 7.
+* Prebuilt binaries are now built against CUDA 9.0 and cuDNN 7.
* Our Linux binaries are built using ubuntu 16 containers, potentially
introducing glibc incompatibility issues with ubuntu 14.
* Starting from 1.6 release, our prebuilt binaries will use AVX instructions.
This may break TF on older CPUs.
+## Known Bugs
+* Using XLA:GPU with CUDA 9 and CUDA 9.1 results in garbage results and/or
+ `CUDA_ILLEGAL_ADDRESS` failures.
+
+ Google discovered in mid-December 2017 that the PTX-to-SASS compiler in CUDA 9
+ and CUDA 9.1 sometimes does not properly compute the carry bit when
+ decomposing 64-bit address calculations with large offsets (e.g. `load [x +
+ large_constant]`) into 32-bit arithmetic in SASS.
+
+ As a result, these versions of `ptxas` miscompile most XLA programs which use
+ more than 4GB of temp memory. This results in garbage results and/or
+ `CUDA_ERROR_ILLEGAL_ADDRESS` failures.
+
+ A fix in CUDA 9.1.121 is expected in late February 2018. We do not expect a
+ fix for CUDA 9.0.x. Until the fix is available, the only workaround is to
+ [downgrade](https://developer.nvidia.com/cuda-toolkit-archive) to CUDA 8.0.x
+ or disable XLA:GPU.
+
+ TensorFlow will print a warning if you use XLA:GPU with a known-bad version of
+ CUDA; see e00ba24c4038e7644da417ddc639169b6ea59122.
+
## Major Features And Improvements
* [Eager execution](https://github.com/tensorflow/tensorflow/tree/r1.5/tensorflow/contrib/eager)
preview version is now available.
* [TensorFlow Lite](https://github.com/tensorflow/tensorflow/tree/r1.5/tensorflow/contrib/lite)
dev preview is now available.
-* CUDA 9 and cuDNN 7 support.
+* CUDA 9.0 and cuDNN 7 support.
* Accelerated Linear Algebra (XLA):
* Add `complex64` support to XLA compiler.
* `bfloat` support is now added to XLA infrastructure.
@@ -523,7 +544,7 @@ answered questions, and were part of inspiring discussions.
* Fixed LIBXSMM integration.
* Make decode_jpeg/decode_png/decode_gif handle all formats, since users frequently try to decode an image as the wrong type.
* Improve implicit broadcasting lowering.
-* Improving stability of GCS/Bigquery clients by a faster retrying of stale transmissions.
+* Improving stability of GCS/BigQuery clients by a faster retrying of stale transmissions.
* Remove OpKernelConstruction::op_def() as part of minimizing proto dependencies.
* VectorLaplaceDiag distribution added.
* Android demo no longer requires libtensorflow_demo.so to run (libtensorflow_inference.so still required)