aboutsummaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
authorGravatar Pete Warden <petewarden@google.com>2016-12-07 16:55:40 -0800
committerGravatar TensorFlower Gardener <gardener@tensorflow.org>2016-12-07 17:04:30 -0800
commit8f5c4ad135968fe12910d6b5ca641a3b5ae7a888 (patch)
tree72b9d4d5ff10d5d03598aadeef92c94575eede25
parent15a996fb56b656b98729ad77b8dde33a7c74b8a7 (diff)
Update documentation to explain how to enable SSE/AVX when compiling from source
Change: 141376599
-rw-r--r--tensorflow/g3doc/get_started/os_setup.md22
1 files changed, 22 insertions, 0 deletions
diff --git a/tensorflow/g3doc/get_started/os_setup.md b/tensorflow/g3doc/get_started/os_setup.md
index 80a0860086..ec4f2e99cf 100644
--- a/tensorflow/g3doc/get_started/os_setup.md
+++ b/tensorflow/g3doc/get_started/os_setup.md
@@ -784,6 +784,28 @@ $ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
$ sudo pip install /tmp/tensorflow_pkg/tensorflow-0.11.0-py2-none-any.whl
```
+## Optimizing CPU performance
+
+To be compatible with as wide a range of machines as possible, TensorFlow
+defaults to only using SSE4.1 SIMD instructions on x86 machines. Most modern PCs
+and Macs support more advanced instructions, so if you're building a binary
+that you'll only be running on your own machine, you can enable these by using
+`--copt=-march=native` in your bazel build command. For example:
+
+``` bash
+$ bazel build --copt=-march=native -c opt //tensorflow/tools/pip_package:build_pip_package
+```
+
+If you are distributing a binary but know the capabilities of the machines
+you'll be running on, you can manually choose the right instructions with
+something like `--copt=-march=avx`. You may also want to enable multiple
+features using several arguments, for example
+`--copt=-mavx2 --copt=-mfma`.
+
+If you run a binary built using SIMD instructions on a machine that doesn't
+support them, you'll see an illegal instruction error when that code is
+executed.
+
## Setting up TensorFlow for Development
If you're working on TensorFlow itself, it is useful to be able to test your