aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/g3doc/tutorials/deep_cnn/index.md
diff options
context:
space:
mode:
authorGravatar Vijay Vasudevan <vrv@google.com>2015-11-07 13:58:24 -0800
committerGravatar Vijay Vasudevan <vrv@google.com>2015-11-07 13:58:24 -0800
commitfddaed524622417900d745fe8f115562c55ac49a (patch)
treecabb2fc16540a27748b60329195966d535f48837 /tensorflow/g3doc/tutorials/deep_cnn/index.md
parent7de9099a739c9dc62b1ca55c1eeef90acbfa7be9 (diff)
TensorFlow: Upstream commits to git.
Changes: - More documentation edits, fixes to anchors, fixes to mathjax, new images, etc. - Add rnn models to pip install package. Base CL: 107312343
Diffstat (limited to 'tensorflow/g3doc/tutorials/deep_cnn/index.md')
-rw-r--r--tensorflow/g3doc/tutorials/deep_cnn/index.md36
1 files changed, 18 insertions, 18 deletions
diff --git a/tensorflow/g3doc/tutorials/deep_cnn/index.md b/tensorflow/g3doc/tutorials/deep_cnn/index.md
index 906093009e..be23e7ccaa 100644
--- a/tensorflow/g3doc/tutorials/deep_cnn/index.md
+++ b/tensorflow/g3doc/tutorials/deep_cnn/index.md
@@ -1,9 +1,9 @@
-# Convolutional Neural Networks
+# Convolutional Neural Networks <a class="md-anchor" id="AUTOGENERATED-convolutional-neural-networks"></a>
**NOTE:** This tutorial is intended for *advanced* users of TensorFlow
and assumes expertise and experience in machine learning.
-## Overview
+## Overview <a class="md-anchor" id="AUTOGENERATED-overview"></a>
CIFAR-10 classification is a common benchmark problem in machine learning. The
problem is to classify RGB 32x32 pixel images across 10 categories:
@@ -15,7 +15,7 @@ For more details refer to the [CIFAR-10 page](http://www.cs.toronto.edu/~kriz/ci
and a [Tech Report](http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf)
by Alex Krizhevsky.
-### Goals
+### Goals <a class="md-anchor" id="AUTOGENERATED-goals"></a>
The goal of this tutorial is to build a relatively small convolutional neural
network (CNN) for recognizing images. In the process this tutorial:
@@ -29,7 +29,7 @@ exercise much of TensorFlow's ability to scale to large models. At the same
time, the model is small enough to train fast in order to test new ideas and
experiments.
-### Highlights of the Tutorial
+### Highlights of the Tutorial <a class="md-anchor" id="AUTOGENERATED-highlights-of-the-tutorial"></a>
The CIFAR-10 tutorial demonstrates several important constructs for
designing larger and more sophisticated models in TensorFlow:
@@ -60,7 +60,7 @@ We also provide a multi-GPU version of the model which demonstrates:
We hope that this tutorial provides a launch point for building larger CNNs for
vision tasks on TensorFlow.
-### Model Architecture
+### Model Architecture <a class="md-anchor" id="AUTOGENERATED-model-architecture"></a>
The model in this CIFAR-10 tutorial is a multi-layer architecture consisting of
alternating convolutions and nonlinearities. These layers are followed by fully
@@ -74,7 +74,7 @@ of training time on a GPU. Please see [below](#evaluating-a-model) and the code
for details. It consists of 1,068,298 learnable parameters and requires about
19.5M multiply-add operations to compute inference on a single image.
-## Code Organization
+## Code Organization <a class="md-anchor" id="AUTOGENERATED-code-organization"></a>
The code for this tutorial resides in
[`tensorflow/models/image/cifar10/`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/).
@@ -88,7 +88,7 @@ File | Purpose
[`cifar10_eval.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_eval.py) | Evaluates the predictive performance of a CIFAR-10 model.
-## CIFAR-10 Model
+## CIFAR-10 Model <a class="md-anchor" id="AUTOGENERATED-cifar-10-model"></a>
The CIFAR-10 network is largely contained in
[`cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py).
@@ -105,7 +105,7 @@ adds operations that perform inference, i.e. classification, on supplied images.
add operations that compute the loss,
gradients, variable updates and visualization summaries.
-### Model Inputs
+### Model Inputs <a class="md-anchor" id="AUTOGENERATED-model-inputs"></a>
The input part of the model is built by the functions `inputs()` and
`distorted_inputs()` which read images from the CIFAR-10 binary data files.
@@ -143,7 +143,7 @@ processing time. To prevent these operations from slowing down training, we run
them inside 16 separate threads which continuously fill a TensorFlow
[queue](../../api_docs/python/io_ops.md#shuffle_batch).
-### Model Prediction
+### Model Prediction <a class="md-anchor" id="AUTOGENERATED-model-prediction"></a>
The prediction part of the model is constructed by the `inference()` function
which adds operations to compute the *logits* of the predictions. That part of
@@ -181,7 +181,7 @@ the CIFAR-10 model specified in
layers are locally connected and not fully connected. Try editing the
architecture to exactly replicate that fully connected model.
-### Model Training
+### Model Training <a class="md-anchor" id="AUTOGENERATED-model-training"></a>
The usual method for training a network to perform N-way classification is
[multinomial logistic regression](https://en.wikipedia.org/wiki/Multinomial_logistic_regression),
@@ -199,7 +199,7 @@ loss and all these weight decay terms, as returned by the `loss()` function.
We visualize it in TensorBoard with a [scalar_summary](../../api_docs/python/train.md?#scalar_summary):
![CIFAR-10 Loss](./cifar_loss.png "CIFAR-10 Total Loss")
-###### [View this TensorBoard live! (Chrome/FF)](/tensorboard/cifar.html)
+###### [View this TensorBoard live! (Chrome/FF)](/tensorboard/cifar.html) <a class="md-anchor" id="AUTOGENERATED--view-this-tensorboard-live---chrome-ff----tensorboard-cifar.html-"></a>
We train the model using standard
[gradient descent](https://en.wikipedia.org/wiki/Gradient_descent)
@@ -209,7 +209,7 @@ with a learning rate that
over time.
![CIFAR-10 Learning Rate Decay](./cifar_lr_decay.png "CIFAR-10 Learning Rate Decay")
-###### [View this TensorBoard live! (Chrome/FF)](/tensorboard/cifar.html)
+###### [View this TensorBoard live! (Chrome/FF)](/tensorboard/cifar.html) <a class="md-anchor" id="AUTOGENERATED--view-this-tensorboard-live---chrome-ff----tensorboard-cifar.html-"></a>
The `train()` function adds the operations needed to minimize the objective by
calculating the gradient and updating the learned variables (see
@@ -217,7 +217,7 @@ calculating the gradient and updating the learned variables (see
for details). It returns an operation that executes all of the calculations
needed to train and update the model for one batch of images.
-## Launching and Training the Model
+## Launching and Training the Model <a class="md-anchor" id="AUTOGENERATED-launching-and-training-the-model"></a>
We have built the model, let's now launch it and run the training operation with
the script `cifar10_train.py`.
@@ -302,7 +302,7 @@ values. See how the scripts use
[ExponentialMovingAverage](../../api_docs/python/train.md#ExponentialMovingAverage)
for this purpose.
-## Evaluating a Model
+## Evaluating a Model <a class="md-anchor" id="AUTOGENERATED-evaluating-a-model"></a>
Let us now evaluate how well the trained model performs on a hold-out data set.
the model is evaluated by the script `cifar10_eval.py`. It constructs the model
@@ -346,7 +346,7 @@ the averaged parameters for the model and verify that the predictive performance
drops.
-## Training a Model Using Multiple GPU Cards
+## Training a Model Using Multiple GPU Cards <a class="md-anchor" id="AUTOGENERATED-training-a-model-using-multiple-gpu-cards"></a>
Modern workstations may contain multiple GPUs for scientific computation.
TensorFlow can leverage this environment to run the training operation
@@ -390,7 +390,7 @@ The GPUs are synchronized in operation. All gradients are accumulated from
the GPUs and averaged (see green box). The model parameters are updated with
the gradients averaged across all model replicas.
-### Placing Variables and Operations on Devices
+### Placing Variables and Operations on Devices <a class="md-anchor" id="AUTOGENERATED-placing-variables-and-operations-on-devices"></a>
Placing operations and variables on devices requires some special
abstractions.
@@ -414,7 +414,7 @@ All variables are pinned to the CPU and accessed via
in order to share them in a multi-GPU version.
See how-to on [Sharing Variables](../../how_tos/variable_scope/index.md).
-### Launching and Training the Model on Multiple GPU cards
+### Launching and Training the Model on Multiple GPU cards <a class="md-anchor" id="AUTOGENERATED-launching-and-training-the-model-on-multiple-gpu-cards"></a>
If you have several GPU cards installed on your machine you can use them to
train the model faster with the `cifar10_multi_gpu_train.py` script. It is a
@@ -446,7 +446,7 @@ you ask for more.
run on a batch size of 128. Try running `cifar10_multi_gpu_train.py` on 2 GPUs
with a batch size of 64 and compare the training speed.
-## Next Steps
+## Next Steps <a class="md-anchor" id="AUTOGENERATED-next-steps"></a>
[Congratulations!](https://www.youtube.com/watch?v=9bZkp7q19f0). You have
completed the CIFAR-10 tutorial.