From fddaed524622417900d745fe8f115562c55ac49a Mon Sep 17 00:00:00 2001 From: Vijay Vasudevan Date: Sat, 7 Nov 2015 13:58:24 -0800 Subject: TensorFlow: Upstream commits to git. Changes: - More documentation edits, fixes to anchors, fixes to mathjax, new images, etc. - Add rnn models to pip install package. Base CL: 107312343 --- tensorflow/g3doc/tutorials/deep_cnn/index.md | 36 ++++++++++++++-------------- 1 file changed, 18 insertions(+), 18 deletions(-) (limited to 'tensorflow/g3doc/tutorials/deep_cnn/index.md') diff --git a/tensorflow/g3doc/tutorials/deep_cnn/index.md b/tensorflow/g3doc/tutorials/deep_cnn/index.md index 906093009e..be23e7ccaa 100644 --- a/tensorflow/g3doc/tutorials/deep_cnn/index.md +++ b/tensorflow/g3doc/tutorials/deep_cnn/index.md @@ -1,9 +1,9 @@ -# Convolutional Neural Networks +# Convolutional Neural Networks **NOTE:** This tutorial is intended for *advanced* users of TensorFlow and assumes expertise and experience in machine learning. -## Overview +## Overview CIFAR-10 classification is a common benchmark problem in machine learning. The problem is to classify RGB 32x32 pixel images across 10 categories: @@ -15,7 +15,7 @@ For more details refer to the [CIFAR-10 page](http://www.cs.toronto.edu/~kriz/ci and a [Tech Report](http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf) by Alex Krizhevsky. -### Goals +### Goals The goal of this tutorial is to build a relatively small convolutional neural network (CNN) for recognizing images. In the process this tutorial: @@ -29,7 +29,7 @@ exercise much of TensorFlow's ability to scale to large models. At the same time, the model is small enough to train fast in order to test new ideas and experiments. -### Highlights of the Tutorial +### Highlights of the Tutorial The CIFAR-10 tutorial demonstrates several important constructs for designing larger and more sophisticated models in TensorFlow: @@ -60,7 +60,7 @@ We also provide a multi-GPU version of the model which demonstrates: We hope that this tutorial provides a launch point for building larger CNNs for vision tasks on TensorFlow. -### Model Architecture +### Model Architecture The model in this CIFAR-10 tutorial is a multi-layer architecture consisting of alternating convolutions and nonlinearities. These layers are followed by fully @@ -74,7 +74,7 @@ of training time on a GPU. Please see [below](#evaluating-a-model) and the code for details. It consists of 1,068,298 learnable parameters and requires about 19.5M multiply-add operations to compute inference on a single image. -## Code Organization +## Code Organization The code for this tutorial resides in [`tensorflow/models/image/cifar10/`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/). @@ -88,7 +88,7 @@ File | Purpose [`cifar10_eval.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_eval.py) | Evaluates the predictive performance of a CIFAR-10 model. -## CIFAR-10 Model +## CIFAR-10 Model The CIFAR-10 network is largely contained in [`cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py). @@ -105,7 +105,7 @@ adds operations that perform inference, i.e. classification, on supplied images. add operations that compute the loss, gradients, variable updates and visualization summaries. -### Model Inputs +### Model Inputs The input part of the model is built by the functions `inputs()` and `distorted_inputs()` which read images from the CIFAR-10 binary data files. @@ -143,7 +143,7 @@ processing time. To prevent these operations from slowing down training, we run them inside 16 separate threads which continuously fill a TensorFlow [queue](../../api_docs/python/io_ops.md#shuffle_batch). -### Model Prediction +### Model Prediction The prediction part of the model is constructed by the `inference()` function which adds operations to compute the *logits* of the predictions. That part of @@ -181,7 +181,7 @@ the CIFAR-10 model specified in layers are locally connected and not fully connected. Try editing the architecture to exactly replicate that fully connected model. -### Model Training +### Model Training The usual method for training a network to perform N-way classification is [multinomial logistic regression](https://en.wikipedia.org/wiki/Multinomial_logistic_regression), @@ -199,7 +199,7 @@ loss and all these weight decay terms, as returned by the `loss()` function. We visualize it in TensorBoard with a [scalar_summary](../../api_docs/python/train.md?#scalar_summary): ![CIFAR-10 Loss](./cifar_loss.png "CIFAR-10 Total Loss") -###### [View this TensorBoard live! (Chrome/FF)](/tensorboard/cifar.html) +###### [View this TensorBoard live! (Chrome/FF)](/tensorboard/cifar.html) We train the model using standard [gradient descent](https://en.wikipedia.org/wiki/Gradient_descent) @@ -209,7 +209,7 @@ with a learning rate that over time. ![CIFAR-10 Learning Rate Decay](./cifar_lr_decay.png "CIFAR-10 Learning Rate Decay") -###### [View this TensorBoard live! (Chrome/FF)](/tensorboard/cifar.html) +###### [View this TensorBoard live! (Chrome/FF)](/tensorboard/cifar.html) The `train()` function adds the operations needed to minimize the objective by calculating the gradient and updating the learned variables (see @@ -217,7 +217,7 @@ calculating the gradient and updating the learned variables (see for details). It returns an operation that executes all of the calculations needed to train and update the model for one batch of images. -## Launching and Training the Model +## Launching and Training the Model We have built the model, let's now launch it and run the training operation with the script `cifar10_train.py`. @@ -302,7 +302,7 @@ values. See how the scripts use [ExponentialMovingAverage](../../api_docs/python/train.md#ExponentialMovingAverage) for this purpose. -## Evaluating a Model +## Evaluating a Model Let us now evaluate how well the trained model performs on a hold-out data set. the model is evaluated by the script `cifar10_eval.py`. It constructs the model @@ -346,7 +346,7 @@ the averaged parameters for the model and verify that the predictive performance drops. -## Training a Model Using Multiple GPU Cards +## Training a Model Using Multiple GPU Cards Modern workstations may contain multiple GPUs for scientific computation. TensorFlow can leverage this environment to run the training operation @@ -390,7 +390,7 @@ The GPUs are synchronized in operation. All gradients are accumulated from the GPUs and averaged (see green box). The model parameters are updated with the gradients averaged across all model replicas. -### Placing Variables and Operations on Devices +### Placing Variables and Operations on Devices Placing operations and variables on devices requires some special abstractions. @@ -414,7 +414,7 @@ All variables are pinned to the CPU and accessed via in order to share them in a multi-GPU version. See how-to on [Sharing Variables](../../how_tos/variable_scope/index.md). -### Launching and Training the Model on Multiple GPU cards +### Launching and Training the Model on Multiple GPU cards If you have several GPU cards installed on your machine you can use them to train the model faster with the `cifar10_multi_gpu_train.py` script. It is a @@ -446,7 +446,7 @@ you ask for more. run on a batch size of 128. Try running `cifar10_multi_gpu_train.py` on 2 GPUs with a batch size of 64 and compare the training speed. -## Next Steps +## Next Steps [Congratulations!](https://www.youtube.com/watch?v=9bZkp7q19f0). You have completed the CIFAR-10 tutorial. -- cgit v1.2.3