aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/docs_src/get_started/monitors.md
diff options
context:
space:
mode:
Diffstat (limited to 'tensorflow/docs_src/get_started/monitors.md')
-rw-r--r--tensorflow/docs_src/get_started/monitors.md406
1 files changed, 0 insertions, 406 deletions
diff --git a/tensorflow/docs_src/get_started/monitors.md b/tensorflow/docs_src/get_started/monitors.md
deleted file mode 100644
index 5606e95365..0000000000
--- a/tensorflow/docs_src/get_started/monitors.md
+++ /dev/null
@@ -1,406 +0,0 @@
-# Logging and Monitoring Basics with tf.contrib.learn
-
-When training a model, it’s often valuable to track and evaluate progress in
-real time. In this tutorial, you’ll learn how to use TensorFlow’s logging
-capabilities and the `Monitor` API to audit the in-progress training of a neural
-network classifier for categorizing irises. This tutorial builds on the code
-developed in @{$estimator$tf.estimator Quickstart} so if you
-haven't yet completed that tutorial, you may want to explore it first,
-especially if you're looking for an intro/refresher on tf.contrib.learn basics.
-
-## Setup {#setup}
-
-For this tutorial, you'll be building upon the following code from
-@{$estimator$tf.estimator Quickstart}:
-
-```python
-from __future__ import absolute_import
-from __future__ import division
-from __future__ import print_function
-
-import os
-
-import numpy as np
-import tensorflow as tf
-
-# Data sets
-IRIS_TRAINING = os.path.join(os.path.dirname(__file__), "iris_training.csv")
-IRIS_TEST = os.path.join(os.path.dirname(__file__), "iris_test.csv")
-
-def main(unused_argv):
- # Load datasets.
- training_set = tf.contrib.learn.datasets.base.load_csv_with_header(
- filename=IRIS_TRAINING, target_dtype=np.int, features_dtype=np.float32)
- test_set = tf.contrib.learn.datasets.base.load_csv_with_header(
- filename=IRIS_TEST, target_dtype=np.int, features_dtype=np.float32)
-
- # Specify that all features have real-value data
- feature_columns = [tf.contrib.layers.real_valued_column("", dimension=4)]
-
- # Build 3 layer DNN with 10, 20, 10 units respectively.
- classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
- hidden_units=[10, 20, 10],
- n_classes=3,
- model_dir="/tmp/iris_model")
-
- # Fit model.
- classifier.fit(x=training_set.data,
- y=training_set.target,
- steps=2000)
-
- # Evaluate accuracy.
- accuracy_score = classifier.evaluate(x=test_set.data,
- y=test_set.target)["accuracy"]
- print('Accuracy: {0:f}'.format(accuracy_score))
-
- # Classify two new flower samples.
- new_samples = np.array(
- [[6.4, 3.2, 4.5, 1.5], [5.8, 3.1, 5.0, 1.7]], dtype=float)
- y = list(classifier.predict(new_samples, as_iterable=True))
- print('Predictions: {}'.format(str(y)))
-
-if __name__ == "__main__":
- tf.app.run()
-```
-
-Copy the above code into a file, and download the corresponding
-[training](http://download.tensorflow.org/data/iris_training.csv) and
-[test](http://download.tensorflow.org/data/iris_test.csv) data sets to the same
-directory.
-
-In the following sections, you'll progressively make updates to the above code
-to add logging and monitoring capabilities. Final code incorporating all updates
-is [available for download
-here](https://www.tensorflow.org/code/tensorflow/examples/tutorials/monitors/iris_monitors.py).
-
-## Overview
-
-The @{$estimator$tf.estimator Quickstart tutorial} walked through
-how to implement a neural net classifier to categorize iris examples into one of
-three species.
-
-But when [the code](#setup) from this tutorial is run, the output contains no
-logging tracking how model training is progressing—only the results of the
-`print` statements that were included:
-
-```none
-Accuracy: 0.933333
-Predictions: [1 2]
-```
-
-Without any logging, model training feels like a bit of a black box; you can't
-see what's happening as TensorFlow steps through gradient descent, get a sense
-of whether the model is converging appropriately, or audit to determine whether
-[early stopping](https://en.wikipedia.org/wiki/Early_stopping) might be
-appropriate.
-
-One way to address this problem would be to split model training into multiple
-`fit` calls with smaller numbers of steps in order to evaluate accuracy more
-progressively. However, this is not recommended practice, as it greatly slows
-down model training. Fortunately, tf.contrib.learn offers another solution: a
-@{tf.contrib.learn.monitors$Monitor API} designed to help
-you log metrics and evaluate your model while training is in progress. In the
-following sections, you'll learn how to enable logging in TensorFlow, set up a
-ValidationMonitor to do streaming evaluations, and visualize your metrics using
-TensorBoard.
-
-## Enabling Logging with TensorFlow
-
-TensorFlow uses five different levels for log messages. In order of ascending
-severity, they are `DEBUG`, `INFO`, `WARN`, `ERROR`, and `FATAL`. When you
-configure logging at any of these levels, TensorFlow will output all log
-messages corresponding to that level and all levels of higher severity. For
-example, if you set a logging level of `ERROR`, you'll get log output containing
-`ERROR` and `FATAL` messages, and if you set a level of `DEBUG`, you'll get log
-messages from all five levels.
-
-By default, TensorFlow is configured at a logging level of `WARN`, but when
-tracking model training, you'll want to adjust the level to `INFO`, which will
-provide additional feedback as `fit` operations are in progress.
-
-Add the following line to the beginning of your code (right after your
-`import`s):
-
-```python
-tf.logging.set_verbosity(tf.logging.INFO)
-```
-
-Now when you run the code, you'll see additional log output like the following:
-
-```none
-INFO:tensorflow:loss = 1.18812, step = 1
-INFO:tensorflow:loss = 0.210323, step = 101
-INFO:tensorflow:loss = 0.109025, step = 201
-```
-
-With `INFO`-level logging, tf.contrib.learn automatically outputs [training-loss
-metrics](https://en.wikipedia.org/wiki/Loss_function) to stderr after every 100
-steps.
-
-## Configuring a ValidationMonitor for Streaming Evaluation
-
-Logging training loss is helpful to get a sense whether your model is
-converging, but what if you want further insight into what's happening during
-training? tf.contrib.learn provides several high-level `Monitor`s you can attach
-to your `fit` operations to further track metrics and/or debug lower-level
-TensorFlow operations during model training, including:
-
-Monitor | Description
-------------------- | -----------
-`CaptureVariable` | Saves a specified variable's values into a collection at every _n_ steps of training
-`PrintTensor` | Logs a specified tensor's values at every _n_ steps of training
-`SummarySaver` | Saves @{tf.Summary} [protocol buffers](https://developers.google.com/protocol-buffers/) for a given tensor using a @{tf.summary.FileWriter} at every _n_ steps of training
-`ValidationMonitor` | Logs a specified set of evaluation metrics at every _n_ steps of training, and, if desired, implements early stopping under certain conditions
-
-### Evaluating Every *N* Steps
-
-For the iris neural network classifier, while logging training loss, you might
-also want to simultaneously evaluate against test data to see how well the model
-is generalizing. You can accomplish this by configuring a `ValidationMonitor`
-with the test data (`test_set.data` and `test_set.target`), and setting how
-often to evaluate with `every_n_steps`. The default value of `every_n_steps` is
-`100`; here, set `every_n_steps` to `50` to evaluate after every 50 steps of
-model training:
-
-```python
-validation_monitor = tf.contrib.learn.monitors.ValidationMonitor(
- test_set.data,
- test_set.target,
- every_n_steps=50)
-```
-
-Place this code right before the line instantiating the `classifier`.
-
-`ValidationMonitor`s rely on saved checkpoints to perform evaluation operations,
-so you'll want to modify instantiation of the `classifier` to add a
-@{tf.contrib.learn.RunConfig} that includes
-`save_checkpoints_secs`, which specifies how many seconds should elapse between
-checkpoint saves during training. Because the iris data set is quite small, and
-thus trains quickly, it makes sense to set `save_checkpoints_secs` to 1 (saving
-a checkpoint every second) to ensure a sufficient number of checkpoints:
-
-```python
-classifier = tf.contrib.learn.DNNClassifier(
- feature_columns=feature_columns,
- hidden_units=[10, 20, 10],
- n_classes=3,
- model_dir="/tmp/iris_model",
- config=tf.contrib.learn.RunConfig(save_checkpoints_secs=1))
-```
-
-NOTE: The `model_dir` parameter specifies an explicit directory
-(`/tmp/iris_model`) for model data to be stored; this directory path will be
-easier to reference later on than an autogenerated one. Each time you run the
-code, any existing data in `/tmp/iris_model` will be loaded, and model training
-will continue where it left off in the last run (e.g., running the script twice
-in succession will execute 4000 steps during training—2000 during each
-`fit` operation). To start over model training from scratch, delete
-`/tmp/iris_model` before running the code.
-
-Finally, to attach your `validation_monitor`, update the `fit` call to include a
-`monitors` param, which takes a list of all monitors to run during model
-training:
-
-```python
-classifier.fit(x=training_set.data,
- y=training_set.target,
- steps=2000,
- monitors=[validation_monitor])
-```
-
-Now, when you rerun the code, you should see validation metrics in your log
-output, e.g.:
-
-```none
-INFO:tensorflow:Validation (step 50): loss = 1.71139, global_step = 0, accuracy = 0.266667
-...
-INFO:tensorflow:Validation (step 300): loss = 0.0714158, global_step = 268, accuracy = 0.966667
-...
-INFO:tensorflow:Validation (step 1750): loss = 0.0574449, global_step = 1729, accuracy = 0.966667
-```
-
-### Customizing the Evaluation Metrics with MetricSpec
-
-By default, if no evaluation metrics are specified, `ValidationMonitor` will log
-both [loss](https://en.wikipedia.org/wiki/Loss_function) and accuracy, but you
-can customize the list of metrics that will be run every 50 steps. To specify
-the exact metrics you'd like to run in each evaluation pass, you can add a
-`metrics` param to the `ValidationMonitor` constructor. `metrics` takes a dict
-of key/value pairs, where each key is the name you'd like logged for the metric,
-and the corresponding value is a
-[`MetricSpec`](https://www.tensorflow.org/code/tensorflow/contrib/learn/python/learn/metric_spec.py)
-object.
-
-The `MetricSpec` constructor accepts four parameters:
-
-* `metric_fn`. The function that calculates and returns the value of a metric.
- This can be a predefined function available in the
- @{tf.contrib.metrics} module, such as
- @{tf.contrib.metrics.streaming_precision} or
- @{tf.contrib.metrics.streaming_recall}.
-
- Alternatively, you can define your own custom metric function, which must
- take `predictions` and `labels` tensors as arguments (a `weights` argument
- can also optionally be supplied). The function must return the value of the
- metric in one of two formats:
-
- * A single tensor
- * A pair of ops `(value_op, update_op)`, where `value_op` returns the
- metric value and `update_op` performs a corresponding operation to
- update internal model state.
-
-* `prediction_key`. The key of the tensor containing the predictions returned
- by the model. This argument may be omitted if the model returns either a
- single tensor or a dict with a single entry. For a `DNNClassifier` model,
- class predictions will be returned in a tensor with the key
- @{tf.contrib.learn.PredictionKey.CLASSES}.
-
-* `label_key`. The key of the tensor containing the labels returned by the
- model, as specified by the model's @{$input_fn$`input_fn`}. As
- with `prediction_key`, this argument may be omitted if the `input_fn`
- returns either a single tensor or a dict with a single entry. In the iris
- example in this tutorial, the `DNNClassifier` does not have an `input_fn`
- (`x`,`y` data is passed directly to `fit`), so it's not necessary to provide
- a `label_key`.
-
-* `weights_key`. *Optional*. The key of the tensor (returned by the
- @{$input_fn$`input_fn`}) containing weights inputs for the
- `metric_fn`.
-
-The following code creates a `validation_metrics` dict that defines three
-metrics to log during model evaluation:
-
-* `"accuracy"`, using @{tf.contrib.metrics.streaming_accuracy}
- as the `metric_fn`
-* `"precision"`, using @{tf.contrib.metrics.streaming_precision}
- as the `metric_fn`
-* `"recall"`, using @{tf.contrib.metrics.streaming_recall}
- as the `metric_fn`
-
-```python
-validation_metrics = {
- "accuracy":
- tf.contrib.learn.MetricSpec(
- metric_fn=tf.contrib.metrics.streaming_accuracy,
- prediction_key=tf.contrib.learn.PredictionKey.CLASSES),
- "precision":
- tf.contrib.learn.MetricSpec(
- metric_fn=tf.contrib.metrics.streaming_precision,
- prediction_key=tf.contrib.learn.PredictionKey.CLASSES),
- "recall":
- tf.contrib.learn.MetricSpec(
- metric_fn=tf.contrib.metrics.streaming_recall,
- prediction_key=tf.contrib.learn.PredictionKey.CLASSES)
-}
-```
-
-Add the above code before the `ValidationMonitor` constructor. Then revise the
-`ValidationMonitor` constructor as follows to add a `metrics` parameter to log
-the accuracy, precision, and recall metrics specified in `validation_metrics`
-(loss is always logged, and doesn't need to be explicitly specified):
-
-```python
-validation_monitor = tf.contrib.learn.monitors.ValidationMonitor(
- test_set.data,
- test_set.target,
- every_n_steps=50,
- metrics=validation_metrics)
-```
-
-Rerun the code, and you should see precision and recall included in your log
-output, e.g.:
-
-```none
-INFO:tensorflow:Validation (step 50): recall = 0.0, loss = 1.20626, global_step = 1, precision = 0.0, accuracy = 0.266667
-...
-INFO:tensorflow:Validation (step 600): recall = 1.0, loss = 0.0530696, global_step = 571, precision = 1.0, accuracy = 0.966667
-...
-INFO:tensorflow:Validation (step 1500): recall = 1.0, loss = 0.0617403, global_step = 1452, precision = 1.0, accuracy = 0.966667
-```
-
-### Early Stopping with ValidationMonitor
-
-Note that in the above log output, by step 600, the model has already achieved
-precision and recall rates of 1.0. This raises the question as to whether model
-training could benefit from
-[early stopping](https://en.wikipedia.org/wiki/Early_stopping).
-
-In addition to logging eval metrics, `ValidationMonitor`s make it easy to
-implement early stopping when specified conditions are met, via three params:
-
-| Param | Description |
-| -------------------------------- | ----------------------------------------- |
-| `early_stopping_metric` | Metric that triggers early stopping |
-: : (e.g., loss or accuracy) under conditions :
-: : specified in `early_stopping_rounds` and :
-: : `early_stopping_metric_minimize`. Default :
-: : is `"loss"`. :
-| `early_stopping_metric_minimize` | `True` if desired model behavior is to |
-: : minimize the value of :
-: : `early_stopping_metric`; `False` if :
-: : desired model behavior is to maximize the :
-: : value of `early_stopping_metric`. Default :
-: : is `True`. :
-| `early_stopping_rounds` | Sets a number of steps during which if |
-: : the `early_stopping_metric` does not :
-: : decrease (if :
-: : `early_stopping_metric_minimize` is :
-: : `True`) or increase (if :
-: : `early_stopping_metric_minimize` is :
-: : `False`), training will be stopped. :
-: : Default is `None`, which means early :
-: : stopping will never occur. :
-
-Make the following revision to the `ValidationMonitor` constructor, which
-specifies that if loss (`early_stopping_metric="loss"`) does not decrease
-(`early_stopping_metric_minimize=True`) over a period of 200 steps
-(`early_stopping_rounds=200`), model training will stop immediately at that
-point, and not complete the full 2000 steps specified in `fit`:
-
-```python
-validation_monitor = tf.contrib.learn.monitors.ValidationMonitor(
- test_set.data,
- test_set.target,
- every_n_steps=50,
- metrics=validation_metrics,
- early_stopping_metric="loss",
- early_stopping_metric_minimize=True,
- early_stopping_rounds=200)
-```
-
-Rerun the code to see if model training stops early:
-
-```none
-...
-INFO:tensorflow:Validation (step 1150): recall = 1.0, loss = 0.056436, global_step = 1119, precision = 1.0, accuracy = 0.966667
-INFO:tensorflow:Stopping. Best step: 800 with loss = 0.048313818872.
-```
-
-Indeed, here training stops at step 1150, indicating that for the past 200
-steps, loss did not decrease, and that overall, step 800 produced the smallest
-loss value against the test data set. This suggests that additional calibration
-of hyperparameters by decreasing the step count might further improve the model.
-
-## Visualizing Log Data with TensorBoard
-
-Reading through the log produced by `ValidationMonitor` provides plenty of raw
-data on model performance during training, but it may also be helpful to see
-visualizations of this data to get further insight into trends—for
-example, how accuracy is changing over step count. You can use TensorBoard (a
-separate program packaged with TensorFlow) to plot graphs like this by setting
-the `logdir` command-line argument to the directory where you saved your model
-training data (here, `/tmp/iris_model`). Run the following on your command line:
-
-<pre><strong>$ tensorboard --logdir=/tmp/iris_model/</strong>
-Starting TensorBoard 39 on port 6006</pre>
-
-Then navigate to `http://0.0.0.0:`*`<port_number>`* in your browser, where
-*`<port_number>`* is the port specified in the command-line output (here,
-`6006`).
-
-If you click on the accuracy field, you'll see an image like the following,
-which shows accuracy plotted against step count:
-
-![Accuracy over step count in TensorBoard](https://www.tensorflow.org/images/validation_monitor_tensorboard_accuracy.png "Accuracy over step count in TensorBoard")
-
-For more on using TensorBoard, see @{$summaries_and_tensorboard$TensorBoard: Visualizing Learning} and @{$graph_viz$TensorBoard: Graph Visualization}.