diff options
Diffstat (limited to 'tensorflow/g3doc/api_docs/python/train.md')
-rw-r--r-- | tensorflow/g3doc/api_docs/python/train.md | 1825 |
1 files changed, 1825 insertions, 0 deletions
diff --git a/tensorflow/g3doc/api_docs/python/train.md b/tensorflow/g3doc/api_docs/python/train.md new file mode 100644 index 0000000000..0c88968c5d --- /dev/null +++ b/tensorflow/g3doc/api_docs/python/train.md @@ -0,0 +1,1825 @@ +<!-- This file is machine generated: DO NOT EDIT! --> + +# Training +<!-- TOC-BEGIN This section is generated by neural network: DO NOT EDIT! --> +## Contents +* [Optimizers.](#AUTOGENERATED-optimizers.) + * [class tf.train.Optimizer](#Optimizer) + * [Usage](#AUTOGENERATED-usage) + * [Processing gradients before applying them.](#AUTOGENERATED-processing-gradients-before-applying-them.) + * [Gating Gradients](#AUTOGENERATED-gating-gradients) + * [Slots](#AUTOGENERATED-slots) + * [class tf.train.GradientDescentOptimizer](#GradientDescentOptimizer) + * [class tf.train.AdagradOptimizer](#AdagradOptimizer) + * [class tf.train.MomentumOptimizer](#MomentumOptimizer) + * [class tf.train.AdamOptimizer](#AdamOptimizer) + * [class tf.train.FtrlOptimizer](#FtrlOptimizer) + * [class tf.train.RMSPropOptimizer](#RMSPropOptimizer) +* [Gradient Computation.](#AUTOGENERATED-gradient-computation.) + * [tf.gradients(ys, xs, grad_ys=None, name='gradients', colocate_gradients_with_ops=False, gate_gradients=False, aggregation_method=None)](#gradients) + * [class tf.AggregationMethod](#AggregationMethod) + * [tf.stop_gradient(input, name=None)](#stop_gradient) +* [Gradient Clipping](#AUTOGENERATED-gradient-clipping) + * [tf.clip_by_value(t, clip_value_min, clip_value_max, name=None)](#clip_by_value) + * [tf.clip_by_norm(t, clip_norm, name=None)](#clip_by_norm) + * [tf.clip_by_average_norm(t, clip_norm, name=None)](#clip_by_average_norm) + * [tf.clip_by_global_norm(t_list, clip_norm, use_norm=None, name=None)](#clip_by_global_norm) + * [tf.global_norm(t_list, name=None)](#global_norm) +* [Decaying the learning rate.](#AUTOGENERATED-decaying-the-learning-rate.) + * [tf.train.exponential_decay(learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None)](#exponential_decay) +* [Moving Averages.](#AUTOGENERATED-moving-averages.) + * [class tf.train.ExponentialMovingAverage](#ExponentialMovingAverage) +* [Coordinator and QueueRunner.](#AUTOGENERATED-coordinator-and-queuerunner.) + * [class tf.train.Coordinator](#Coordinator) + * [class tf.train.QueueRunner](#QueueRunner) + * [tf.train.add_queue_runner(qr, collection='queue_runners')](#add_queue_runner) + * [tf.train.start_queue_runners(sess=None, coord=None, daemon=True, start=True, collection='queue_runners')](#start_queue_runners) +* [Summary Operations.](#AUTOGENERATED-summary-operations.) + * [tf.scalar_summary(tags, values, collections=None, name=None)](#scalar_summary) + * [tf.image_summary(tag, tensor, max_images=None, collections=None, name=None)](#image_summary) + * [tf.histogram_summary(tag, values, collections=None, name=None)](#histogram_summary) + * [tf.nn.zero_fraction(value, name=None)](#zero_fraction) + * [tf.merge_summary(inputs, collections=None, name=None)](#merge_summary) + * [tf.merge_all_summaries(key='summaries')](#merge_all_summaries) +* [Adding Summaries to Event Files.](#AUTOGENERATED-adding-summaries-to-event-files.) + * [class tf.train.SummaryWriter](#SummaryWriter) + * [tf.train.summary_iterator(path)](#summary_iterator) +* [Training utilities.](#AUTOGENERATED-training-utilities.) + * [tf.train.global_step(sess, global_step_tensor)](#global_step) + * [tf.train.write_graph(graph_def, logdir, name, as_text=True)](#write_graph) + + +<!-- TOC-END This section was generated by neural network, THANKS FOR READING! --> + +This library provides a set of classes and functions that helps train models. + +## Optimizers. <div class="md-anchor" id="AUTOGENERATED-optimizers.">{#AUTOGENERATED-optimizers.}</div> + +The Optimizer base class provides methods to compute gradients for a loss and +apply gradients to variables. A collection of subclasses implement classic +optimization algorithms such as GradientDescent and Adagrad. + +You never instantiate the Optimizer class itself, but instead instantiate one +of the subclasses. + +- - - + +### class tf.train.Optimizer <div class="md-anchor" id="Optimizer">{#Optimizer}</div> + +Base class for optimizers. + +This class defines the API to add Ops to train a model. You never use this +class directly, but instead instantiate one of its subclasses such as +`GradientDescentOptimizer`, `AdagradOptimizer`, or `MomentumOptimizer`. + +### Usage <div class="md-anchor" id="AUTOGENERATED-usage">{#AUTOGENERATED-usage}</div> + +``` +# Create an optimizer with the desired parameters. +opt = GradientDescentOptimizer(learning_rate=0.1) +# Add Ops to the graph to minimize a cost by updating a list of variables. +# "cost" is a Tensor, and the list of variables contains variables.Variable +# objects. +opt_op = opt.minimize(cost, <list of variables>) +``` + +In the training program you will just have to run the returned Op. + +``` +# Execute opt_op to do one step of training: +opt_op.run() +``` + +### Processing gradients before applying them. <div class="md-anchor" id="AUTOGENERATED-processing-gradients-before-applying-them.">{#AUTOGENERATED-processing-gradients-before-applying-them.}</div> + +Calling `minimize()` takes care of both computing the gradients and +applying them to the variables. If you want to process the gradients +before applying them you can instead use the optimizer in three steps: + +1. Compute the gradients with `compute_gradients()`. +2. Process the gradients as you wish. +3. Apply the processed gradients with `apply_gradients()`. + +Example: + +``` +# Create an optimizer. +opt = GradientDescentOptimizer(learning_rate=0.1) + +# Compute the gradients for a list of variables. +grads_and_vars = opt.compute_gradients(loss, <list of variables>) + +# grads_and_vars is a list of tuples (gradient, variable). Do whatever you +# need to the 'gradient' part, for example cap them, etc. +capped_grads_and_vars = [(MyCapper(gv[0]), gv[1])) for gv in grads_and_vars] + +# Ask the optimizer to apply the capped gradients. +opt.apply_gradients(capped_grads_and_vars) +``` + +- - - + +#### tf.train.Optimizer.__init__(use_locking, name) {#Optimizer.__init__} + +Create a new Optimizer. + +This must be called by the constructors of subclasses. + +##### Args: + + +* <b>use_locking</b>: Bool. If True apply use locks to prevent concurrent updates + to variables. +* <b>name</b>: A non-empty string. The name to use for accumulators created + for the optimizer. + +##### Raises: + + +* <b>ValueError</b>: if name is malformed. + + + +- - - + +#### tf.train.Optimizer.minimize(loss, global_step=None, var_list=None, gate_gradients=1, name=None) {#Optimizer.minimize} + +Add operations to minimize 'loss' by updating 'var_list'. + +This method simply combines calls compute_gradients() and +apply_gradients(). If you want to process the gradient before applying them +call compute_gradients() and apply_gradients() explicitly instead of using +this function. + +##### Args: + + +* <b>loss</b>: A Tensor containing the value to minimize. +* <b>global_step</b>: Optional Variable to increment by one after the + variables have been updated. +* <b>var_list</b>: Optional list of variables.Variable to update to minimize + 'loss'. Defaults to the list of variables collected in the graph + under the key GraphKeys.TRAINABLE_VARIABLES. +* <b>gate_gradients</b>: How to gate the computation of gradients. Can be + GATE_NONE, GATE_OP, or GATE_GRAPH. +* <b>name</b>: Optional name for the returned operation. + +##### Returns: + + An Operation that updates the variables in 'var_list'. If 'global_step' + was not None, that operation also increments global_step. + +##### Raises: + + +* <b>ValueError</b>: if some of the variables are not variables.Variable objects. + + +- - - + +#### tf.train.Optimizer.compute_gradients(loss, var_list=None, gate_gradients=1) {#Optimizer.compute_gradients} + +Compute gradients of "loss" for the variables in "var_list". + +This is the first part of minimize(). It returns a list +of (gradient, variable) pairs where "gradient" is the gradient +for "variable". Note that "gradient" can be a Tensor, a +IndexedSlices, or None if there is no gradient for the +given variable. + +##### Args: + + +* <b>loss</b>: A Tensor containing the value to minimize. +* <b>var_list</b>: Optional list of variables.Variable to update to minimize + "loss". Defaults to the list of variables collected in the graph + under the key GraphKey.TRAINABLE_VARIABLES. +* <b>gate_gradients</b>: How to gate the computation of gradients. Can be + GATE_NONE, GATE_OP, or GATE_GRAPH. + +##### Returns: + + A list of (gradient, variable) pairs. + +##### Raises: + + +* <b>TypeError</b>: If var_list contains anything else than variables.Variable. +* <b>ValueError</b>: If some arguments are invalid. + + +- - - + +#### tf.train.Optimizer.apply_gradients(grads_and_vars, global_step=None, name=None) {#Optimizer.apply_gradients} + +Apply gradients to variables. + +This is the second part of minimize(). It returns an Operation that +applies gradients. + +##### Args: + + +* <b>grads_and_vars</b>: List of (gradient, variable) pairs as returned by + compute_gradients(). +* <b>global_step</b>: Optional Variable to increment by one after the + variables have been updated. +* <b>name</b>: Optional name for the returned operation. Default to the + name passed to the Optimizer constructor. + +##### Returns: + + An Operation that applies the specified gradients. If 'global_step' + was not None, that operation also increments global_step. + +##### Raises: + + +* <b>TypeError</b>: if grads_and_vars is malformed. + + + +### Gating Gradients <div class="md-anchor" id="AUTOGENERATED-gating-gradients">{#AUTOGENERATED-gating-gradients}</div> + +Both `minimize()` and `compute_gradients()` accept a `gate_gradient` argument +that controls the degree of parallelism during the application of the +gradients. + +The possible values are: `GATE_NONE`, `GATE_OP`, and `GATE_GRAPH`. + +<b>GATE_NONE</b>: Compute and apply gradients in parallel. This provides the +maximum parallelism in execution, at the cost of some non-reproducibility in +the results. For example the two gradients of MatMul depend on the input +values: With `GATE_NONE` one of the gradients could be applied to one of the +inputs _before_ the other gradient is computed resulting in non-reproducible +results. + +<b>GATE_OP</b>: For each Op, make sure all gradients are computed before they +are used. This prevents race conditions for Ops that generate gradients for +multiple inputs where the gradients depend on the inputs. + +<b>GATE_GRAPH</b>: Make sure all gradients for all variables are computed +before any one of them is used. This provides the least parallelism but can +be useful if you want to process all gradients before applying any of them. + +### Slots <div class="md-anchor" id="AUTOGENERATED-slots">{#AUTOGENERATED-slots}</div> + +Some optimizer subclasses, such as `MomentumOptimizer` and `AdagradOptimizer` +allocate and manage additional variables associated with the variables to +train. These are called <i>Slots</i>. Slots have names and you can ask the +optimizer for the names of the slots that it uses. Once you have a slot name +you can ask the optimizer for the variable it created to hold the slot value. + +This can be useful if you want to log debug a training algorithm, report stats +about the slots, etc. + +- - - + +#### tf.train.Optimizer.get_slot_names() {#Optimizer.get_slot_names} + +Return a list of the names of slots created by the Optimizer. + +See get_slot(). + +##### Returns: + + A list of strings. + + +- - - + +#### tf.train.Optimizer.get_slot(var, name) {#Optimizer.get_slot} + +Return a slot named "name" created for "var" by the Optimizer. + +Some Optimizer subclasses use additional variables. For example +Momentum and Adagrad use variables to accumulate updates. This method +gives access to these Variables if for some reason you need them. + +Use get_slot_names() to get the list of slot names created by the Optimizer. + +##### Args: + + +* <b>var</b>: A variable passed to minimize() or apply_gradients(). +* <b>name</b>: A string. + +##### Returns: + + The Variable for the slot if it was created, None otherwise. + + + + +- - - + +### class tf.train.GradientDescentOptimizer <div class="md-anchor" id="GradientDescentOptimizer">{#GradientDescentOptimizer}</div> + +Optimizer that implements the gradient descent algorithm. + +- - - + +#### tf.train.GradientDescentOptimizer.__init__(learning_rate, use_locking=False, name='GradientDescent') {#GradientDescentOptimizer.__init__} + +Construct a new gradient descent optimizer. + +##### Args: + + +* <b>learning_rate</b>: A Tensor or a floating point value. The learning + rate to use. +* <b>use_locking</b>: If True use locks for update operation.s +* <b>name</b>: Optional name prefix for the operations created when applying + gradients. Defaults to "GradientDescent". + + + +- - - + +### class tf.train.AdagradOptimizer <div class="md-anchor" id="AdagradOptimizer">{#AdagradOptimizer}</div> + +Optimizer that implements the Adagrad algorithm. + +- - - + +#### tf.train.AdagradOptimizer.__init__(learning_rate, initial_accumulator_value=0.1, use_locking=False, name='Adagrad') {#AdagradOptimizer.__init__} + +Construct a new Adagrad optimizer. + +##### Args: + + +* <b>learning_rate</b>: A `Tensor` or a floating point value. The learning rate. +* <b>initial_accumulator_value</b>: A floating point value. + Starting value for the accumulators, must be positive. +* <b>use_locking</b>: If `True` use locks for update operations. +* <b>name</b>: Optional name prefix for the operations created when applying + gradients. Defaults to "Adagrad". + +##### Raises: + + +* <b>ValueError</b>: If the initial_accumulator_value is invalid. + + + +- - - + +### class tf.train.MomentumOptimizer <div class="md-anchor" id="MomentumOptimizer">{#MomentumOptimizer}</div> + +Optimizer that implements the Momentum algorithm. + +- - - + +#### tf.train.MomentumOptimizer.__init__(learning_rate, momentum, use_locking=False, name='Momentum') {#MomentumOptimizer.__init__} + +Construct a new Momentum optimizer. + +##### Args: + + +* <b>learning_rate</b>: A `Tensor` or a floating point value. The learning rate. +* <b>momentum</b>: A `Tensor` or a floating point value. The momentum. +* <b>use_locking</b>: If `True` use locks for update operations. +* <b>name</b>: Optional name prefix for the operations created when applying + gradients. Defaults to "Momentum". + + + +- - - + +### class tf.train.AdamOptimizer <div class="md-anchor" id="AdamOptimizer">{#AdamOptimizer}</div> + +Optimizer that implements the Adam algorithm. + +- - - + +#### tf.train.AdamOptimizer.__init__(learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08, use_locking=False, name='Adam') {#AdamOptimizer.__init__} + +Construct a new Adam optimizer. + +Implementation is based on: http://arxiv.org/pdf/1412.6980v7.pdf + +Initialization: + +``` +m_0 <- 0 (Initialize initial 1st moment vector) +v_0 <- 0 (Initialize initial 2nd moment vector) +t <- 0 (Initialize timestep) +``` + +The update rule for `variable` with gradient `g` uses an optimization +described at the end of section2 of the paper: + +``` +t <- t + 1 +lr_t <- learning_rate * sqrt(1 - beta2^t) / (1 - beta1^t) + +m_t <- beta1 * m_{t-1} + (1 - beta1) * g +v_t <- beta2 * v_{t-1} + (1 - beta2) * g * g +variable <- variable - lr_t * m_t / (sqrt(v_t) + epsilon) +``` + +The default value of 1e-8 for epsilon might not be a good default in +general. For example, when training an Inception network on ImageNet a +current good choice is 1.0 or 0.1. + +##### Args: + + +* <b>learning_rate</b>: A Tensor or a floating point value. The learning rate. +* <b>beta1</b>: A float value or a constant float tensor. + The exponential decay rate for the 1st moment estimates. +* <b>beta2</b>: A float value or a constant float tensor. + The exponential decay rate for the 2st moment estimates. +* <b>epsilon</b>: A small constant for numerical stability. +* <b>use_locking</b>: If True use locks for update operation.s +* <b>name</b>: Optional name for the operations created when applying gradients. + Defaults to "Adam". + + + +- - - + +### class tf.train.FtrlOptimizer <div class="md-anchor" id="FtrlOptimizer">{#FtrlOptimizer}</div> + +Optimizer that implements the FTRL algorithm. + +- - - + +#### tf.train.FtrlOptimizer.__init__(learning_rate, learning_rate_power=-0.5, initial_accumulator_value=0.1, l1_regularization_strength=0.0, l2_regularization_strength=0.0, use_locking=False, name='Ftrl') {#FtrlOptimizer.__init__} + +Construct a new FTRL optimizer. + +The Ftrl-proximal algorithm, abbreviated for Follow-the-regularized-leader, +is described in the paper [Ad Click Prediction: a View from the Trenches]( +https://www.eecs.tufts.edu/~dsculley/papers/ad-click-prediction.pdf). + +It can give a good performance vs. sparsity tradeoff. + +Ftrl-proximal uses its own global base learning rate and can behave like +Adagrad with `learning_rate_power=-0.5`, or like gradient descent with +`learning_rate_power=0.0`. + +The effective learning rate is adjusted per parameter, relative to this +base learning rate as: + +``` +effective_learning_rate_i = (learning_rate / + pow(k + summed_squared_gradients_for_i, learning_rate_power)); +``` + +where k is the small constant `initial_accumulator_value`. + +Note that the real regularization coefficient of `|w|^2` for objective +function is `1 / lambda_2` if specifying `l2 = lambda_2` as argument when +using this function. + +##### Args: + + +* <b>learning_rate</b>: A float value or a constant float `Tensor`. +* <b>learning_rate_power</b>: A float value, must be less or equal to zero. +* <b>initial_accumulator_value</b>: The starting value for accumulators. + Only positive values are allowed. +* <b>l1_regularization_strength</b>: A float value, must be greater than or + equal to zero. +* <b>l2_regularization_strength</b>: A float value, must be greater than or + equal to zero. +* <b>use_locking</b>: If `True` use locks for update operations. +* <b>name</b>: Optional name prefix for the operations created when applying + gradients. Defaults to "Ftrl". + +##### Raises: + + +* <b>ValueError</b>: if one of the arguments is invalid. + + + +- - - + +### class tf.train.RMSPropOptimizer <div class="md-anchor" id="RMSPropOptimizer">{#RMSPropOptimizer}</div> + +Optimizer that implements the RMSProp algorithm. + +- - - + +#### tf.train.RMSPropOptimizer.__init__(learning_rate, decay, momentum=0.0, epsilon=1e-10, use_locking=False, name='RMSProp') {#RMSPropOptimizer.__init__} + +Construct a new RMSProp optimizer. + +##### Args: + + +* <b>learning_rate</b>: A Tensor or a floating point value. The learning rate. +* <b>decay</b>: discounting factor for the history/coming gradient +* <b>momentum</b>: a scalar tensor. +* <b>epsilon</b>: small value to avoid zero denominator. +* <b>use_locking</b>: If True use locks for update operation. +* <b>name</b>: Optional name prefic for the operations created when applying + gradients. Defaults to "RMSProp". + + + + +## Gradient Computation. <div class="md-anchor" id="AUTOGENERATED-gradient-computation.">{#AUTOGENERATED-gradient-computation.}</div> + +TensorFlow provides functions to compute the derivatives for a given +TensorFlow computation graph, adding operations to the graph. The +optimizer classes automatically compute derivatives on your graph, but +creators of new Optimizers or expert users can call the lower-level +functions below. + +- - - + +### tf.gradients(ys, xs, grad_ys=None, name='gradients', colocate_gradients_with_ops=False, gate_gradients=False, aggregation_method=None) <div class="md-anchor" id="gradients">{#gradients}</div> + +Constructs symbolic partial derivatives of `ys` w.r.t. x in `xs`. + +`ys` and `xs` are each a `Tensor` or a list of tensors. `grad_ys` +is a list of `Tensor`, holding the gradients received by the +`ys`. The list must be the same length as `ys`. + +`gradients()` adds ops to the graph to output the partial +derivatives of `ys` with respect to `xs`. It returns a list of +`Tensor` of length `len(xs)` where each tensor is the `sum(dy/dx)` +for y in `ys`. + +`grad_ys` is a list of tensors of the same length as `ys` that holds +the initial gradients for each y in `ys`. When `grad_ys` is None, +we fill in a tensor of '1's of the shape of y for each y in `ys`. A +user can provide their own initial 'grad_ys` to compute the +derivatives using a different initial gradient for each y (e.g., if +one wanted to weight the gradient differently for each value in +each y). + +##### Args: + + +* <b>ys</b>: A `Tensor` or list of tensors to be differentiated. +* <b>xs</b>: A `Tensor` or list of tensors to be used for differentiation. +* <b>grad_ys</b>: Optional. A `Tensor` or list of tensors the same size as + `ys` and holding the gradients computed for each y in `ys`. +* <b>name</b>: Optional name to use for grouping all the gradient ops together. + defaults to 'gradients'. +* <b>colocate_gradients_with_ops</b>: If True, try colocating gradients with + the corresponding op. +* <b>gate_gradients</b>: If True, add a tuple around the gradients returned + for an operations. This avoids some race conditions. +* <b>aggregation_method</b>: Specifies the method used to combine gradient terms. + Accepted values are constants defined in the class `AggregationMethod`. + +##### Returns: + + A list of `sum(dy/dx)` for each x in `xs`. + +##### Raises: + + +* <b>LookupError</b>: if one of the operations between `x` and `y` does not + have a registered gradient function. +* <b>ValueError</b>: if the arguments are invalid. + + +- - - + +### class tf.AggregationMethod <div class="md-anchor" id="AggregationMethod">{#AggregationMethod}</div> + +A class listing aggregation methods used to combine gradients. + +Computing partial derivatives can require aggregating gradient +contributions. This class lists the various methods that can +be used to combine gradients in the graph: + +* `ADD_N`: All of the gradient terms are summed as part of one + operation using the "AddN" op. It has the property that all + gradients must be ready before any aggregation is performed. +* `DEFAULT`: The system-chosen default aggregation method. + + +- - - + +### tf.stop_gradient(input, name=None) <div class="md-anchor" id="stop_gradient">{#stop_gradient}</div> + +Stops gradient computation. + +When executed in a graph, this op outputs its input tensor as-is. + +When building ops to compute gradients, this op prevents the contribution of +its inputs to be taken into account. Normally, the gradient generator adds ops +to a graph to compute the derivatives of a specified 'loss' by recursively +finding out inputs that contributed to its computation. If you insert this op +in the graph it inputs are masked from the gradient generator. They are not +taken into account for computing gradients. + +This is useful any time you want to compute a value with TensorFlow but need +to pretend that the value was a constant. Some examples include: + +* The *EM* algorithm where the *M-step* should not involve backpropagation + through the output of the *E-step*. +* Contrastive divergence training of Boltzmann machines where, when + differentiating the energy function, the training must not backpropagate + through the graph that generated the samples from the model. +* Adversarial training, where no backprop should happen through the adversarial + example generation process. + +##### Args: + + +* <b>input</b>: A `Tensor`. +* <b>name</b>: A name for the operation (optional). + +##### Returns: + + A `Tensor`. Has the same type as `input`. + + + + +## Gradient Clipping <div class="md-anchor" id="AUTOGENERATED-gradient-clipping">{#AUTOGENERATED-gradient-clipping}</div> + +TensorFlow provides several operations that you can use to add clipping +functions to your graph. You can use these functions to perform general data +clipping, but they're particularly useful for handling exploding or vanishing +gradients. + +- - - + +### tf.clip_by_value(t, clip_value_min, clip_value_max, name=None) <div class="md-anchor" id="clip_by_value">{#clip_by_value}</div> + +Clips tensor values to a specified min and max. + +Given a tensor `t`, this operation returns a tensor of the same type and +shape as `t` with its values clipped to `clip_value_min` and `clip_value_max`. +Any values less than `clip_value_min` are set to `clip_value_min`. Any values +greater than `clip_value_max` are set to `clip_value_max`. + +##### Args: + + +* <b>t</b>: A `Tensor`. +* <b>clip_value_min</b>: A 0-D (scalar) `Tensor`. The minimum value to clip by. +* <b>clip_value_max</b>: A 0-D (scalar) `Tensor`. The maximum value to clip by. +* <b>name</b>: A name for the operation (optional). + +##### Returns: + + A clipped `Tensor`. + + +- - - + +### tf.clip_by_norm(t, clip_norm, name=None) <div class="md-anchor" id="clip_by_norm">{#clip_by_norm}</div> + +Clips tensor values to a maximum L2-norm. + +Given a tensor `t`, and a maximum clip value `clip_norm`, this operation +normalizes `t` so that its L2-norm is less than or equal to `clip_norm'. +Specifically, if the L2-norm is already less than or equal to `clip_norm`, +then `t` is not modified. If the L2-norm is greater than `clip_norm`, then +this operation returns a tensor of the same type and shape as `t` with its +values set to: + +`t * clip_norm / l2norm(t)` + +In this case, the L2-norm of the output tensor is `clip_norm`. + +This operation is typically used to clip gradients before applying them with +an optimizer. + +##### Args: + + +* <b>t</b>: A `Tensor`. +* <b>clip_norm</b>: A 0-D (scalar) `Tensor` > 0. A maximum clipping value. +* <b>name</b>: A name for the operation (optional). + +##### Returns: + + A clipped `Tensor`. + + +- - - + +### tf.clip_by_average_norm(t, clip_norm, name=None) <div class="md-anchor" id="clip_by_average_norm">{#clip_by_average_norm}</div> + +Clips tensor values to a maximum average L2-norm. + +Given a tensor `t`, and a maximum clip value `clip_norm`, this operation +normalizes `t` so that its average L2-norm is less than or equal to +`clip_norm'. Specifically, if the average L2-norm is already less than or +equal to `clip_norm`, then `t` is not modified. If the average L2-norm is +greater than `clip_norm`, then this operation returns a tensor of the same +type and shape as `t` with its values set to: + +`t * clip_norm / l2norm_avg(t)` + +In this case, the average L2-norm of the output tensor is `clip_norm`. + +This operation is typically used to clip gradients before applying them with +an optimizer. + +##### Args: + + +* <b>t</b>: A `Tensor`. +* <b>clip_norm</b>: A 0-D (scalar) `Tensor` > 0. A maximum clipping value. +* <b>name</b>: A name for the operation (optional). + +##### Returns: + + A clipped `Tensor`. + + +- - - + +### tf.clip_by_global_norm(t_list, clip_norm, use_norm=None, name=None) <div class="md-anchor" id="clip_by_global_norm">{#clip_by_global_norm}</div> + +Clips values of multiple tensors by the ratio of the sum of their norms. + +Given a tuple or list of tensors `t_list`, and a clipping ratio `clip_norm`, +this operation returns a list of clipped tensors `list_clipped` +and the global norm (`global_norm`) of all tensors in `t_list`. Optionally, +if you've already computed the global norm for `t_list`, you can specify +the global norm with `use_norm`. + +To perform the clipping, the values t_list[i] are set to: + +`t_list[i] * clip_norm / max(global_norm, clip_norm)` + +where: + +`global_norm = sqrt(sum([l2norm(t)**2 for t in t_list]))` + +If `clip_norm > global_norm` then the entries in `t_list` remain as they are, +otherwise they're all shrunk by the global ratio. + +Any of the entries of `t_list` that are of type None are ignored. + +This is the correct way to perform gradient clipping (for example, see +R. Pascanu, T. Mikolov, and Y. Bengio, "On the difficulty of training +Recurrent Neural Networks". http://arxiv.org/abs/1211.5063) + +However, it is slower than `clip_by_norm()` because all the parameters must be +ready before the clipping operation can be performed. + +##### Args: + + +* <b>t_list</b>: A tuple or list of mixed `Tensors`, `IndexedSlices`, or None. +* <b>clip_norm</b>: A 0-D (scalar) `Tensor` > 0. The clipping ratio. +* <b>use_norm</b>: A 0-D (scalar) `Tensor` of type `float` (optional). The global + norm to use. If not provided, `global_norm()` is used to compute the norm. +* <b>name</b>: A name for the operation (optional). + +##### Returns: + + +* <b>list_clipped</b>: A list of `Tensors` of the same type as `list_t`. +* <b>global_norm</b>: A 0-D (scalar) `Tensor` representing the global norm. + +##### Raises: + + +* <b>TypeError</b>: If `t_list` is not a sequence. + + +- - - + +### tf.global_norm(t_list, name=None) <div class="md-anchor" id="global_norm">{#global_norm}</div> + +Computes the global norm of multiple tensors. + +Given a tuple or list of tensors `t_list`, this operation returns the +global norm of the elements in all tensors in `t_list`. The global norm is +computed as: + +`global_norm = sqrt(sum([l2norm(t)**2 for t in t_list]))` + +Any entries in `t_list` that are of type None are ignored. + +##### Args: + + +* <b>t_list</b>: A tuple or list of mixed `Tensors`, `IndexedSlices`, or None. +* <b>name</b>: A name for the operation (optional). + +##### Returns: + + A 0-D (scalar) `Tensor` of type `float`. + +##### Raises: + + +* <b>TypeError</b>: If `t_list` is not a sequence. + + + +## Decaying the learning rate. <div class="md-anchor" id="AUTOGENERATED-decaying-the-learning-rate.">{#AUTOGENERATED-decaying-the-learning-rate.}</div> +- - - + +### tf.train.exponential_decay(learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None) <div class="md-anchor" id="exponential_decay">{#exponential_decay}</div> + +Applies exponential decay to the learning rate. + +When training a model, it is often recommended to lower the learning rate as +the training progresses. This function applies an exponential decay function +to a provided initial learning rate. It requires a `global_step` value to +compute the decayed learning rate. You can just pass a TensorFlow variable +that you increment at each training step. + +The function returns the decayed learning rate. It is computed as: + +```python +decayed_learning_rate = learning_rate * + decay_rate ^ (global_step / decay_steps) +``` + +If the argument `staircase` is `True`, then `global_step /decay_steps` is an +integer division and the decayed learning rate follows a staircase function. + +Example: decay every 100000 steps with a base of 0.96: + +```python +... +global_step = tf.Variable(0, trainable=False) +starter_learning_rate = 0.1 +learning_rate = tf.exponential_decay(starter_learning_rate, global_step, + 100000, 0.96, staircase=True) +optimizer = tf.GradientDescent(learning_rate) +# Passing global_step to minimize() will increment it at each step. +optimizer.minimize(...my loss..., global_step=global_step) +``` + +##### Args: + + +* <b>learning_rate</b>: A scalar `float32` or `float64` `Tensor` or a + Python number. The initial learning rate. +* <b>global_step</b>: A scalar `int32` or `int64` `Tensor` or a Python number. + Global step to use for the decay computation. Must not be negative. +* <b>decay_steps</b>: A scalar `int32` or `int64` `Tensor` or a Python number. + Must be positive. See the decay computation above. +* <b>decay_rate</b>: A scalar `float32` or `float64` `Tensor` or a + Python number. The decay rate. +* <b>staircase</b>: Boolean. It `True` decay the learning rate at discrete intervals. +* <b>name</b>: string. Optional name of the operation. Defaults to 'ExponentialDecay' + +##### Returns: + + A scalar `Tensor` of the same type as `learning_rate`. The decayed + learning rate. + + + +## Moving Averages. <div class="md-anchor" id="AUTOGENERATED-moving-averages.">{#AUTOGENERATED-moving-averages.}</div> + +Some training algorithms, such as GradientDescent and Momentum often benefit +from maintaining a moving average of variables during optimization. Using the +moving averages for evaluations often improve results significantly. + +- - - + +### class tf.train.ExponentialMovingAverage <div class="md-anchor" id="ExponentialMovingAverage">{#ExponentialMovingAverage}</div> + +Maintains moving averages of variables by employing and exponential decay. + +When training a model, it is often beneficial to maintain moving averages of +the trained parameters. Evaluations that use averaged parameters sometimes +produce significantly better results than the final trained values. + +The `apply()` method adds shadow copies of trained variables and add ops that +maintain a moving average of the trained variables in their shadow copies. +It is used when building the training model. The ops that maintain moving +averages are typically run after each training step. +The `average()` and `average_name()` methods give access to the shadow +variables and their names. They are useful when building an evaluation +model, or when restoring a model from a checkpoint file. They help use the +moving averages in place of the last trained values for evaluations. + +The moving averages are computed using exponential decay. You specify the +decay value when creating the `ExponentialMovingAverage` object. The shadow +variables are initialized with the same initial values as the trained +variables. When you run the ops to maintain the moving averages, each +shadow variable is updated with the formula: + + `shadow_variable -= (1 - decay) * (shadow_variable - variable)` + +This is mathematically equivalent to the classic formula below, but the use +of an `assign_sub` op (the `"-="` in the formula) allows concurrent lockless +updates to the variables: + + `shadow_variable = decay * shadow_variable + (1 - decay) * variable` + +Reasonable values for `decay` are close to 1.0, typically in the +multiple-nines range: 0.999, 0.9999, etc. + +Example usage when creating a training model: + +```python +# Create variables. +var0 = tf.Variable(...) +var1 = tf.Variable(...) +# ... use the variables to build a training model... +... +# Create an op that applies the optimizer. This is what we usually +# would use as a training op. +opt_op = opt.minimize(my_loss, [var0, var1]) + +# Create an ExponentialMovingAverage object +ema = tf.train.ExponentialMovingAverage(decay=0.9999) + +# Create the shadow variables, and add ops to maintain moving averages +# of var0 and var1. +maintain_averages_op = ema.apply([var0, var1]) + +# Create an op that will update the moving averages after each training +# step. This is what we will use in place of the usuall trainig op. +with tf.control_dependencies([opt_op]): + training_op = tf.group(maintain_averages_op) + +...train the model by running training_op... +``` + +There are two ways to use the moving averages for evaluations: + +* Build a model that uses the shadow variables instead of the variables. + For this, use the `average()` method which returns the shadow variable + for a given variable. +* Build a model normally but load the checkpoint files to evaluate by using + the shadow variable names. For this use the `average_name()` method. See + the [Saver class](train.md#Saver) for more information on restoring saved + variables. + +Example of restoring the shadow variable values: + +```python +# Create a Saver that loads variables from their saved shadow values. +shadow_var0_name = ema.average_name(var0) +shadow_var1_name = ema.average_name(var1) +saver = tf.train.Saver({shadow_var0_name: var0, shadow_var1_name: var1}) +saver.restore(...checkpoint filename...) +# var0 and var1 now hold the moving average values +``` + +- - - + +#### tf.train.ExponentialMovingAverage.__init__(decay, num_updates=None, name='ExponentialMovingAverage') {#ExponentialMovingAverage.__init__} + +Creates a new ExponentialMovingAverage object. + +The `Apply()` method has to be called to create shadow variables and add +ops to maintain moving averages. + +The optional `num_updates` parameter allows one to tweak the decay rate +dynamically. . It is typical to pass the count of training steps, usually +kept in a variable that is incremented at each step, in which case the +decay rate is lower at the start of training. This makes moving averages +move faster. If passed, the actual decay rate used is: + + `min(decay, (1 + num_updates) / (10 + num_updates))` + +##### Args: + + +* <b>decay</b>: Float. The decay to use. +* <b>num_updates</b>: Optional count of number of updates applied to variables. +* <b>name</b>: String. Optional prefix name to use for the name of ops added in + `Apply()`. + + +- - - + +#### tf.train.ExponentialMovingAverage.apply(var_list=None) {#ExponentialMovingAverage.apply} + +Maintains moving averages of variables. + +`var_list` must be a list of `Variable` or `Tensor` objects. This method +creates shadow variables for all elements of `var_list`. Shadow variables +for `Variable` objects are initialized to the variable's initial value. +For `Tensor` objects, the shadow variables are initialized to 0. + +shadow variables are created with `trainable=False` and added to the +`GraphKeys.ALL_VARIABLES` collection. They will be returned by calls to +`tf.all_variables()`. + +Returns an op that updates all shadow variables as described above. + +Note that `apply()` can be called multiple times with different lists of +variables. + +##### Args: + + +* <b>var_list</b>: A list of Variable or Tensor objects. The variables + and Tensors must be of types float32 or float64. + +##### Returns: + + An Operation that updates the moving averages. + +##### Raises: + + +* <b>TypeError</b>: If the arguments are not all float32 or float64. +* <b>ValueError</b>: If the moving average of one of the variables is already + being computed. + + +- - - + +#### tf.train.ExponentialMovingAverage.average_name(var) {#ExponentialMovingAverage.average_name} + +Returns the name of the `Variable` holding the average for `var`. + +The typical scenario for `ExponentialMovingAverage` is to compute moving +averages of variables during training, and restore the variables from the +computed moving averages during evaluations. + +To restore variables, you have to know the name of the shadow variables. +That name and the original variable can then be passed to a `Saver()` object +to restore the variable from the moving average value with: + `saver = tf.train.Saver({ema.average_name(var): var})` + +`average_name()` can be called whether or not `apply()` has been called. + +##### Args: + + +* <b>var</b>: A `Variable` object. + +##### Returns: + + A string: the name of the variable that will be used or was used + by the `ExponentialMovingAverage class` to hold the moving average of + `var`. + + +- - - + +#### tf.train.ExponentialMovingAverage.average(var) {#ExponentialMovingAverage.average} + +Returns the `Variable` holding the average of `var`. + +##### Args: + + +* <b>var</b>: A `Variable` object. + +##### Returns: + + A `Variable` object or `None` if the moving average of `var` + is not maintained.. + + + + +## Coordinator and QueueRunner. <div class="md-anchor" id="AUTOGENERATED-coordinator-and-queuerunner.">{#AUTOGENERATED-coordinator-and-queuerunner.}</div> + +See [Threading and Queues](../../how_tos/threading_and_queues/index.md) +for how to use threads and queues. For documentation on the Queue API, +see [Queues](../../api_docs/python/io_ops.md#queues). + +- - - + +### class tf.train.Coordinator <div class="md-anchor" id="Coordinator">{#Coordinator}</div> + +A coordinator for threads. + +This class implements a simple mechanism to coordinate the termination of a +set of threads. + +#### Usage: + +```python +# Create a coordinator. +coord = Coordinator() +# Start a number of threads, passing the coordinator to each of them. +...start thread 1...(coord, ...) +...start thread N...(coord, ...) +# Wait for all the threads to terminate. +coord.join(threads) +``` + +Any of the threads can call `coord.request_stop()` to ask for all the threads +to stop. To cooperate with the requests, each thread must check for +`coord.should_stop()` on a regular basis. `coord.should_stop()` returns +`True` as soon as `coord.request_stop()` has been called. + +A typical thread running with a Coordinator will do something like: + +```python +while not coord.should_stop(): + ...do some work... +``` + +#### Exception handling: + +A thread can report an exception to the Coordinator as part of the +`should_stop()` call. The exception will be re-raised from the +`coord.join()` call. + +Thread code: + +```python +try: + while not coord.should_stop(): + ...do some work... +except Exception, e: + coord.request_stop(e) +``` + +Main code: + +```python +try: + ... + coord = Coordinator() + # Start a number of threads, passing the coordinator to each of them. + ...start thread 1...(coord, ...) + ...start thread N...(coord, ...) + # Wait for all the threads to terminate. + coord.join(threads) +except Exception, e: + ...exception that was passed to coord.request_stop() +``` + +#### Grace period for stopping: + +After a thread has called `coord.request_stop()` the other threads have a +fixed time to stop, this is called the 'stop grace period' and defaults to 2 +minutes. If any of the threads is still alive after the grace period expires +`coord.join()` raises a RuntimeException reporting the laggards. + +``` +try: + ... + coord = Coordinator() + # Start a number of threads, passing the coordinator to each of them. + ...start thread 1...(coord, ...) + ...start thread N...(coord, ...) + # Wait for all the threads to terminate, give them 10s grace period + coord.join(threads, stop_grace_period_secs=10) +except RuntimeException: + ...one of the threads took more than 10s to stop after request_stop() + ...was called. +except Exception: + ...exception that was passed to coord.request_stop() +``` +- - - + +#### tf.train.Coordinator.__init__() {#Coordinator.__init__} + +Create a new Coordinator. + + +- - - + +#### tf.train.Coordinator.join(threads, stop_grace_period_secs=120) {#Coordinator.join} + +Wait for threads to terminate. + +Blocks until all 'threads' have terminated or request_stop() is called. + +After the threads stop, if an 'exc_info' was passed to request_stop, that +exception is re-reaised. + +Grace period handling: When request_stop() is called, threads are given +'stop_grace_period_secs' seconds to terminate. If any of them is still +alive after that period expires, a RuntimeError is raised. Note that if +an 'exc_info' was passed to request_stop() then it is raised instead of +that RuntimeError. + +##### Args: + + +* <b>threads</b>: List threading.Threads. The started threads to join. +* <b>stop_grace_period_secs</b>: Number of seconds given to threads to stop after + request_stop() has been called. + +##### Raises: + + +* <b>RuntimeError</b>: If any thread is still alive after request_stop() + is called and the grace period expires. + + +- - - + +#### tf.train.Coordinator.request_stop(ex=None) {#Coordinator.request_stop} + +Request that the threads stop. + +After this is called, calls to should_stop() will return True. + +##### Args: + + +* <b>ex</b>: Optional Exception, or Python 'exc_info' tuple as returned by + sys.exc_info(). If this is the first call to request_stop() the + corresponding exception is recorded and re-raised from join(). + + +- - - + +#### tf.train.Coordinator.should_stop() {#Coordinator.should_stop} + +Check if stop was requested. + +##### Returns: + + True if a stop was requested. + + +- - - + +#### tf.train.Coordinator.wait_for_stop(timeout=None) {#Coordinator.wait_for_stop} + +Wait till the Coordinator is told to stop. + +##### Args: + + +* <b>timeout</b>: float. Sleep for up to that many seconds waiting for + should_stop() to become True. + +##### Returns: + + True if the Coordinator is told stop, False if the timeout expired. + + + +- - - + +### class tf.train.QueueRunner <div class="md-anchor" id="QueueRunner">{#QueueRunner}</div> + +Holds a list of enqueue operations for a queue, each to be run in a thread. + +Queues are a convenient TensorFlow mechanism to compute tensors +asynchronously using multiple threads. For example in the canonical 'Input +Reader' setup one set of threads generates filenames in a queue; a second set +of threads read records from the files, processes them, and enqueues tensors +on a second queue; a third set of threads dequeues these input records to +construct batches and runs them through training operations. + +There are several delicate issues when running multiple threads that way: +closing the queues in sequence as the input is exhausted, correctly catching +and reporting exceptions, etc. + +The `QueueRunner`, combined with the `Coordinator`, helps handle these issues. +- - - + +#### tf.train.QueueRunner.__init__(queue, enqueue_ops) {#QueueRunner.__init__} + +Create a QueueRunner. + +On construction the `QueueRunner` adds an op to close the queue. That op +will be run if the enqueue ops raise exceptions. + +When you later call the `create_threads()` method, the `QueueRunner` will +create one thread for each op in `enqueue_ops`. Each thread will run its +enqueue op in parallel with the other threads. The enqueue ops do not have +to all be the same op, but it is expected that they all enqueue tensors in +`queue`. + +##### Args: + + +* <b>queue</b>: A `Queue`. +* <b>enqueue_ops</b>: List of enqueue ops to run in threads later. + + +- - - + +#### tf.train.QueueRunner.create_threads(sess, coord=None, daemon=False, start=False) {#QueueRunner.create_threads} + +Create threads to run the enqueue ops. + +This method requires a session in which the graph was launched. It creates +a list of threads, optionally starting them. There is one thread for each +op passed in `enqueue_ops`. + +The `coord` argument is an optional coordinator, that the threads will use +to terminate together and report exceptions. If a coordinator is given, +this method starts an additional thread to close the queue when the +coordinator requests a stop. + +This method may be called again as long as all threads from a previous call +have stopped. + +##### Args: + + +* <b>sess</b>: A `Session`. +* <b>coord</b>: Optional `Coordinator` object for reporting errors and checking + stop conditions. +* <b>daemon</b>: Boolean. If `True` make the threads daemon threads. +* <b>start</b>: Boolean. If `True` starts the threads. If `False` the + caller must call the `start()` method of the returned threads. + +##### Returns: + + A list of threads. + +##### Raises: + + +* <b>RuntimeError</b>: If threads from a previous call to `create_threads()` are + still running. + + +- - - + +#### tf.train.QueueRunner.exceptions_raised {#QueueRunner.exceptions_raised} + +Exceptions raised but not handled by the `QueueRunner` threads. + +Exceptions raised in queue runner threads are handled in one of two ways +depending on whether or not a `Coordinator` was passed to +`create_threads()`: + +* With a `Coordinator`, exceptions are reported to the coordinator and + forgotten by the `QueueRunner`. +* Without a `Coordinator`, exceptions are captured by the `QueueRunner` and + made available in this `exceptions_raised` property. + +##### Returns: + + A list of Python `Exception` objects. The list is empty if no exception + was captured. (No exceptions are captured when using a Coordinator.) + + +- - - + +### tf.train.add_queue_runner(qr, collection='queue_runners') <div class="md-anchor" id="add_queue_runner">{#add_queue_runner}</div> + +Adds a `QueueRunner` to a collection in the graph. + +When building a complex model that uses many queues it is often difficult to +gather all the queue runners that need to be run. This convenience function +allows you to add a queue runner to a well known collection in the graph. + +The companion method `start_queue_runners()` can be used to start threads for +all the collected queue runners. + +##### Args: + + +* <b>qr</b>: A `QueueRunner`. +* <b>collection</b>: A `GraphKey` specifying the graph collection to add + the queue runner to. Defaults to `GraphKeys.QUEUE_RUNNERS`. + + +- - - + +### tf.train.start_queue_runners(sess=None, coord=None, daemon=True, start=True, collection='queue_runners') <div class="md-anchor" id="start_queue_runners">{#start_queue_runners}</div> + +Starts all queue runners collected in the graph. + +This is a companion method to `add_queue_runner()`. It just starts +threads for all queue runners collected in the graph. It returns +the list of all threads. + +##### Args: + + +* <b>sess</b>: `Session` used to run the queue ops. Defaults to the + default session. +* <b>coord</b>: Optional `Coordinator` for coordinating the started threads. +* <b>daemon</b>: Whether the threads should be marked as `daemons`, meaning + they don't block program exit. +* <b>start</b>: Set to `False` to only create the threads, not start them. +* <b>collection</b>: A `GraphKey` specifying the graph collection to + get the queue runners from. Defaults to `GraphKeys.QUEUE_RUNNERS`. + +##### Returns: + + A list of threads. + + + +## Summary Operations. <div class="md-anchor" id="AUTOGENERATED-summary-operations.">{#AUTOGENERATED-summary-operations.}</div> + +The following ops output +[`Summary`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto) +protocol buffers as serialized string tensors. + +You can fetch the output of a summary op in a session, and pass it to a +[SummaryWriter](train.md#SummaryWriter) to append it to an event file. You can +then use TensorBoard to visualize the contents of the event files. See +[TensorBoard and Summaries](../../how_tos/summaries_and_tensorboard/index.md) +for more details. + +- - - + +### tf.scalar_summary(tags, values, collections=None, name=None) <div class="md-anchor" id="scalar_summary">{#scalar_summary}</div> + +Outputs a `Summary` protocol buffer with scalar values. + +The input `tags` and `values` must have the same shape. The generated +summary has a summary value for each tag-value pair in `tags` and `values`. + +##### Args: + + +* <b>tags</b>: A 1-D `string` `Tensor`. Tags for the summaries. +* <b>values</b>: A 1-D `float32` or `float64` Tensor. Values for the summaries. +* <b>collections</b>: Optional list of graph collections keys. The new summary op is + added to these collections. Defaults to `[GraphKeys.SUMMARIES]`. +* <b>name</b>: A name for the operation (optional). + +##### Returns: + + A scalar `Tensor` of type `string`. The serialized `Summary` protocol + buffer. + + +- - - + +### tf.image_summary(tag, tensor, max_images=None, collections=None, name=None) <div class="md-anchor" id="image_summary">{#image_summary}</div> + +Outputs a `Summary` protocol buffer with images. + +The summary has up to `max_images` summary values containing images. The +images are built from `tensor` which must be 4-D with shape `[batch_size, +height, width, channels]` and where `channels` can be: + +* 1: `tensor` is interpreted as Grayscale. +* 3: `tensor` is interpreted as RGB. +* 4: `tensor` is interpreted as RGBA. + +The images have the same number of channels as the input tensor. Their values +are normalized, one image at a time, to fit in the range `[0, 255]`. The +op uses two different normalization algorithms: + +* If the input values are all positive, they are rescaled so the largest one + is 255. + +* If any input value is negative, the values are shifted so input value 0.0 + is at 127. They are then rescaled so that either the smallest value is 0, + or the largest one is 255. + +The `tag` argument is a scalar `Tensor` of type `string`. It is used to +build the `tag` of the summary values: + +* If `max_images` is 1, the summary value tag is '*tag*/image'. +* If `max_images` is greater than 1, the summary value tags are + generated sequentially as '*tag*/image/0', '*tag*/image/1', etc. + +##### Args: + + +* <b>tag</b>: A scalar `Tensor` of type `string`. Used to build the `tag` + of the summary values. +* <b>tensor</b>: A 4-D `float32` `Tensor` of shape `[batch_size, height, width, + channels]` where `channels` is 1, 3, or 4. +* <b>max_images</b>: Max number of batch elements to generate images for. +* <b>collections</b>: Optional list of ops.GraphKeys. The collections to add the + summary to. Defaults to [ops.GraphKeys.SUMMARIES] +* <b>name</b>: A name for the operation (optional). + +##### Returns: + + A scalar `Tensor` of type `string`. The serialized `Summary` protocol + buffer. + + +- - - + +### tf.histogram_summary(tag, values, collections=None, name=None) <div class="md-anchor" id="histogram_summary">{#histogram_summary}</div> + +Outputs a `Summary` protocol buffer with a histogram. + +The generated +[`Summary`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto) +has one summary value containing a histogram for `values`. + +This op reports an `OutOfRange` error if any value is not finite. + +##### Args: + + +* <b>tag</b>: A `string` `Tensor`. 0-D. Tag to use for the summary value. +* <b>values</b>: A `float32` `Tensor`. Any shape. Values to use to build the + histogram. +* <b>collections</b>: Optional list of graph collections keys. The new summary op is + added to these collections. Defaults to `[GraphKeys.SUMMARIES]`. +* <b>name</b>: A name for the operation (optional). + +##### Returns: + + A scalar `Tensor` of type `string`. The serialized `Summary` protocol + buffer. + + +- - - + +### tf.nn.zero_fraction(value, name=None) <div class="md-anchor" id="zero_fraction">{#zero_fraction}</div> + +Returns the fraction of zeros in `value`. + +If `value` is empty, the result is `nan`. + +This is useful in summaries to measure and report sparsity. For example, + + z = tf.Relu(...) + summ = tf.scalar_summary('sparsity', tf.zero_fraction(z)) + +##### Args: + + +* <b>value</b>: A tensor of numeric type. +* <b>name</b>: A name for the operation (optional). + +##### Returns: + + The fraction of zeros in `value`, with type `float32`. + + + +- - - + +### tf.merge_summary(inputs, collections=None, name=None) <div class="md-anchor" id="merge_summary">{#merge_summary}</div> + +Merges summaries. + +This op creates a +[`Summary`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto) +protocol buffer that contains the union of all the values in the input +summaries. + +When the Op is run, it reports an `InvalidArgument` error if multiple values +in the summaries to merge use the same tag. + +##### Args: + + +* <b>inputs</b>: A list of `string` `Tensor` objects containing serialized `Summary` + protocol buffers. +* <b>collections</b>: Optional list of graph collections keys. The new summary op is + added to these collections. Defaults to `[GraphKeys.SUMMARIES]`. +* <b>name</b>: A name for the operation (optional). + +##### Returns: + + A scalar `Tensor` of type `string`. The serialized `Summary` protocol + buffer resulting from the merging. + + +- - - + +### tf.merge_all_summaries(key='summaries') <div class="md-anchor" id="merge_all_summaries">{#merge_all_summaries}</div> + +Merges all summaries collected in the default graph. + +##### Args: + + +* <b>key</b>: `GraphKey` used to collect the summaries. Defaults to + `GraphKeys.SUMMARIES`. + +##### Returns: + + If no summaries were collected, returns None. Otherwise returns a scalar + `Tensor` of type`string` containing the serialized `Summary` protocol + buffer resulting from the merging. + + + +## Adding Summaries to Event Files. <div class="md-anchor" id="AUTOGENERATED-adding-summaries-to-event-files.">{#AUTOGENERATED-adding-summaries-to-event-files.}</div> + +See [Summaries and +TensorBoard](../../how_tos/summaries_and_tensorboard/index.md) for an +overview of summaries, event files, and visualization in TensorBoard. + +- - - + +### class tf.train.SummaryWriter <div class="md-anchor" id="SummaryWriter">{#SummaryWriter}</div> + +Writes `Summary` protocol buffers to event files. + +The `SummaryWriter` class provides a mechanism to create an event file in a +given directory and add summaries and events to it. The class updates the +file contents asynchronously. This allows a training program to call methods +to add data to the file directly from the training loop, without slowing down +training. + +- - - + +#### tf.train.SummaryWriter.__init__(logdir, graph_def=None, max_queue=10, flush_secs=120) {#SummaryWriter.__init__} + +Creates a `SummaryWriter` and an event file. + +On construction the summary writer creates a new event file in `logdir`. +This event file will contain `Event` protocol buffers constructed when you +call one of the following functions: `add_summary()`, `add_event()`, or +`add_graph()`. + +If you pass a `graph_def` protocol buffer to the constructor it is added to +the event file. (This is equivalent to calling `add_graph()` later). + +TensorBoard will pick the graph from the file and display it graphically so +you can interactively explore the graph you built. You will usually pass +the graph from the session in which you launched it: + +```python +...create a graph... +# Launch the graph in a session. +sess = tf.Session() +# Create a summary writer, add the 'graph_def' to the event file. +writer = tf.train.SummaryWriter(<some-directory>, sess.graph_def) +``` + +The other arguments to the constructor control the asynchronous writes to +the event file: + +* `flush_secs`: How often, in seconds, to flush the added summaries + and events to disk. +* `max_queue`: Maximum number of summaries or events pending to be + written to disk before one of the 'add' calls block. + +##### Args: + + +* <b>logdir</b>: A string. Directory where event file will be written. +* <b>graph_def</b>: A `GraphDef` protocol buffer. +* <b>max_queue</b>: Integer. Size of the queue for pending events and summaries. +* <b>flush_secs</b>: Number. How often, in seconds, to flush the + pending events and summaries to disk. + + + +- - - + +#### tf.train.SummaryWriter.add_summary(summary, global_step=None) {#SummaryWriter.add_summary} + +Adds a `Summary` protocol buffer to the event file. + +This method wraps the provided summary in an `Event` procotol buffer +and adds it to the event file. + +You can pass the output of any summary op, as-is, to this function. You +can also pass a `Summary` procotol buffer that you manufacture with your +own data. This is commonly done to report evaluation results in event +files. + +##### Args: + + +* <b>summary</b>: A `Summary` protocol buffer, optionally serialized as a string. +* <b>global_step</b>: Number. Optional global step value to record with the + summary. + + +- - - + +#### tf.train.SummaryWriter.add_event(event) {#SummaryWriter.add_event} + +Adds an event to the event file. + +##### Args: + + +* <b>event</b>: An `Event` protocol buffer. + + +- - - + +#### tf.train.SummaryWriter.add_graph(graph_def, global_step=None) {#SummaryWriter.add_graph} + +Adds a `GraphDef` protocol buffer to the event file. + +The graph described by the protocol buffer will be displayed by +TensorBoard. Most users pass a graph in the constructor instead. + +##### Args: + + +* <b>graph_def</b>: A `GraphDef` protocol buffer. +* <b>global_step</b>: Number. Optional global step counter to record with the + graph. + + + +- - - + +#### tf.train.SummaryWriter.flush() {#SummaryWriter.flush} + +Flushes the event file to disk. + +Call this method to make sure that all pending events have been written to +disk. + + +- - - + +#### tf.train.SummaryWriter.close() {#SummaryWriter.close} + +Flushes the event file to disk and close the file. + +Call this method when you do not need the summary writer anymore. + + + +- - - + +### tf.train.summary_iterator(path) <div class="md-anchor" id="summary_iterator">{#summary_iterator}</div> + +An iterator for reading `Event` protocol buffers from an event file. + +You can use this function to read events written to an event file. It returns +a Python iterator that yields `Event` protocol buffers. + +Example: Print the contents of an events file. + +```python +for e in tf.summary_iterator(path to events file): + print e +``` + +Example: Print selected summary values. + +```python +# This example supposes that the events file contains summaries with a +# summary value tag 'loss'. These could have been added by calling +# `add_summary()`, passing the output of a scalar summary op created with +# with: `tf.scalar_summary(['loss'], loss_tensor)`. +for e in tf.summary_iterator(path to events file): + for v in e.summary.value: + if v.tag == 'loss': + print v.simple_value +``` + +See the protocol buffer definitions of +[Event](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/util/event.proto) +and +[Summary](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto) +for more information about their attributes. + +##### Args: + + +* <b>path</b>: The path to an event file created by a `SummaryWriter`. + +##### Yields: + + `Event` protocol buffers. + + + +## Training utilities. <div class="md-anchor" id="AUTOGENERATED-training-utilities.">{#AUTOGENERATED-training-utilities.}</div> + +- - - + +### tf.train.global_step(sess, global_step_tensor) <div class="md-anchor" id="global_step">{#global_step}</div> + +Small helper to get the global step. + +```python +# Creates a variable to hold the global_step. +global_step_tensor = tf.Variable(10, trainable=False, name='global_step') +# Creates a session. +sess = tf.Session() +# Initializes the variable. +sess.run(global_step_tensor.initializer) +print 'global_step:', tf.train.global_step(sess, global_step_tensor) + +global_step: 10 +``` + +##### Args: + + +* <b>sess</b>: A brain `Session` object. +* <b>global_step_tensor</b>: `Tensor` or the `name` of the operation that contains + the global step. + +##### Returns: + + The global step value. + + +- - - + +### tf.train.write_graph(graph_def, logdir, name, as_text=True) <div class="md-anchor" id="write_graph">{#write_graph}</div> + +Writes a graph proto on disk. + +The graph is written as a binary proto unless as_text is `True`. + +```python +v = tf.Variable(0, name='my_variable') +sess = tf.Session() +tf.train.write_graph(sess.graph_def, '/tmp/my-model', 'train.pbtxt') +``` + +##### Args: + + +* <b>graph_def</b>: A `GraphDef` protocol buffer. +* <b>logdir</b>: Directory where to write the graph. +* <b>name</b>: Filename for the graph. +* <b>as_text</b>: If `True`, writes the graph as an ASCII proto. + + |