Update generated Python Op docs.

Change: 141011628
author: A. Unique TensorFlower <gardener@tensorflow.org> 2016-12-04 21:05:14 -0800
committer: TensorFlower Gardener <gardener@tensorflow.org> 2016-12-04 21:23:37 -0800
commit: 70a517936a16e95b5521ef2458fd35b23658f6bc (patch)
tree: 2e94944bbba8d0331f6079bc98f6e0d9fcca9e8d /tensorflow/g3doc
parent: c5e21b06986d717f3fde68e3664ebc650d934377 (diff)
17 files changed, 1127 insertions, 866 deletions
diff --git a/tensorflow/g3doc/api_docs/python/contrib.legacy_seq2seq.md b/tensorflow/g3doc/api_docs/python/contrib.legacy_seq2seq.md
new file mode 100644
index 0000000000..43c051d9ec
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/contrib.legacy_seq2seq.md
@@ -0,0 +1,578 @@
+<!-- This file is machine generated: DO NOT EDIT! -->
+
+# Sequence to Sequence (contrib)
+[TOC]
+
+Deprecated library for creating sequence-to-sequence models in TensorFlow.
+
+- - -
+
+### `tf.contrib.legacy_seq2seq.attention_decoder(decoder_inputs, initial_state, attention_states, cell, output_size=None, num_heads=1, loop_function=None, dtype=None, scope=None, initial_state_attention=False)` {#attention_decoder}
+
+RNN decoder with attention for the sequence-to-sequence model.
+
+In this context "attention" means that, during decoding, the RNN can look up
+information in the additional tensor attention_states, and it does this by
+focusing on a few entries from the tensor. This model has proven to yield
+especially good results in a number of sequence-to-sequence tasks. This
+implementation is based on http://arxiv.org/abs/1412.7449 (see below for
+details). It is recommended for complex sequence-to-sequence tasks.
+
+##### Args:
+
+
+*  <b>`decoder_inputs`</b>: A list of 2D Tensors [batch_size x input_size].
+*  <b>`initial_state`</b>: 2D Tensor [batch_size x cell.state_size].
+*  <b>`attention_states`</b>: 3D Tensor [batch_size x attn_length x attn_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`output_size`</b>: Size of the output vectors; if None, we use cell.output_size.
+*  <b>`num_heads`</b>: Number of attention heads that read from attention_states.
+*  <b>`loop_function`</b>: If not None, this function will be applied to i-th output
+    in order to generate i+1-th input, and decoder_inputs will be ignored,
+    except for the first element ("GO" symbol). This can be used for decoding,
+    but also for training to emulate http://arxiv.org/abs/1506.03099.
+    Signature -- loop_function(prev, i) = next
+      * prev is a 2D Tensor of shape [batch_size x output_size],
+      * i is an integer, the step number (when advanced control is needed),
+      * next is a 2D Tensor of shape [batch_size x input_size].
+*  <b>`dtype`</b>: The dtype to use for the RNN initial state (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; default: "attention_decoder".
+*  <b>`initial_state_attention`</b>: If False (default), initial attentions are zero.
+    If True, initialize the attentions from the initial state and attention
+    states -- useful when we wish to resume decoding from a previously
+    stored decoder state and attention states.
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors of
+      shape [batch_size x output_size]. These represent the generated outputs.
+      Output i is computed from input i (which is either the i-th element
+      of decoder_inputs or loop_function(output {i-1}, i)) as follows.
+      First, we run the cell on a combination of the input and previous
+      attention masks:
+        cell_output, new_state = cell(linear(input, prev_attn), prev_state).
+      Then, we calculate new attention masks:
+        new_attn = softmax(V^T * tanh(W * attention_states + U * new_state))
+      and then we calculate the output:
+        output = linear(cell_output, new_attn).
+*  <b>`state`</b>: The state of each decoder cell the final time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: when num_heads is not positive, there are no inputs, shapes
+    of attention_states are not set, or input size cannot be inferred
+    from the input.
+
+
+- - -
+
+### `tf.contrib.legacy_seq2seq.basic_rnn_seq2seq(encoder_inputs, decoder_inputs, cell, dtype=tf.float32, scope=None)` {#basic_rnn_seq2seq}
+
+Basic RNN sequence-to-sequence model.
+
+This model first runs an RNN to encode encoder_inputs into a state vector,
+then runs decoder, initialized with the last encoder state, on decoder_inputs.
+Encoder and decoder use the same RNN cell type, but don't share parameters.
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of 2D Tensors [batch_size x input_size].
+*  <b>`decoder_inputs`</b>: A list of 2D Tensors [batch_size x input_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`dtype`</b>: The dtype of the initial state of the RNN cell (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; default: "basic_rnn_seq2seq".
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors with
+      shape [batch_size x output_size] containing the generated outputs.
+*  <b>`state`</b>: The state of each decoder cell in the final time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
+
+- - -
+
+### `tf.contrib.legacy_seq2seq.embedding_attention_decoder(decoder_inputs, initial_state, attention_states, cell, num_symbols, embedding_size, num_heads=1, output_size=None, output_projection=None, feed_previous=False, update_embedding_for_previous=True, dtype=None, scope=None, initial_state_attention=False)` {#embedding_attention_decoder}
+
+RNN decoder with embedding and attention and a pure-decoding option.
+
+##### Args:
+
+
+*  <b>`decoder_inputs`</b>: A list of 1D batch-sized int32 Tensors (decoder inputs).
+*  <b>`initial_state`</b>: 2D Tensor [batch_size x cell.state_size].
+*  <b>`attention_states`</b>: 3D Tensor [batch_size x attn_length x attn_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function.
+*  <b>`num_symbols`</b>: Integer, how many symbols come into the embedding.
+*  <b>`embedding_size`</b>: Integer, the length of the embedding vector for each symbol.
+*  <b>`num_heads`</b>: Number of attention heads that read from attention_states.
+*  <b>`output_size`</b>: Size of the output vectors; if None, use output_size.
+*  <b>`output_projection`</b>: None or a pair (W, B) of output projection weights and
+    biases; W has shape [output_size x num_symbols] and B has shape
+    [num_symbols]; if provided and feed_previous=True, each fed previous
+    output will first be multiplied by W and added B.
+*  <b>`feed_previous`</b>: Boolean; if True, only the first of decoder_inputs will be
+    used (the "GO" symbol), and all other decoder inputs will be generated by:
+      next = embedding_lookup(embedding, argmax(previous_output)),
+    In effect, this implements a greedy decoder. It can also be used
+    during training to emulate http://arxiv.org/abs/1506.03099.
+    If False, decoder_inputs are used as given (the standard decoder case).
+*  <b>`update_embedding_for_previous`</b>: Boolean; if False and feed_previous=True,
+    only the embedding for the first symbol of decoder_inputs (the "GO"
+    symbol) will be updated by back propagation. Embeddings for the symbols
+    generated from the decoder itself remain unchanged. This parameter has
+    no effect if feed_previous=False.
+*  <b>`dtype`</b>: The dtype to use for the RNN initial states (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to
+    "embedding_attention_decoder".
+*  <b>`initial_state_attention`</b>: If False (default), initial attentions are zero.
+    If True, initialize the attentions from the initial state and attention
+    states -- useful when we wish to resume decoding from a previously
+    stored decoder state and attention states.
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors with
+      shape [batch_size x output_size] containing the generated outputs.
+*  <b>`state`</b>: The state of each decoder cell at the final time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: When output_projection has the wrong shape.
+
+
+- - -
+
+### `tf.contrib.legacy_seq2seq.embedding_attention_seq2seq(encoder_inputs, decoder_inputs, cell, num_encoder_symbols, num_decoder_symbols, embedding_size, num_heads=1, output_projection=None, feed_previous=False, dtype=None, scope=None, initial_state_attention=False)` {#embedding_attention_seq2seq}
+
+Embedding sequence-to-sequence model with attention.
+
+This model first embeds encoder_inputs by a newly created embedding (of shape
+[num_encoder_symbols x input_size]). Then it runs an RNN to encode
+embedded encoder_inputs into a state vector. It keeps the outputs of this
+RNN at every step to use for attention later. Next, it embeds decoder_inputs
+by another newly created embedding (of shape [num_decoder_symbols x
+input_size]). Then it runs attention decoder, initialized with the last
+encoder state, on embedded decoder_inputs and attending to encoder outputs.
+
+Warning: when output_projection is None, the size of the attention vectors
+and variables will be made proportional to num_decoder_symbols, can be large.
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`decoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`num_encoder_symbols`</b>: Integer; number of symbols on the encoder side.
+*  <b>`num_decoder_symbols`</b>: Integer; number of symbols on the decoder side.
+*  <b>`embedding_size`</b>: Integer, the length of the embedding vector for each symbol.
+*  <b>`num_heads`</b>: Number of attention heads that read from attention_states.
+*  <b>`output_projection`</b>: None or a pair (W, B) of output projection weights and
+    biases; W has shape [output_size x num_decoder_symbols] and B has
+    shape [num_decoder_symbols]; if provided and feed_previous=True, each
+    fed previous output will first be multiplied by W and added B.
+*  <b>`feed_previous`</b>: Boolean or scalar Boolean Tensor; if True, only the first
+    of decoder_inputs will be used (the "GO" symbol), and all other decoder
+    inputs will be taken from previous outputs (as in embedding_rnn_decoder).
+    If False, decoder_inputs are used as given (the standard decoder case).
+*  <b>`dtype`</b>: The dtype of the initial RNN state (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to
+    "embedding_attention_seq2seq".
+*  <b>`initial_state_attention`</b>: If False (default), initial attentions are zero.
+    If True, initialize the attentions from the initial state and attention
+    states.
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors with
+      shape [batch_size x num_decoder_symbols] containing the generated
+      outputs.
+*  <b>`state`</b>: The state of each decoder cell at the final time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
+
+- - -
+
+### `tf.contrib.legacy_seq2seq.embedding_rnn_decoder(decoder_inputs, initial_state, cell, num_symbols, embedding_size, output_projection=None, feed_previous=False, update_embedding_for_previous=True, scope=None)` {#embedding_rnn_decoder}
+
+RNN decoder with embedding and a pure-decoding option.
+
+##### Args:
+
+
+*  <b>`decoder_inputs`</b>: A list of 1D batch-sized int32 Tensors (decoder inputs).
+*  <b>`initial_state`</b>: 2D Tensor [batch_size x cell.state_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function.
+*  <b>`num_symbols`</b>: Integer, how many symbols come into the embedding.
+*  <b>`embedding_size`</b>: Integer, the length of the embedding vector for each symbol.
+*  <b>`output_projection`</b>: None or a pair (W, B) of output projection weights and
+    biases; W has shape [output_size x num_symbols] and B has
+    shape [num_symbols]; if provided and feed_previous=True, each fed
+    previous output will first be multiplied by W and added B.
+*  <b>`feed_previous`</b>: Boolean; if True, only the first of decoder_inputs will be
+    used (the "GO" symbol), and all other decoder inputs will be generated by:
+      next = embedding_lookup(embedding, argmax(previous_output)),
+    In effect, this implements a greedy decoder. It can also be used
+    during training to emulate http://arxiv.org/abs/1506.03099.
+    If False, decoder_inputs are used as given (the standard decoder case).
+*  <b>`update_embedding_for_previous`</b>: Boolean; if False and feed_previous=True,
+    only the embedding for the first symbol of decoder_inputs (the "GO"
+    symbol) will be updated by back propagation. Embeddings for the symbols
+    generated from the decoder itself remain unchanged. This parameter has
+    no effect if feed_previous=False.
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to
+    "embedding_rnn_decoder".
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors. The
+      output is of shape [batch_size x cell.output_size] when
+      output_projection is not None (and represents the dense representation
+      of predicted tokens). It is of shape [batch_size x num_decoder_symbols]
+      when output_projection is None.
+*  <b>`state`</b>: The state of each decoder cell in each time-step. This is a list
+      with length len(decoder_inputs) -- one item for each time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: When output_projection has the wrong shape.
+
+
+- - -
+
+### `tf.contrib.legacy_seq2seq.embedding_rnn_seq2seq(encoder_inputs, decoder_inputs, cell, num_encoder_symbols, num_decoder_symbols, embedding_size, output_projection=None, feed_previous=False, dtype=None, scope=None)` {#embedding_rnn_seq2seq}
+
+Embedding RNN sequence-to-sequence model.
+
+This model first embeds encoder_inputs by a newly created embedding (of shape
+[num_encoder_symbols x input_size]). Then it runs an RNN to encode
+embedded encoder_inputs into a state vector. Next, it embeds decoder_inputs
+by another newly created embedding (of shape [num_decoder_symbols x
+input_size]). Then it runs RNN decoder, initialized with the last
+encoder state, on embedded decoder_inputs.
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`decoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`num_encoder_symbols`</b>: Integer; number of symbols on the encoder side.
+*  <b>`num_decoder_symbols`</b>: Integer; number of symbols on the decoder side.
+*  <b>`embedding_size`</b>: Integer, the length of the embedding vector for each symbol.
+*  <b>`output_projection`</b>: None or a pair (W, B) of output projection weights and
+    biases; W has shape [output_size x num_decoder_symbols] and B has
+    shape [num_decoder_symbols]; if provided and feed_previous=True, each
+    fed previous output will first be multiplied by W and added B.
+*  <b>`feed_previous`</b>: Boolean or scalar Boolean Tensor; if True, only the first
+    of decoder_inputs will be used (the "GO" symbol), and all other decoder
+    inputs will be taken from previous outputs (as in embedding_rnn_decoder).
+    If False, decoder_inputs are used as given (the standard decoder case).
+*  <b>`dtype`</b>: The dtype of the initial state for both the encoder and encoder
+    rnn cells (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to
+    "embedding_rnn_seq2seq"
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors. The
+      output is of shape [batch_size x cell.output_size] when
+      output_projection is not None (and represents the dense representation
+      of predicted tokens). It is of shape [batch_size x num_decoder_symbols]
+      when output_projection is None.
+*  <b>`state`</b>: The state of each decoder cell in each time-step. This is a list
+      with length len(decoder_inputs) -- one item for each time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
+
+- - -
+
+### `tf.contrib.legacy_seq2seq.embedding_tied_rnn_seq2seq(encoder_inputs, decoder_inputs, cell, num_symbols, embedding_size, num_decoder_symbols=None, output_projection=None, feed_previous=False, dtype=None, scope=None)` {#embedding_tied_rnn_seq2seq}
+
+Embedding RNN sequence-to-sequence model with tied (shared) parameters.
+
+This model first embeds encoder_inputs by a newly created embedding (of shape
+[num_symbols x input_size]). Then it runs an RNN to encode embedded
+encoder_inputs into a state vector. Next, it embeds decoder_inputs using
+the same embedding. Then it runs RNN decoder, initialized with the last
+encoder state, on embedded decoder_inputs. The decoder output is over symbols
+from 0 to num_decoder_symbols - 1 if num_decoder_symbols is none; otherwise it
+is over 0 to num_symbols - 1.
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`decoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`num_symbols`</b>: Integer; number of symbols for both encoder and decoder.
+*  <b>`embedding_size`</b>: Integer, the length of the embedding vector for each symbol.
+*  <b>`num_decoder_symbols`</b>: Integer; number of output symbols for decoder. If
+    provided, the decoder output is over symbols 0 to num_decoder_symbols - 1.
+    Otherwise, decoder output is over symbols 0 to num_symbols - 1. Note that
+    this assumes that the vocabulary is set up such that the first
+    num_decoder_symbols of num_symbols are part of decoding.
+*  <b>`output_projection`</b>: None or a pair (W, B) of output projection weights and
+    biases; W has shape [output_size x num_symbols] and B has
+    shape [num_symbols]; if provided and feed_previous=True, each
+    fed previous output will first be multiplied by W and added B.
+*  <b>`feed_previous`</b>: Boolean or scalar Boolean Tensor; if True, only the first
+    of decoder_inputs will be used (the "GO" symbol), and all other decoder
+    inputs will be taken from previous outputs (as in embedding_rnn_decoder).
+    If False, decoder_inputs are used as given (the standard decoder case).
+*  <b>`dtype`</b>: The dtype to use for the initial RNN states (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to
+    "embedding_tied_rnn_seq2seq".
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors with
+      shape [batch_size x output_symbols] containing the generated
+      outputs where output_symbols = num_decoder_symbols if
+      num_decoder_symbols is not None otherwise output_symbols = num_symbols.
+*  <b>`state`</b>: The state of each decoder cell at the final time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: When output_projection has the wrong shape.
+
+
+- - -
+
+### `tf.contrib.legacy_seq2seq.model_with_buckets(encoder_inputs, decoder_inputs, targets, weights, buckets, seq2seq, softmax_loss_function=None, per_example_loss=False, name=None)` {#model_with_buckets}
+
+Create a sequence-to-sequence model with support for bucketing.
+
+The seq2seq argument is a function that defines a sequence-to-sequence model,
+e.g., seq2seq = lambda x, y: basic_rnn_seq2seq(x, y, rnn_cell.GRUCell(24))
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of Tensors to feed the encoder; first seq2seq input.
+*  <b>`decoder_inputs`</b>: A list of Tensors to feed the decoder; second seq2seq input.
+*  <b>`targets`</b>: A list of 1D batch-sized int32 Tensors (desired output sequence).
+*  <b>`weights`</b>: List of 1D batch-sized float-Tensors to weight the targets.
+*  <b>`buckets`</b>: A list of pairs of (input size, output size) for each bucket.
+*  <b>`seq2seq`</b>: A sequence-to-sequence model function; it takes 2 input that
+    agree with encoder_inputs and decoder_inputs, and returns a pair
+    consisting of outputs and states (as, e.g., basic_rnn_seq2seq).
+*  <b>`softmax_loss_function`</b>: Function (inputs-batch, labels-batch) -> loss-batch
+    to be used instead of the standard softmax (the default if this is None).
+*  <b>`per_example_loss`</b>: Boolean. If set, the returned loss will be a batch-sized
+    tensor of losses for each sequence in the batch. If unset, it will be
+    a scalar with the averaged loss from all examples.
+*  <b>`name`</b>: Optional name for this operation, defaults to "model_with_buckets".
+
+##### Returns:
+
+  A tuple of the form (outputs, losses), where:
+
+*  <b>`outputs`</b>: The outputs for each bucket. Its j'th element consists of a list
+      of 2D Tensors. The shape of output tensors can be either
+      [batch_size x output_size] or [batch_size x num_decoder_symbols]
+      depending on the seq2seq model used.
+*  <b>`losses`</b>: List of scalar Tensors, representing losses for each bucket, or,
+      if per_example_loss is set, a list of 1D batch-sized float Tensors.
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: If length of encoder_inputsut, targets, or weights is smaller
+    than the largest (last) bucket.
+
+
+- - -
+
+### `tf.contrib.legacy_seq2seq.one2many_rnn_seq2seq(encoder_inputs, decoder_inputs_dict, cell, num_encoder_symbols, num_decoder_symbols_dict, embedding_size, feed_previous=False, dtype=None, scope=None)` {#one2many_rnn_seq2seq}
+
+One-to-many RNN sequence-to-sequence model (multi-task).
+
+This is a multi-task sequence-to-sequence model with one encoder and multiple
+decoders. Reference to multi-task sequence-to-sequence learning can be found
+here: http://arxiv.org/abs/1511.06114
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`decoder_inputs_dict`</b>: A dictionany mapping decoder name (string) to
+    the corresponding decoder_inputs; each decoder_inputs is a list of 1D
+    Tensors of shape [batch_size]; num_decoders is defined as
+    len(decoder_inputs_dict).
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`num_encoder_symbols`</b>: Integer; number of symbols on the encoder side.
+*  <b>`num_decoder_symbols_dict`</b>: A dictionary mapping decoder name (string) to an
+    integer specifying number of symbols for the corresponding decoder;
+    len(num_decoder_symbols_dict) must be equal to num_decoders.
+*  <b>`embedding_size`</b>: Integer, the length of the embedding vector for each symbol.
+*  <b>`feed_previous`</b>: Boolean or scalar Boolean Tensor; if True, only the first of
+    decoder_inputs will be used (the "GO" symbol), and all other decoder
+    inputs will be taken from previous outputs (as in embedding_rnn_decoder).
+    If False, decoder_inputs are used as given (the standard decoder case).
+*  <b>`dtype`</b>: The dtype of the initial state for both the encoder and encoder
+    rnn cells (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to
+    "one2many_rnn_seq2seq"
+
+##### Returns:
+
+  A tuple of the form (outputs_dict, state_dict), where:
+
+*  <b>`outputs_dict`</b>: A mapping from decoder name (string) to a list of the same
+      length as decoder_inputs_dict[name]; each element in the list is a 2D
+      Tensors with shape [batch_size x num_decoder_symbol_list[name]]
+      containing the generated outputs.
+*  <b>`state_dict`</b>: A mapping from decoder name (string) to the final state of the
+      corresponding decoder RNN; it is a 2D Tensor of shape
+      [batch_size x cell.state_size].
+
+
+- - -
+
+### `tf.contrib.legacy_seq2seq.rnn_decoder(decoder_inputs, initial_state, cell, loop_function=None, scope=None)` {#rnn_decoder}
+
+RNN decoder for the sequence-to-sequence model.
+
+##### Args:
+
+
+*  <b>`decoder_inputs`</b>: A list of 2D Tensors [batch_size x input_size].
+*  <b>`initial_state`</b>: 2D Tensor with shape [batch_size x cell.state_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`loop_function`</b>: If not None, this function will be applied to the i-th output
+    in order to generate the i+1-st input, and decoder_inputs will be ignored,
+    except for the first element ("GO" symbol). This can be used for decoding,
+    but also for training to emulate http://arxiv.org/abs/1506.03099.
+    Signature -- loop_function(prev, i) = next
+      * prev is a 2D Tensor of shape [batch_size x output_size],
+      * i is an integer, the step number (when advanced control is needed),
+      * next is a 2D Tensor of shape [batch_size x input_size].
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to "rnn_decoder".
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors with
+      shape [batch_size x output_size] containing generated outputs.
+*  <b>`state`</b>: The state of each cell at the final time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+      (Note that in some cases, like basic RNN cell or GRU cell, outputs and
+       states can be the same. They are different for LSTM cells though.)
+
+
+- - -
+
+### `tf.contrib.legacy_seq2seq.sequence_loss(logits, targets, weights, average_across_timesteps=True, average_across_batch=True, softmax_loss_function=None, name=None)` {#sequence_loss}
+
+Weighted cross-entropy loss for a sequence of logits, batch-collapsed.
+
+##### Args:
+
+
+*  <b>`logits`</b>: List of 2D Tensors of shape [batch_size x num_decoder_symbols].
+*  <b>`targets`</b>: List of 1D batch-sized int32 Tensors of the same length as logits.
+*  <b>`weights`</b>: List of 1D batch-sized float-Tensors of the same length as logits.
+*  <b>`average_across_timesteps`</b>: If set, divide the returned cost by the total
+    label weight.
+*  <b>`average_across_batch`</b>: If set, divide the returned cost by the batch size.
+*  <b>`softmax_loss_function`</b>: Function (inputs-batch, labels-batch) -> loss-batch
+    to be used instead of the standard softmax (the default if this is None).
+*  <b>`name`</b>: Optional name for this operation, defaults to "sequence_loss".
+
+##### Returns:
+
+  A scalar float Tensor: The average log-perplexity per symbol (weighted).
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: If len(logits) is different from len(targets) or len(weights).
+
+
+- - -
+
+### `tf.contrib.legacy_seq2seq.sequence_loss_by_example(logits, targets, weights, average_across_timesteps=True, softmax_loss_function=None, name=None)` {#sequence_loss_by_example}
+
+Weighted cross-entropy loss for a sequence of logits (per example).
+
+##### Args:
+
+
+*  <b>`logits`</b>: List of 2D Tensors of shape [batch_size x num_decoder_symbols].
+*  <b>`targets`</b>: List of 1D batch-sized int32 Tensors of the same length as logits.
+*  <b>`weights`</b>: List of 1D batch-sized float-Tensors of the same length as logits.
+*  <b>`average_across_timesteps`</b>: If set, divide the returned cost by the total
+    label weight.
+*  <b>`softmax_loss_function`</b>: Function (inputs-batch, labels-batch) -> loss-batch
+    to be used instead of the standard softmax (the default if this is None).
+*  <b>`name`</b>: Optional name for this operation, default: "sequence_loss_by_example".
+
+##### Returns:
+
+  1D batch-sized float Tensor: The log-perplexity for each sequence.
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: If len(logits) is different from len(targets) or len(weights).
+
+
+- - -
+
+### `tf.contrib.legacy_seq2seq.tied_rnn_seq2seq(encoder_inputs, decoder_inputs, cell, loop_function=None, dtype=tf.float32, scope=None)` {#tied_rnn_seq2seq}
+
+RNN sequence-to-sequence model with tied encoder and decoder parameters.
+
+This model first runs an RNN to encode encoder_inputs into a state vector, and
+then runs decoder, initialized with the last encoder state, on decoder_inputs.
+Encoder and decoder use the same RNN cell and share parameters.
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of 2D Tensors [batch_size x input_size].
+*  <b>`decoder_inputs`</b>: A list of 2D Tensors [batch_size x input_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`loop_function`</b>: If not None, this function will be applied to i-th output
+    in order to generate i+1-th input, and decoder_inputs will be ignored,
+    except for the first element ("GO" symbol), see rnn_decoder for details.
+*  <b>`dtype`</b>: The dtype of the initial state of the rnn cell (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; default: "tied_rnn_seq2seq".
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors with
+      shape [batch_size x output_size] containing the generated outputs.
+*  <b>`state`</b>: The state of each decoder cell in each time-step. This is a list
+      with length len(decoder_inputs) -- one item for each time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
+
diff --git a/tensorflow/g3doc/api_docs/python/contrib.rnn.md b/tensorflow/g3doc/api_docs/python/contrib.rnn.md
index 0c445a3374..f7722b13f4 100644
--- a/tensorflow/g3doc/api_docs/python/contrib.rnn.md
+++ b/tensorflow/g3doc/api_docs/python/contrib.rnn.md
@@ -1,6 +1,6 @@
 <!-- This file is machine generated: DO NOT EDIT! -->
 
-# RNN (contrib)
+# RNN and Cells (contrib)
 [TOC]
 
 Module for constructing RNN Cells and additional RNN operations.
diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard0/tf.contrib.legacy_seq2seq.attention_decoder.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard0/tf.contrib.legacy_seq2seq.attention_decoder.md
new file mode 100644
index 0000000000..8043a4dad5
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard0/tf.contrib.legacy_seq2seq.attention_decoder.md
@@ -0,0 +1,60 @@
+### `tf.contrib.legacy_seq2seq.attention_decoder(decoder_inputs, initial_state, attention_states, cell, output_size=None, num_heads=1, loop_function=None, dtype=None, scope=None, initial_state_attention=False)` {#attention_decoder}
+
+RNN decoder with attention for the sequence-to-sequence model.
+
+In this context "attention" means that, during decoding, the RNN can look up
+information in the additional tensor attention_states, and it does this by
+focusing on a few entries from the tensor. This model has proven to yield
+especially good results in a number of sequence-to-sequence tasks. This
+implementation is based on http://arxiv.org/abs/1412.7449 (see below for
+details). It is recommended for complex sequence-to-sequence tasks.
+
+##### Args:
+
+
+*  <b>`decoder_inputs`</b>: A list of 2D Tensors [batch_size x input_size].
+*  <b>`initial_state`</b>: 2D Tensor [batch_size x cell.state_size].
+*  <b>`attention_states`</b>: 3D Tensor [batch_size x attn_length x attn_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`output_size`</b>: Size of the output vectors; if None, we use cell.output_size.
+*  <b>`num_heads`</b>: Number of attention heads that read from attention_states.
+*  <b>`loop_function`</b>: If not None, this function will be applied to i-th output
+    in order to generate i+1-th input, and decoder_inputs will be ignored,
+    except for the first element ("GO" symbol). This can be used for decoding,
+    but also for training to emulate http://arxiv.org/abs/1506.03099.
+    Signature -- loop_function(prev, i) = next
+      * prev is a 2D Tensor of shape [batch_size x output_size],
+      * i is an integer, the step number (when advanced control is needed),
+      * next is a 2D Tensor of shape [batch_size x input_size].
+*  <b>`dtype`</b>: The dtype to use for the RNN initial state (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; default: "attention_decoder".
+*  <b>`initial_state_attention`</b>: If False (default), initial attentions are zero.
+    If True, initialize the attentions from the initial state and attention
+    states -- useful when we wish to resume decoding from a previously
+    stored decoder state and attention states.
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors of
+      shape [batch_size x output_size]. These represent the generated outputs.
+      Output i is computed from input i (which is either the i-th element
+      of decoder_inputs or loop_function(output {i-1}, i)) as follows.
+      First, we run the cell on a combination of the input and previous
+      attention masks:
+        cell_output, new_state = cell(linear(input, prev_attn), prev_state).
+      Then, we calculate new attention masks:
+        new_attn = softmax(V^T * tanh(W * attention_states + U * new_state))
+      and then we calculate the output:
+        output = linear(cell_output, new_attn).
+*  <b>`state`</b>: The state of each decoder cell the final time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: when num_heads is not positive, there are no inputs, shapes
+    of attention_states are not set, or input size cannot be inferred
+    from the input.
+
diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard0/tf.contrib.legacy_seq2seq.embedding_rnn_decoder.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard0/tf.contrib.legacy_seq2seq.embedding_rnn_decoder.md
new file mode 100644
index 0000000000..3ba228523f
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard0/tf.contrib.legacy_seq2seq.embedding_rnn_decoder.md
@@ -0,0 +1,48 @@
+### `tf.contrib.legacy_seq2seq.embedding_rnn_decoder(decoder_inputs, initial_state, cell, num_symbols, embedding_size, output_projection=None, feed_previous=False, update_embedding_for_previous=True, scope=None)` {#embedding_rnn_decoder}
+
+RNN decoder with embedding and a pure-decoding option.
+
+##### Args:
+
+
+*  <b>`decoder_inputs`</b>: A list of 1D batch-sized int32 Tensors (decoder inputs).
+*  <b>`initial_state`</b>: 2D Tensor [batch_size x cell.state_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function.
+*  <b>`num_symbols`</b>: Integer, how many symbols come into the embedding.
+*  <b>`embedding_size`</b>: Integer, the length of the embedding vector for each symbol.
+*  <b>`output_projection`</b>: None or a pair (W, B) of output projection weights and
+    biases; W has shape [output_size x num_symbols] and B has
+    shape [num_symbols]; if provided and feed_previous=True, each fed
+    previous output will first be multiplied by W and added B.
+*  <b>`feed_previous`</b>: Boolean; if True, only the first of decoder_inputs will be
+    used (the "GO" symbol), and all other decoder inputs will be generated by:
+      next = embedding_lookup(embedding, argmax(previous_output)),
+    In effect, this implements a greedy decoder. It can also be used
+    during training to emulate http://arxiv.org/abs/1506.03099.
+    If False, decoder_inputs are used as given (the standard decoder case).
+*  <b>`update_embedding_for_previous`</b>: Boolean; if False and feed_previous=True,
+    only the embedding for the first symbol of decoder_inputs (the "GO"
+    symbol) will be updated by back propagation. Embeddings for the symbols
+    generated from the decoder itself remain unchanged. This parameter has
+    no effect if feed_previous=False.
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to
+    "embedding_rnn_decoder".
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors. The
+      output is of shape [batch_size x cell.output_size] when
+      output_projection is not None (and represents the dense representation
+      of predicted tokens). It is of shape [batch_size x num_decoder_symbols]
+      when output_projection is None.
+*  <b>`state`</b>: The state of each decoder cell in each time-step. This is a list
+      with length len(decoder_inputs) -- one item for each time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: When output_projection has the wrong shape.
+
diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard0/tf.contrib.legacy_seq2seq.embedding_rnn_seq2seq.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard0/tf.contrib.legacy_seq2seq.embedding_rnn_seq2seq.md
new file mode 100644
index 0000000000..2eafffe765
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard0/tf.contrib.legacy_seq2seq.embedding_rnn_seq2seq.md
@@ -0,0 +1,46 @@
+### `tf.contrib.legacy_seq2seq.embedding_rnn_seq2seq(encoder_inputs, decoder_inputs, cell, num_encoder_symbols, num_decoder_symbols, embedding_size, output_projection=None, feed_previous=False, dtype=None, scope=None)` {#embedding_rnn_seq2seq}
+
+Embedding RNN sequence-to-sequence model.
+
+This model first embeds encoder_inputs by a newly created embedding (of shape
+[num_encoder_symbols x input_size]). Then it runs an RNN to encode
+embedded encoder_inputs into a state vector. Next, it embeds decoder_inputs
+by another newly created embedding (of shape [num_decoder_symbols x
+input_size]). Then it runs RNN decoder, initialized with the last
+encoder state, on embedded decoder_inputs.
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`decoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`num_encoder_symbols`</b>: Integer; number of symbols on the encoder side.
+*  <b>`num_decoder_symbols`</b>: Integer; number of symbols on the decoder side.
+*  <b>`embedding_size`</b>: Integer, the length of the embedding vector for each symbol.
+*  <b>`output_projection`</b>: None or a pair (W, B) of output projection weights and
+    biases; W has shape [output_size x num_decoder_symbols] and B has
+    shape [num_decoder_symbols]; if provided and feed_previous=True, each
+    fed previous output will first be multiplied by W and added B.
+*  <b>`feed_previous`</b>: Boolean or scalar Boolean Tensor; if True, only the first
+    of decoder_inputs will be used (the "GO" symbol), and all other decoder
+    inputs will be taken from previous outputs (as in embedding_rnn_decoder).
+    If False, decoder_inputs are used as given (the standard decoder case).
+*  <b>`dtype`</b>: The dtype of the initial state for both the encoder and encoder
+    rnn cells (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to
+    "embedding_rnn_seq2seq"
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors. The
+      output is of shape [batch_size x cell.output_size] when
+      output_projection is not None (and represents the dense representation
+      of predicted tokens). It is of shape [batch_size x num_decoder_symbols]
+      when output_projection is None.
+*  <b>`state`</b>: The state of each decoder cell in each time-step. This is a list
+      with length len(decoder_inputs) -- one item for each time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard0/tf.contrib.legacy_seq2seq.sequence_loss.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard0/tf.contrib.legacy_seq2seq.sequence_loss.md
new file mode 100644
index 0000000000..c0beb1541e
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard0/tf.contrib.legacy_seq2seq.sequence_loss.md
@@ -0,0 +1,26 @@
+### `tf.contrib.legacy_seq2seq.sequence_loss(logits, targets, weights, average_across_timesteps=True, average_across_batch=True, softmax_loss_function=None, name=None)` {#sequence_loss}
+
+Weighted cross-entropy loss for a sequence of logits, batch-collapsed.
+
+##### Args:
+
+
+*  <b>`logits`</b>: List of 2D Tensors of shape [batch_size x num_decoder_symbols].
+*  <b>`targets`</b>: List of 1D batch-sized int32 Tensors of the same length as logits.
+*  <b>`weights`</b>: List of 1D batch-sized float-Tensors of the same length as logits.
+*  <b>`average_across_timesteps`</b>: If set, divide the returned cost by the total
+    label weight.
+*  <b>`average_across_batch`</b>: If set, divide the returned cost by the batch size.
+*  <b>`softmax_loss_function`</b>: Function (inputs-batch, labels-batch) -> loss-batch
+    to be used instead of the standard softmax (the default if this is None).
+*  <b>`name`</b>: Optional name for this operation, defaults to "sequence_loss".
+
+##### Returns:
+
+  A scalar float Tensor: The average log-perplexity per symbol (weighted).
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: If len(logits) is different from len(targets) or len(weights).
+
diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard2/tf.contrib.legacy_seq2seq.embedding_attention_decoder.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard2/tf.contrib.legacy_seq2seq.embedding_attention_decoder.md
new file mode 100644
index 0000000000..a3e6aa804f
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard2/tf.contrib.legacy_seq2seq.embedding_attention_decoder.md
@@ -0,0 +1,52 @@
+### `tf.contrib.legacy_seq2seq.embedding_attention_decoder(decoder_inputs, initial_state, attention_states, cell, num_symbols, embedding_size, num_heads=1, output_size=None, output_projection=None, feed_previous=False, update_embedding_for_previous=True, dtype=None, scope=None, initial_state_attention=False)` {#embedding_attention_decoder}
+
+RNN decoder with embedding and attention and a pure-decoding option.
+
+##### Args:
+
+
+*  <b>`decoder_inputs`</b>: A list of 1D batch-sized int32 Tensors (decoder inputs).
+*  <b>`initial_state`</b>: 2D Tensor [batch_size x cell.state_size].
+*  <b>`attention_states`</b>: 3D Tensor [batch_size x attn_length x attn_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function.
+*  <b>`num_symbols`</b>: Integer, how many symbols come into the embedding.
+*  <b>`embedding_size`</b>: Integer, the length of the embedding vector for each symbol.
+*  <b>`num_heads`</b>: Number of attention heads that read from attention_states.
+*  <b>`output_size`</b>: Size of the output vectors; if None, use output_size.
+*  <b>`output_projection`</b>: None or a pair (W, B) of output projection weights and
+    biases; W has shape [output_size x num_symbols] and B has shape
+    [num_symbols]; if provided and feed_previous=True, each fed previous
+    output will first be multiplied by W and added B.
+*  <b>`feed_previous`</b>: Boolean; if True, only the first of decoder_inputs will be
+    used (the "GO" symbol), and all other decoder inputs will be generated by:
+      next = embedding_lookup(embedding, argmax(previous_output)),
+    In effect, this implements a greedy decoder. It can also be used
+    during training to emulate http://arxiv.org/abs/1506.03099.
+    If False, decoder_inputs are used as given (the standard decoder case).
+*  <b>`update_embedding_for_previous`</b>: Boolean; if False and feed_previous=True,
+    only the embedding for the first symbol of decoder_inputs (the "GO"
+    symbol) will be updated by back propagation. Embeddings for the symbols
+    generated from the decoder itself remain unchanged. This parameter has
+    no effect if feed_previous=False.
+*  <b>`dtype`</b>: The dtype to use for the RNN initial states (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to
+    "embedding_attention_decoder".
+*  <b>`initial_state_attention`</b>: If False (default), initial attentions are zero.
+    If True, initialize the attentions from the initial state and attention
+    states -- useful when we wish to resume decoding from a previously
+    stored decoder state and attention states.
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors with
+      shape [batch_size x output_size] containing the generated outputs.
+*  <b>`state`</b>: The state of each decoder cell at the final time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: When output_projection has the wrong shape.
+
diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard2/tf.contrib.legacy_seq2seq.embedding_attention_seq2seq.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard2/tf.contrib.legacy_seq2seq.embedding_attention_seq2seq.md
new file mode 100644
index 0000000000..5de66ad5e9
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard2/tf.contrib.legacy_seq2seq.embedding_attention_seq2seq.md
@@ -0,0 +1,50 @@
+### `tf.contrib.legacy_seq2seq.embedding_attention_seq2seq(encoder_inputs, decoder_inputs, cell, num_encoder_symbols, num_decoder_symbols, embedding_size, num_heads=1, output_projection=None, feed_previous=False, dtype=None, scope=None, initial_state_attention=False)` {#embedding_attention_seq2seq}
+
+Embedding sequence-to-sequence model with attention.
+
+This model first embeds encoder_inputs by a newly created embedding (of shape
+[num_encoder_symbols x input_size]). Then it runs an RNN to encode
+embedded encoder_inputs into a state vector. It keeps the outputs of this
+RNN at every step to use for attention later. Next, it embeds decoder_inputs
+by another newly created embedding (of shape [num_decoder_symbols x
+input_size]). Then it runs attention decoder, initialized with the last
+encoder state, on embedded decoder_inputs and attending to encoder outputs.
+
+Warning: when output_projection is None, the size of the attention vectors
+and variables will be made proportional to num_decoder_symbols, can be large.
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`decoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`num_encoder_symbols`</b>: Integer; number of symbols on the encoder side.
+*  <b>`num_decoder_symbols`</b>: Integer; number of symbols on the decoder side.
+*  <b>`embedding_size`</b>: Integer, the length of the embedding vector for each symbol.
+*  <b>`num_heads`</b>: Number of attention heads that read from attention_states.
+*  <b>`output_projection`</b>: None or a pair (W, B) of output projection weights and
+    biases; W has shape [output_size x num_decoder_symbols] and B has
+    shape [num_decoder_symbols]; if provided and feed_previous=True, each
+    fed previous output will first be multiplied by W and added B.
+*  <b>`feed_previous`</b>: Boolean or scalar Boolean Tensor; if True, only the first
+    of decoder_inputs will be used (the "GO" symbol), and all other decoder
+    inputs will be taken from previous outputs (as in embedding_rnn_decoder).
+    If False, decoder_inputs are used as given (the standard decoder case).
+*  <b>`dtype`</b>: The dtype of the initial RNN state (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to
+    "embedding_attention_seq2seq".
+*  <b>`initial_state_attention`</b>: If False (default), initial attentions are zero.
+    If True, initialize the attentions from the initial state and attention
+    states.
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors with
+      shape [batch_size x num_decoder_symbols] containing the generated
+      outputs.
+*  <b>`state`</b>: The state of each decoder cell at the final time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard2/tf.contrib.legacy_seq2seq.rnn_decoder.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard2/tf.contrib.legacy_seq2seq.rnn_decoder.md
new file mode 100644
index 0000000000..00cafc27bc
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard2/tf.contrib.legacy_seq2seq.rnn_decoder.md
@@ -0,0 +1,31 @@
+### `tf.contrib.legacy_seq2seq.rnn_decoder(decoder_inputs, initial_state, cell, loop_function=None, scope=None)` {#rnn_decoder}
+
+RNN decoder for the sequence-to-sequence model.
+
+##### Args:
+
+
+*  <b>`decoder_inputs`</b>: A list of 2D Tensors [batch_size x input_size].
+*  <b>`initial_state`</b>: 2D Tensor with shape [batch_size x cell.state_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`loop_function`</b>: If not None, this function will be applied to the i-th output
+    in order to generate the i+1-st input, and decoder_inputs will be ignored,
+    except for the first element ("GO" symbol). This can be used for decoding,
+    but also for training to emulate http://arxiv.org/abs/1506.03099.
+    Signature -- loop_function(prev, i) = next
+      * prev is a 2D Tensor of shape [batch_size x output_size],
+      * i is an integer, the step number (when advanced control is needed),
+      * next is a 2D Tensor of shape [batch_size x input_size].
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to "rnn_decoder".
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors with
+      shape [batch_size x output_size] containing generated outputs.
+*  <b>`state`</b>: The state of each cell at the final time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+      (Note that in some cases, like basic RNN cell or GRU cell, outputs and
+       states can be the same. They are different for LSTM cells though.)
+
diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard3/tf.contrib.legacy_seq2seq.basic_rnn_seq2seq.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard3/tf.contrib.legacy_seq2seq.basic_rnn_seq2seq.md
new file mode 100644
index 0000000000..12e5851497
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard3/tf.contrib.legacy_seq2seq.basic_rnn_seq2seq.md
@@ -0,0 +1,26 @@
+### `tf.contrib.legacy_seq2seq.basic_rnn_seq2seq(encoder_inputs, decoder_inputs, cell, dtype=tf.float32, scope=None)` {#basic_rnn_seq2seq}
+
+Basic RNN sequence-to-sequence model.
+
+This model first runs an RNN to encode encoder_inputs into a state vector,
+then runs decoder, initialized with the last encoder state, on decoder_inputs.
+Encoder and decoder use the same RNN cell type, but don't share parameters.
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of 2D Tensors [batch_size x input_size].
+*  <b>`decoder_inputs`</b>: A list of 2D Tensors [batch_size x input_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`dtype`</b>: The dtype of the initial state of the RNN cell (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; default: "basic_rnn_seq2seq".
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors with
+      shape [batch_size x output_size] containing the generated outputs.
+*  <b>`state`</b>: The state of each decoder cell in the final time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard3/tf.contrib.legacy_seq2seq.tied_rnn_seq2seq.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard3/tf.contrib.legacy_seq2seq.tied_rnn_seq2seq.md
new file mode 100644
index 0000000000..5455cefa2d
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard3/tf.contrib.legacy_seq2seq.tied_rnn_seq2seq.md
@@ -0,0 +1,30 @@
+### `tf.contrib.legacy_seq2seq.tied_rnn_seq2seq(encoder_inputs, decoder_inputs, cell, loop_function=None, dtype=tf.float32, scope=None)` {#tied_rnn_seq2seq}
+
+RNN sequence-to-sequence model with tied encoder and decoder parameters.
+
+This model first runs an RNN to encode encoder_inputs into a state vector, and
+then runs decoder, initialized with the last encoder state, on decoder_inputs.
+Encoder and decoder use the same RNN cell and share parameters.
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of 2D Tensors [batch_size x input_size].
+*  <b>`decoder_inputs`</b>: A list of 2D Tensors [batch_size x input_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`loop_function`</b>: If not None, this function will be applied to i-th output
+    in order to generate i+1-th input, and decoder_inputs will be ignored,
+    except for the first element ("GO" symbol), see rnn_decoder for details.
+*  <b>`dtype`</b>: The dtype of the initial state of the rnn cell (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; default: "tied_rnn_seq2seq".
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors with
+      shape [batch_size x output_size] containing the generated outputs.
+*  <b>`state`</b>: The state of each decoder cell in each time-step. This is a list
+      with length len(decoder_inputs) -- one item for each time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard4/tf.contrib.legacy_seq2seq.one2many_rnn_seq2seq.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard4/tf.contrib.legacy_seq2seq.one2many_rnn_seq2seq.md
new file mode 100644
index 0000000000..fd297eae06
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard4/tf.contrib.legacy_seq2seq.one2many_rnn_seq2seq.md
@@ -0,0 +1,43 @@
+### `tf.contrib.legacy_seq2seq.one2many_rnn_seq2seq(encoder_inputs, decoder_inputs_dict, cell, num_encoder_symbols, num_decoder_symbols_dict, embedding_size, feed_previous=False, dtype=None, scope=None)` {#one2many_rnn_seq2seq}
+
+One-to-many RNN sequence-to-sequence model (multi-task).
+
+This is a multi-task sequence-to-sequence model with one encoder and multiple
+decoders. Reference to multi-task sequence-to-sequence learning can be found
+here: http://arxiv.org/abs/1511.06114
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`decoder_inputs_dict`</b>: A dictionany mapping decoder name (string) to
+    the corresponding decoder_inputs; each decoder_inputs is a list of 1D
+    Tensors of shape [batch_size]; num_decoders is defined as
+    len(decoder_inputs_dict).
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`num_encoder_symbols`</b>: Integer; number of symbols on the encoder side.
+*  <b>`num_decoder_symbols_dict`</b>: A dictionary mapping decoder name (string) to an
+    integer specifying number of symbols for the corresponding decoder;
+    len(num_decoder_symbols_dict) must be equal to num_decoders.
+*  <b>`embedding_size`</b>: Integer, the length of the embedding vector for each symbol.
+*  <b>`feed_previous`</b>: Boolean or scalar Boolean Tensor; if True, only the first of
+    decoder_inputs will be used (the "GO" symbol), and all other decoder
+    inputs will be taken from previous outputs (as in embedding_rnn_decoder).
+    If False, decoder_inputs are used as given (the standard decoder case).
+*  <b>`dtype`</b>: The dtype of the initial state for both the encoder and encoder
+    rnn cells (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to
+    "one2many_rnn_seq2seq"
+
+##### Returns:
+
+  A tuple of the form (outputs_dict, state_dict), where:
+
+*  <b>`outputs_dict`</b>: A mapping from decoder name (string) to a list of the same
+      length as decoder_inputs_dict[name]; each element in the list is a 2D
+      Tensors with shape [batch_size x num_decoder_symbol_list[name]]
+      containing the generated outputs.
+*  <b>`state_dict`</b>: A mapping from decoder name (string) to the final state of the
+      corresponding decoder RNN; it is a 2D Tensor of shape
+      [batch_size x cell.state_size].
+
diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard5/tf.contrib.legacy_seq2seq.sequence_loss_by_example.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard5/tf.contrib.legacy_seq2seq.sequence_loss_by_example.md
new file mode 100644
index 0000000000..fdb38a45ee
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard5/tf.contrib.legacy_seq2seq.sequence_loss_by_example.md
@@ -0,0 +1,25 @@
+### `tf.contrib.legacy_seq2seq.sequence_loss_by_example(logits, targets, weights, average_across_timesteps=True, softmax_loss_function=None, name=None)` {#sequence_loss_by_example}
+
+Weighted cross-entropy loss for a sequence of logits (per example).
+
+##### Args:
+
+
+*  <b>`logits`</b>: List of 2D Tensors of shape [batch_size x num_decoder_symbols].
+*  <b>`targets`</b>: List of 1D batch-sized int32 Tensors of the same length as logits.
+*  <b>`weights`</b>: List of 1D batch-sized float-Tensors of the same length as logits.
+*  <b>`average_across_timesteps`</b>: If set, divide the returned cost by the total
+    label weight.
+*  <b>`softmax_loss_function`</b>: Function (inputs-batch, labels-batch) -> loss-batch
+    to be used instead of the standard softmax (the default if this is None).
+*  <b>`name`</b>: Optional name for this operation, default: "sequence_loss_by_example".
+
+##### Returns:
+
+  1D batch-sized float Tensor: The log-perplexity for each sequence.
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: If len(logits) is different from len(targets) or len(weights).
+
diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard7/tf.contrib.legacy_seq2seq.model_with_buckets.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard7/tf.contrib.legacy_seq2seq.model_with_buckets.md
new file mode 100644
index 0000000000..37e2b9a076
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard7/tf.contrib.legacy_seq2seq.model_with_buckets.md
@@ -0,0 +1,42 @@
+### `tf.contrib.legacy_seq2seq.model_with_buckets(encoder_inputs, decoder_inputs, targets, weights, buckets, seq2seq, softmax_loss_function=None, per_example_loss=False, name=None)` {#model_with_buckets}
+
+Create a sequence-to-sequence model with support for bucketing.
+
+The seq2seq argument is a function that defines a sequence-to-sequence model,
+e.g., seq2seq = lambda x, y: basic_rnn_seq2seq(x, y, rnn_cell.GRUCell(24))
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of Tensors to feed the encoder; first seq2seq input.
+*  <b>`decoder_inputs`</b>: A list of Tensors to feed the decoder; second seq2seq input.
+*  <b>`targets`</b>: A list of 1D batch-sized int32 Tensors (desired output sequence).
+*  <b>`weights`</b>: List of 1D batch-sized float-Tensors to weight the targets.
+*  <b>`buckets`</b>: A list of pairs of (input size, output size) for each bucket.
+*  <b>`seq2seq`</b>: A sequence-to-sequence model function; it takes 2 input that
+    agree with encoder_inputs and decoder_inputs, and returns a pair
+    consisting of outputs and states (as, e.g., basic_rnn_seq2seq).
+*  <b>`softmax_loss_function`</b>: Function (inputs-batch, labels-batch) -> loss-batch
+    to be used instead of the standard softmax (the default if this is None).
+*  <b>`per_example_loss`</b>: Boolean. If set, the returned loss will be a batch-sized
+    tensor of losses for each sequence in the batch. If unset, it will be
+    a scalar with the averaged loss from all examples.
+*  <b>`name`</b>: Optional name for this operation, defaults to "model_with_buckets".
+
+##### Returns:
+
+  A tuple of the form (outputs, losses), where:
+
+*  <b>`outputs`</b>: The outputs for each bucket. Its j'th element consists of a list
+      of 2D Tensors. The shape of output tensors can be either
+      [batch_size x output_size] or [batch_size x num_decoder_symbols]
+      depending on the seq2seq model used.
+*  <b>`losses`</b>: List of scalar Tensors, representing losses for each bucket, or,
+      if per_example_loss is set, a list of 1D batch-sized float Tensors.
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: If length of encoder_inputsut, targets, or weights is smaller
+    than the largest (last) bucket.
+
diff --git a/tensorflow/g3doc/api_docs/python/functions_and_classes/shard9/tf.contrib.legacy_seq2seq.embedding_tied_rnn_seq2seq.md b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard9/tf.contrib.legacy_seq2seq.embedding_tied_rnn_seq2seq.md
new file mode 100644
index 0000000000..4107deb134
--- /dev/null
+++ b/tensorflow/g3doc/api_docs/python/functions_and_classes/shard9/tf.contrib.legacy_seq2seq.embedding_tied_rnn_seq2seq.md
@@ -0,0 +1,53 @@
+### `tf.contrib.legacy_seq2seq.embedding_tied_rnn_seq2seq(encoder_inputs, decoder_inputs, cell, num_symbols, embedding_size, num_decoder_symbols=None, output_projection=None, feed_previous=False, dtype=None, scope=None)` {#embedding_tied_rnn_seq2seq}
+
+Embedding RNN sequence-to-sequence model with tied (shared) parameters.
+
+This model first embeds encoder_inputs by a newly created embedding (of shape
+[num_symbols x input_size]). Then it runs an RNN to encode embedded
+encoder_inputs into a state vector. Next, it embeds decoder_inputs using
+the same embedding. Then it runs RNN decoder, initialized with the last
+encoder state, on embedded decoder_inputs. The decoder output is over symbols
+from 0 to num_decoder_symbols - 1 if num_decoder_symbols is none; otherwise it
+is over 0 to num_symbols - 1.
+
+##### Args:
+
+
+*  <b>`encoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`decoder_inputs`</b>: A list of 1D int32 Tensors of shape [batch_size].
+*  <b>`cell`</b>: rnn_cell.RNNCell defining the cell function and size.
+*  <b>`num_symbols`</b>: Integer; number of symbols for both encoder and decoder.
+*  <b>`embedding_size`</b>: Integer, the length of the embedding vector for each symbol.
+*  <b>`num_decoder_symbols`</b>: Integer; number of output symbols for decoder. If
+    provided, the decoder output is over symbols 0 to num_decoder_symbols - 1.
+    Otherwise, decoder output is over symbols 0 to num_symbols - 1. Note that
+    this assumes that the vocabulary is set up such that the first
+    num_decoder_symbols of num_symbols are part of decoding.
+*  <b>`output_projection`</b>: None or a pair (W, B) of output projection weights and
+    biases; W has shape [output_size x num_symbols] and B has
+    shape [num_symbols]; if provided and feed_previous=True, each
+    fed previous output will first be multiplied by W and added B.
+*  <b>`feed_previous`</b>: Boolean or scalar Boolean Tensor; if True, only the first
+    of decoder_inputs will be used (the "GO" symbol), and all other decoder
+    inputs will be taken from previous outputs (as in embedding_rnn_decoder).
+    If False, decoder_inputs are used as given (the standard decoder case).
+*  <b>`dtype`</b>: The dtype to use for the initial RNN states (default: tf.float32).
+*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to
+    "embedding_tied_rnn_seq2seq".
+
+##### Returns:
+
+  A tuple of the form (outputs, state), where:
+
+*  <b>`outputs`</b>: A list of the same length as decoder_inputs of 2D Tensors with
+      shape [batch_size x output_symbols] containing the generated
+      outputs where output_symbols = num_decoder_symbols if
+      num_decoder_symbols is not None otherwise output_symbols = num_symbols.
+*  <b>`state`</b>: The state of each decoder cell at the final time-step.
+      It is a 2D Tensor of shape [batch_size x cell.state_size].
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: When output_projection has the wrong shape.
+
diff --git a/tensorflow/g3doc/api_docs/python/index.md b/tensorflow/g3doc/api_docs/python/index.md
index e9b0e7b350..6172e005c1 100644
--- a/tensorflow/g3doc/api_docs/python/index.md
+++ b/tensorflow/g3doc/api_docs/python/index.md
@@ -571,19 +571,6 @@
   * [`with_space_to_batch`](../../api_docs/python/nn.md#with_space_to_batch)
   * [`zero_fraction`](../../api_docs/python/nn.md#zero_fraction)
 
-* **[Neural Network RNN Cells](../../api_docs/python/rnn_cell.md)**:
-  * [`BasicLSTMCell`](../../api_docs/python/rnn_cell.md#BasicLSTMCell)
-  * [`BasicRNNCell`](../../api_docs/python/rnn_cell.md#BasicRNNCell)
-  * [`DropoutWrapper`](../../api_docs/python/rnn_cell.md#DropoutWrapper)
-  * [`EmbeddingWrapper`](../../api_docs/python/rnn_cell.md#EmbeddingWrapper)
-  * [`GRUCell`](../../api_docs/python/rnn_cell.md#GRUCell)
-  * [`InputProjectionWrapper`](../../api_docs/python/rnn_cell.md#InputProjectionWrapper)
-  * [`LSTMCell`](../../api_docs/python/rnn_cell.md#LSTMCell)
-  * [`LSTMStateTuple`](../../api_docs/python/rnn_cell.md#LSTMStateTuple)
-  * [`MultiRNNCell`](../../api_docs/python/rnn_cell.md#MultiRNNCell)
-  * [`OutputProjectionWrapper`](../../api_docs/python/rnn_cell.md#OutputProjectionWrapper)
-  * [`RNNCell`](../../api_docs/python/rnn_cell.md#RNNCell)
-
 * **[Running Graphs](../../api_docs/python/client.md)**:
   * [`AbortedError`](../../api_docs/python/client.md#AbortedError)
   * [`AlreadyExistsError`](../../api_docs/python/client.md#AlreadyExistsError)
@@ -1014,6 +1001,21 @@
   * [`SummaryWriterCache`](../../api_docs/python/contrib.learn.monitors.md#SummaryWriterCache)
   * [`ValidationMonitor`](../../api_docs/python/contrib.learn.monitors.md#ValidationMonitor)
 
+* **[Sequence to Sequence (contrib)](../../api_docs/python/contrib.legacy_seq2seq.md)**:
+  * [`attention_decoder`](../../api_docs/python/contrib.legacy_seq2seq.md#attention_decoder)
+  * [`basic_rnn_seq2seq`](../../api_docs/python/contrib.legacy_seq2seq.md#basic_rnn_seq2seq)
+  * [`embedding_attention_decoder`](../../api_docs/python/contrib.legacy_seq2seq.md#embedding_attention_decoder)
+  * [`embedding_attention_seq2seq`](../../api_docs/python/contrib.legacy_seq2seq.md#embedding_attention_seq2seq)
+  * [`embedding_rnn_decoder`](../../api_docs/python/contrib.legacy_seq2seq.md#embedding_rnn_decoder)
+  * [`embedding_rnn_seq2seq`](../../api_docs/python/contrib.legacy_seq2seq.md#embedding_rnn_seq2seq)
+  * [`embedding_tied_rnn_seq2seq`](../../api_docs/python/contrib.legacy_seq2seq.md#embedding_tied_rnn_seq2seq)
+  * [`model_with_buckets`](../../api_docs/python/contrib.legacy_seq2seq.md#model_with_buckets)
+  * [`one2many_rnn_seq2seq`](../../api_docs/python/contrib.legacy_seq2seq.md#one2many_rnn_seq2seq)
+  * [`rnn_decoder`](../../api_docs/python/contrib.legacy_seq2seq.md#rnn_decoder)
+  * [`sequence_loss`](../../api_docs/python/contrib.legacy_seq2seq.md#sequence_loss)
+  * [`sequence_loss_by_example`](../../api_docs/python/contrib.legacy_seq2seq.md#sequence_loss_by_example)
+  * [`tied_rnn_seq2seq`](../../api_docs/python/contrib.legacy_seq2seq.md#tied_rnn_seq2seq)
+
 * **[Linear Algebra (contrib)](../../api_docs/python/contrib.linalg.md)**:
   * [`LinearOperator`](../../api_docs/python/contrib.linalg.md#LinearOperator)
   * [`LinearOperatorDiag`](../../api_docs/python/contrib.linalg.md#LinearOperatorDiag)
@@ -1035,7 +1037,7 @@
   * [`softmax_cross_entropy`](../../api_docs/python/contrib.losses.md#softmax_cross_entropy)
   * [`sparse_softmax_cross_entropy`](../../api_docs/python/contrib.losses.md#sparse_softmax_cross_entropy)
 
-* **[RNN (contrib)](../../api_docs/python/contrib.rnn.md)**:
+* **[RNN and Cells (contrib)](../../api_docs/python/contrib.rnn.md)**:
   * [`AttentionCellWrapper`](../../api_docs/python/contrib.rnn.md#AttentionCellWrapper)
   * [`BasicLSTMCell`](../../api_docs/python/contrib.rnn.md#BasicLSTMCell)
   * [`BasicRNNCell`](../../api_docs/python/contrib.rnn.md#BasicRNNCell)
diff --git a/tensorflow/g3doc/api_docs/python/rnn_cell.md b/tensorflow/g3doc/api_docs/python/rnn_cell.md
deleted file mode 100644
index 3e495fe569..0000000000
--- a/tensorflow/g3doc/api_docs/python/rnn_cell.md
+++ /dev/null
@@ -1,851 +0,0 @@
-<!-- This file is machine generated: DO NOT EDIT! -->
-
-# Neural Network RNN Cells
-[TOC]
-
-Module for constructing RNN Cells.
-
-## Base interface for all RNN Cells
-
-- - -
-
-### `class tf.contrib.rnn.RNNCell` {#RNNCell}
-
-Abstract object representing an RNN cell.
-
-The definition of cell in this package differs from the definition used in the
-literature. In the literature, cell refers to an object with a single scalar
-output. The definition in this package refers to a horizontal array of such
-units.
-
-An RNN cell, in the most abstract setting, is anything that has
-a state and performs some operation that takes a matrix of inputs.
-This operation results in an output matrix with `self.output_size` columns.
-If `self.state_size` is an integer, this operation also results in a new
-state matrix with `self.state_size` columns.  If `self.state_size` is a
-tuple of integers, then it results in a tuple of `len(state_size)` state
-matrices, each with a column size corresponding to values in `state_size`.
-
-This module provides a number of basic commonly used RNN cells, such as
-LSTM (Long Short Term Memory) or GRU (Gated Recurrent Unit), and a number
-of operators that allow add dropouts, projections, or embeddings for inputs.
-Constructing multi-layer cells is supported by the class `MultiRNNCell`,
-or by calling the `rnn` ops several times. Every `RNNCell` must have the
-properties below and and implement `__call__` with the following signature.
-- - -
-
-#### `tf.contrib.rnn.RNNCell.__call__(inputs, state, scope=None)` {#RNNCell.__call__}
-
-Run this RNN cell on inputs, starting from the given state.
-
-##### Args:
-
-
-*  <b>`inputs`</b>: `2-D` tensor with shape `[batch_size x input_size]`.
-*  <b>`state`</b>: if `self.state_size` is an integer, this should be a `2-D Tensor`
-    with shape `[batch_size x self.state_size]`.  Otherwise, if
-    `self.state_size` is a tuple of integers, this should be a tuple
-    with shapes `[batch_size x s] for s in self.state_size`.
-*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to class name.
-
-##### Returns:
-
-  A pair containing:
-
-  - Output: A `2-D` tensor with shape `[batch_size x self.output_size]`.
-  - New state: Either a single `2-D` tensor, or a tuple of tensors matching
-    the arity and shapes of `state`.
-
-
-- - -
-
-#### `tf.contrib.rnn.RNNCell.output_size` {#RNNCell.output_size}
-
-Integer or TensorShape: size of outputs produced by this cell.
-
-
-- - -
-
-#### `tf.contrib.rnn.RNNCell.state_size` {#RNNCell.state_size}
-
-size(s) of state(s) used by this cell.
-
-It can be represented by an Integer, a TensorShape or a tuple of Integers
-or TensorShapes.
-
-
-- - -
-
-#### `tf.contrib.rnn.RNNCell.zero_state(batch_size, dtype)` {#RNNCell.zero_state}
-
-Return zero-filled state tensor(s).
-
-##### Args:
-
-
-*  <b>`batch_size`</b>: int, float, or unit Tensor representing the batch size.
-*  <b>`dtype`</b>: the data type to use for the state.
-
-##### Returns:
-
-  If `state_size` is an int or TensorShape, then the return value is a
-  `N-D` tensor of shape `[batch_size x state_size]` filled with zeros.
-
-  If `state_size` is a nested list or tuple, then the return value is
-  a nested list or tuple (of the same structure) of `2-D` tensors with
-the shapes `[batch_size x s]` for each s in `state_size`.
-
-
-
-
-## RNN Cells for use with TensorFlow's core RNN methods
-
-- - -
-
-### `class tf.contrib.rnn.BasicRNNCell` {#BasicRNNCell}
-
-The most basic RNN cell.
-- - -
-
-#### `tf.contrib.rnn.BasicRNNCell.__call__(inputs, state, scope=None)` {#BasicRNNCell.__call__}
-
-Most basic RNN: output = new_state = act(W * input + U * state + B).
-
-
-- - -
-
-#### `tf.contrib.rnn.BasicRNNCell.__init__(num_units, input_size=None, activation=tanh)` {#BasicRNNCell.__init__}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.BasicRNNCell.output_size` {#BasicRNNCell.output_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.BasicRNNCell.state_size` {#BasicRNNCell.state_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.BasicRNNCell.zero_state(batch_size, dtype)` {#BasicRNNCell.zero_state}
-
-Return zero-filled state tensor(s).
-
-##### Args:
-
-
-*  <b>`batch_size`</b>: int, float, or unit Tensor representing the batch size.
-*  <b>`dtype`</b>: the data type to use for the state.
-
-##### Returns:
-
-  If `state_size` is an int or TensorShape, then the return value is a
-  `N-D` tensor of shape `[batch_size x state_size]` filled with zeros.
-
-  If `state_size` is a nested list or tuple, then the return value is
-  a nested list or tuple (of the same structure) of `2-D` tensors with
-the shapes `[batch_size x s]` for each s in `state_size`.
-
-
-
-- - -
-
-### `class tf.contrib.rnn.BasicLSTMCell` {#BasicLSTMCell}
-
-Basic LSTM recurrent network cell.
-
-The implementation is based on: http://arxiv.org/abs/1409.2329.
-
-We add forget_bias (default: 1) to the biases of the forget gate in order to
-reduce the scale of forgetting in the beginning of the training.
-
-It does not allow cell clipping, a projection layer, and does not
-use peep-hole connections: it is the basic baseline.
-
-For advanced models, please use the full LSTMCell that follows.
-- - -
-
-#### `tf.contrib.rnn.BasicLSTMCell.__call__(inputs, state, scope=None)` {#BasicLSTMCell.__call__}
-
-Long short-term memory cell (LSTM).
-
-
-- - -
-
-#### `tf.contrib.rnn.BasicLSTMCell.__init__(num_units, forget_bias=1.0, input_size=None, state_is_tuple=True, activation=tanh)` {#BasicLSTMCell.__init__}
-
-Initialize the basic LSTM cell.
-
-##### Args:
-
-
-*  <b>`num_units`</b>: int, The number of units in the LSTM cell.
-*  <b>`forget_bias`</b>: float, The bias added to forget gates (see above).
-*  <b>`input_size`</b>: Deprecated and unused.
-*  <b>`state_is_tuple`</b>: If True, accepted and returned states are 2-tuples of
-    the `c_state` and `m_state`.  If False, they are concatenated
-    along the column axis.  The latter behavior will soon be deprecated.
-*  <b>`activation`</b>: Activation function of the inner states.
-
-
-- - -
-
-#### `tf.contrib.rnn.BasicLSTMCell.output_size` {#BasicLSTMCell.output_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.BasicLSTMCell.state_size` {#BasicLSTMCell.state_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.BasicLSTMCell.zero_state(batch_size, dtype)` {#BasicLSTMCell.zero_state}
-
-Return zero-filled state tensor(s).
-
-##### Args:
-
-
-*  <b>`batch_size`</b>: int, float, or unit Tensor representing the batch size.
-*  <b>`dtype`</b>: the data type to use for the state.
-
-##### Returns:
-
-  If `state_size` is an int or TensorShape, then the return value is a
-  `N-D` tensor of shape `[batch_size x state_size]` filled with zeros.
-
-  If `state_size` is a nested list or tuple, then the return value is
-  a nested list or tuple (of the same structure) of `2-D` tensors with
-the shapes `[batch_size x s]` for each s in `state_size`.
-
-
-
-- - -
-
-### `class tf.contrib.rnn.GRUCell` {#GRUCell}
-
-Gated Recurrent Unit cell (cf. http://arxiv.org/abs/1406.1078).
-- - -
-
-#### `tf.contrib.rnn.GRUCell.__call__(inputs, state, scope=None)` {#GRUCell.__call__}
-
-Gated recurrent unit (GRU) with nunits cells.
-
-
-- - -
-
-#### `tf.contrib.rnn.GRUCell.__init__(num_units, input_size=None, activation=tanh)` {#GRUCell.__init__}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.GRUCell.output_size` {#GRUCell.output_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.GRUCell.state_size` {#GRUCell.state_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.GRUCell.zero_state(batch_size, dtype)` {#GRUCell.zero_state}
-
-Return zero-filled state tensor(s).
-
-##### Args:
-
-
-*  <b>`batch_size`</b>: int, float, or unit Tensor representing the batch size.
-*  <b>`dtype`</b>: the data type to use for the state.
-
-##### Returns:
-
-  If `state_size` is an int or TensorShape, then the return value is a
-  `N-D` tensor of shape `[batch_size x state_size]` filled with zeros.
-
-  If `state_size` is a nested list or tuple, then the return value is
-  a nested list or tuple (of the same structure) of `2-D` tensors with
-the shapes `[batch_size x s]` for each s in `state_size`.
-
-
-
-- - -
-
-### `class tf.contrib.rnn.LSTMCell` {#LSTMCell}
-
-Long short-term memory unit (LSTM) recurrent network cell.
-
-The default non-peephole implementation is based on:
-
-  http://deeplearning.cs.cmu.edu/pdfs/Hochreiter97_lstm.pdf
-
-S. Hochreiter and J. Schmidhuber.
-"Long Short-Term Memory". Neural Computation, 9(8):1735-1780, 1997.
-
-The peephole implementation is based on:
-
-  https://research.google.com/pubs/archive/43905.pdf
-
-Hasim Sak, Andrew Senior, and Francoise Beaufays.
-"Long short-term memory recurrent neural network architectures for
- large scale acoustic modeling." INTERSPEECH, 2014.
-
-The class uses optional peep-hole connections, optional cell clipping, and
-an optional projection layer.
-- - -
-
-#### `tf.contrib.rnn.LSTMCell.__call__(inputs, state, scope=None)` {#LSTMCell.__call__}
-
-Run one step of LSTM.
-
-##### Args:
-
-
-*  <b>`inputs`</b>: input Tensor, 2D, batch x num_units.
-*  <b>`state`</b>: if `state_is_tuple` is False, this must be a state Tensor,
-    `2-D, batch x state_size`.  If `state_is_tuple` is True, this must be a
-    tuple of state Tensors, both `2-D`, with column sizes `c_state` and
-    `m_state`.
-*  <b>`scope`</b>: VariableScope for the created subgraph; defaults to "lstm_cell".
-
-##### Returns:
-
-  A tuple containing:
-
-  - A `2-D, [batch x output_dim]`, Tensor representing the output of the
-    LSTM after reading `inputs` when previous state was `state`.
-    Here output_dim is:
-       num_proj if num_proj was set,
-       num_units otherwise.
-  - Tensor(s) representing the new state of LSTM after reading `inputs` when
-    the previous state was `state`.  Same type and shape(s) as `state`.
-
-##### Raises:
-
-
-*  <b>`ValueError`</b>: If input size cannot be inferred from inputs via
-    static shape inference.
-
-
-- - -
-
-#### `tf.contrib.rnn.LSTMCell.__init__(num_units, input_size=None, use_peepholes=False, cell_clip=None, initializer=None, num_proj=None, proj_clip=None, num_unit_shards=None, num_proj_shards=None, forget_bias=1.0, state_is_tuple=True, activation=tanh)` {#LSTMCell.__init__}
-
-Initialize the parameters for an LSTM cell.
-
-##### Args:
-
-
-*  <b>`num_units`</b>: int, The number of units in the LSTM cell
-*  <b>`input_size`</b>: Deprecated and unused.
-*  <b>`use_peepholes`</b>: bool, set True to enable diagonal/peephole connections.
-*  <b>`cell_clip`</b>: (optional) A float value, if provided the cell state is clipped
-    by this value prior to the cell output activation.
-*  <b>`initializer`</b>: (optional) The initializer to use for the weight and
-    projection matrices.
-*  <b>`num_proj`</b>: (optional) int, The output dimensionality for the projection
-    matrices.  If None, no projection is performed.
-*  <b>`proj_clip`</b>: (optional) A float value.  If `num_proj > 0` and `proj_clip` is
-    provided, then the projected values are clipped elementwise to within
-    `[-proj_clip, proj_clip]`.
-*  <b>`num_unit_shards`</b>: Deprecated, will be removed by Jan. 2017.
-    Use a variable_scope partitioner instead.
-*  <b>`num_proj_shards`</b>: Deprecated, will be removed by Jan. 2017.
-    Use a variable_scope partitioner instead.
-*  <b>`forget_bias`</b>: Biases of the forget gate are initialized by default to 1
-    in order to reduce the scale of forgetting at the beginning of
-    the training.
-*  <b>`state_is_tuple`</b>: If True, accepted and returned states are 2-tuples of
-    the `c_state` and `m_state`.  If False, they are concatenated
-    along the column axis.  This latter behavior will soon be deprecated.
-*  <b>`activation`</b>: Activation function of the inner states.
-
-
-- - -
-
-#### `tf.contrib.rnn.LSTMCell.output_size` {#LSTMCell.output_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.LSTMCell.state_size` {#LSTMCell.state_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.LSTMCell.zero_state(batch_size, dtype)` {#LSTMCell.zero_state}
-
-Return zero-filled state tensor(s).
-
-##### Args:
-
-
-*  <b>`batch_size`</b>: int, float, or unit Tensor representing the batch size.
-*  <b>`dtype`</b>: the data type to use for the state.
-
-##### Returns:
-
-  If `state_size` is an int or TensorShape, then the return value is a
-  `N-D` tensor of shape `[batch_size x state_size]` filled with zeros.
-
-  If `state_size` is a nested list or tuple, then the return value is
-  a nested list or tuple (of the same structure) of `2-D` tensors with
-the shapes `[batch_size x s]` for each s in `state_size`.
-
-
-
-
-## Classes storing split `RNNCell` state
-
-- - -
-
-### `class tf.contrib.rnn.LSTMStateTuple` {#LSTMStateTuple}
-
-Tuple used by LSTM Cells for `state_size`, `zero_state`, and output state.
-
-Stores two elements: `(c, h)`, in that order.
-
-Only used when `state_is_tuple=True`.
-- - -
-
-#### `tf.contrib.rnn.LSTMStateTuple.__getnewargs__()` {#LSTMStateTuple.__getnewargs__}
-
-Return self as a plain tuple.  Used by copy and pickle.
-
-
-- - -
-
-#### `tf.contrib.rnn.LSTMStateTuple.__getstate__()` {#LSTMStateTuple.__getstate__}
-
-Exclude the OrderedDict from pickling
-
-
-- - -
-
-#### `tf.contrib.rnn.LSTMStateTuple.__new__(_cls, c, h)` {#LSTMStateTuple.__new__}
-
-Create new instance of LSTMStateTuple(c, h)
-
-
-- - -
-
-#### `tf.contrib.rnn.LSTMStateTuple.__repr__()` {#LSTMStateTuple.__repr__}
-
-Return a nicely formatted representation string
-
-
-- - -
-
-#### `tf.contrib.rnn.LSTMStateTuple.c` {#LSTMStateTuple.c}
-
-Alias for field number 0
-
-
-- - -
-
-#### `tf.contrib.rnn.LSTMStateTuple.dtype` {#LSTMStateTuple.dtype}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.LSTMStateTuple.h` {#LSTMStateTuple.h}
-
-Alias for field number 1
-
-
-
-
-## RNN Cell wrappers (RNNCells that wrap other RNNCells)
-
-- - -
-
-### `class tf.contrib.rnn.MultiRNNCell` {#MultiRNNCell}
-
-RNN cell composed sequentially of multiple simple cells.
-- - -
-
-#### `tf.contrib.rnn.MultiRNNCell.__call__(inputs, state, scope=None)` {#MultiRNNCell.__call__}
-
-Run this multi-layer cell on inputs, starting from state.
-
-
-- - -
-
-#### `tf.contrib.rnn.MultiRNNCell.__init__(cells, state_is_tuple=True)` {#MultiRNNCell.__init__}
-
-Create a RNN cell composed sequentially of a number of RNNCells.
-
-##### Args:
-
-
-*  <b>`cells`</b>: list of RNNCells that will be composed in this order.
-*  <b>`state_is_tuple`</b>: If True, accepted and returned states are n-tuples, where
-    `n = len(cells)`.  If False, the states are all
-    concatenated along the column axis.  This latter behavior will soon be
-    deprecated.
-
-##### Raises:
-
-
-*  <b>`ValueError`</b>: if cells is empty (not allowed), or at least one of the cells
-    returns a state tuple but the flag `state_is_tuple` is `False`.
-
-
-- - -
-
-#### `tf.contrib.rnn.MultiRNNCell.output_size` {#MultiRNNCell.output_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.MultiRNNCell.state_size` {#MultiRNNCell.state_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.MultiRNNCell.zero_state(batch_size, dtype)` {#MultiRNNCell.zero_state}
-
-Return zero-filled state tensor(s).
-
-##### Args:
-
-
-*  <b>`batch_size`</b>: int, float, or unit Tensor representing the batch size.
-*  <b>`dtype`</b>: the data type to use for the state.
-
-##### Returns:
-
-  If `state_size` is an int or TensorShape, then the return value is a
-  `N-D` tensor of shape `[batch_size x state_size]` filled with zeros.
-
-  If `state_size` is a nested list or tuple, then the return value is
-  a nested list or tuple (of the same structure) of `2-D` tensors with
-the shapes `[batch_size x s]` for each s in `state_size`.
-
-
-
-- - -
-
-### `class tf.contrib.rnn.DropoutWrapper` {#DropoutWrapper}
-
-Operator adding dropout to inputs and outputs of the given cell.
-- - -
-
-#### `tf.contrib.rnn.DropoutWrapper.__call__(inputs, state, scope=None)` {#DropoutWrapper.__call__}
-
-Run the cell with the declared dropouts.
-
-
-- - -
-
-#### `tf.contrib.rnn.DropoutWrapper.__init__(cell, input_keep_prob=1.0, output_keep_prob=1.0, seed=None)` {#DropoutWrapper.__init__}
-
-Create a cell with added input and/or output dropout.
-
-Dropout is never used on the state.
-
-##### Args:
-
-
-*  <b>`cell`</b>: an RNNCell, a projection to output_size is added to it.
-*  <b>`input_keep_prob`</b>: unit Tensor or float between 0 and 1, input keep
-    probability; if it is float and 1, no input dropout will be added.
-*  <b>`output_keep_prob`</b>: unit Tensor or float between 0 and 1, output keep
-    probability; if it is float and 1, no output dropout will be added.
-*  <b>`seed`</b>: (optional) integer, the randomness seed.
-
-##### Raises:
-
-
-*  <b>`TypeError`</b>: if cell is not an RNNCell.
-*  <b>`ValueError`</b>: if keep_prob is not between 0 and 1.
-
-
-- - -
-
-#### `tf.contrib.rnn.DropoutWrapper.output_size` {#DropoutWrapper.output_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.DropoutWrapper.state_size` {#DropoutWrapper.state_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.DropoutWrapper.zero_state(batch_size, dtype)` {#DropoutWrapper.zero_state}
-
-Return zero-filled state tensor(s).
-
-##### Args:
-
-
-*  <b>`batch_size`</b>: int, float, or unit Tensor representing the batch size.
-*  <b>`dtype`</b>: the data type to use for the state.
-
-##### Returns:
-
-  If `state_size` is an int or TensorShape, then the return value is a
-  `N-D` tensor of shape `[batch_size x state_size]` filled with zeros.
-
-  If `state_size` is a nested list or tuple, then the return value is
-  a nested list or tuple (of the same structure) of `2-D` tensors with
-the shapes `[batch_size x s]` for each s in `state_size`.
-
-
-
-- - -
-
-### `class tf.contrib.rnn.EmbeddingWrapper` {#EmbeddingWrapper}
-
-Operator adding input embedding to the given cell.
-
-Note: in many cases it may be more efficient to not use this wrapper,
-but instead concatenate the whole sequence of your inputs in time,
-do the embedding on this batch-concatenated sequence, then split it and
-feed into your RNN.
-- - -
-
-#### `tf.contrib.rnn.EmbeddingWrapper.__call__(inputs, state, scope=None)` {#EmbeddingWrapper.__call__}
-
-Run the cell on embedded inputs.
-
-
-- - -
-
-#### `tf.contrib.rnn.EmbeddingWrapper.__init__(cell, embedding_classes, embedding_size, initializer=None)` {#EmbeddingWrapper.__init__}
-
-Create a cell with an added input embedding.
-
-##### Args:
-
-
-*  <b>`cell`</b>: an RNNCell, an embedding will be put before its inputs.
-*  <b>`embedding_classes`</b>: integer, how many symbols will be embedded.
-*  <b>`embedding_size`</b>: integer, the size of the vectors we embed into.
-*  <b>`initializer`</b>: an initializer to use when creating the embedding;
-    if None, the initializer from variable scope or a default one is used.
-
-##### Raises:
-
-
-*  <b>`TypeError`</b>: if cell is not an RNNCell.
-*  <b>`ValueError`</b>: if embedding_classes is not positive.
-
-
-- - -
-
-#### `tf.contrib.rnn.EmbeddingWrapper.output_size` {#EmbeddingWrapper.output_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.EmbeddingWrapper.state_size` {#EmbeddingWrapper.state_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.EmbeddingWrapper.zero_state(batch_size, dtype)` {#EmbeddingWrapper.zero_state}
-
-Return zero-filled state tensor(s).
-
-##### Args:
-
-
-*  <b>`batch_size`</b>: int, float, or unit Tensor representing the batch size.
-*  <b>`dtype`</b>: the data type to use for the state.
-
-##### Returns:
-
-  If `state_size` is an int or TensorShape, then the return value is a
-  `N-D` tensor of shape `[batch_size x state_size]` filled with zeros.
-
-  If `state_size` is a nested list or tuple, then the return value is
-  a nested list or tuple (of the same structure) of `2-D` tensors with
-the shapes `[batch_size x s]` for each s in `state_size`.
-
-
-
-- - -
-
-### `class tf.contrib.rnn.InputProjectionWrapper` {#InputProjectionWrapper}
-
-Operator adding an input projection to the given cell.
-
-Note: in many cases it may be more efficient to not use this wrapper,
-but instead concatenate the whole sequence of your inputs in time,
-do the projection on this batch-concatenated sequence, then split it.
-- - -
-
-#### `tf.contrib.rnn.InputProjectionWrapper.__call__(inputs, state, scope=None)` {#InputProjectionWrapper.__call__}
-
-Run the input projection and then the cell.
-
-
-- - -
-
-#### `tf.contrib.rnn.InputProjectionWrapper.__init__(cell, num_proj, input_size=None)` {#InputProjectionWrapper.__init__}
-
-Create a cell with input projection.
-
-##### Args:
-
-
-*  <b>`cell`</b>: an RNNCell, a projection of inputs is added before it.
-*  <b>`num_proj`</b>: Python integer.  The dimension to project to.
-*  <b>`input_size`</b>: Deprecated and unused.
-
-##### Raises:
-
-
-*  <b>`TypeError`</b>: if cell is not an RNNCell.
-
-
-- - -
-
-#### `tf.contrib.rnn.InputProjectionWrapper.output_size` {#InputProjectionWrapper.output_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.InputProjectionWrapper.state_size` {#InputProjectionWrapper.state_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.InputProjectionWrapper.zero_state(batch_size, dtype)` {#InputProjectionWrapper.zero_state}
-
-Return zero-filled state tensor(s).
-
-##### Args:
-
-
-*  <b>`batch_size`</b>: int, float, or unit Tensor representing the batch size.
-*  <b>`dtype`</b>: the data type to use for the state.
-
-##### Returns:
-
-  If `state_size` is an int or TensorShape, then the return value is a
-  `N-D` tensor of shape `[batch_size x state_size]` filled with zeros.
-
-  If `state_size` is a nested list or tuple, then the return value is
-  a nested list or tuple (of the same structure) of `2-D` tensors with
-the shapes `[batch_size x s]` for each s in `state_size`.
-
-
-
-- - -
-
-### `class tf.contrib.rnn.OutputProjectionWrapper` {#OutputProjectionWrapper}
-
-Operator adding an output projection to the given cell.
-
-Note: in many cases it may be more efficient to not use this wrapper,
-but instead concatenate the whole sequence of your outputs in time,
-do the projection on this batch-concatenated sequence, then split it
-if needed or directly feed into a softmax.
-- - -
-
-#### `tf.contrib.rnn.OutputProjectionWrapper.__call__(inputs, state, scope=None)` {#OutputProjectionWrapper.__call__}
-
-Run the cell and output projection on inputs, starting from state.
-
-
-- - -
-
-#### `tf.contrib.rnn.OutputProjectionWrapper.__init__(cell, output_size)` {#OutputProjectionWrapper.__init__}
-
-Create a cell with output projection.
-
-##### Args:
-
-
-*  <b>`cell`</b>: an RNNCell, a projection to output_size is added to it.
-*  <b>`output_size`</b>: integer, the size of the output after projection.
-
-##### Raises:
-
-
-*  <b>`TypeError`</b>: if cell is not an RNNCell.
-*  <b>`ValueError`</b>: if output_size is not positive.
-
-
-- - -
-
-#### `tf.contrib.rnn.OutputProjectionWrapper.output_size` {#OutputProjectionWrapper.output_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.OutputProjectionWrapper.state_size` {#OutputProjectionWrapper.state_size}
-
-
-
-
-- - -
-
-#### `tf.contrib.rnn.OutputProjectionWrapper.zero_state(batch_size, dtype)` {#OutputProjectionWrapper.zero_state}
-
-Return zero-filled state tensor(s).
-
-##### Args:
-
-
-*  <b>`batch_size`</b>: int, float, or unit Tensor representing the batch size.
-*  <b>`dtype`</b>: the data type to use for the state.
-
-##### Returns:
-
-  If `state_size` is an int or TensorShape, then the return value is a
-  `N-D` tensor of shape `[batch_size x state_size]` filled with zeros.
-
-  If `state_size` is a nested list or tuple, then the return value is
-  a nested list or tuple (of the same structure) of `2-D` tensors with
-the shapes `[batch_size x s]` for each s in `state_size`.
-
-
-
author	A. Unique TensorFlower <gardener@tensorflow.org>	2016-12-04 21:05:14 -0800
committer	TensorFlower Gardener <gardener@tensorflow.org>	2016-12-04 21:23:37 -0800
commit	70a517936a16e95b5521ef2458fd35b23658f6bc (patch)
tree	2e94944bbba8d0331f6079bc98f6e0d9fcca9e8d /tensorflow/g3doc
parent	c5e21b06986d717f3fde68e3664ebc650d934377 (diff)