1 files changed, 30 insertions, 90 deletions
diff --git a/tensorflow/g3doc/api_docs/python/nn.md b/tensorflow/g3doc/api_docs/python/nn.md
index 50c460b68c..b129506107 100644
--- a/tensorflow/g3doc/api_docs/python/nn.md
+++ b/tensorflow/g3doc/api_docs/python/nn.md
@@ -35,7 +35,6 @@ accepted by [`tf.convert_to_tensor`](framework.md#convert_to_tensor).
   * [tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None)](#softmax_cross_entropy_with_logits)
 * [Embeddings](#AUTOGENERATED-embeddings)
   * [tf.nn.embedding_lookup(params, ids, name=None)](#embedding_lookup)
-  * [tf.nn.embedding_lookup_sparse(params, sp_ids, sp_weights, name=None, combiner='mean')](#embedding_lookup_sparse)
 * [Evaluation](#AUTOGENERATED-evaluation)
   * [tf.nn.top_k(input, k, name=None)](#top_k)
   * [tf.nn.in_top_k(predictions, targets, k, name=None)](#in_top_k)
@@ -130,17 +129,18 @@ sum is unchanged.
 By default, each element is kept or dropped independently.  If `noise_shape`
 is specified, it must be
 [broadcastable](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)
-to the shape of `x`, and only dimensions with `noise_shape[i] == x.shape[i]`
-will make independent decisions.  For example, if `x.shape = [b, x, y, c]` and
-`noise_shape = [b, 1, 1, c]`, each batch and channel component will be
+to the shape of `x`, and only dimensions with `noise_shape[i] == shape(x)[i]`
+will make independent decisions.  For example, if `shape(x) = [k, l, m, n]`
+and `noise_shape = [k, 1, 1, n]`, each batch and channel component will be
 kept independently and each row and column will be kept or not kept together.
 
 ##### Args:
 
 
 *  <b>x</b>: A tensor.
-*  <b>keep_prob</b>: Float probability that each element is kept.
-*  <b>noise_shape</b>: Shape for randomly generated keep/drop flags.
+*  <b>keep_prob</b>: A Python float. The probability that each element is kept.
+*  <b>noise_shape</b>: A 1-D `Tensor` of type `int32`, representing the
+    shape for randomly generated keep/drop flags.
 *  <b>seed</b>: A Python integer. Used to create a random seed.
     See [`set_random_seed`](constant_op.md#set_random_seed) for behavior.
 *  <b>name</b>: A name for this operation (optional).
@@ -247,10 +247,10 @@ are as follows.  If the 4-D `input` has shape
 `[batch, in_height, in_width, ...]` and the 4-D `filter` has shape
 `[filter_height, filter_width, ...]`, then
 
-    output.shape = [batch,
-                    (in_height - filter_height + 1) / strides[1],
-                    (in_width - filter_width + 1) / strides[2],
-                    ...]
+    shape(output) = [batch,
+                     (in_height - filter_height + 1) / strides[1],
+                     (in_width - filter_width + 1) / strides[2],
+                     ...]
 
     output[b, i, j, :] =
         sum_{di, dj} input[b, strides[1] * i + di, strides[2] * j + dj, ...] *
@@ -262,7 +262,7 @@ vectors.  For `depthwise_conv_2d`, each scalar component `input[b, i, j, k]`
 is multiplied by a vector `filter[di, dj, k]`, and all the vectors are
 concatenated.
 
-In the formula for `output.shape`, the rounding direction depends on padding:
+In the formula for `shape(output)`, the rounding direction depends on padding:
 
 * `padding = 'SAME'`: Round down (only full size windows are considered).
 * `padding = 'VALID'`: Round up (partial windows are included).
@@ -411,7 +411,7 @@ In detail, the output is
 
 for each tuple of indices `i`.  The output shape is
 
-    output.shape = (value.shape - ksize + 1) / strides
+    shape(output) = (shape(value) - ksize + 1) / strides
 
 where the rounding direction depends on padding:
 
@@ -722,103 +722,43 @@ and the same dtype (either `float32` or `float64`).
 
 ## Embeddings <div class="md-anchor" id="AUTOGENERATED-embeddings">{#AUTOGENERATED-embeddings}</div>
 
-TensorFlow provides several operations that help you compute embeddings.
+TensorFlow provides library support for looking up values in embedding
+tensors.
 
 - - -
 
 ### tf.nn.embedding_lookup(params, ids, name=None) <div class="md-anchor" id="embedding_lookup">{#embedding_lookup}</div>
 
-Return a tensor of embedding values by looking up "ids" in "params".
+Looks up `ids` in a list of embedding tensors.
 
-##### Args:
-
-
-*  <b>params</b>: List of tensors of the same shape.  A single tensor is
-          treated as a singleton list.
-*  <b>ids</b>: Tensor of integers containing the ids to be looked up in
-       'params'.  Let P be len(params).  If P > 1, then the ids are
-       partitioned by id % P, and we do separate lookups in params[p]
-       for 0 <= p < P, and then stitch the results back together into
-       a single result tensor.
-*  <b>name</b>: Optional name for the op.
-
-##### Returns:
-
-  A tensor of shape ids.shape + params[0].shape[1:] containing the
-  values params[i % P][i] for each i in ids.
-
-##### Raises:
-
-
-*  <b>ValueError</b>: if some parameters are invalid.
+This function is used to perform parallel lookups on the list of
+tensors in `params`.  It is a generalization of
+[`tf.gather()`](array_ops.md#gather), where `params` is interpreted
+as a partition of a larger embedding tensor.
 
+If `len(params) > 1`, each element `id` of `ids` is partitioned between
+the elements of `params` by computing `p = id % len(params)`, and is
+then used to look up the slice `params[p][id // len(params), ...]`.
 
-- - -
-
-### tf.nn.embedding_lookup_sparse(params, sp_ids, sp_weights, name=None, combiner='mean') <div class="md-anchor" id="embedding_lookup_sparse">{#embedding_lookup_sparse}</div>
-
-Computes embeddings for the given ids and weights.
-
-This op assumes that there is at least one id for each row in the dense tensor
-represented by sp_ids (i.e. there are no rows with empty features), and that
-all the indices of sp_ids are in canonical row-major order.
-
-It also assumes that all id values lie in the range [0, p0), where p0
-is the sum of the size of params along dimension 0.
+The results of the lookup are then concatenated into a dense
+tensor. The returned tensor has shape `shape(ids) + shape(params)[1:]`.
 
 ##### Args:
 
 
-*  <b>params</b>: A single tensor representing the complete embedding tensor,
-    or a list of P tensors all of same shape except for the first dimension,
-    representing sharded embedding tensors. In the latter case, the ids are
-    partitioned by id % P, and we do separate lookups in params[p] for
-    0 <= p < P, and then stitch the results back together into a single
-    result tensor. The first dimension is allowed to vary as the vocab
-    size is not necessarily a multiple of P.
-*  <b>sp_ids</b>: N x M SparseTensor of int64 ids (typically from FeatureValueToId),
-    where N is typically batch size and M is arbitrary.
-*  <b>sp_weights</b>: either a SparseTensor of float / double weights, or None to
-    indicate all weights should be taken to be 1. If specified, sp_weights
-    must have exactly the same shape and indices as sp_ids.
-*  <b>name</b>: Optional name for the op.
-*  <b>combiner</b>: A string specifying the reduction op. Currently "mean" and "sum"
-    are supported.
-    "sum" computes the weighted sum of the embedding results for each row.
-    "mean" is the weighted sum divided by the total weight.
+*  <b>params</b>: A list of tensors with the same shape and type.
+*  <b>ids</b>: A `Tensor` with type `int32` containing the ids to be looked
+    up in `params`.
+*  <b>name</b>: A name for the operation (optional).
 
 ##### Returns:
 
-  A dense tensor representing the combined embeddings for the
-  sparse ids. For each row in the dense tensor represented by sp_ids, the op
-  looks up the embeddings for all ids in that row, multiplies them by the
-  corresponding weight, and combines these embeddings as specified.
-
-  In other words, if
-    shape(combined params) = [p0, p1, ..., pm]
-  and
-    shape(sp_ids) = shape(sp_weights) = [d0, d1, ..., dn]
-  then
-    shape(output) = [d0, d1, ..., dn-1, p1, ..., pm].
-
-  For instance, if params is a 10x20 matrix, and sp_ids / sp_weights are
-
-    [0, 0]: id 1, weight 2.0
-    [0, 1]: id 3, weight 0.5
-    [1, 0]: id 0, weight 1.0
-    [2, 3]: id 1, weight 3.0
-
-  with combiner="mean", then the output will be a 3x20 matrix where
-    output[0, :] = (params[1, :] * 2.0 + params[3, :] * 0.5) / (2.0 + 0.5)
-    output[1, :] = params[0, :] * 1.0
-    output[2, :] = params[1, :] * 3.0
+  A `Tensor` with the same type as the tensors in `params`.
 
 ##### Raises:
 
 
-*  <b>TypeError</b>: If sp_ids is not a SparseTensor, or if sp_weights is neither
-    None nor SparseTensor.
-*  <b>ValueError</b>: If combiner is not one of {"mean", "sum"}.
+*  <b>ValueError</b>: If `params` is empty.