2 files changed, 121 insertions, 12 deletions
diff --git a/tensorflow/g3doc/api_docs/python/index.md b/tensorflow/g3doc/api_docs/python/index.md
index 8e5fd82dc9..ec080cda9c 100644
--- a/tensorflow/g3doc/api_docs/python/index.md
+++ b/tensorflow/g3doc/api_docs/python/index.md
@@ -90,12 +90,14 @@
   * [`scatter_add`](../../api_docs/python/state_ops.md#scatter_add)
   * [`scatter_sub`](../../api_docs/python/state_ops.md#scatter_sub)
   * [`scatter_update`](../../api_docs/python/state_ops.md#scatter_update)
+  * [`sharded_uniform_unit_scaling_initializer`](../../api_docs/python/state_ops.md#sharded_uniform_unit_scaling_initializer)
   * [`sparse_mask`](../../api_docs/python/state_ops.md#sparse_mask)
   * [`trainable_variables`](../../api_docs/python/state_ops.md#trainable_variables)
   * [`truncated_normal_initializer`](../../api_docs/python/state_ops.md#truncated_normal_initializer)
   * [`uniform_unit_scaling_initializer`](../../api_docs/python/state_ops.md#uniform_unit_scaling_initializer)
   * [`update_checkpoint_state`](../../api_docs/python/state_ops.md#update_checkpoint_state)
   * [`Variable`](../../api_docs/python/state_ops.md#Variable)
+  * [`variable_axis_size_partitioner`](../../api_docs/python/state_ops.md#variable_axis_size_partitioner)
   * [`variable_op_scope`](../../api_docs/python/state_ops.md#variable_op_scope)
   * [`variable_scope`](../../api_docs/python/state_ops.md#variable_scope)
   * [`VariableScope`](../../api_docs/python/state_ops.md#VariableScope)
diff --git a/tensorflow/g3doc/api_docs/python/state_ops.md b/tensorflow/g3doc/api_docs/python/state_ops.md
index 180dc07a77..6d3b039d00 100644
--- a/tensorflow/g3doc/api_docs/python/state_ops.md
+++ b/tensorflow/g3doc/api_docs/python/state_ops.md
@@ -1033,7 +1033,7 @@ create variables contingent on certain conditions.
 
 - - -
 
-### `tf.get_variable(name, shape=None, dtype=tf.float32, initializer=None, regularizer=None, trainable=True, collections=None, validate_shape=True)` {#get_variable}
+### `tf.get_variable(name, shape=None, dtype=tf.float32, initializer=None, regularizer=None, trainable=True, collections=None, caching_device=None, partitioner=None, validate_shape=True)` {#get_variable}
 
 Gets an existing variable with these parameters or create a new one.
 
@@ -1059,20 +1059,37 @@ Similarly, if the regularizer is `None` (the default), the default regularizer
 passed in the variable scope will be used (if that is `None` too,
 then by default no regularization is performed).
 
+If a partitioner is provided, first a sharded `Variable` is created
+via `_get_partitioned_variable_list`, and the return value is a
+`Tensor` composed of the shards concatenated along the partition axis.
+
+Some useful partitioners are available.  See, e.g.,
+`variable_axis_size_partitioner`.
+
 ##### Args:
 
 
-*  <b>`name`</b>: the name of the new or existing variable.
-*  <b>`shape`</b>: shape of the new or existing variable.
-*  <b>`dtype`</b>: type of the new or existing variable (defaults to `DT_FLOAT`).
-*  <b>`initializer`</b>: initializer for the variable if one is created.
-*  <b>`regularizer`</b>: a (Tensor -> Tensor or None) function; the result of
+*  <b>`name`</b>: The name of the new or existing variable.
+*  <b>`shape`</b>: Shape of the new or existing variable.
+*  <b>`dtype`</b>: Type of the new or existing variable (defaults to `DT_FLOAT`).
+*  <b>`initializer`</b>: Initializer for the variable if one is created.
+*  <b>`regularizer`</b>: A (Tensor -> Tensor or None) function; the result of
     applying it on a newly created variable will be added to the collection
     GraphKeys.REGULARIZATION_LOSSES and can be used for regularization.
 *  <b>`trainable`</b>: If `True` also add the variable to the graph collection
     `GraphKeys.TRAINABLE_VARIABLES` (see tf.Variable).
 *  <b>`collections`</b>: List of graph collections keys to add the Variable to.
     Defaults to `[GraphKeys.VARIABLES]` (see tf.Variable).
+    If partitioning is enabled and used, the concatenated return value
+    is also added to collection `GraphKeys.CONCATENATED_VARIABLES`.
+*  <b>`caching_device`</b>: Optional device string or function describing where the
+    Variable should be cached for reading.  Defaults to the Variable's
+    device.  If not `None`, caches on another device.  Typical use is to
+    cache on the device where the Ops using the Variable reside, to
+    deduplicate copying through `Switch` and other conditional statements.
+*  <b>`partitioner`</b>: Optional callable that accepts a fully defined `TensorShape`
+    and `dtype` of the Variable to be created, and returns a list of
+    partitions for each axis (currently only one axis can be partitioned).
 *  <b>`validate_shape`</b>: If False, allows the variable to be initialized with a
       value of unknown shape. If True, the default, the shape of initial_value
       must be known.
@@ -1105,10 +1122,11 @@ Attributes:
   reuse: Boolean or None, setting the reuse in get_variable.
   caching_device: string, callable, or None: the caching device passed to
     get_variable.
-  name_scope: The name passed to tf.name_scope.
+  partitioner: callable or `None`: the partitioner passed to `get_variable`.
+  name_scope: The name passed to `tf.name_scope`.
 - - -
 
-#### `tf.VariableScope.__init__(reuse, name='', initializer=None, regularizer=None, caching_device=None, name_scope='')` {#VariableScope.__init__}
+#### `tf.VariableScope.__init__(reuse, name='', initializer=None, regularizer=None, caching_device=None, partitioner=None, name_scope='')` {#VariableScope.__init__}
 
 Creates a new VariableScope with the given properties.
 
@@ -1122,7 +1140,7 @@ Creates a new VariableScope with the given properties.
 
 - - -
 
-#### `tf.VariableScope.get_variable(var_store, name, shape=None, dtype=tf.float32, initializer=None, regularizer=None, trainable=True, collections=None, caching_device=None, validate_shape=True)` {#VariableScope.get_variable}
+#### `tf.VariableScope.get_variable(var_store, name, shape=None, dtype=tf.float32, initializer=None, regularizer=None, trainable=True, collections=None, caching_device=None, partitioner=None, validate_shape=True)` {#VariableScope.get_variable}
 
 Gets an existing variable with this name or create a new one.
 
@@ -1143,6 +1161,13 @@ Gets an existing variable with this name or create a new one.
 
 - - -
 
+#### `tf.VariableScope.partitioner` {#VariableScope.partitioner}
+
+
+
+
+- - -
+
 #### `tf.VariableScope.regularizer` {#VariableScope.regularizer}
 
 
@@ -1178,6 +1203,13 @@ Set initializer for this scope.
 
 - - -
 
+#### `tf.VariableScope.set_partitioner(partitioner)` {#VariableScope.set_partitioner}
+
+Set partitioner for this scope.
+
+
+- - -
+
 #### `tf.VariableScope.set_regularizer(regularizer)` {#VariableScope.set_regularizer}
 
 Set regularizer for this scope.
@@ -1186,7 +1218,7 @@ Set regularizer for this scope.
 
 - - -
 
-### `tf.variable_scope(name_or_scope, reuse=None, initializer=None, regularizer=None, caching_device=None)` {#variable_scope}
+### `tf.variable_scope(name_or_scope, reuse=None, initializer=None, regularizer=None, caching_device=None, partitioner=None)` {#variable_scope}
 
 Returns a context for variable scope.
 
@@ -1255,6 +1287,7 @@ then all its sub-scopes become reusing as well.
 *  <b>`initializer`</b>: default initializer for variables within this scope.
 *  <b>`regularizer`</b>: default regularizer for variables within this scope.
 *  <b>`caching_device`</b>: default caching device for variables within this scope.
+*  <b>`partitioner`</b>: default partitioner for variables within this scope.
 
 ##### Returns:
 
@@ -1270,7 +1303,7 @@ then all its sub-scopes become reusing as well.
 
 - - -
 
-### `tf.variable_op_scope(values, name_or_scope, default_name=None, initializer=None, regularizer=None, caching_device=None, reuse=None)` {#variable_op_scope}
+### `tf.variable_op_scope(values, name_or_scope, default_name=None, initializer=None, regularizer=None, caching_device=None, partitioner=None, reuse=None)` {#variable_op_scope}
 
 Returns a context manager for defining an op that creates variables.
 
@@ -1307,9 +1340,10 @@ def my_op_with_vars(a, b, scope=None):
 *  <b>`default_name`</b>: The default name to use if the `name_or_scope` argument is
     `None`, this name will be uniquified. If name_or_scope is provided it
     won't be used and therefore it is not required and can be None.
-*  <b>`initializer`</b>: The  default initializer to pass to variable scope.
+*  <b>`initializer`</b>: The default initializer to pass to variable scope.
 *  <b>`regularizer`</b>: The default regularizer for variables within this scope.
 *  <b>`caching_device`</b>: The default caching device for variables within this scope.
+*  <b>`partitioner`</b>: The default partitioner for variables within this scope.
 *  <b>`reuse`</b>: `True` or `None`; if `True`, we go into reuse mode for this scope as
     well as all sub-scopes; if `None`, we just inherit the parent scope reuse.
 
@@ -1596,6 +1630,42 @@ numerically computed: for a linear layer it's 1.0, relu: ~1.43, tanh: ~1.15.
 
 - - -
 
+### `tf.sharded_uniform_unit_scaling_initializer(shape, factor=1.0, seed=None, dtype=tf.float32)` {#sharded_uniform_unit_scaling_initializer}
+
+Returns a uniform_unit_scaling_initializer with scale based on `shape`.
+
+This initializer works identically to `uniform_unit_scaling_initializer`,
+but the scale is based on a predefined shape.  This is useful when a
+`Variable` is being partitioned across several shards, and each shard
+has a smaller shape than the whole.  Since the shards are usually
+concatenated when used, the scale should be based on the shape of the whole.
+
+See the documentation of `uniform_unit_scaling_initializer` for details
+on how the scale of the input variance is kept constant.
+
+##### Args:
+
+
+*  <b>`shape`</b>: A list or tuple of ints to use for the scaling calculation.
+*  <b>`factor`</b>: Float.  A multiplicative factor by which the values will be scaled.
+*  <b>`seed`</b>: A Python integer. Used to create random seeds. See
+    [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed)
+    for behavior.
+*  <b>`dtype`</b>: The data type. Only floating point types are supported.
+
+##### Returns:
+
+  An initializer that generates sub-tensors of a unit variance tensor
+  (so long as that tensor is not being sharded along its innermost dimension).
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: if `dtype` is not a floating point type.
+
+
+- - -
+
 ### `tf.zeros_initializer(shape, dtype=tf.float32)` {#zeros_initializer}
 
 An adaptor for zeros() to match the Initializer spec.
@@ -1609,6 +1679,43 @@ An adaptor for ones() to match the Initializer spec.
 
 
 
+## Variable Partitioners for Sharding
+
+- - -
+
+### `tf.variable_axis_size_partitioner(max_shard_bytes, axis=0, bytes_per_string_element=16)` {#variable_axis_size_partitioner}
+
+Get a partitioner for VariableScope to keep shards below `max_shard_bytes`.
+
+This partitioner will shard a Variable along one axis, attempting to keep
+the maximum shard size below `max_shard_bytes`.  In practice, this is not
+always possible when sharding along only one axis.  When this happens,
+this axis is sharded as much as possible (i.e., every dimension becomes
+a separate shard).
+
+One reasonable value for `max_shard_bytes` is `(64 << 20) - 1`, or almost
+`64MB`, to keep below the protobuf byte limit.
+
+##### Args:
+
+
+*  <b>`max_shard_bytes`</b>: The maximum size any given shard is allowed to be.
+*  <b>`axis`</b>: The axis to partition along.  Default: outermost axis.
+*  <b>`bytes_per_string_element`</b>: If the `Variable` is of type string, this provides
+    an estimate of how large each scalar in the `Variable` is.
+
+##### Returns:
+
+  A partition function usable as the `partitioner` argument to
+  `variable_scope`, `get_variable`, and `get_partitioned_variable_list`.
+
+##### Raises:
+
+
+*  <b>`ValueError`</b>: If any of the byte counts are non-positive.
+
+
+
 ## Sparse Variable Updates
 
 The sparse update ops modify a subset of the entries in a dense `Variable`,