| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The core of the change is have the gradient tape capture
distributed variables instead of plain ResourceVariables.
In other words, we move the distribution awareness from defun
down to tape and rely on distributed variable magic to provide us
with the right variable at runtime.
In tower context, we always watch the container (e.g. MirroredVariable).
In cross tower context, we always watch all the components.
PiperOrigin-RevId: 216430530
|
|\
| |
| |
| | |
PiperOrigin-RevId: 215639962
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
output depends on the updates across all mirrors. Before this change,
update() would return a Mirrored value that where each component was
an update to a single mirror. This caused a problem since for reading
purposes other DistributionStrategy methods would consider it okay
to read any single component, and so if you for example did something
like session.run(strategy.update(...)) it would only perform the
update on one replica. The fix is to have the output be a Mirrored
value that is actually the identity operation returning the output on
that device, but that has a control dependency making sure that the
update actually happens on all the replicas. This fix was already
present in MirroredVariable._assign_func, this CL moves the fix into
update() and generalizes it to multiple return values.
To disable this new grouping behavior, you may now pass
"grouped=False" to update(). For example, some callers (like Optimizer)
are performing a lot of updates and they prefer to group all of them
together at once for performance reasons. In this case, we still want
to make sure the caller executes the update on all replicas, so we
return an unwrapped value instead of a Mirrored value. This has the
happy side effect of removing a bunch of unwrap calls in client code,
since unwrapping was the only safe way to use the Mirrored value we
used to return.
PiperOrigin-RevId: 215301909
|
| |
| |
| |
| | |
PiperOrigin-RevId: 214989908
|
| |
| |
| |
| |
| |
| | |
We will re-enable it when it is more robust.
PiperOrigin-RevId: 214956066
|
|/ |
|
|
|
|
|
|
|
|
| |
distribution strategies. That is always the appropriate option.
In the existing code, we would set it to a partially specified "worker" name that was ambiguous and end up on the GPU.
PiperOrigin-RevId: 214882658
|
|
|
|
|
|
|
|
| |
supported in Graph mode using initializable iterators. In a subsequent change, we'll add in support for Eager mode as well.
This removes prefetching_ops_v2 code.
PiperOrigin-RevId: 214546754
|
|
|
|
|
|
| |
Parameter server strategy where variables are shared across sessions.
PiperOrigin-RevId: 211573447
|
|
|
|
| |
PiperOrigin-RevId: 211008923
|
|
|
|
|
|
|
| |
step counter. This allows us to get rid of the increment_var()
function and just use a standard assign_add().
PiperOrigin-RevId: 210743165
|
|
|
|
| |
PiperOrigin-RevId: 210669284
|
|
|
|
|
|
| |
wrapper for variables in collections instead of what it wraps.
PiperOrigin-RevId: 210107528
|
|
|
|
|
|
| |
coordinator in MirroredStrategy.
PiperOrigin-RevId: 209848375
|
|
|
|
| |
PiperOrigin-RevId: 209099475
|
|
|
|
| |
PiperOrigin-RevId: 209064406
|
|
|
|
| |
PiperOrigin-RevId: 208929959
|
|\ |
|
| |
| |
| |
| |
| |
| | |
this for MirroredStrategy and OneDeviceStrategy. Implemented in TPUStrategy earlier.
PiperOrigin-RevId: 207961939
|
| |
| |
| |
| | |
PiperOrigin-RevId: 207649000
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Before this change, when was function is called in a distribution
strategy context, it would capture the component variables from some
device and always use these variables, even when the function is
executed on a different device.
This CL "reevaluates" distributed variables to get the correct variable
at call time. These correct variables are then passed to the function.
We don't handle distributed tensors. First, because the mechanics for handling
distributed tensors are different from handling distributed variables,
their support added significant complexity to already complex defuns.
Second, there is no easy way for users have a function capture a distributed
tensor or feed a distributed tensor explicitly. If this changes, we can
support them (the code exists in this CL's history).
We also don't handle distributed variables explicitly passed into the
function for similar reasons.
PiperOrigin-RevId: 207640908
|
| |
| |
| |
| | |
PiperOrigin-RevId: 207348195
|
| |
| |
| |
| | |
PiperOrigin-RevId: 207294037
|
| |
| |
| |
| | |
PiperOrigin-RevId: 207215423
|
| |
| |
| |
| | |
PiperOrigin-RevId: 206289143
|
| | |
|
| |
| |
| |
| | |
PiperOrigin-RevId: 202983273
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
will be used for distributed variables.
Add Enum `VariableSynchronization` with values for `synchronization`: AUTO, UNREPLICATED, ON_WRITE, ON_READ
Add Enum `VariableAggregation` with values for `aggregation`: NONE, SUM, MEAN. Replace all the aggregation methods strings in distribution strategy to the enum values.
Update Mirrored strategy to use these parameters to decide on whether a variable should be Mirrored or TowerLocal.
Update different distribution strategy value types to use the `VariableAggregation` Enum
PiperOrigin-RevId: 202736077
|
|/ |
|
|
|
|
|
|
| |
in cross tower and tower context.
PiperOrigin-RevId: 202162272
|
|
|
|
| |
PiperOrigin-RevId: 201582230
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
so we can delete it. Frequently we can now delete the call entirely,
but in other cases we switch to read_var().
This revealed some bugs also fixed in this CL:
* For MirroredStrategy: fix read_var(mean_tower_local) bug.
* Support get() for Mirrored values that are not MirroredVariables,
and make them DistributedDelegates so we can operate on them in
cross-tower mode.
* Actually iterate through the available devices in MirroredStrategy.get().
With this and already-submitted 201390698, we can pass mirrored
variables and other mirrored values directly to self.evaluate() in
tests.
PiperOrigin-RevId: 201435436
|
|
|
|
| |
PiperOrigin-RevId: 201379124
|
|
|
|
| |
PiperOrigin-RevId: 201232817
|
|
|
|
|
|
| |
sometimes a variable.
PiperOrigin-RevId: 200231463
|
|
|
|
| |
PiperOrigin-RevId: 200209039
|
|
|
|
| |
PiperOrigin-RevId: 199319758
|
|
|
|
| |
PiperOrigin-RevId: 199241723
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now, after tf.enable_eager_execution() has been executed, entering the context
manager of a tf.Graph will enable graph mode. So, for example
```
tf.enable_eager_execution()
with tf.Graph().as_default():
c = tf.constant(1.0) # this is a graph tensor
c2 = tf.constant(1.0) # this is an eager tensor
```
The main use-case of this is allowing documentation writers to make a single
notebook which starts with eager execution and seamlessly transitions to
building graphs.
This also makes many explicit enablings of graph mode in the code redundant
(a cleanup cl will follow).
PiperOrigin-RevId: 198092991
|
|
|
|
|
|
| |
Bug reported and solution suggested in #19069
PiperOrigin-RevId: 196718454
|
|
|
|
| |
PiperOrigin-RevId: 195368876
|
|
|
|
| |
PiperOrigin-RevId: 194976633
|
|
|
|
|
|
| |
dataset.
PiperOrigin-RevId: 193437651
|
|
|
|
|
|
| |
in estimator.
PiperOrigin-RevId: 193394603
|
|
|
|
| |
PiperOrigin-RevId: 191024677
|
|
and MirroredStrategy, and related functionality.
Also add tf.contrib.optimizer_v2, an update to the Optimizer API.
RELNOTES: Can now pass tf.contrib.distribute.MirroredStrategy() to
tf.estimator.RunConfig() to run an Estimator model on multiple GPUs
on one machine.
PiperOrigin-RevId: 190996247
|