| Commit message (Collapse) | Author | Age |
|
|
|
| |
PiperOrigin-RevId: 211387503
|
|
|
|
| |
PiperOrigin-RevId: 211378182
|
|
|
|
| |
PiperOrigin-RevId: 211378028
|
|
|
|
| |
PiperOrigin-RevId: 211377977
|
|
|
|
|
|
|
|
|
|
|
| |
The autotune code assumes a clean slate, but there might be things from
previous program executions still pending on the streams owned by the executor.
Do a full host-device sync before autotuning to flush out any pending work.
I'm still somewhat confused on how autotune can interfere with other buffers.
There might be more things going wrong ...
PiperOrigin-RevId: 211369162
|
|
|
|
|
|
| |
We use the same trick that is used in the TPU backend.
PiperOrigin-RevId: 211344106
|
|
|
|
|
|
|
|
| |
Cudnn supports grouped convolutions, so we don't need the
ConvolutionFeatureGroupConverter pass and can instead set the group_count
parameter on the cudnn custom calls.
PiperOrigin-RevId: 211339551
|
|
|
|
| |
PiperOrigin-RevId: 211339000
|
|
|
|
| |
PiperOrigin-RevId: 211323840
|
|
|
|
|
|
|
|
| |
Reinstate the use of integral-exponent power function MathUtil::IPow, but make sure to use a floating point base, so as to compute the result using floating point arithmetic. This behaviour is equivalent to, but faster than, std::pow.
Note that care must be taken to convert the base to double, which we effect by providing an explicit template type argument for MathUtil::IPow.
PiperOrigin-RevId: 211290304
|
|
|
|
|
|
| |
Happened to observe this come up in a linear algebra workload.
PiperOrigin-RevId: 211290278
|
|
|
|
| |
PiperOrigin-RevId: 211257009
|
|\
| |
| |
| | |
PiperOrigin-RevId: 211226585
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This allows fine grained control over recording in some cases, for example the
following where we want d2y but not d2z:
x1 = tf.Variable(2.0, trainable=False)
x2 = tf.Variable(2.0, trainable=False)
with tf.GradientTape() as tape1:
with tf.GradientTape() as tape2:
tape1.watch(x1)
tape2.watch([x1, x2])
y = x1 ** 3
z = x2 ** 2
dy, dz = tape2.gradient([y, z], [x1, x2])
d2y, d2z = tape1.gradient([dy, dz], [x1, x2])
assert d2z is None
PiperOrigin-RevId: 211206506
|
| |
| |
| |
| | |
PiperOrigin-RevId: 211204708
|
| |
| |
| |
| |
| |
| | |
StringPiece and string_view are the same now, no need to convert between them.
PiperOrigin-RevId: 211195959
|
| |
| |
| |
| | |
PiperOrigin-RevId: 211195689
|
| |
| |
| |
| |
| |
| | |
This required an absl version bump past 5e7d459eeca7bc53deab0ee9634601386b53d7c0
PiperOrigin-RevId: 211195261
|
| |
| |
| |
| | |
PiperOrigin-RevId: 211188683
|
| |
| |
| |
| |
| |
| | |
The current implementation queries the global collection for ready op. Therefore there is no need to have a per-tower ready op.
PiperOrigin-RevId: 211187544
|
| |
| |
| |
| | |
PiperOrigin-RevId: 211180182
|
| |
| |
| |
| |
| |
| | |
optimization.
PiperOrigin-RevId: 211179990
|
|\ \
| | |
| | |
| | | |
PiperOrigin-RevId: 211178634
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 211175130
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Instead call it "buffer table", it now contains both entry computation
parameters and temporaries.
PiperOrigin-RevId: 211171651
|
|\ \ \
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211169413
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Fixes #21756
PiperOrigin-RevId: 211168797
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211167699
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211167333
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Fixed #17850
PiperOrigin-RevId: 211166112
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211165943
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
clusters with std servers.
PiperOrigin-RevId: 211165860
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211165811
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Error if fn closes over Tensor or Variable (not always detectable)
Allow None gradients to some inputs (filter out Nones before control_deps)
PiperOrigin-RevId: 211162615
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211162510
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211162384
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Add M^1/2 to reduce condition numbers, before computing inverse pth root.
PiperOrigin-RevId: 211162032
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Rollback of rollback. Fix: make access to collective_graph_key thread-safe.
The original change introduced a collective_graph_key_ integer to DirectSession, but it did not protect accesses to this integer. This change protects access with a mutex.
END_PUBLIC
Automated rollback of commit cb9443831283c2366e3dd91001db6362d6594f66
PiperOrigin-RevId: 211161961
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211161790
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211161172
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211160708
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211159438
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
of ParameterServerStrategy. We enable multi-worker training only through distribute coordinator.
PiperOrigin-RevId: 211159386
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211158585
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
With this pipelining change (and a little bit of re-tuning of input pipelines to have _less_ parallelism to avoid thread starvation), we are able to significantly reduce the overheads of supporting dynamic shapes with Keras.
PiperOrigin-RevId: 211157531
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
These deps are unncessary and were causing unexpected breakage.
Remove them.
Fixes #20828
PiperOrigin-RevId: 211156706
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211156403
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211154236
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 211153502
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
If you want all-reduce, supply the `value` to the `destinations` argument.
PiperOrigin-RevId: 211148002
|