| Commit message (Collapse) | Author | Age |
|
|
|
| |
PiperOrigin-RevId: 195734246
|
|
|
|
| |
PiperOrigin-RevId: 195731675
|
|
|
|
| |
PiperOrigin-RevId: 195731183
|
|
|
|
|
|
| |
eager-mode
PiperOrigin-RevId: 195730534
|
|
|
|
| |
PiperOrigin-RevId: 195730139
|
|
|
|
|
|
|
|
| |
tensorflow/compiler/tests:extract_image_patches_op_test_cpu_ondemand
A recent change has made this test flaky.
PiperOrigin-RevId: 195726647
|
|
|
|
| |
PiperOrigin-RevId: 195723288
|
|
|
|
| |
PiperOrigin-RevId: 195722449
|
|
|
|
| |
PiperOrigin-RevId: 195721404
|
|
|
|
| |
PiperOrigin-RevId: 195720133
|
|
|
|
|
|
| |
numerically unstable for large minibatches.
PiperOrigin-RevId: 195719795
|
|
|
|
| |
PiperOrigin-RevId: 195718061
|
|
|
|
| |
PiperOrigin-RevId: 195717497
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cross-tower context:
* only provide read-only access to variables via get()
* don't fail if use the variable isn't copied to the current device in
get()
* make _as_graph_element() return the aggregate value for tower-local
variables (instead of the incorrect previous behavior of returning
the primary)
PiperOrigin-RevId: 195711474
|
|
|
|
| |
PiperOrigin-RevId: 195710562
|
|
|
|
|
|
| |
This is a typo I introduced in cr/195514907.
PiperOrigin-RevId: 195706006
|
|
|
|
|
|
|
| |
This test sometimes runs longer than 60s, and has been getting flaky timeouts
as a result. With a longer timeout, it succeeds reliably.
PiperOrigin-RevId: 195704998
|
|
|
|
| |
PiperOrigin-RevId: 195704492
|
|
|
|
| |
PiperOrigin-RevId: 195701399
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in a tensor.
The current implementation of EmitArrayElementAddress incorrectly concludes
that having a size one dimension in a tensor indicates broadcasting is needed
and the linear address can't be used to access the tensor. We fix this by
leaving LinearValidOnShape to decide whether the linear address can be used to
access the tensor. This enables the vectorization of loads/stores in unrolled
elementwise op kernels when other criteria are met.
Add a test case.
PiperOrigin-RevId: 195701194
|
|
|
|
| |
PiperOrigin-RevId: 195700319
|
|
|
|
| |
PiperOrigin-RevId: 195698980
|
|
|
|
| |
PiperOrigin-RevId: 195697446
|
|
|
|
|
|
| |
Fixes #17784
PiperOrigin-RevId: 195696915
|
|
|
|
|
|
| |
TPUStrategy passes tests in minimize_loss_test. That caused me to add a capability to have `iterations x cores` inputs of any structure. I also resolved a big number of small issues and uncovered more things to resolve that are documented as todos.
PiperOrigin-RevId: 195696833
|
|
|
|
| |
PiperOrigin-RevId: 195693362
|
|
|
|
| |
PiperOrigin-RevId: 195690333
|
|
|
|
| |
PiperOrigin-RevId: 195690035
|
|
|
|
|
|
|
|
|
|
| |
conversions.
That is, instances of sp.ToString() are replaced with std::string(sp).
This will allow tensorflow::StringPiece::ToString to be removed, which is necessary before it can be replaced with absl::string_view.
PiperOrigin-RevId: 195689392
|
|
|
|
| |
PiperOrigin-RevId: 195685740
|
|
|
|
| |
PiperOrigin-RevId: 195685340
|
|
|
|
|
|
|
|
| |
https://www.tensorflow.org/programmers_guide/embedding
The link at "Follow this link to see a fun example of thumbnail images in the Embedding Projector." It should go to https://www.tensorflow.org/images/embedding-mnist.mp4 but instead goes to the TF index page.
PiperOrigin-RevId: 195684456
|
|
|
|
| |
PiperOrigin-RevId: 195681946
|
|
|
|
| |
PiperOrigin-RevId: 195654450
|
|
|
|
| |
PiperOrigin-RevId: 195645734
|
|
|
|
|
|
|
|
|
|
|
| |
The previous approach didn't work because a multiplication by a scalar value
will be changed into an explicit broadcast.
Another issue that is fixed in this CL is retrieving the constant value from
the literal. This depends on the PrimitiveType, before we always assumed it to be double.
Also when checking ImplementedAsGemm() we should not call it recursively, but instead just the check related to kDot.
Finally add an execution test and adjust the fusion logic test.
PiperOrigin-RevId: 195638795
|
|
|
|
| |
PiperOrigin-RevId: 195632175
|
|
|
|
| |
PiperOrigin-RevId: 195560525
|
|
|
|
|
|
|
|
| |
We don't need a corresponding change in gemm_thunk.cc because for gemms,
we do our autotune at runtime, at which point we have some real data in
our input/output buffers.
PiperOrigin-RevId: 195548896
|
|
|
|
|
|
| |
Useful when wanting to compile a computation but not run it. Returns a serialized CompilationResult string with the error message.
PiperOrigin-RevId: 195547847
|
|
|
|
| |
PiperOrigin-RevId: 195547670
|
|
|
|
|
|
|
|
|
|
|
|
| |
fusion.
This has the effect of pushing widening kConvert HLOs into consumers.
This is what we want, because it means that the producer writes the
narrower type (e.g. f16) and the consumer reads it and internally
upcasts to the wider type (e.g. f32). This lets the producer and
consumer both run faster, because they have to touch less memory.
PiperOrigin-RevId: 195546910
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Swift via direct session.
The changes are:
1. Added a TF experimental C API for Swift host to enqueue a tensor for sending
to TF. Again, the C APIs can be removed once the Fifo-queue based design
proves stable later.
2. TFLowerGraph is extended to generate Fifo related nodes for TF to receive
tensors. This is similar to the extension for TF to send tensors.
3. TFPartition is extended to support host send (createHostSend()), which does
the tensor send via a new protocol method TensorSendableReceivable.sendToDevice().
The main complexity is in sending a scalar, where a new protocol method
TensorSendableReceivable.createScalarTensor() is called to first create a tensor
out of it, and then send it over to TF.
Also removed code for protocol conformance on AccelerableByTensorFlow. Instead
have compiler look up that conformance from the SILFunction on sending/receiving
tensors.
AccelerableByTensorFlow could be removed from the compiler-known protocol list
now, but we'll defer that till things can stabilized more (in the past this
protocol has been added to and removed from the list at different times).
PiperOrigin-RevId: 195539436
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I didn't remove the enum itself, but after this change removing the enum should
be a simple NFC change (famous last words!).
This will make it easier to implement BatchDot on CPU.
The change removes usages of kTransposeDot by:
- Teaching TransposeFolding to "fuse" transposes into dots by flipping the
lhs_contracting_dims/rhs_contracting_dims fields.
- Replacing the notion of transpose_lhs/transpose_rhs in the IR emitters with
"has a non-canonical LHS contraction dimension"/"has a non-canonical RHS
contraction dimension" where the canonical LHS and RHS contraction dims [0]
are 1 and 0.
Some tests were getting away with creating Dot instructions with their
dimensions numbers unset. I've fixed these to create canonical dot operations
instead.
It is possible (but hard to tell without trying) that some of the IR emission
logic and Eigen runtime calls can now be simplified further. For instance,
instead of passing in a `transpose_lhs` and `transpose_rhs` to the Eigen GEMM
routines, we could instead pass in the LHS and RHS contraction dimensions
directly.
[0] See HloInstruction::CreateCanonicalDot.
PiperOrigin-RevId: 195514907
|
|
|
|
| |
PiperOrigin-RevId: 195506194
|
|
|
|
| |
PiperOrigin-RevId: 195503895
|
|
|
|
| |
PiperOrigin-RevId: 195503894
|
|
|
|
|
|
|
|
|
|
|
| |
Exposes it as tf.contrib.checkpoint.NoDependency. Objects wrapped in a
NoDependency object get unwrapped in __setattr__ and not tracked.
Removes the _save_counter dependency from tf.train.Checkpoint (the save counter
is still tracked as "save_counter" and always has been, so this is a
backwards-compatible dependency removal).
PiperOrigin-RevId: 195502562
|
|
|
|
| |
PiperOrigin-RevId: 195501990
|
|
|
|
| |
PiperOrigin-RevId: 195501342
|