| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
| |
--copts are passed to both c++ and c (so is redundent with --cxxopts).
Configs passed to "bazel build" are inherited by "bazel run" and "bazel test".
Also removed some unused configs.
PiperOrigin-RevId: 175326697
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
absl::string_view equivalents.
This will allow for a more convenient transition to absl::string_view.
Calls to set StringPiece::set and StringPiece::clear were replaced with the StringPiece constructor as follows:
string_piece_foo.set(data, size) => string_piece_foo = StringPiece(data, size)
string_piece_foo.clear() => string_piece_foo = StringPiece()
PiperOrigin-RevId: 175326576
|
|
|
|
| |
PiperOrigin-RevId: 175324895
|
|
|
|
|
|
|
| |
This was occasionally causing flaky failures due to the module being compiled
before the proto_text headers had been generated.
PiperOrigin-RevId: 175322168
|
|
|
|
|
|
|
|
|
|
| |
I know, I know, PTX isn't really IR. (Or is it?? It's not machine
code...)
In any case, if you pass this flag and get PTX, you're unlikely to be
disappointed.
PiperOrigin-RevId: 175320353
|
|
|
|
| |
PiperOrigin-RevId: 175319401
|
|
|
|
|
|
| |
Also fixes previously-introduced memory management bugs in graph_def_versions and op_def.
PiperOrigin-RevId: 175314829
|
|
|
|
|
|
| |
are not created from VirtualScheduler.
PiperOrigin-RevId: 175314193
|
|
|
|
| |
PiperOrigin-RevId: 175311543
|
|
|
|
| |
PiperOrigin-RevId: 175307853
|
|
|
|
| |
PiperOrigin-RevId: 175307445
|
|
|
|
|
|
|
|
|
| |
Currently, information about separate minibatches registered by `LossFunction`s
is private, and only the concatenation of all minibatch inputs is exposed
through the `inputs` property. This change adds `input_minibatches` and
`num_registered_minibatches` to `LossFunction` to expose this information.
PiperOrigin-RevId: 175306297
|
|
|
|
|
|
| |
when two Conv2D objects depend on the same Const.
PiperOrigin-RevId: 175305425
|
|
|
|
| |
PiperOrigin-RevId: 175304705
|
|
|
|
| |
PiperOrigin-RevId: 175304150
|
|
|
|
|
|
| |
I attempted to exercise compute_sum_on_device(IndexedSlices) via the DNNCLassifier test per reviewer's suggestion. This changelist is way to do it correctly. I verified that it indeed triggers the required codepath by adding logging and removing compute_sum_on_device(IndexedSlices) support.
PiperOrigin-RevId: 175303333
|
|
|
|
| |
PiperOrigin-RevId: 175302425
|
|
|
|
| |
PiperOrigin-RevId: 175297329
|
|
|
|
|
|
| |
fusion instructions.
PiperOrigin-RevId: 175295981
|
|
|
|
| |
PiperOrigin-RevId: 175277161
|
|
|
|
| |
PiperOrigin-RevId: 175275184
|
|
|
|
|
|
|
|
| |
the release process.
See #13872
PiperOrigin-RevId: 175261983
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tiling dimension corresponding to the number of vector registers in the tile
can be changed easily. Expose this value as a backend specific flag so that we
can experiment with it to find a good default value.
This CL also fixes a bug exposed by a variable tiling factor in the row major
GEMV implementation. This wasn't caught before because having tile_rows ==
tile_cols hides the bug.
PiperOrigin-RevId: 175258553
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
on it unless we lock each call to GetNext which is not preferable.
Each iterator now handles saving/restoring exhausted state.
As a guideline, we always reset the input_impl(s) when they get exhausted. This can be used as an indicator of exhausted-ness for non-terminal iterators. Also reduces memory overhead.
Each iterator should also handle calls to GetNextInternal when it is exhausted. Fixed this for some datasets.
Also fix a bug in dataset_serialization_test_base. We were not saving
a checkpoint after exhausting the iterator so verify_exhausted_iterator
was not really testing restoring an exhausted iterator.
PiperOrigin-RevId: 175253023
|
|
|
|
|
|
| |
This is necessary in providing bfloat support in GPU backend.
RELNOTES: bfloat support is now added to XLA infra.
PiperOrigin-RevId: 175252067
|
|
|
|
|
|
|
|
|
|
| |
Nodes inside of subcomputations (e.g. fusion computations) are always
printed by the HLO graph dumper. Before this change, the dumper was not
fully aware of this fact, leading it to mark as "deemphasized" (i.e.
draw as gray with a dashed outline) nodes that had no business of being
deemphasized.
PiperOrigin-RevId: 175247474
|
|
|
|
|
|
| |
Also, give PaddingConfig its own ToString format.
PiperOrigin-RevId: 175239832
|
|
|
|
|
|
|
|
|
| |
Previously we LOG(INFO)'ed the driver version, which meant it wouldn't
be printed unless you passed --logtostderr. But this information is
pretty important, especially since cudnnCreate failing is likely to be a
fatal error.
PiperOrigin-RevId: 175235628
|
|
|
|
| |
PiperOrigin-RevId: 175232587
|
|
|
|
|
|
|
|
| |
"Reduce metric variables" operation is a single operation across all metric variables, which means it is across all eval metrics. Previously, an update op for every eval metric was conditioned on a copy of overall "reduce metric variables" op. The latter was meant to be idempotent and thus the end result was supposed to be correct.
However, "reduce metric variables" op consists of a number of variable assignments and thus is not atomic. If execution of two "reduce metric variables" ops interleaves, then the end result might come out to be incorrect. This caused flakiness in replicate_model_fn_test.py. To fix the problem, there is now a single copy of the "reduce metric variables" and every eval metric is associated with that single instance.
PiperOrigin-RevId: 175232016
|
|
|
|
|
|
| |
Reduce, SelectAndScatter, Reverse, Slice, DynamicSlice, DynamicUpdateSlice, Transpose, BatchNormTraining, BatchNormInference, BatchNormGrad.
PiperOrigin-RevId: 175231463
|
|
|
|
| |
PiperOrigin-RevId: 175230217
|
|
|
|
| |
PiperOrigin-RevId: 175229944
|
|
|
|
| |
PiperOrigin-RevId: 175228315
|
|
|
|
| |
PiperOrigin-RevId: 175228264
|
|
|
|
| |
PiperOrigin-RevId: 175225805
|
|
|
|
| |
PiperOrigin-RevId: 175219920
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of assigning the pre and post optimization to a singleton xla::Compiler
object, prefer creating a short-lived CpuCompiler or a GpuCompiler instance on
the stack. Without this change, adding a second test case on the
(Cpu|Gpu)Compiler in the same process triggers a use-after-free.
(Btw, LLVMCompiler should really be spelled LlvmCompiler per Google C++ style,
I'll do that rename shortly).
PiperOrigin-RevId: 175218617
|
|
|
|
| |
PiperOrigin-RevId: 175217850
|
|
|
|
|
|
| |
differs from streaming_auc because it uses every prediction as a threshold rather than linearly spaced fixed thresholds.
PiperOrigin-RevId: 175217002
|
|
|
|
|
|
|
| |
and Recv into {Recv, RecvDone}. See operation_semantics.md for the updated
semantics.
PiperOrigin-RevId: 175216012
|
|
|
|
| |
PiperOrigin-RevId: 175213336
|
|
|
|
|
|
|
|
|
|
| |
the labels.
Clarify current backprop behavior.
Original bugfix by Alexandre Passos.
PiperOrigin-RevId: 175211803
|
|
|
|
| |
PiperOrigin-RevId: 175210678
|
|
|
|
|
|
|
|
|
|
| |
Previously, if you had a very large allocation, it would round up to the
next power of 2, and then, if this didn't fit in your GPU's available
memory, eat all remaining memory in the device.
Now we waste at most 128mb of memory in a large alloc.
PiperOrigin-RevId: 175209995
|
|
|
|
| |
PiperOrigin-RevId: 175207829
|
|
|
|
|
|
|
|
| |
have already been applied.
Make sure rewrites are idempotent by running the optimizer twice in unit tests.
PiperOrigin-RevId: 175206742
|
|
|
|
| |
PiperOrigin-RevId: 175205782
|
|
|
|
| |
PiperOrigin-RevId: 175204075
|
|
|
|
|
|
| |
types, not just Conv2D.
PiperOrigin-RevId: 175204002
|