| Commit message (Collapse) | Author | Age |
... | |
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | | |
PiperOrigin-RevId: 207306967
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
ops only).
PiperOrigin-RevId: 207306198
|
| | | | | | | |
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
Do not add placeholders to the function body as XLA cannot compile them.
PiperOrigin-RevId: 207299427
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
implementation
Also increase test coverage for C64 a bit.
PiperOrigin-RevId: 207297946
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
--xla_hlo_profile is enabled.
I often find myself searching for the last profiling run in a
ReplayComputation log. This makes it much easier to find.
PiperOrigin-RevId: 207294644
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
PiperOrigin-RevId: 207294037
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
I don't think adding whitebox tests is necessary for the code that's checked in
today, but I'm working on a CL for which I'd prefer writing whitebox tests.
Also fix a minor issue with SymbolPredicate::ToString() where we were dropping
the must_be_true() bit.
PiperOrigin-RevId: 207289695
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
PiperOrigin-RevId: 207289283
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
It's unfortunate that this was only added in 9.1, but I haven't found a good
way of emulating the behavior on 9.0 without falling back to non-batched gemms.
PiperOrigin-RevId: 207286575
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
This is mostly a huge amount of plumbing just to call into the cublas functions.
blasGemmStridedBatched has been available since CUDA 8.0.
For autotuning we'd need cublasGemmStridedBatchedEx, which is new in CUDA 9.2
so I didn't wire that up yet.
PiperOrigin-RevId: 207285707
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
PiperOrigin-RevId: 207284323
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
PiperOrigin-RevId: 207283527
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
PiperOrigin-RevId: 207282495
|
|\ \ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | | |
PiperOrigin-RevId: 207278109
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
PiperOrigin-RevId: 207268708
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
addition to loop fusions.
PiperOrigin-RevId: 207253181
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
This gives a huge speedup for users of batchdot. This is a minimal implementation without autotuning and without support for strided batch gemm.
PiperOrigin-RevId: 207247740
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
PiperOrigin-RevId: 207238096
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
PiperOrigin-RevId: 207215423
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
PiperOrigin-RevId: 207215039
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
PiperOrigin-RevId: 207213865
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
PiperOrigin-RevId: 207210333
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
* Add link to updating scope on a running VM
* Add code formatting and Python syntax highlighting
* Clarify kwargs argument formatting
* Fix method name in docstring
PiperOrigin-RevId: 207204628
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
This became unnecessary with cl/206243319 "Implement constant buffer allocation
for XLA:GPU".
PiperOrigin-RevId: 207204478
|
| | | | | | | | |
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
non-public TF APIs
PiperOrigin-RevId: 207197647
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
PiperOrigin-RevId: 207195679
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
to record latency on each edge of dataset input pipeline.
PiperOrigin-RevId: 207190025
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
For logging copies, we can set the device_policy to DEVICE_PLACEMENT_WARN
PiperOrigin-RevId: 207186848
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
signal group_size_tensor_ready_ immediately, without initialization.
PiperOrigin-RevId: 207184621
|
|\ \ \ \ \ \ \ \
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
PiperOrigin-RevId: 207183550
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
PiperOrigin-RevId: 207183398
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
PiperOrigin-RevId: 207183038
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
PiperOrigin-RevId: 207179803
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
Pure refactor, in preparation for adding a higher level checkpoint management utility. This utility will also need to work with the Checkpoint proto, and globbing it on to saver.py seems dirty.
PiperOrigin-RevId: 207179646
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
PiperOrigin-RevId: 207176749
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
PiperOrigin-RevId: 207176261
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
PiperOrigin-RevId: 207176253
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
PiperOrigin-RevId: 207176147
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
grappler functions.
PiperOrigin-RevId: 207171072
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
PiperOrigin-RevId: 207170573
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
PiperOrigin-RevId: 207165297
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
binary predicates.
PiperOrigin-RevId: 207162491
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
PiperOrigin-RevId: 207152562
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
issue where the compiler isn't sure of the type when building for arm64 computers.
PiperOrigin-RevId: 207151595
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
This defines the semantics, and adds parser and shape inference support. Since support is not plumbed through the rest of the compiler here, multi-output reduce is still rejected by the HLO verifier, and is not exposed through XlaBuilder.
PiperOrigin-RevId: 207148035
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
PiperOrigin-RevId: 207147507
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
abb903df7a5998b33547c02e95f9fa47c00f31f4
PiperOrigin-RevId: 207145802
|
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | |
| | | | | | | | | |
As far as I can tell, `executable` is never nullptr.
PiperOrigin-RevId: 207141878
|