| Commit message (Collapse) | Author | Age |
|
|
|
| |
PiperOrigin-RevId: 215783391
|
|
|
|
| |
PiperOrigin-RevId: 215272497
|
|
|
|
| |
PiperOrigin-RevId: 214848216
|
|
|
|
| |
PiperOrigin-RevId: 212674212
|
|
|
|
| |
PiperOrigin-RevId: 210998142
|
|
|
|
|
|
| |
It should behave like kBroadcast with respect to it being "effectively unary."
PiperOrigin-RevId: 210795483
|
|
|
|
| |
PiperOrigin-RevId: 210576458
|
|
|
|
|
|
| |
Of {fusable, fusile, fusible} my dictionary only knows about fusible.
PiperOrigin-RevId: 210373347
|
|
|
|
|
|
| |
Now for host compute, we just emit SendToHost & RecvFromHost pairs, and use token to ensure dependency.
PiperOrigin-RevId: 209671416
|
|
|
|
| |
PiperOrigin-RevId: 209502513
|
|
|
|
| |
PiperOrigin-RevId: 209248552
|
|
|
|
| |
PiperOrigin-RevId: 209247783
|
|
|
|
| |
PiperOrigin-RevId: 207971529
|
|
|
|
| |
PiperOrigin-RevId: 207045468
|
|
|
|
| |
PiperOrigin-RevId: 205447892
|
|
|
|
|
|
|
|
|
|
|
|
| |
Array select and tuple-select already are handled separately in all backends and HLO passes: Array select is an elementwise operation. The shapes of the to operands have the same dimensions. Tuple select does not define its own output, but instead forwards the true- or false- operand based on a scalar predicate operand.
This CL reflects this by adding a new kTupleSelect HLO. The XLA builder interface stays the same and dispatches based on the operand shapes.
No change in the operation semantics. This CL just splits the existing select operation into two opcodes and preserves the existing semantics.
HLO cost analysis is fixed to handle the two ops appropriately.
PiperOrigin-RevId: 203180342
|
|
|
|
|
|
|
|
| |
Long term I think we want to require kAfterAll to take at least one token as operand so it cannot generate a token out of thin air, so kGenerateToken is no longer an appropriate name. Instead, a primordial token would be supplied some how in the entry computation, perhaps as a parameter, and then threaded to any side-effecting ops.
NFC.
PiperOrigin-RevId: 202079040
|
|
|
|
| |
PiperOrigin-RevId: 201669139
|
|
|
|
|
|
| |
This avoids lowering xor in terms of other bitwise ops and all backends have instructions for it anyway.
PiperOrigin-RevId: 201597493
|
|
|
|
|
|
|
| |
std::list is just hilariously inefficient and the postorder list creation has
been rewritten not to not depend on splicing anymore so there's no need for the
list. While there remove the old unused postorder list creation code.
PiperOrigin-RevId: 200743677
|
|
|
|
|
|
|
|
|
|
| |
The new HLO instruction serves two purposes. (1) It generates a new token value. This is the only way to create tokens. (2) The operation is variadic, taking zero or more token operands. The operation acts as a join of its operands.
I considered initially using a kConstant constant as a method to create new tokens, but this ran into problems because of expectations in backends regarding constants and their materialization.
This CL enables creation of generate-token instructions, but the new instruction is not supported yet in any backend.
PiperOrigin-RevId: 199836205
|
|
|
|
|
|
|
|
| |
instructions with similar attributes (ie, sharding).
This CL simply adds the infrastructure, but leaves the wire-on to a separate CL.
PiperOrigin-RevId: 198503625
|
|
|
|
|
|
| |
Take a conservative approach and attempt multi-output fusion in cases where "regular" fusion is not an option.
PiperOrigin-RevId: 197852598
|
|
|
|
| |
PiperOrigin-RevId: 196813042
|
|
|
|
|
|
|
| |
A new HLO seems prudent as it allows implementations to use fancy techniques to
compute accurate results for small inputs.
PiperOrigin-RevId: 196078115
|
|
|
|
|
|
|
|
|
|
|
|
| |
fusion.
This has the effect of pushing widening kConvert HLOs into consumers.
This is what we want, because it means that the producer writes the
narrower type (e.g. f16) and the consumer reads it and internally
upcasts to the wider type (e.g. f32). This lets the producer and
consumer both run faster, because they have to touch less memory.
PiperOrigin-RevId: 195546910
|
|
|
|
|
|
| |
allow fusion when an operation needs to be duplicated.
PiperOrigin-RevId: 194429279
|
|
|
|
|
|
|
| |
* Adds the HLO op and lowering on CPU/GPU/evaluator;
* This does not update the operation semantics;
PiperOrigin-RevId: 193461989
|
|
|
|
| |
PiperOrigin-RevId: 193427566
|
|
|
|
|
|
|
| |
Implicit broadcasts can be translated to the new instruction instead of a reshape-and-broadcast.
Follow-up CLs will add support in UserComputation and the various backends.
PiperOrigin-RevId: 192180356
|
|
|
|
|
|
| |
No functional change.
PiperOrigin-RevId: 187852483
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gather
Pretty much everything other than HLO verification and shape inference will fail
for Gather with Unimplemented.
Note that this CL is intentionally incomplete -- I figured it would be nicer to
get some of the boiler-platey stuff out of the way early. Let me know if you
want me to send in a larger but more complete CL instead.
PiperOrigin-RevId: 186055521
|
|
|
|
| |
PiperOrigin-RevId: 186047964
|
|
|
|
| |
PiperOrigin-RevId: 180581912
|
|
|
|
| |
PiperOrigin-RevId: 180000981
|
|
|
|
|
|
| |
GPU support includes plan reuse with new scratch allocator per execution in fft_thunk.
PiperOrigin-RevId: 179983419
|
|
|
|
|
|
| |
floating point types.
PiperOrigin-RevId: 176610007
|
|
|
|
| |
PiperOrigin-RevId: 175919301
|
|
|
|
|
|
|
| |
and Recv into {Recv, RecvDone}. See operation_semantics.md for the updated
semantics.
PiperOrigin-RevId: 175216012
|
|
|
|
| |
PiperOrigin-RevId: 173987428
|
|
|
|
|
|
|
|
|
|
| |
relevant elementwise unary and binary op lowering for CPU and GPU.
We use a named LLVM struct "complex64", laid out the same as std::complex<float>. This named struct is accessed via the llvm::Module, which required changes to accessors of PrimitiveTypeToIrType & friends.
Ops that require atan2 (in particular, angle and log) are only supported on GPU at this point. LLVM lacks a CPU intrinsic for atan or atan2, whereas libdevice provides this for GPU.
PiperOrigin-RevId: 173676849
|
|
|
|
| |
PiperOrigin-RevId: 173462881
|
|
|
|
| |
PiperOrigin-RevId: 172091595
|
|
|
|
| |
PiperOrigin-RevId: 171536686
|
|
|
|
| |
PiperOrigin-RevId: 171029054
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like HloComputation::instructions(), HloModule::computations() used to
return a list of unique_ptrs. But this is an implementation detail that
shouldn't be leaked into the public API.
This patch also adds HloModule::MakeNonFusionComputations(), because
many of the callers of computations() went on to filter out all the
fusion computations.
It would be possible to implement MakeNonFusionComputations() "in place"
using a filtering iterator, but I don't think it's necessary -- we never
have *that* many computations, and since many callers go on to copy the
list of non-fusion computations, making it unconditionally a copy is
simpler and avoids a footgun.
PiperOrigin-RevId: 170529051
|
|
|
|
| |
PiperOrigin-RevId: 169955636
|
|
|
|
|
|
| |
This heuristic assumes an implementation of fusion that requires recomputing the producer, which is specific to those backends, rather than inherent to fusion.
PiperOrigin-RevId: 168592936
|
|
|
|
| |
PiperOrigin-RevId: 167025880
|
|
|
|
|
|
| |
InstructionFusion::Fuse to be overridden by derived classes.
PiperOrigin-RevId: 166631382
|