aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/compiler/xla/service/instruction_fusion.cc
Commit message (Collapse)AuthorAge
* [XLA] Move FusionQueue class declaration into separate headerGravatar A. Unique TensorFlower2018-10-04
| | | | PiperOrigin-RevId: 215783391
* [XLA] Migrate from gtl::FlatMap to absl::flat_hash_mapGravatar Benjamin Kramer2018-10-01
| | | | PiperOrigin-RevId: 215272497
* [XLA] Use a result cache to speed up InstructionFusion::CanFuseOnAllPaths()Gravatar Yuanzhong Xu2018-09-27
| | | | PiperOrigin-RevId: 214848216
* [XLA] A queue interface to allow fusion in different orders.Gravatar Yuanzhong Xu2018-09-12
| | | | PiperOrigin-RevId: 212674212
* [XLA] Rename all (Mutable)ArraySlice to absl::Span.Gravatar Tim Shen2018-08-30
| | | | PiperOrigin-RevId: 210998142
* [XLA] Update InstructionFusion::EffectivelyAtMostUnary for kIotaGravatar David Majnemer2018-08-29
| | | | | | It should behave like kBroadcast with respect to it being "effectively unary." PiperOrigin-RevId: 210795483
* [XLA] Add the xla interface for CollectivePermute.Gravatar A. Unique TensorFlower2018-08-28
| | | | PiperOrigin-RevId: 210576458
* [XLA] Unify spelling of 'fusible'Gravatar Benjamin Kramer2018-08-27
| | | | | | Of {fusable, fusile, fusible} my dictionary only knows about fusible. PiperOrigin-RevId: 210373347
* Remove HostCompute HLO.Gravatar Tong Shen2018-08-21
| | | | | | Now for host compute, we just emit SendToHost & RecvFromHost pairs, and use token to ensure dependency. PiperOrigin-RevId: 209671416
* [XLA] Switch to absl versions of the c_foo functions.Gravatar Justin Lebar2018-08-20
| | | | PiperOrigin-RevId: 209502513
* Automated rollback of commit 4a41f50648929197954d892559587cb76458d306Gravatar A. Unique TensorFlower2018-08-17
| | | | PiperOrigin-RevId: 209248552
* [XLA] Switch to absl versions of the c_foo functions.Gravatar Justin Lebar2018-08-17
| | | | PiperOrigin-RevId: 209247783
* [XLA] Add the xla interface for AllToAll.Gravatar A. Unique TensorFlower2018-08-08
| | | | PiperOrigin-RevId: 207971529
* [XLA] Add Scatter HLO.Gravatar A. Unique TensorFlower2018-08-01
| | | | PiperOrigin-RevId: 207045468
* Start implementation of Iota HLO.Gravatar Nick Desaulniers2018-07-20
| | | | PiperOrigin-RevId: 205447892
* [TF:XLA] Split select HLO into array- and tuple-select.Gravatar A. Unique TensorFlower2018-07-03
| | | | | | | | | | | | Array select and tuple-select already are handled separately in all backends and HLO passes: Array select is an elementwise operation. The shapes of the to operands have the same dimensions. Tuple select does not define its own output, but instead forwards the true- or false- operand based on a scalar predicate operand. This CL reflects this by adding a new kTupleSelect HLO. The XLA builder interface stays the same and dispatches based on the operand shapes. No change in the operation semantics. This CL just splits the existing select operation into two opcodes and preserves the existing semantics. HLO cost analysis is fixed to handle the two ops appropriately. PiperOrigin-RevId: 203180342
* Rename HLO opcode kGenerateToken to kAfterAll.Gravatar Mark Heffernan2018-06-25
| | | | | | | | Long term I think we want to require kAfterAll to take at least one token as operand so it cannot generate a token out of thin air, so kGenerateToken is no longer an appropriate name. Instead, a primordial token would be supplied some how in the entry computation, perhaps as a parameter, and then threaded to any side-effecting ops. NFC. PiperOrigin-RevId: 202079040
* Enable duplicating instructions with same input and output size in fusionGravatar A. Unique TensorFlower2018-06-22
| | | | PiperOrigin-RevId: 201669139
* [TF:XLA] Add Xor HLO operation.Gravatar A. Unique TensorFlower2018-06-21
| | | | | | This avoids lowering xor in terms of other bitwise ops and all backends have instructions for it anyway. PiperOrigin-RevId: 201597493
* [XLA] Switch PostOrder accessors to use std::vector instead of std::list.Gravatar Benjamin Kramer2018-06-15
| | | | | | | std::list is just hilariously inefficient and the postorder list creation has been rewritten not to not depend on splicing anymore so there's no need for the list. While there remove the old unused postorder list creation code. PiperOrigin-RevId: 200743677
* Add kGenerateToken HLO instruction.Gravatar Mark Heffernan2018-06-08
| | | | | | | | | | The new HLO instruction serves two purposes. (1) It generates a new token value. This is the only way to create tokens. (2) The operation is variadic, taking zero or more token operands. The operation acts as a join of its operands. I considered initially using a kConstant constant as a method to create new tokens, but this ran into problems because of expectations in backends regarding constants and their materialization. This CL enables creation of generate-token instructions, but the new instruction is not supported yet in any backend. PiperOrigin-RevId: 199836205
* Introduced kDomain HLO instruction set isolation to bound connected sets of ↵Gravatar A. Unique TensorFlower2018-05-29
| | | | | | | | instructions with similar attributes (ie, sharding). This CL simply adds the infrastructure, but leaves the wire-on to a separate CL. PiperOrigin-RevId: 198503625
* [XLA:GPU] Basic multi-output fusion for GPU.Gravatar A. Unique TensorFlower2018-05-24
| | | | | | Take a conservative approach and attempt multi-output fusion in cases where "regular" fusion is not an option. PiperOrigin-RevId: 197852598
* Refactor HloInstruction::Fuse and add a method for multi-output fusion.Gravatar A. Unique TensorFlower2018-05-16
| | | | PiperOrigin-RevId: 196813042
* [XLA] Add log1p/expm1Gravatar David Majnemer2018-05-09
| | | | | | | A new HLO seems prudent as it allows implementations to use fancy techniques to compute accurate results for small inputs. PiperOrigin-RevId: 196078115
* [XLA] Always be willing to duplicate widening kConvert instructions during ↵Gravatar Justin Lebar2018-05-07
| | | | | | | | | | | | fusion. This has the effect of pushing widening kConvert HLOs into consumers. This is what we want, because it means that the producer writes the narrower type (e.g. f16) and the consumer reads it and internally upcasts to the wider type (e.g. f32). This lets the producer and consumer both run faster, because they have to touch less memory. PiperOrigin-RevId: 195546910
* Simplify, test and document logic in instruction fusion that decides whether weGravatar A. Unique TensorFlower2018-04-26
| | | | | | allow fusion when an operation needs to be duplicated. PiperOrigin-RevId: 194429279
* Initial addition of CLZ HLOGravatar Jacques Pienaar2018-04-18
| | | | | | | * Adds the HLO op and lowering on CPU/GPU/evaluator; * This does not update the operation semantics; PiperOrigin-RevId: 193461989
* Automated g4 rollback of changelist 192180356Gravatar Dimitris Vardoulakis2018-04-18
| | | | PiperOrigin-RevId: 193427566
* Add opcode for new instruction that broadcasts degenerate dimensions.Gravatar Dimitris Vardoulakis2018-04-09
| | | | | | | Implicit broadcasts can be translated to the new instruction instead of a reshape-and-broadcast. Follow-up CLs will add support in UserComputation and the various backends. PiperOrigin-RevId: 192180356
* [XLA] Minor comment fixes in instruction_fusion.cc.Gravatar Justin Lebar2018-03-05
| | | | | | No functional change. PiperOrigin-RevId: 187852483
* [XLA] Add some plumbing, documentation, verification and shape inference for ↵Gravatar Sanjoy Das2018-02-16
| | | | | | | | | | | | | Gather Pretty much everything other than HLO verification and shape inference will fail for Gather with Unimplemented. Note that this CL is intentionally incomplete -- I figured it would be nicer to get some of the boiler-platey stuff out of the way early. Let me know if you want me to send in a larger but more complete CL instead. PiperOrigin-RevId: 186055521
* [TF:XLA] Adds HostCompute HLO - a pseudo-op to represent host-side computation.Gravatar A. Unique TensorFlower2018-02-16
| | | | PiperOrigin-RevId: 186047964
* Automated g4 rollback of changelist 180000981Gravatar A. Unique TensorFlower2018-01-02
| | | | PiperOrigin-RevId: 180581912
* Automated g4 rollback of changelist 179983419Gravatar A. Unique TensorFlower2017-12-23
| | | | PiperOrigin-RevId: 180000981
* Adds FFT for XLA: CPU via Eigen, GPU via cuFFT.Gravatar A. Unique TensorFlower2017-12-22
| | | | | | GPU support includes plan reuse with new scratch allocator per execution in fft_thunk. PiperOrigin-RevId: 179983419
* [XLA] Add BitcastConvert HLO op to enable bitwise operations onGravatar A. Unique TensorFlower2017-11-21
| | | | | | floating point types. PiperOrigin-RevId: 176610007
* [XLA] Adding kConditional opcode that represents a conditional HLO instruction.Gravatar A. Unique TensorFlower2017-11-15
| | | | PiperOrigin-RevId: 175919301
* Change for asynchronous Send and Recv by splitting Send into {Send, SendDone}Gravatar HyoukJoong Lee2017-11-10
| | | | | | | and Recv into {Recv, RecvDone}. See operation_semantics.md for the updated semantics. PiperOrigin-RevId: 175216012
* [XLA] Remove dead opcode kIndex.Gravatar Justin Lebar2017-10-30
| | | | PiperOrigin-RevId: 173987428
* [XLA:CPU] [XLA:GPU] Adds compiler support for C64 primitive type, including ↵Gravatar A. Unique TensorFlower2017-10-27
| | | | | | | | | | relevant elementwise unary and binary op lowering for CPU and GPU. We use a named LLVM struct "complex64", laid out the same as std::complex<float>. This named struct is accessed via the llvm::Module, which required changes to accessors of PrimitiveTypeToIrType & friends. Ops that require atan2 (in particular, angle and log) are only supported on GPU at this point. LLVM lacks a CPU intrinsic for atan or atan2, whereas libdevice provides this for GPU. PiperOrigin-RevId: 173676849
* [XLA] Remove dead kUpdate opcode.Gravatar Justin Lebar2017-10-25
| | | | PiperOrigin-RevId: 173462881
* [XLA] Add ShiftLeft, ShiftRightArithmetic, and ShiftRightLogical operators.Gravatar Peter Hawkins2017-10-13
| | | | PiperOrigin-RevId: 172091595
* [TF:XLA] Rename HloOpcode::kLogicalX to kXGravatar A. Unique TensorFlower2017-10-09
| | | | PiperOrigin-RevId: 171536686
* Add vlogging of HloModule before and after fusion.Gravatar Mark Heffernan2017-10-04
| | | | PiperOrigin-RevId: 171029054
* [XLA] Make HloModule::computations() return raw pointers.Gravatar Justin Lebar2017-09-29
| | | | | | | | | | | | | | | | | | Like HloComputation::instructions(), HloModule::computations() used to return a list of unique_ptrs. But this is an implementation detail that shouldn't be leaked into the public API. This patch also adds HloModule::MakeNonFusionComputations(), because many of the callers of computations() went on to filter out all the fusion computations. It would be possible to implement MakeNonFusionComputations() "in place" using a filtering iterator, but I don't think it's necessary -- we never have *that* many computations, and since many callers go on to copy the list of non-fusion computations, making it unconditionally a copy is simpler and avoids a footgun. PiperOrigin-RevId: 170529051
* [XLA] Add support for QuantizeAndDequantizeV2.Gravatar Chris Leary2017-09-25
| | | | PiperOrigin-RevId: 169955636
* [XLA] Move ReusesOperandElements() fusion check into the CPU/GPU subclasses.Gravatar A. Unique TensorFlower2017-09-13
| | | | | | This heuristic assumes an implementation of fusion that requires recomputing the producer, which is specific to those backends, rather than inherent to fusion. PiperOrigin-RevId: 168592936
* [XLA] Make some static functions in InstructionFusion members.Gravatar A. Unique TensorFlower2017-08-30
| | | | PiperOrigin-RevId: 167025880
* Minor bugfix to HloInstruction::MergeFusionInstruction, and allow ↵Gravatar A. Unique TensorFlower2017-08-27
| | | | | | InstructionFusion::Fuse to be overridden by derived classes. PiperOrigin-RevId: 166631382