aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/compiler/xla
Commit message (Collapse)AuthorAge
* Support kDomain instructions in the HloMatcher frameworkGravatar A. Unique TensorFlower2018-10-10
| | | | PiperOrigin-RevId: 216525613
* Support removing side effecting instructions with ↵Gravatar A. Unique TensorFlower2018-10-10
| | | | | | | | | | RemoveInstructionAndUnusedOperands If the caller explicitly asks to remove a side effceting instruction (e.g. all-reduce) then we should respect it instead of silently ignoring the request. PiperOrigin-RevId: 216505133
* Change user_set to an absl::flat_hash_set in HloInstruction.Gravatar A. Unique TensorFlower2018-10-10
| | | | | absl::flat_hash_set have better performance than a std::unordered_set, which can improve overall compile time. PiperOrigin-RevId: 216498767
* [XLA] Add documentation and HLO-level support for multi-value sort.Gravatar Michael Kuperstein2018-10-09
| | | | | | No support in any of the backends, and not yet exposed through XlaBuilder. PiperOrigin-RevId: 216465753
* [XLA:GPU] Use CudnnConvKind in more places.Gravatar Justin Lebar2018-10-09
| | | | | | No functional change. PiperOrigin-RevId: 216451881
* [XLA] Cleanup: Make AllocationTracker::Resolve const.Gravatar A. Unique TensorFlower2018-10-09
| | | | | | | So that when resolving some global data, we don't have to worry whether "Resolve" is going to mutate the real data. PiperOrigin-RevId: 216448145
* [XLA:GPU] Elide the SequentialThunk when emitting scatter with no copyGravatar Benjamin Kramer2018-10-09
| | | | | | | | We have a 1-element thunk sequence if we're not copying. That's still two thunks and hlo profiling gets confused if it sees two thunks for the same instruction and one of them claims to be the whole instruction. PiperOrigin-RevId: 216448063
* [XLA] Added xla::CreateModuleFromProto(...) combining loading moduleGravatar A. Unique TensorFlower2018-10-09
| | | | | | from proto and verifying it with HloVerifier. PiperOrigin-RevId: 216447947
* [XLA] Allow scatter to share the operand buffer with the outputGravatar Benjamin Kramer2018-10-09
| | | | | | This avoids a copy. PiperOrigin-RevId: 216437329
* [XLA:GPU] Pattern match atomic "apply" into an atomic storeGravatar Benjamin Kramer2018-10-09
| | | | | | Otherwise we'd emit a CAS loop. PiperOrigin-RevId: 216421161
* [XLA:GPU] Add an implementation of scatter for GPUGravatar Benjamin Kramer2018-10-09
| | | | | | | | | | | | This simple has a kernel that runs on every element of the updates tensor, figure out the right indices to perform the update, and applies it with an atomic operation. Currently we emit a CAS for plain (i.e. non-add) updates, which is inefficient. Also TuplePointsToAnalysis doesn't know that it should alias the operand and output buffers of a scatter, which would avoid a copy. PiperOrigin-RevId: 216412467
* Fixes typo in Sort description.Gravatar Tayo Oguntebi2018-10-09
| | | | PiperOrigin-RevId: 216375421
* Correctly pre-reserve visit state in HloInstruction::PostOrderDFSGravatar A. Unique TensorFlower2018-10-09
| | | | | | | | | | Previously we pre-reserverd the visit state based on the number of instructions but then started to index it with the instruction unique ID what can be larger then the instruction count. This resulted in some very expensive re-allocations what can be eliminated by reserving the correctly sized buffer. PiperOrigin-RevId: 216369849
* Automated rollback of commit 375c109659d2d0e6265447dffdeb460693b3cccfGravatar A. Unique TensorFlower2018-10-09
| | | | PiperOrigin-RevId: 216350134
* Enable support for PRED values in KeyValueSort for the HloEvaluator.Gravatar Adrian Kuegel2018-10-09
| | | | PiperOrigin-RevId: 216315110
* [XLA] Introduce input/output alias config.Gravatar Yunxing Dai2018-10-08
| | | | | | | | - This CL intruduces input/output alias config in HLO module that allows any HLO pass to configure it. Once the alias_config is set, each backend needs to follow the contract during execution time to make sure the input and output are indeed aliased. - Copy insertion / buffer assignment and alias analysis has been updated to correctly honor the config and avoid any possible liveness interference. PiperOrigin-RevId: 216299501
* [XLA] Make overly-specific ShapeUtil predicate a little more general.Gravatar Chris Leary2018-10-08
| | | | PiperOrigin-RevId: 216263039
* [XLA] Simplify loop nesting in HandleConvolutionGravatar David Majnemer2018-10-08
| | | | | | | | | | | | | | | The calculation of a spatial coordinate in the kernel and activations is not dependent on which part of the contracted dimension (input feature) we are in. Rather than nesting the loops, the loops can be siblings: - One loop over spatial dimensions - One loop over the input feature group This reduces the nesting depth which makes the code a little more readable and might be slightly faster due work invariant in the spatial loop getting hoisted out. PiperOrigin-RevId: 216255839
* Add more logging to the convolution transformations.Gravatar Tim Shen2018-10-08
| | | | PiperOrigin-RevId: 216252980
* Add custom call with layout constraints.Gravatar Mark Heffernan2018-10-08
| | | | | | Add a variant of CustomCall which specifies arbitrary layout constraints on the operands and result. The existing non-layout-constrained CustomCall is changed to have no layout preference and can now be assigned arbitrary layouts by layout assignment. PiperOrigin-RevId: 216249615
* Merge pull request #22303 from JuliaComputing:kf/broadcastshapevalGravatar TensorFlower Gardener2018-10-08
|\ | | | | | | PiperOrigin-RevId: 216228494
* | Improve const correctness of HloDomainMapGravatar A. Unique TensorFlower2018-10-08
| | | | | | | | PiperOrigin-RevId: 216189458
* | [XLA] Add base and window dilation support to ReduceWindowGravatar David Majnemer2018-10-06
| | | | | | | | PiperOrigin-RevId: 216041507
* | [XLA:GPU] Remove hidden flag for disabling heuristic layout assignment.Gravatar Justin Lebar2018-10-05
| | | | | | | | | | | | | | Heuristic NCHW/NHWC layout assignment works great; we've never had to flip this flag. Might as well remove it and simplify things a bit. PiperOrigin-RevId: 215989807
* | [XLA:GPU] Use a struct for the return value of ↵Gravatar Justin Lebar2018-10-05
| | | | | | | | | | | | | | | | | | | | | | | | CudnnConvolutionAlgorithmPicker::PickBestAlgorithm. Using a struct lets us return additional data -- namely, the elapsed time to run the best algo -- without adding a fourth entry to the tuple, which would be confusing. No functional change. PiperOrigin-RevId: 215987795
* | [XLA] Use the highest possible precision for large Iota inputs.Gravatar Blake Hechtman2018-10-05
| | | | | | | | PiperOrigin-RevId: 215957327
* | [XLA] Extend the HLO verifier to check that non-layout-changing instructionsGravatar Bixia Zheng2018-10-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | preserve operand layouts. Add an std::function member to the HloVerifier for a backend to specify the function object used to determine whether an instruction can change layouts. Use the function object to find out the non-layout-changing instructions and check that such instructions should produce results with the same layouts as its operands. Add test cases. PiperOrigin-RevId: 215941282
* | Update XlaSort to match the underlying HLO.Gravatar A. Unique TensorFlower2018-10-05
| | | | | | | | PiperOrigin-RevId: 215917470
* | Use absl::Span for HloModuleGroupMetadataGravatar HyoukJoong Lee2018-10-05
| | | | | | | | PiperOrigin-RevId: 215905026
* | [XLA:GPU] Fix old-ptxas-version detection logic.Gravatar Justin Lebar2018-10-04
| | | | | | | | | | | | | | | | This was completely broken for CUDA versions > 9 and resulted in spurious warnings. Reported in #22706#issuecomment-426861394 -- thank you! PiperOrigin-RevId: 215841354
* | Rename "Inliner" to "MapInliner".Gravatar Mark Heffernan2018-10-04
| | | | | | | | PiperOrigin-RevId: 215801897
* | [TF:XLA] Improve the accounting for subcomputations in the List scheduler to ↵Gravatar Dimitris Vardoulakis2018-10-04
| | | | | | | | | | | | avoid double-counting. PiperOrigin-RevId: 215795640
* | Few more fixes for issued in parsing invalid HLO module proto.Gravatar A. Unique TensorFlower2018-10-04
| | | | | | | | PiperOrigin-RevId: 215794086
* | [XLA] Move FusionQueue class declaration into separate headerGravatar A. Unique TensorFlower2018-10-04
| | | | | | | | PiperOrigin-RevId: 215783391
* | Implement LiteralBase::Slice for all primitive typeGravatar A. Unique TensorFlower2018-10-04
| | | | | | | | PiperOrigin-RevId: 215764305
* | Automated rollback of commit f22037abf5a6f4581f5fb6013f72f91747f22965Gravatar A. Unique TensorFlower2018-10-04
| | | | | | | | PiperOrigin-RevId: 215757701
* | Remove CHECKs from HloInstruction constructors.Gravatar Mark Heffernan2018-10-04
| | | | | | | | | | | | | | | | Move these checks to RET_CHECKs in the HloVerifier. Added a new visitor class InstructionVerifier inside of hlo_verifier.cc for handling these random non-result-shape verifications. PiperOrigin-RevId: 215745043
* | Improve the performance of the ListMemorySchedulerGravatar A. Unique TensorFlower2018-10-04
| | | | | | | | | | | | | | | | This CL replaces a std::unordered_map with an absl::flat_hash_map and removes an unnecessary map lookup. This two change can improve the performance of the scheduler on large graphs by up to 2x. PiperOrigin-RevId: 215707921
* | [XLA] Update Tf2Xla bridge to use Scatter HLO.Gravatar A. Unique TensorFlower2018-10-03
| | | | | | | | PiperOrigin-RevId: 215687800
* | [XLA] Delete IsInplaceSlice.Gravatar Yuanzhong Xu2018-10-03
| | | | | | | | PiperOrigin-RevId: 215681153
* | [XLA] Fix handling of tuple constants in HLO constant folding.Gravatar Peter Hawkins2018-10-03
| | | | | | | | PiperOrigin-RevId: 215676675
* | [XLA] Add a size limit to the constant folder to avoid forming giant ↵Gravatar Peter Hawkins2018-10-03
| | | | | | | | | | | | constants during compilation. PiperOrigin-RevId: 215663002
* | [TF:XLA] Improve the accounting for subcomputations in the heap simulator.Gravatar Dimitris Vardoulakis2018-10-03
| | | | | | | | | | | | | | | | Subtract the size of the aliased buffers from the subcomputation estimate instead of from the current computation. This way, the memory estimate for the current computation is more accurate. For the newly added test, the heap simulation calculates 48 bytes at head instead of the correct 64 bytes. PiperOrigin-RevId: 215653047
* | [XLA] Revise the way to express a CPU specific test.Gravatar Bixia Zheng2018-10-03
| | | | | | | | | | | | | | Use #ifdef XLA_TEST_BACKEND_CPU to protect the test instead of disabling it for all the other backends except for the CPU backend. PiperOrigin-RevId: 215651036
* | [XLA] Disable a test for layout changing elementwise operations.Gravatar Bixia Zheng2018-10-03
| | | | | | | | | | | | | | | | | | Rename the test to make it obvious that it is for testing the codegen correctness in handling layout changing elementwise operations. Keep the test only for the CPU backend. PiperOrigin-RevId: 215630611
* | Fix handling of tuples in CreateCopyWithNewLayout.Gravatar A. Unique TensorFlower2018-10-03
| | | | | | | | | | | | | | | | | | | | | | If the layout of a single tensor in a tuple is different from its use, then CreateCopyWithNewLayout will do a deep copy of the entire tuple. Not only does this operation create unnecessary copies of elements where the layout is the same, it will throw an error if the tuple contains elements like token[] that cannot be copied. As a result, layout assignment on TPU occassionally causes mysterious compilation failures for code that runs correctly on CPU and GPU. PiperOrigin-RevId: 215615731
* | [XLA] In the HLO parser, give the module a non-empty default name.Gravatar A. Unique TensorFlower2018-10-02
| | | | | | | | | | | | Otherwise, when parsing a single instruction, the parsed module doesn't have a name, which won't pass the hlo verifier check. PiperOrigin-RevId: 215519412
* | [XLA:CPU] Re-enable the inliner pass in the cpu compiler.Gravatar A. Unique TensorFlower2018-10-02
| | | | | | | | PiperOrigin-RevId: 215517752
* | [XLA] Modify the function that determines whether an instruction can changeGravatar Bixia Zheng2018-10-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | layout so that it can be used by the HLO verifier. Change the function to a static member function of the LayoutAssignment class. Add an std::function member to LayoutAssignment to store the function object passed down from the backend compiler class and use it to decide whether an instruction can change layouts. Fix affected test cases. PiperOrigin-RevId: 215515611
* | [XLA] Merge the single instruction parsing and the full module parsing in ↵Gravatar A. Unique TensorFlower2018-10-02
| | | | | | | | | | | | one function. PiperOrigin-RevId: 215501702