aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/compiler/xla/service/hlo_computation.h
diff options
context:
space:
mode:
authorGravatar Mark Heffernan <meheff@google.com>2017-11-02 22:12:33 -0700
committerGravatar TensorFlower Gardener <gardener@tensorflow.org>2017-11-02 22:16:19 -0700
commit7bb2d57b0b051d1cf8dd74d3276bf5a452774172 (patch)
treed5b07beacebcc425454978eb87ffecfe728d4281 /tensorflow/compiler/xla/service/hlo_computation.h
parent8a7f5c47dcb71deb71df4a72f3cf829904c5a28e (diff)
Rewrite CopyInsertion to use module-scoped HloAliasAnalysis. The net effect (number of copies inserted) is roughly similar to the existing implementation, but the new implementation is much more general. The new implementation can handle entry argument buffer reuse with minimal modification, for example.
Some unnecessary copies are still added due to deficiencies in buffer assignment (b/62548313), but these can be removed when buffer assignment also uses HloAliasAnalysis. Also address a few issues uncovered with this cl: (1) For inplace dynamic slice in llvm backends, truncate do not wrap the slice. This matches the behavior of the non-inplace variant. (2) Disable SelectBetweenPredTuples test on GPU. The test introduces top-level buffer ambiguity which is not tolerated by the gpu backend. (3) When deserializing HLO form a proto, do not uniquify instruction names in fused computations. (4) In dataflow analysis, don't deallocate deleted HloValues during propagation. (5) In dataflow analysis, fix issue with live_out_of_computation property. PiperOrigin-RevId: 174423881
Diffstat (limited to 'tensorflow/compiler/xla/service/hlo_computation.h')
-rw-r--r--tensorflow/compiler/xla/service/hlo_computation.h10
1 files changed, 8 insertions, 2 deletions
diff --git a/tensorflow/compiler/xla/service/hlo_computation.h b/tensorflow/compiler/xla/service/hlo_computation.h
index 0754a9024c..f72a6e13c1 100644
--- a/tensorflow/compiler/xla/service/hlo_computation.h
+++ b/tensorflow/compiler/xla/service/hlo_computation.h
@@ -152,12 +152,18 @@ class HloComputation {
// computation_map: a map from computation name to HloComputation*. This map
// must contain all computations which the newly constructed computation
// calls.
- // fusion_instruction: if non-null then the newly created computation will be
+ // add_fused_computation: A function to call to add a fused
+ // computation. Used (clearly) when the instruction is a fusion
+ // instruction.
+ // fusion_instruction: if non-null then the newly created computation will
+ // be
// constructed as a fused computation with this instruction as its fusion
// parent.
static StatusOr<std::unique_ptr<HloComputation>> CreateFromProto(
HloModule* module, const HloComputationProto& proto,
- tensorflow::gtl::FlatMap<string, HloComputation*>* computation_map,
+ const tensorflow::gtl::FlatMap<string, HloComputation*>& computation_map,
+ const std::function<void(std::unique_ptr<HloComputation>)>&
+ add_fused_computation,
HloInstruction* fusion_instruction = nullptr);
// Gets the instructions in this computation.