aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/compiler/xla/service/buffer_assignment.h
diff options
context:
space:
mode:
authorGravatar Justin Lebar <jlebar@google.com>2017-11-14 20:12:00 -0800
committerGravatar TensorFlower Gardener <gardener@tensorflow.org>2017-11-14 20:16:38 -0800
commit24a6162d2d5fad078157e2ec514f2fbb7ee0c676 (patch)
tree602b458f606e8e374f9d27862a5aa6ca4b67a381 /tensorflow/compiler/xla/service/buffer_assignment.h
parentc798e04fbd24809d0bc52d4b80f30e74418b8f4d (diff)
[XLA:GPU] Mark loads as invariant where appropriate.
If we read a value within an HLO that isn't modified by that HLO, mark it as invariant in LLVM IR. LLVM can perform more aggressive optimizations on invariant loads, but I don't expect this will help much in our case, because XLA already emits pretty aggressive noalias information on loads and stores. However, on nvidia GPUs, marking loads as invariant has the additional benefit of allowing LLVM to lower the load as ld.global.nc (equivalent to CUDA's __ldg). This instruction uses a special cache on the GPU, and it's usually faster than a regular load. PiperOrigin-RevId: 175774979
Diffstat (limited to 'tensorflow/compiler/xla/service/buffer_assignment.h')
-rw-r--r--tensorflow/compiler/xla/service/buffer_assignment.h6
1 files changed, 6 insertions, 0 deletions
diff --git a/tensorflow/compiler/xla/service/buffer_assignment.h b/tensorflow/compiler/xla/service/buffer_assignment.h
index 08a53af8ba..08a40bfeb2 100644
--- a/tensorflow/compiler/xla/service/buffer_assignment.h
+++ b/tensorflow/compiler/xla/service/buffer_assignment.h
@@ -327,6 +327,12 @@ class BufferAssignment {
return SharesSliceAtIndex(hlo_a, {}, hlo_b, {});
}
+ // Returns true if hlo_a and hlo_b both have at least one buffer assigned for
+ // their top-level and each of their nested shape indices, and if hlo_a's
+ // buffers are all different from hlo_b's buffers.
+ bool HaveDisjointSlices(const HloInstruction* hlo_a,
+ const HloInstruction* hlo_b) const;
+
// Returns the underlying points-to analysis used for this assignment.
const TuplePointsToAnalysis& points_to_analysis() const {
return liveness_->points_to_analysis();