diff options
author | Justin Lebar <jlebar@google.com> | 2017-11-14 20:12:00 -0800 |
---|---|---|
committer | TensorFlower Gardener <gardener@tensorflow.org> | 2017-11-14 20:16:38 -0800 |
commit | 24a6162d2d5fad078157e2ec514f2fbb7ee0c676 (patch) | |
tree | 602b458f606e8e374f9d27862a5aa6ca4b67a381 /tensorflow/compiler/xla/service/buffer_assignment.h | |
parent | c798e04fbd24809d0bc52d4b80f30e74418b8f4d (diff) |
[XLA:GPU] Mark loads as invariant where appropriate.
If we read a value within an HLO that isn't modified by that HLO, mark
it as invariant in LLVM IR.
LLVM can perform more aggressive optimizations on invariant loads, but I
don't expect this will help much in our case, because XLA already emits
pretty aggressive noalias information on loads and stores.
However, on nvidia GPUs, marking loads as invariant has the additional
benefit of allowing LLVM to lower the load as ld.global.nc (equivalent
to CUDA's __ldg). This instruction uses a special cache on the GPU, and
it's usually faster than a regular load.
PiperOrigin-RevId: 175774979
Diffstat (limited to 'tensorflow/compiler/xla/service/buffer_assignment.h')
-rw-r--r-- | tensorflow/compiler/xla/service/buffer_assignment.h | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/tensorflow/compiler/xla/service/buffer_assignment.h b/tensorflow/compiler/xla/service/buffer_assignment.h index 08a53af8ba..08a40bfeb2 100644 --- a/tensorflow/compiler/xla/service/buffer_assignment.h +++ b/tensorflow/compiler/xla/service/buffer_assignment.h @@ -327,6 +327,12 @@ class BufferAssignment { return SharesSliceAtIndex(hlo_a, {}, hlo_b, {}); } + // Returns true if hlo_a and hlo_b both have at least one buffer assigned for + // their top-level and each of their nested shape indices, and if hlo_a's + // buffers are all different from hlo_b's buffers. + bool HaveDisjointSlices(const HloInstruction* hlo_a, + const HloInstruction* hlo_b) const; + // Returns the underlying points-to analysis used for this assignment. const TuplePointsToAnalysis& points_to_analysis() const { return liveness_->points_to_analysis(); |