diff options
author | Justin Lebar <jlebar@google.com> | 2018-03-27 17:16:31 -0700 |
---|---|---|
committer | TensorFlower Gardener <gardener@tensorflow.org> | 2018-03-27 17:23:12 -0700 |
commit | 50e1888fa89bce621e988a92ede3dc362e37b248 (patch) | |
tree | 8f845fa4d3bb07be27a8eea6473f4995550a39d1 | |
parent | e4b367cef5f89d6741dac223c91a11fda9ca63ae (diff) |
[XLA] Assert that all buffers and sub-buffers passed to XLA have an explicit pointer.
In the past, we allowed sub-buffers to be null if the top-level tuple
was non-null.
This doesn't actually work well on the GPU: For ops that are implemented
using cudnn or cublas, we have to have a pointer to the sub-buffer on
the host in order to make the call. Retrieving it from the GPU in an
efficient manner is complicated, and the best we can come up with isn't
all that efficient (fundamentally having to pull data down from the GPU
blocks the ability of the CPU to "run ahead" of the GPU).
Since TF wasn't making use of our flexibility *anyway*, we add the
requirement that XLA be given non-null pointers to all sub-buffers.
Changes to the XLA:GPU backend to take advantage of this will come
separately.
PiperOrigin-RevId: 190700021
-rw-r--r-- | tensorflow/compiler/xla/service/gpu/gpu_executable.cc | 24 |
1 files changed, 15 insertions, 9 deletions
diff --git a/tensorflow/compiler/xla/service/gpu/gpu_executable.cc b/tensorflow/compiler/xla/service/gpu/gpu_executable.cc index 04b37d913e..28f9344795 100644 --- a/tensorflow/compiler/xla/service/gpu/gpu_executable.cc +++ b/tensorflow/compiler/xla/service/gpu/gpu_executable.cc @@ -267,16 +267,22 @@ StatusOr<std::unique_ptr<ShapedBuffer>> GpuExecutable::ExecuteOnStream( ++i) { const BufferAllocation& allocation = assignment_->GetAllocation(i); if (allocation.is_entry_computation_parameter()) { - // The caller must give us a buffer for ShapeIndex {} of every parameter. - // It can optionally give us a buffer for other ShapeIndices, but we - // ignore them: Because we can't rely on these sub-buffers' addresses - // being available, our generated code can't use them. Instead, it must - // chase pointers starting at the tuple root. - if (allocation.param_shape_index().empty()) { - auto param_no = allocation.parameter_number(); - buffer_allocations_builder.RegisterBuffer( - i, arguments[param_no]->root_buffer()); + auto param_no = allocation.parameter_number(); + se::DeviceMemoryBase buffer = + arguments[param_no]->buffer(allocation.param_shape_index()); + + // All top-level buffers and sub-buffers must have an explicit, non-null + // pointer, except for zero-sized buffers, which may be null. + if (buffer.is_null() && buffer.size() > 0) { + return FailedPrecondition( + "Cannot run XLA computation because pointer to (sub-)buffer at " + "index %s of parameter %lld was null. All pointers to " + "(sub-)buffers must not be null, unless the (sub-)buffer has zero " + "elements.", + allocation.param_shape_index().ToString().c_str(), param_no); } + + buffer_allocations_builder.RegisterBuffer(i, buffer); } } se::StreamExecutor* executor = run_options->stream()->parent(); |