Add ScopedAllocatorOptimizer in support of CollectiveReduce. - tensorflow

diff options

author	A. Unique TensorFlower <gardener@tensorflow.org>	2018-05-25 12:54:49 -0700
committer	TensorFlower Gardener <gardener@tensorflow.org>	2018-05-25 12:57:18 -0700
commit	0b522fd22b986704d1056254961cc7988ae182eb (patch)
tree	472c18f77c5e6b2c1dae0f1aacd6234f5e53436b /third_party/nanopb.BUILD
parent	ae0eb1b7f81f6d98e0503b9568c72feaa805e655 (diff)

Add ScopedAllocatorOptimizer in support of CollectiveReduce.

The efficiency of CollectiveReduce is greatly improved by merging multiple parallel reductions over smaller tensors into a single reduction over a larger tensor that is the concatentation of the smaller tensors. Because CollectiveReduce is essentially an element-wise array operation which operates on a 1-D reshape of the input tensor it is eligible for a ScopedAllocation optimization. The optimization works by looking for serially independent instances of CollectiveReduce that lie within the same name-scope tier and have the same control-flow (e.g. loop) embedding structure. Where two or more such nodes are found the upstream nodes that generate their inputs are modified to write their outputs into consecutive regions of a single tensor buffer maintained by a ScopedAllocator. The multiple CollectiveReduce nodes are then replaced by a single CollectiveReduce that operates in-place on the backing buffer. The effectiveness of the optimization depends on there being candidate CollectiveReduce nodes with these characteristics that become eligible for execution at close to the same time. If the name scope is too large, and includes nodes that become execution eligible at very different times, this graph rewrite could result in a slowdown. Note that this optimization is experimental: it is not guaranteed to work, especially for ops other than CollectiveReduce. PiperOrigin-RevId: 198089642

Diffstat (limited to 'third_party/nanopb.BUILD')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: