Speed up TuplePointsToAnalysis.

This analysis is one of the most expensive parts of the HLO optimization pipeline. - Avoid one or two unnecessary hashtable lookups in PopulateDefinedBuffersAndAliases. - Add a mode to ShapeTree wherein we avoid copying Shapes. - Use templated functors rather than std::function in ShapeTree's iterators, thus avoiding the overhead of std::function. PiperOrigin-RevId: 160487485
author: Justin Lebar <jlebar@google.com> 2017-06-28 21:50:44 -0700
committer: TensorFlower Gardener <gardener@tensorflow.org> 2017-06-28 21:54:41 -0700
commit: 181816fe27684585bface6e2260a0ff1c890e3e9 (patch)
tree: 36dfeab13e57a6a5f37f34afabae7f8aafe37108 /tensorflow/compiler/xla/service/tuple_points_to_analysis.h
parent: e6a45475735ee8a31c7d6c8e28e9164cda7d1853 (diff)
1 files changed, 4 insertions, 1 deletions
diff --git a/tensorflow/compiler/xla/service/tuple_points_to_analysis.h b/tensorflow/compiler/xla/service/tuple_points_to_analysis.h
index a05b4c8ebc..be821d5154 100644
--- a/tensorflow/compiler/xla/service/tuple_points_to_analysis.h
+++ b/tensorflow/compiler/xla/service/tuple_points_to_analysis.h
@@ -48,7 +48,10 @@ namespace xla {
 // the corresponding buffer.
 class PointsToSet : public ShapeTree<std::vector<const LogicalBuffer*>> {
  public:
-  explicit PointsToSet(const Shape& shape)
+  // Construct our ShapeTree with a pointer rather than a reference to a Shape
+  // because this is very hot code, and copying (and then destroying) all these
+  // Shapes is slow.
+  explicit PointsToSet(const Shape* shape)
       : ShapeTree<std::vector<const LogicalBuffer*>>(shape),
         tuple_sources_(shape) {}
author	Justin Lebar <jlebar@google.com>	2017-06-28 21:50:44 -0700
committer	TensorFlower Gardener <gardener@tensorflow.org>	2017-06-28 21:54:41 -0700
commit	181816fe27684585bface6e2260a0ff1c890e3e9 (patch)
tree	36dfeab13e57a6a5f37f34afabae7f8aafe37108 /tensorflow/compiler/xla/service/tuple_points_to_analysis.h
parent	e6a45475735ee8a31c7d6c8e28e9164cda7d1853 (diff)