aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/compiler/xla/shape_tree.h
Commit message (Collapse)AuthorAge
* Avoid directly constructing vector iterators from pointers;Gravatar A. Unique TensorFlower2018-09-08
| | | | | | that isn't part of their public API. PiperOrigin-RevId: 212123326
* Change headers to directly include absl::Span, and clean up the buildGravatar Tim Shen2018-08-30
| | | | | | dependencies as well. PiperOrigin-RevId: 211038094
* New XLA API to launch a program.Gravatar Yunxing Dai2018-08-29
| | | | | | | 1. Propose a new API with ability to do input/output. 2. Start to enable ABSL in TF's codebase. PiperOrigin-RevId: 210783617
* [XLA] gtl::optional->absl::optionalGravatar Yunxing Dai2018-08-21
| | | | PiperOrigin-RevId: 209686671
* [XLA] Use absl::make_unique instead of xla::MakeUnique.Gravatar Justin Lebar2018-08-20
| | | | | | Same for WrapUnique. PiperOrigin-RevId: 209531124
* - Use InlinedVector for ShapeIndex.Gravatar Yunxing Dai2018-07-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Use a separate IndexTable for lookups. This reduces the number of cachelines needed for storing ShapeTreeNodes. - Set up benchmark for flat tuples, useful as a benchmark for future optimizations. name old time/op new time/op delta BM_Construct/2/8 8.34?s ? 1% 7.57?s ? 2% -9.26% (p=0.008 n=5+5) BM_Construct/1/1000 143?s ? 1% 132?s ? 1% -7.29% (p=0.008 n=5+5) BM_ConstructUnowned/2/8 2.18?s ? 4% 1.31?s ? 1% -39.99% (p=0.008 n=5+5) BM_ConstructUnowned/1/1000 23.0?s ? 7% 15.1?s ? 1% -34.47% (p=0.008 n=5+5) BM_Copy/2/8 1.52?s ? 5% 0.37?s ? 1% -76.01% (p=0.008 n=5+5) BM_Copy/1/1000 18.7?s ? 3% 4.9?s ? 2% -73.85% (p=0.008 n=5+5) BM_Move/2/8 0.03ns ? 2% 13.42ns ? 1% +40877.10% (p=0.016 n=4+5) BM_Move/1/1000 0.03ns ? 0% 13.54ns ? 3% +40930.30% (p=0.016 n=4+5) BM_ForEach/2/8 26.4ns ? 1% 27.9ns ? 2% +5.77% (p=0.008 n=5+5) BM_ForEach/1/1000 271ns ? 1% 273ns ? 0% +0.81% (p=0.016 n=5+4) BM_Iterate/2/8 25.5ns ? 3% 23.9ns ? 8% ~ (p=0.151 n=5+5) BM_Iterate/1/1000 272ns ? 2% 271ns ? 1% ~ (p=0.984 n=5+5) name old allocs/op new allocs/op delta BM_Construct/2/8 373 ? 0% 276 ? 0% -26.01% (p=0.008 n=5+5) BM_Construct/1/1000 5.00k ? 0% 4.00k ? 0% -20.00% (p=0.008 n=5+5) BM_ConstructUnowned/2/8 99.0 ? 0% 2.0 ? 0% -97.98% (p=0.008 n=5+5) BM_ConstructUnowned/1/1000 1.00k ? 0% 0.00k ? 0% -99.80% (p=0.008 n=5+5) BM_Copy/2/8 105 ? 0% 19 ? 0% -81.90% (p=0.008 n=5+5) BM_Copy/1/1000 1.31k ? 0% 0.25k ? 0% -80.84% (p=0.008 n=5+5) BM_Move/2/8 23.0 ? 0% 17.0 ? 0% -26.09% (p=0.008 n=5+5) BM_Move/1/1000 313 ? 0% 250 ? 0% -20.13% (p=0.008 n=5+5) BM_ForEach/2/8 23.0 ? 0% 17.0 ? 0% -26.09% (p=0.008 n=5+5) BM_ForEach/1/1000 313 ? 0% 250 ? 0% -20.13% (p=0.008 n=5+5) BM_Iterate/2/8 23.0 ? 0% 17.0 ? 0% -26.09% (p=0.008 n=5+5) BM_Iterate/1/1000 313 ? 0% 250 ? 0% -20.13% (p=0.008 n=5+5) PiperOrigin-RevId: 205154861
* BEGIN_PUBLICGravatar A. Unique TensorFlower2018-07-18
| | | | | | | | | Rollback of 81161f9d9987a8eb70793d95048c20be34292859 due internal breakage. END_PUBLIC Automated rollback of commit 81161f9d9987a8eb70793d95048c20be34292859 PiperOrigin-RevId: 205107655
* - Use InlinedVector for ShapeIndex.Gravatar Yunxing Dai2018-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Use a separate IndexTable for lookups. This reduces the number of cachelines needed for storing ShapeTreeNodes. - Set up benchmark for flat tuples, useful as a benchmark for future optimizations. name old time/op new time/op delta BM_Construct/2/8 8.34?s ? 1% 7.57?s ? 2% -9.26% (p=0.008 n=5+5) BM_Construct/1/1000 143?s ? 1% 132?s ? 1% -7.29% (p=0.008 n=5+5) BM_ConstructUnowned/2/8 2.18?s ? 4% 1.31?s ? 1% -39.99% (p=0.008 n=5+5) BM_ConstructUnowned/1/1000 23.0?s ? 7% 15.1?s ? 1% -34.47% (p=0.008 n=5+5) BM_Copy/2/8 1.52?s ? 5% 0.37?s ? 1% -76.01% (p=0.008 n=5+5) BM_Copy/1/1000 18.7?s ? 3% 4.9?s ? 2% -73.85% (p=0.008 n=5+5) BM_Move/2/8 0.03ns ? 2% 13.42ns ? 1% +40877.10% (p=0.016 n=4+5) BM_Move/1/1000 0.03ns ? 0% 13.54ns ? 3% +40930.30% (p=0.016 n=4+5) BM_ForEach/2/8 26.4ns ? 1% 27.9ns ? 2% +5.77% (p=0.008 n=5+5) BM_ForEach/1/1000 271ns ? 1% 273ns ? 0% +0.81% (p=0.016 n=5+4) BM_Iterate/2/8 25.5ns ? 3% 23.9ns ? 8% ~ (p=0.151 n=5+5) BM_Iterate/1/1000 272ns ? 2% 271ns ? 1% ~ (p=0.984 n=5+5) name old allocs/op new allocs/op delta BM_Construct/2/8 373 ? 0% 276 ? 0% -26.01% (p=0.008 n=5+5) BM_Construct/1/1000 5.00k ? 0% 4.00k ? 0% -20.00% (p=0.008 n=5+5) BM_ConstructUnowned/2/8 99.0 ? 0% 2.0 ? 0% -97.98% (p=0.008 n=5+5) BM_ConstructUnowned/1/1000 1.00k ? 0% 0.00k ? 0% -99.80% (p=0.008 n=5+5) BM_Copy/2/8 105 ? 0% 19 ? 0% -81.90% (p=0.008 n=5+5) BM_Copy/1/1000 1.31k ? 0% 0.25k ? 0% -80.84% (p=0.008 n=5+5) BM_Move/2/8 23.0 ? 0% 17.0 ? 0% -26.09% (p=0.008 n=5+5) BM_Move/1/1000 313 ? 0% 250 ? 0% -20.13% (p=0.008 n=5+5) BM_ForEach/2/8 23.0 ? 0% 17.0 ? 0% -26.09% (p=0.008 n=5+5) BM_ForEach/1/1000 313 ? 0% 250 ? 0% -20.13% (p=0.008 n=5+5) BM_Iterate/2/8 23.0 ? 0% 17.0 ? 0% -26.09% (p=0.008 n=5+5) BM_Iterate/1/1000 313 ? 0% 250 ? 0% -20.13% (p=0.008 n=5+5) PiperOrigin-RevId: 205024687
* [XLA] Make ShapeTree use ShapeIndexViewsGravatar Benjamin Kramer2018-06-21
| | | | | | | Avoids creating temporary std::vectors on the consumer side. Also push ShapeIndexViews through the GPU backend a bit. PiperOrigin-RevId: 201529722
* Do not count empty tuples as having one leaf node.Gravatar A. Unique TensorFlower2018-06-12
| | | | PiperOrigin-RevId: 200327849
* Introduced kDomain HLO instruction set isolation to bound connected sets of ↵Gravatar A. Unique TensorFlower2018-05-29
| | | | | | | | instructions with similar attributes (ie, sharding). This CL simply adds the infrastructure, but leaves the wire-on to a separate CL. PiperOrigin-RevId: 198503625
* [XLA] Optimize ShapeTree<T>Gravatar A. Unique TensorFlower2018-05-22
| | | | | | | | | | | | | | This optimizes ShapeTree quite significantly. In particular this optimizes for the common case of querying/iterating, copying and moving ShapeTrees. * Allocate all ShapeTreeNodes inside a single, owned, vector. This reduces the number of memory allocations and improves cache performance. * Instead of storing children nodes as unique_ptrs, store them as indices into the owning container's vector. This allows cheap copy-construction (a std::vector POD copy) and doesn't change the fast path (dereferencing a pointer is just as fast as dereferencing a base + offset). * Instead of a unique_ptr<Shape>, use a shared_ptr<Shape>. This removes a load of copy-construction overhead at the cost of a shared_ptr over a unique_ptr (one extra allocation). * Instead of computing ShapeIndexes on-demand in the iterators/ForEach*, precompute them during construction time. This adds a few more bytes per ShapeTree, but now we can... * ... store a std::pair<ShapeIndex, T> as the ShapeTreeNode's data element. This allows us to provide a std::pair<K,V>&, STL-like interface from iterators without going through any of the previous unique_ptr hacks around storage lifetimes. * Because we no longer need to iterate from the beginning to build up the ShapeIndex, we can now offer a ::find() function to return an iterator for a ShapeIndex in O(K) time. As the iteration order is guaranteed to be pre-order, this can be used (and will be, later) to speed up the fast-path of mutating a subtree of a ShapeTree from tf2xla::ExtractSubBuffers. * Similarly because we now have a very standard, cheap STL interface with no performance cliffs, we can hopefully improve ShapedBuffer's copy and move constructors to be cheaper. PiperOrigin-RevId: 197548717
* [XLA] Fully qualify xla::MakeUnique uses in shape_tree.h. No functional changes.Gravatar Peter Hawkins2018-03-01
| | | | PiperOrigin-RevId: 187520283
* Make conversions from ShapedBuffer <-> ScopedShapedBuffer efficient byGravatar A. Unique TensorFlower2018-02-15
| | | | | | moving memory ownership instead of copying. PiperOrigin-RevId: 185871648
* Merge changes from github.Gravatar A. Unique TensorFlower2017-12-22
| | | | PiperOrigin-RevId: 179953488
* Automated g4 rollback of changelist 179258973Gravatar A. Unique TensorFlower2017-12-15
| | | | PiperOrigin-RevId: 179260538
* Merge changes from github.Gravatar Dandelion Man?2017-12-15
| | | | PiperOrigin-RevId: 179258973
* Change HloSharding to allow getting a ShapeTree for non-tuple types.Gravatar A. Unique TensorFlower2017-11-10
| | | | | | Add reverse iteration to ShapeTree. PiperOrigin-RevId: 175341255
* When sharding a tuple, we typically want to describe the data shardingGravatar A. Unique TensorFlower2017-11-10
| | | | | | | | | | | | | | | | | | of each individual subtensor individually. Tuples are essentially just containers - the tensors they contain should be able to be sharded differently. Tuples are hierarchically structured, but shardings were designed to not contain the sharded type (the sharded type is inferred from the output type of the instruction the sharding is applied to). Therefore, shardings for tuples contain shardings for each subtensor as a non-structured list. This list is ordered as a preorder walk of the tuple shape, and of course only the leaf nodes of the tuple shape are stored. The structure is reapplied when the sharded instruction's shape is known. PiperOrigin-RevId: 175132692
* Reduce XLA compile time by ~7% for a convolutional image model:Gravatar A. Unique TensorFlower2017-08-18
| | | | | | | | | | | | | | | | | | | | | | * Added CompactPointerSet<T>, which is optimized for set size <= 1. * Changed expensive CHECKs to DCHECKS in buffer_assignment.cc * Reserve space in DFS state array before starting DFS. * Use unsigned arithmetic in DFS state maintenance. * HloInstruction: - Moved frequently used fields to start for better cache locality. - Use InlinedVector instead of vector for operand array. - Use InlinedVector instead of vector for DFS stack. * Pre-compute "is array" and "is tuple" for LogicalBuffer. * PointsToSet: - Combine two ShapeTrees into one. - Use CompactPointerSet instead of std::set to hold sources. - Use CompactPointerSet instead of std::set to hold flattened buffers. * ShapeTree: use unique_ptr instead of optional for shape storage (reduces size and destruction overhead). * Add proper const qualifiers to some FlatSet iterator methods. Co-author=jeff PiperOrigin-RevId: 165759117
* Fix const_iterator in ShapeTree. The existing implementation didn't workGravatar Mark Heffernan2017-08-04
| | | | | | | because const Foo<T>* and Foo<const T>* are not convertible in C++ which broke the internal machinery of the iterator in the const case. PiperOrigin-RevId: 164276236
* Add an iterator implementing std::forward_iterator_tag to ShapeTree.Gravatar A. Unique TensorFlower2017-08-02
| | | | | | This allows it to be iterated over by a for-loop. PiperOrigin-RevId: 163950729
* Speed up TuplePointsToAnalysis.Gravatar Justin Lebar2017-06-28
| | | | | | | | | | | | | | | This analysis is one of the most expensive parts of the HLO optimization pipeline. - Avoid one or two unnecessary hashtable lookups in PopulateDefinedBuffersAndAliases. - Add a mode to ShapeTree wherein we avoid copying Shapes. - Use templated functors rather than std::function in ShapeTree's iterators, thus avoiding the overhead of std::function. PiperOrigin-RevId: 160487485
* [XLA] Simplify Shape traversal visitors.Gravatar Mark Heffernan2017-06-06
| | | | | | Simplify shape traversal visitors in ShapeUtil and ShapeTree. Add a non-Status form because most uses of the traversal methods do not use it, and remove is_leaf parameter from ShapeTree.ForEach* as it is not frequently used. PiperOrigin-RevId: 158201574
* [XLA] Various improvements to ShapeTree.Gravatar Mark Heffernan2017-06-01
| | | | | | | | Add support for holding non-copyable types, operator==, and a CopySubtreeFrom method for copying a subtree from one ShapeTree to another. PiperOrigin-RevId: 157777636
* Fix ShapeTree operator= to actually work. Also added some tests.Gravatar A. Unique TensorFlower2017-04-17
| | | | Change: 153385325
* [XLA] Improve ShapeTree implementation.Gravatar A. Unique TensorFlower2017-04-16
| | | | | | | | | | | | | | | | | | 1) Ensure PODs are zero-initialized via the shape-constructor; i.e. ShapeTree<bool>(shape) has all data items initialized to false. 2) Add a default constructor, which makes it possible to use operator[] in a FlatMap<_, ShapeTree<_>>. This creates a tree where !ShapeTree::IsValid. 3) Change the representation so that the root node holds the shape by value, while non-root nodes don't hold a shape. This is "cleaner" than holding an unused shape in non-root nodes, and also simplifies the implementation. There is now a struct internal::ShapeTreeNode only used by ShapeTree. 4) Simplify constructors to delegate to a single implementation. 5) Fix bug in ShapeTree(Shape shape, T init_value) constructor, which caused the existing code to only work for T=bool. 6) Fix documentation to indicate that a T data value is held for every node in the tree, not just leaf nodes. That was the original behavior, which is unchanged in this CL; it's just the documentation that was wrong. Change: 153305130
* Initial open-source release of XLA: Accelerated Linear Algebra.Gravatar Peter Hawkins2017-01-09
XLA is a compiler-based linear algebra execution engine that targets CPUs, GPUs and custom accelerators. XLA is still experimental; we are releasing it early to get the community involved. Change: 143990941