diff options
author | Sanjoy Das <sanjoy@google.com> | 2018-05-17 15:47:30 -0700 |
---|---|---|
committer | TensorFlower Gardener <gardener@tensorflow.org> | 2018-05-17 15:52:10 -0700 |
commit | b669510b115b5c726fd5e69b5062a1072c034a57 (patch) | |
tree | da4cbc996725ecc1dbc6d61477a8b43c933d1b61 /tensorflow/compiler/xla/service/cpu/BUILD | |
parent | 317f3e09109dcb6f4fc70718d1ad2be70e4d2bf8 (diff) |
Introduce an "indexed array" analysis
Context: we want to optimize computations hanging off of a embedding lookup from
a constant array. For instance, consider:
embedding = gather from a constant array using non-constant indices
embedding_reshaped = reshape embedding
embedding_reshaped_transposed = transpose embedding_reshaped
result = dot(embedding_reshaped_transposed, constant)
In the graph above, depending on how the details work out, we may be able to
fold `result` into a gather from a precomputed constant array. However, it is
inconvenient to get there by incremental rewrites -- it is probably not
profitable to rewrite embedding_reshaped or embedding_reshaped_transposed [0] as
embedding lookups but we get to "see" that the dot can be rewritten only after
rewriting the reshape and the transpose.
This analysis aims to make the optimization above more straightforward by
allowing a transformation pass (that uses this analysis) to query the analysis
to see if if `result` _can_ be represented as an embedding lookup. If yes it
can then apply some profitability heuristics to decide if it is worth it to
rewrite it as one. This suggested workflow gives us separation of concerns (the
legality of the rewrite is computed separately from its profitability) and, more
importantly, lets us "look ahead" and analyze the dot without rewriting its
operands.
The implementation is far from complete (most of the interesting bits are TODO)
but I wanted to get an early design review before I spent too much time on this.
[0] Under the assumption that transposing or reshaping are not expensive enough
to pay the price of keeping around a new potentially large constant (in
particular, some of these may have been equivalent to free bitcasts).
PiperOrigin-RevId: 197064648
Diffstat (limited to 'tensorflow/compiler/xla/service/cpu/BUILD')
-rw-r--r-- | tensorflow/compiler/xla/service/cpu/BUILD | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/tensorflow/compiler/xla/service/cpu/BUILD b/tensorflow/compiler/xla/service/cpu/BUILD index d718322ba0..a15e41fee0 100644 --- a/tensorflow/compiler/xla/service/cpu/BUILD +++ b/tensorflow/compiler/xla/service/cpu/BUILD @@ -126,6 +126,7 @@ cc_library( "//tensorflow/compiler/xla/service:hlo_scheduling", "//tensorflow/compiler/xla/service:hlo_subcomputation_unification", "//tensorflow/compiler/xla/service:hlo_verifier", + "//tensorflow/compiler/xla/service:indexed_array_analysis", "//tensorflow/compiler/xla/service:inliner", "//tensorflow/compiler/xla/service:llvm_compiler", "//tensorflow/compiler/xla/service:reduce_precision_insertion", |