aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/docs_src
diff options
context:
space:
mode:
authorGravatar Sanjoy Das <sanjoy@google.com>2018-08-16 14:44:15 -0700
committerGravatar TensorFlower Gardener <gardener@tensorflow.org>2018-08-16 14:47:49 -0700
commitd43820b9eff0cc863de2bbfb142afe92bf5afd00 (patch)
treee019bdc0bb52496436e0ff92d76b728165e3a420 /tensorflow/docs_src
parent8235e83c442744a1285ce97e5dfc2a6556f9f667 (diff)
Improve gather ergonomics by renaming fields.
This CL renames the various inputs to the Gather HLO to be more mnemonic by making it more obviously a batch dynamic-slice. The replacements I made are: s/elided_window_dims/collapsed_slice_dims/g s/window_bounds/slice_sizes/g s/gather_dims_to_operand_dims/start_index_map/g s/gather_indices/start_indices/g s/output_window_dims/offset_dims/g PiperOrigin-RevId: 209051067
Diffstat (limited to 'tensorflow/docs_src')
-rw-r--r--tensorflow/docs_src/performance/xla/operation_semantics.md267
1 files changed, 128 insertions, 139 deletions
diff --git a/tensorflow/docs_src/performance/xla/operation_semantics.md b/tensorflow/docs_src/performance/xla/operation_semantics.md
index 16dd3c5bf3..2de30d1b3d 100644
--- a/tensorflow/docs_src/performance/xla/operation_semantics.md
+++ b/tensorflow/docs_src/performance/xla/operation_semantics.md
@@ -1138,7 +1138,7 @@ array with the same shape. It is allowed for `operand` to be a scalar (rank 0).
## Gather
The XLA gather operation stitches together several slices (each slice at a
-potentially different runtime offset) of an input tensor into an output tensor.
+potentially different runtime offset) of an input array.
### General Semantics
@@ -1146,151 +1146,141 @@ See also
[`XlaBuilder::Gather`](https://www.tensorflow.org/code/tensorflow/compiler/xla/client/xla_builder.h).
For a more intuitive description, see the "Informal Description" section below.
-<b> `gather(operand, gather_indices, output_window_dims, elided_window_dims, window_bounds, gather_dims_to_operand_dims)` </b>
+<b> `gather(operand, start_indices, offset_dims, collapsed_slice_dims, slice_sizes, start_index_map)` </b>
|Arguments | Type | Semantics |
|----------------- | ----------------------- | --------------------------------|
-|`operand` | `XlaOp` | The tensor we’re gathering |
+|`operand` | `XlaOp` | The array we’re gathering |
: : : from. :
-|`gather_indices` | `XlaOp` | Tensor containing the starting |
-: : : indices of the slices we're :
-: : : stitching together into the :
-: : : output tensor. :
-|`index_vector_dim` | `int64` | The dimension in |
-: : : `gather_indices` that contains :
-: : : the starting indices. :
-|`output_window_dims` | `ArraySlice<int64>` | The set of dimensions in the |
-: : : output shape that are _window :
-: : : dimensions_ (defined below). :
-: : : Not all window dimensions may :
-: : : be present in the output shape. :
-|`elided_window_dims` | `ArraySlice<int64>` | The set of _window dimensions_ |
-: : : that are not present in the output shape. :
-: : : `window_bounds[i]` must be `1` for all `i` :
-: : : in `elided_window_dims`. :
-|`window_bounds` | `ArraySlice<int64>` | `window_bounds[i]` is the bounds |
-: : : for window dimension `i`. This includes :
-: : : both the window dimensions that are :
-: : : explicitly part of the output shape (via :
-: : : `output_window_dims`) and the window :
-: : : dimensions that are elided (via :
-: : : `elided_window_dims`). :
-|`gather_dims_to_operand_dims` | `ArraySlice<int64>` | A dimension map (the |
-: : : array is interpreted as mapping `i` to :
-: : : `gather_dims_to_operand_dims[i]`) from :
-: : : the gather indices in `gather_indices` to :
-: : : the operand index space. It has to be :
-: : : one-to-one and total. :
-
-For every index `Out` in the output tensor, we compute two things (more
-precisely described later):
-
- - An index into `gather_indices.rank` - `1` dimensions of `gather_indices`,
- which gives us a starting index of a slice, _operand slice_, in the operand
- tensor. These `gather_indices.rank` - `1` dimensions are all the dimensions
- in `gather_indices` except `index_vector_dim`.
-
- - A _window index_ that has the same rank as the operand. This index is
- composed of the values in `Out` at dimensions `output_window_dims`, embedded
- with zeroes according to `elided_window_dims`.
-
-The _window index_ is the relative index of the element in _operand slice_ that
-should be present in the output at index `Out`.
-
-The output is a tensor of rank `output_window_dims.size` + `gather_indices.rank`
-- `1`. Additionally, as a shorthand, we define `output_gather_dims` of type
-`ArraySlice<int64>` as the set of dimensions in the output shape but not in
-`output_window_dims`, in ascending order. E.g. if the output tensor has rank
-`5`, `output_window_dims` is {`2`, `4`} then `output_gather_dims` is {`0`, `1`,
-`3`}
-
-If `index_vector_dim` is equal to `gather_indices.rank` we implicitly
-consider `gather_indices` to have a trailing `1` dimension (i.e. if
-`gather_indices` was of shape `[6,7]` and `index_vector_dim` is `2` then
-we implicitly consider the shape of `gather_indices` to be `[6,7,1]`).
-
-The bounds for the output tensor along dimension `i` is computed as follows:
-
- 1. If `i` is present in `output_gather_dims` (i.e. is equal to
- `output_gather_dims[k]` for some `k`) then we pick the corresponding
- dimension bounds out of `gather_indices.shape`, skipping
- `index_vector_dim` (i.e. pick `gather_indices.shape.dims`[`k`] if `k`
- < `index_vector_dim` and `gather_indices.shape.dims`[`k`+`1`]
- otherwise).
- 2. If `i` is present in `output_window_dims` (i.e. equal to
- `output_window_dims`[`k`] for some `k`) then we pick the corresponding
- bound out of `window_bounds` after accounting for `elided_window_dims`
- (i.e. we pick `adjusted_window_bounds`[`k`] where `adjusted_window_bounds`
- is `window_bounds` with the bounds at indices `elided_window_dims`
- removed).
-
-The operand index `In` corresponding to an output index `Out` is computed as
-follows:
-
- 1. Let `G` = { `Out`[`k`] for `k` in `output_gather_dims` }. Use `G` to slice
- out vector `S` such that `S`[`i`] = `gather_indices`[Combine(`G`, `i`)]
- where Combine(A, b) inserts b at position `index_vector_dim` into A.
- Note that this is well defined even if `G` is empty -- if `G` is empty then
- `S` = `gather_indices`.
- 2. Create an index, `S`<sub>`in`</sub>, into `operand` using `S` by
- scattering `S` using the `gather_dims_to_operand_dims` map
- (`S`<sub>`in`</sub> is the starting indices for _operand slice_ mentioned
- above). More precisely:
- 1. `S`<sub>`in`</sub>[`gather_dims_to_operand_dims`[`k`]] = `S`[`k`] if `k` <
- `gather_dims_to_operand_dims.size`.
+|`start_indices` | `XlaOp` | Array containing the starting |
+: : : indices of the slices we gather.:
+|`index_vector_dim` | `int64` | The dimension in |
+: : : `start_indices` that "contains" :
+: : : the starting indices. See :
+: : : below for a detailed :
+: : : description. :
+|`offset_dims` | `ArraySlice<int64>` | The set of dimensions in the :
+: : : output shape that offset into a :
+: : : array sliced from operand. :
+|`slice_sizes` | `ArraySlice<int64>` | `slice_sizes[i]` is the bounds |
+: : : for the slice on dimension `i`.:
+|`collapsed_slice_dims` | `ArraySlice<int64>` | The set of dimensions in each :
+| : | slice that are collapsed away. :
+| : | These dimensions must have size:
+| : | 1. |
+|`start_index_map` | `ArraySlice<int64>` | A map that describes how to map|
+: : : indices in `start_indices` to :
+: : : to legal indices into operand. :
+
+For convenience, we label dimensions in the output array not in `offset_dims`
+as `batch_dims`.
+
+The output is an array of rank `batch_dims.size` + `operand.rank` -
+`collapsed_slice_dims`.size.
+
+If `index_vector_dim` is equal to `start_indices.rank` we implicitly consider
+`start_indices` to have a trailing `1` dimension (i.e. if `start_indices` was of
+shape `[6,7]` and `index_vector_dim` is `2` then we implicitly consider the
+shape of `start_indices` to be `[6,7,1]`).
+
+The bounds for the output array along dimension `i` is computed as follows:
+
+ 1. If `i` is present in `batch_dims` (i.e. is equal to `batch_dims[k]` for
+ some `k`) then we pick the corresponding dimension bounds out of
+ `start_indices.shape`, skipping `index_vector_dim` (i.e. pick
+ `start_indices.shape.dims`[`k`] if `k` < `index_vector_dim` and
+ `start_indices.shape.dims`[`k`+`1`] otherwise).
+
+ 2. If `i` is present in `offset_dims` (i.e. equal to `offset_dims`[`k`] for
+ some `k`) then we pick the corresponding bound out of `slice_sizes` after
+ accounting for `collapsed_slice_dims` (i.e. we pick
+ `adjusted_slice_sizes`[`k`] where `adjusted_slice_sizes` is `slice_sizes`
+ with the bounds at indices `collapsed_slice_dims` removed).
+
+Formally, the operand index `In` corresponding to an output index `Out` is
+computed as follows:
+
+ 1. Let `G` = { `Out`[`k`] for `k` in `batch_dims` }. Use `G` to slice out
+ vector `S` such that `S`[`i`] = `start_indices`[Combine(`G`, `i`)] where
+ Combine(A, b) inserts b at position `index_vector_dim` into A. Note that
+ this is well defined even if `G` is empty -- if `G` is empty then `S` =
+ `start_indices`.
+
+ 2. Create a starting index, `S`<sub>`in`</sub>, into `operand` using `S` by
+ scattering `S` using `start_index_map`. More precisely:
+ 1. `S`<sub>`in`</sub>[`start_index_map`[`k`]] = `S`[`k`] if `k` <
+ `start_index_map.size`.
2. `S`<sub>`in`</sub>[`_`] = `0` otherwise.
- 3. Create an index `W`<sub>`in`</sub> into `operand` by scattering the indices
- at the output window dimensions in `Out` according to
- the `elided_window_dims` set (`W`<sub>`in`</sub> is the _window index_
- mentioned above). More precisely:
- 1. `W`<sub>`in`</sub>[`window_dims_to_operand_dims`(`k`)] = `Out`[`k`] if
- `k` < `output_window_dims.size` (`window_dims_to_operand_dims` is
- defined below).
- 2. `W`<sub>`in`</sub>[`_`] = `0` otherwise.
- 4. `In` is `W`<sub>`in`</sub> + `S`<sub>`in`</sub> where + is element-wise
+
+ 3. Create an index `O`<sub>`in`</sub> into `operand` by scattering the indices
+ at the offset dimensions in `Out` according to the `collapsed_slice_dims`
+ set. More precisely:
+ 1. `O`<sub>`in`</sub>[`expand_offset_dims`(`k`)] =
+ `Out`[`offset_dims`[`k`]] if `k` < `offset_dims.size`
+ (`expand_offset_dims` is defined below).
+ 2. `O`<sub>`in`</sub>[`_`] = `0` otherwise.
+ 4. `In` is `O`<sub>`in`</sub> + `S`<sub>`in`</sub> where + is element-wise
addition.
-`window_dims_to_operand_dims` is the monotonic function with domain [`0`,
-`output_window_dims.size`) and range [`0`, `operand.rank`) \
-`elided_window_dims`. So if, e.g., `output_window_dims.size` is `4`,
-`operand.rank` is `6` and `elided_window_dims` is {`0`, `2`} then
-`window_dims_to_operand_dims` is {`0`→`1`, `1`→`3`, `2`→`4`, `3`→`5`}.
+`expand_offset_dims` is the monotonic function with domain [`0`, `offset.size`)
+and range [`0`, `operand.rank`) \ `collapsed_slice_dims`. So if, e.g.,
+`offset.size` is `4`, `operand.rank` is `6` and `collapsed_slice_dims` is {`0`,
+`2`} then `expand_offset_dims` is {`0`→`1`, `1`→`3`, `2`→`4`, `3`→`5`}.
### Informal Description and Examples
-`index_vector_dim` is set to `gather_indices.rank` - `1` in all of the
-examples that follow. More interesting values for `index_vector_dim`
-does not change the operation fundamentally, but makes the visual representation
-more cumbersome.
+Informally, every index `Out` in the output array corresponds to an element `E`
+in the operand array, computed as follows:
+
+ - We use the batch dimensions in `Out` to look up a starting index from
+ `start_indices`.
+
+ - We use `start_index_map` to map the starting index (which may have size less
+ than operand.rank) to a "full" starting index into operand.
+
+ - We dynamic-slice out a slice with size `slice_sizes` using the full starting
+ index.
+
+ - We reshape the slice by collapsing the `collapsed_slice_dims` dimensions.
+ Since all collapsed slice dimensions have to have bound 1 this reshape is
+ always legal.
+
+ - We use the offset dimensions in `Out` to index into this slice to get the
+ input element, `E`, corresponding to output index `Out`.
+
+`index_vector_dim` is set to `start_indices.rank` - `1` in all of the
+examples that follow. More interesting values for `index_vector_dim` does not
+change the operation fundamentally, but makes the visual representation more
+cumbersome.
To get an intuition on how all of the above fits together, let's look at an
-example that gathers 5 slices of shape `[8,6]` from a `[16,11]` tensor. The
-position of a slice into the `[16,11]` tensor can be represented as an index
+example that gathers 5 slices of shape `[8,6]` from a `[16,11]` array. The
+position of a slice into the `[16,11]` array can be represented as an index
vector of shape `S64[2]`, so the set of 5 positions can be represented as a
-`S64[5,2]` tensor.
+`S64[5,2]` array.
The behavior of the gather operation can then be depicted as an index
-transformation that takes [`G`,`W`<sub>`0`</sub>,`W`<sub>`1`</sub>], an index in
-the output shape, and maps it to an element in the input tensor in the following
+transformation that takes [`G`,`O`<sub>`0`</sub>,`O`<sub>`1`</sub>], an index in
+the output shape, and maps it to an element in the input array in the following
way:
<div style="width:95%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%" src="../../images/ops_xla_gather_0.svg">
</div>
-We first select an (`X`,`Y`) vector from the gather indices tensor using `G`.
-The element in the output tensor at index
-[`G`,`W`<sub>`0`</sub>,`W`<sub>`1`</sub>] is then the element in the input
-tensor at index [`X`+`W`<sub>`0`</sub>,`Y`+`W`<sub>`1`</sub>].
+We first select an (`X`,`Y`) vector from the gather indices array using `G`.
+The element in the output array at index
+[`G`,`O`<sub>`0`</sub>,`O`<sub>`1`</sub>] is then the element in the input
+array at index [`X`+`O`<sub>`0`</sub>,`Y`+`O`<sub>`1`</sub>].
-`window_bounds` is `[8,6]`, which decides the range of W<sub>`0`</sub> and
+`slice_sizes` is `[8,6]`, which decides the range of W<sub>`0`</sub> and
W<sub>`1`</sub>, and this in turn decides the bounds of the slice.
This gather operation acts as a batch dynamic slice with `G` as the batch
dimension.
The gather indices may be multidimensional. For instance, a more general
-version of the example above using a "gather indices" tensor of shape `[4,5,2]`
+version of the example above using a "gather indices" array of shape `[4,5,2]`
would translate indices like this:
<div style="width:95%; margin:auto; margin-bottom:10px; margin-top:20px;">
@@ -1298,25 +1288,25 @@ would translate indices like this:
</div>
Again, this acts as a batch dynamic slice `G`<sub>`0`</sub> and
-`G`<sub>`1`</sub> as the batch dimensions. The window bounds are still `[8,6]`.
+`G`<sub>`1`</sub> as the batch dimensions. The slice size is still `[8,6]`.
The gather operation in XLA generalizes the informal semantics outlined above in
the following ways:
- 1. We can configure which dimensions in the output shape are the window
- dimensions (dimensions containing `W`<sub>`0`</sub>, `W`<sub>`1`</sub> in
- the last example). The output gather dimensions (dimensions containing
+ 1. We can configure which dimensions in the output shape are the offset
+ dimensions (dimensions containing `O`<sub>`0`</sub>, `O`<sub>`1`</sub> in
+ the last example). The output batch dimensions (dimensions containing
`G`<sub>`0`</sub>, `G`<sub>`1`</sub> in the last example) are defined to be
- the output dimensions that are not window dimensions.
+ the output dimensions that are not offset dimensions.
- 2. The number of output window dimensions explicitly present in the output
+ 2. The number of output offset dimensions explicitly present in the output
shape may be smaller than the input rank. These "missing" dimensions, which
- are listed explicitly as `elided_window_dims`, must have a window bound of
- `1`. Since they have a window bound of `1` the only valid index for them is
+ are listed explicitly as `collapsed_slice_dims`, must have a slice size of
+ `1`. Since they have a slice size of `1` the only valid index for them is
`0` and eliding them does not introduce ambiguity.
- 3. The slice extracted from the "Gather Indices" tensor ((`X`, `Y`) in the last
- example) may have fewer elements than the input tensor rank, and an explicit
+ 3. The slice extracted from the "Gather Indices" array ((`X`, `Y`) in the last
+ example) may have fewer elements than the input array rank, and an explicit
mapping dictates how the index should be expanded to have the same rank as
the input.
@@ -1327,20 +1317,19 @@ As a final example, we use (2) and (3) to implement `tf.gather_nd`:
</div>
`G`<sub>`0`</sub> and `G`<sub>`1`</sub> are used to slice out a starting index
-from the gather indices tensor as usual, except the starting index has only one
-element, `X`. Similarly, there is only one output window index with the value
-`W`<sub>`0`</sub>. However, before being used as indices into the input tensor,
-these are expanded in accordance to "Gather Index Mapping"
-(`gather_dims_to_operand_dims` in the formal description) and "Window Mapping"
-(`window_dims_to_operand_dims` in the formal description) into
-[`0`,`W`<sub>`0`</sub>] and [`X`,`0`] respectively, adding up to
-[`X`,`W`<sub>`0`</sub>]. In other words, the output index
-[`G`<sub>`0`</sub>,`G`<sub>`1`</sub>,`W`<sub>`0`</sub>] maps to the input index
+from the gather indices array as usual, except the starting index has only one
+element, `X`. Similarly, there is only one output offset index with the value
+`O`<sub>`0`</sub>. However, before being used as indices into the input array,
+these are expanded in accordance to "Gather Index Mapping" (`start_index_map` in
+the formal description) and "Offset Mapping" (`expand_offset_dims` in the formal
+description) into [`0`,`O`<sub>`0`</sub>] and [`X`,`0`] respectively, adding up
+to [`X`,`O`<sub>`0`</sub>]. In other words, the output index
+[`G`<sub>`0`</sub>,`G`<sub>`1`</sub>,`O`<sub>`0`</sub>] maps to the input index
[`GatherIndices`[`G`<sub>`0`</sub>,`G`<sub>`1`</sub>,`0`],`X`] which gives us
the semantics for `tf.gather_nd`.
-`window_bounds` for this case is `[1,11]`. Intuitively this means that every
-index `X` in the gather indices tensor picks an entire row and the result is the
+`slice_sizes` for this case is `[1,11]`. Intuitively this means that every
+index `X` in the gather indices array picks an entire row and the result is the
concatenation of all these rows.
## GetTupleElement