From de4c12857782f65dc4a941776d506ecac50a5934 Mon Sep 17 00:00:00 2001 From: Michael Kuperstein Date: Thu, 2 Aug 2018 12:46:13 -0700 Subject: [XLA] Introduce variadic version of reduce. This defines the semantics, and adds parser and shape inference support. Since support is not plumbed through the rest of the compiler here, multi-output reduce is still rejected by the HLO verifier, and is not exposed through XlaBuilder. PiperOrigin-RevId: 207148035 --- .../performance/xla/operation_semantics.md | 64 +++++++++++++++++----- 1 file changed, 51 insertions(+), 13 deletions(-) (limited to 'tensorflow/docs_src') diff --git a/tensorflow/docs_src/performance/xla/operation_semantics.md b/tensorflow/docs_src/performance/xla/operation_semantics.md index 3981aaaf75..edc777a3c7 100644 --- a/tensorflow/docs_src/performance/xla/operation_semantics.md +++ b/tensorflow/docs_src/performance/xla/operation_semantics.md @@ -1431,19 +1431,29 @@ complete and returns the received data. See also [`XlaBuilder::Reduce`](https://www.tensorflow.org/code/tensorflow/compiler/xla/client/xla_builder.h). -Applies a reduction function to an array. +Applies a reduction function to one or more arrays in parallel. - `Reduce(operand, init_value, computation, dimensions)` + `Reduce(operands..., init_values..., computation, dimensions)` -Arguments | Type | Semantics -------------- | ---------------- | --------------------------------------- -`operand` | `XlaOp` | array of type `T` -`init_value` | `XlaOp` | scalar of type `T` -`computation` | `XlaComputation` | computation of type `T, T -> T` -`dimensions` | `int64` array | unordered array of dimensions to reduce +Arguments | Type | Semantics +------------- | --------------------- | --------------------------------------- +`operands` | Sequence of N `XlaOp` | N arrays of types `T_0, ..., T_N`. +`init_values` | Sequence of N `XlaOp` | N scalars of types `T_0, ..., T_N`. +`computation` | `XlaComputation` | computation of type + : : `T_0, ..., T_N, T_0, ..., T_N -> Collate(T_0, ..., T_N)` +`dimensions` | `int64` array | unordered array of dimensions to reduce -This operation reduces one or more dimensions of the input array into scalars. -The rank of the returned array is `rank(operand) - len(dimensions)`. +Where: +* N is required to be greater or equal to 1. +* All input arrays must have the same dimensions. +* If `N = 1`, `Collate(T)` is `T`. +* If `N > 1`, `Collate(T_0, ..., T_N)` is a tuple of `N` elements of type `T`. + +The output of the op is `Collate(Q_0, ..., Q_N)` where `Q_i` is an array of type +`T_i`, the dimensions of which are described below. + +This operation reduces one or more dimensions of each input array into scalars. +The rank of each returned array is `rank(operand) - len(dimensions)`. `init_value` is the initial value used for every reduction and may be inserted anywhere during computation by the back-end. In most cases, `init_value` is an identity of the reduction function (for example, 0 for addition). The applied @@ -1459,9 +1469,9 @@ enough to being associative for most practical uses. It is possible to conceive of some completely non-associative reductions, however, and these will produce incorrect or unpredictable results in XLA reductions. -As an example, when reducing across the one dimension in a 1D array with values -[10, 11, 12, 13], with reduction function `f` (this is `computation`) then that -could be computed as +As an example, when reducing across one dimension in a single 1D array with +values [10, 11, 12, 13], with reduction function `f` (this is `computation`) +then that could be computed as `f(10, f(11, f(12, f(init_value, 13)))` @@ -1543,6 +1553,34 @@ the 1D array `| 20 28 36 |`. Reducing the 3D array over all its dimensions produces the scalar `84`. +When `N > 1`, reduce function application is slightly more complex, as it is +applied simultaneously to all inputs. For example, consider the following +reduction function, which can be used to compute the max and the argmax of a +a 1-D tensor in parallel: + +``` +f: (Float, Int, Float, Int) -> Float, Int +f(max, argmax, value, index): + if value >= argmax: + return (value, index) + else: + return (max, argmax) +``` + +For 1-D Input arrays `V = Float[N], K = Int[N]`, and init values +`I_V = Float, I_K = Int`, the result `f_(N-1)` of reducing across the only +input dimension is equivalent to the following recursive application: +``` +f_0 = f(I_V, I_K, V_0, K_0) +f_1 = f(f_0.first, f_0.second, V_1, K_1) +... +f_(N-1) = f(f_(N-2).first, f_(N-2).second, V_(N-1), K_(N-1)) +``` + +Applying this reduction to an array of values, and an array of sequential +indices (i.e. iota), will co-iterate over the arrays, and return a tuple +containing the maximal value and the matching index. + ## ReducePrecision See also -- cgit v1.2.3