## Imperative programming in TensorFlow

In the standard TensorFlow library, the specification of the computation is done
statically in terms of a computation graph, and is separate from the execution
of the graph. This model of programming is referred to as *lazy*, *deferred*,
*dynamic*, or, *asynchronous*. This library brings imperative style programming (à
la [NumPy](http://www.numpy.org)) to TensorFlow. Using this library, you can:

* Write code in an imperative style: the results of the computation are available
  right after the execution of a line of code.
* Use TensorFlow operations on tensors, and get all the benefits of GPU
  acceleration.
* Include any Python control flow statements like `while` and `if` when
  specifying the computation.
* Perform automatic differentiation on your code with the
  standard
  [`tf.gradients`](https://www.tensorflow.org/api_docs/python/train/gradient_computation#gradients) function.

### Getting started

This library is a thin wrapper over the standard TensorFlow Python library. The
source code is
available
[here](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/imperative). You
can get started on Linux by installing the nightly PIP package linked off 
[the main page](https://github.com/tensorflow/tensorflow). Please
consult [this](https://github.com/tensorflow/tensorflow#installation) document for other platforms and the PIP package including GPU
support.


### Write your first imperative TensorFlow program

```shell
$ python
```

```python
>>> import tensorflow.contrib.imperative as tf
>>> x = tf.constant([[7.], [6]])
>>> y = tf.constant([[6., 7]])
>>> tf.matmul(x, y)
array([[ 42.,  49.],
       [ 36.,  42.]], dtype=float32)
```

Note that this code is identical in terms of the programmer's mental model to
the following NumPy code:

```python
>>> import numpy as np
>>> x = np.array([[7.], [6]])
>>> y = np.array([[6., 7]])
>>> x.dot(y)
array([[ 42.,  49.],
       [ 36.,  42.]])
```

The library can be imported as `import tensorflow.contrib.imperative as tf`
(contrast with importing standard TensorFlow, which is done as `import
tensorflow as tf`). This import statement makes all of standard TensorFlow
available in the `tf` symbol. However, it is not necessary to create a session
object and set it up to run and fetch tensors.


### Features

The library provides the following additional features on top of standard
TensorFlow:

* Tensors are automatically fetched when used in contexts that expect their
  value.

  - Printing

  ```python
  x = tf.constant(10)
  y = tf.constant(32)
  print(x + y)
  42
  ```

  - Use in conditionals

  ```python
  x = tf.constant(30)
  if x > 4:
    print('Greater than 4')
  Greater than 4

  x = tf.random_normal([3])
  y = x * 2
  while tf.global_norm([y]) < 1000:
    y = y * 2
  print(y)
  [ -213.2868042   -511.02456665  1026.66882324]
  ```

* Variables are automatically initialized, no need to run the
  [`tf.global_variables_initializer()`](https://www.tensorflow.org/api_docs/python/state_ops/variable_helper_functions#global_variables_initializer) operation.

  ```python
  x = tf.Variable(np.random.normal(size=[2, 2]), dtype=tf.float32)
  y = tf.constant([[1, 2.]])
  z = tf.matmul(y, x)
  print(z)
  array([[-1.231673  ,  3.14744973]], dtype=float32)
  ```

* Gradients work as expected using the standard `tf.gradients` function.

   ```python
   x = tf.Variable(np.random.rand(1, 3))
   y = tf.exp(x)
   dy = tf.gradients(y, x)
   # dy/dx should be equal to y (= exp(x))
   print(y, dy)
   (array([[ 1.79997761,  2.00581881,  2.37302414]]), [array([[ 1.79997761,  2.00581881,  2.37302414]])])
   ```

### Caveats

The library is implemented on top of standard TensorFlow. It still constructs a
graph in the background and defers op execution. But when an op executes for the
first time, its results are cached and the cached value is returned for future
executions, thus providing imperative semantics. Because of this implementation
choice, this library comes with the following caveats:

* **Use inside Python loops:** A graph is constructed and kept around in
  the background, both for just executing using the standard TensorFlow runtime,
  and also for allowing automatic differentiation via `tf.gradients`. This means
  that the graph keeps growing when TensorFlow functions are called inside a
  Python loop. This library provides a `tf.new_step` method that clears the
  graph as well as the cached tensors that have been kept around for gradient
  computation. `tf.new_step` can be used as a context manager around, say, a
  training loop to clear the graph after each training step.

  ```python
  x = tf.Variable(constant_op.constant(1.0))
  for i in range(10):
    # Create a new training step
    with tf.new_step() as step:
      # Perform computation and variable updates
      step.run(tf.assign_sub(x, 0.1))
      self.assertAllClose(tf.identity(x), 1.0 - (i + 1) * 0.1)
      # The graph within this context is cleared at this point.
  ```

* **Speed:** Redundant graph construction and caching of tensor values adds
  overheads that are not present in standard TensorFlow, where typically the
  graph is constructed once and executed multiple times. This library is
  intended as a vehicle to prototype the imperative programming model in
  TensorFlow. The runtime overheads can be alleviated with various optimizations
  to the runtime that would equally benefit the deferred execution mode as
  well.