aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_examples.md
diff options
context:
space:
mode:
Diffstat (limited to 'tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_examples.md')
-rw-r--r--tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_examples.md360
1 files changed, 360 insertions, 0 deletions
diff --git a/tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_examples.md b/tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_examples.md
new file mode 100644
index 0000000000..d88acfae80
--- /dev/null
+++ b/tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_examples.md
@@ -0,0 +1,360 @@
+# TensorFlow Lite Converter command-line examples
+
+This page shows how to use the TensorFlow Lite Converter in the command line.
+
+[TOC]
+
+## Command-line tools <a name="tools"></a>
+
+There are two approaches to running the converter in the command line.
+
+* `tflite_convert`: Starting from TensorFlow 1.9, the command-line tool
+ `tflite_convert` is installed as part of the Python package. All of the
+ examples below use `tflite_convert` for simplicity.
+ * Example: `tflite_convert --output_file=...`
+* `bazel`: In order to run the latest version of the TensorFlow Lite Converter
+ either install the nightly build using
+ [pip](https://www.tensorflow.org/install/pip) or
+ [clone the TensorFlow repository](https://www.tensorflow.org/install/source)
+ and use `bazel`.
+ * Example: `bazel run
+ //tensorflow/contrib/lite/python:tflite_convert --
+ --output_file=...`
+
+### Converting models prior to TensorFlow 1.9 <a name="pre-tensorflow-1.9"></a>
+
+The recommended approach for using the converter prior to TensorFlow 1.9 is the
+[Python API](python_api.md#pre-tensorflow-1.9). If a command line tool is
+desired, the `toco` command line tool was available in TensorFlow 1.7. Enter
+`toco --help` in Terminal for additional details on the command-line flags
+available. There were no command line tools in TensorFlow 1.8.
+
+## Basic examples <a name="basic"></a>
+
+The following section shows examples of how to convert a basic float-point model
+from each of the supported data formats into a TensorFlow Lite FlatBuffers.
+
+### Convert a TensorFlow GraphDef <a name="graphdef"></a>
+
+The follow example converts a basic TensorFlow GraphDef (frozen by
+[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py))
+into a TensorFlow Lite FlatBuffer to perform floating-point inference. Frozen
+graphs contain the variables stored in Checkpoint files as Const ops.
+
+```
+curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \
+ | tar xzv -C /tmp
+tflite_convert \
+ --output_file=/tmp/foo.tflite \
+ --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \
+ --input_arrays=input \
+ --output_arrays=MobilenetV1/Predictions/Reshape_1
+```
+
+The value for `input_shapes` is automatically determined whenever possible.
+
+### Convert a TensorFlow SavedModel <a name="savedmodel"></a>
+
+The follow example converts a basic TensorFlow SavedModel into a Tensorflow Lite
+FlatBuffer to perform floating-point inference.
+
+```
+tflite_convert \
+ --output_file=/tmp/foo.tflite \
+ --saved_model_dir=/tmp/saved_model
+```
+
+[SavedModel](https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators)
+has fewer required flags than frozen graphs due to access to additional data
+contained within the SavedModel. The values for `--input_arrays` and
+`--output_arrays` are an aggregated, alphabetized list of the inputs and outputs
+in the [SignatureDefs](https://www.tensorflow.org/serving/signature_defs) within
+the
+[MetaGraphDef](https://www.tensorflow.org/guide/saved_model#apis_to_build_and_load_a_savedmodel)
+specified by `--saved_model_tag_set`. As with the GraphDef, the value for
+`input_shapes` is automatically determined whenever possible.
+
+There is currently no support for MetaGraphDefs without a SignatureDef or for
+MetaGraphDefs that use the [`assets/`
+directory](https://www.tensorflow.org/guide/saved_model#structure_of_a_savedmodel_directory).
+
+### Convert a tf.Keras model <a name="keras"></a>
+
+The following example converts a `tf.keras` model into a TensorFlow Lite
+Flatbuffer. The `tf.keras` file must contain both the model and the weights.
+
+```
+tflite_convert \
+ --output_file=/tmp/foo.tflite \
+ --keras_model_file=/tmp/keras_model.h5
+```
+
+## Quantization
+
+### Convert a TensorFlow GraphDef for quantized inference <a name="graphdef-quant"></a>
+
+The TensorFlow Lite Converter is compatible with fixed point quantization models
+described [here](https://www.tensorflow.org/performance/quantization). These are
+float models with
+[`FakeQuant*`](https://www.tensorflow.org/api_guides/python/array_ops#Fake_quantization)
+ops inserted at the boundaries of fused layers to record min-max range
+information. This generates a quantized inference workload that reproduces the
+quantization behavior that was used during training.
+
+The following command generates a quantized TensorFlow Lite FlatBuffer from a
+"quantized" TensorFlow GraphDef.
+
+```
+tflite_convert \
+ --output_file=/tmp/foo.tflite \
+ --graph_def_file=/tmp/some_quantized_graph.pb \
+ --inference_type=QUANTIZED_UINT8 \
+ --input_arrays=input \
+ --output_arrays=MobilenetV1/Predictions/Reshape_1 \
+ --mean_values=128 \
+ --std_dev_values=127
+```
+
+### Use \"dummy-quantization\" to try out quantized inference on a float graph <a name="dummy-quant"></a>
+
+In order to evaluate the possible benefit of generating a quantized graph, the
+converter allows "dummy-quantization" on float graphs. The flags
+`--default_ranges_min` and `--default_ranges_max` accept plausible values for
+the min-max ranges of the values in all arrays that do not have min-max
+information. "Dummy-quantization" will produce lower accuracy but will emulate
+the performance of a correctly quantized model.
+
+The example below contains a model using Relu6 activation functions. Therefore,
+a reasonable guess is that most activation ranges should be contained in [0, 6].
+
+```
+curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \
+ | tar xzv -C /tmp
+tflite_convert \
+ --output_file=/tmp/foo.cc \
+ --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \
+ --inference_type=QUANTIZED_UINT8 \
+ --input_arrays=input \
+ --output_arrays=MobilenetV1/Predictions/Reshape_1 \
+ --default_ranges_min=0 \
+ --default_ranges_max=6 \
+ --mean_values=128 \
+ --std_dev_values=127
+```
+
+## Specifying input and output arrays
+
+### Multiple input arrays
+
+The flag `input_arrays` takes in a comma-separated list of input arrays as seen
+in the example below. This is useful for models or subgraphs with multiple
+inputs.
+
+```
+curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \
+ | tar xzv -C /tmp
+tflite_convert \
+ --graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \
+ --output_file=/tmp/foo.tflite \
+ --input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \
+ --input_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_3/MaxPool_0a_3x3/MaxPool,InceptionV1/InceptionV1/Mixed_3b/Branch_0/Conv2d_0a_1x1/Relu \
+ --output_arrays=InceptionV1/Logits/Predictions/Reshape_1
+```
+
+Note that `input_shapes` is provided as a colon-separated list. Each input shape
+corresponds to the input array at the same position in the respective list.
+
+### Multiple output arrays
+
+The flag `output_arrays` takes in a comma-separated list of output arrays as
+seen in the example below. This is useful for models or subgraphs with multiple
+outputs.
+
+```
+curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \
+ | tar xzv -C /tmp
+tflite_convert \
+ --graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \
+ --output_file=/tmp/foo.tflite \
+ --input_arrays=input \
+ --output_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu
+```
+
+### Specifying subgraphs
+
+Any array in the input file can be specified as an input or output array in
+order to extract subgraphs out of an input graph file. The TensorFlow Lite
+Converter discards the parts of the graph outside of the specific subgraph. Use
+[graph visualizations](#graph-visualizations) to identify the input and output
+arrays that make up the desired subgraph.
+
+The follow command shows how to extract a single fused layer out of a TensorFlow
+GraphDef.
+
+```
+curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \
+ | tar xzv -C /tmp
+tflite_convert \
+ --graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \
+ --output_file=/tmp/foo.pb \
+ --input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \
+ --input_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_3/MaxPool_0a_3x3/MaxPool,InceptionV1/InceptionV1/Mixed_3b/Branch_0/Conv2d_0a_1x1/Relu \
+ --output_arrays=InceptionV1/InceptionV1/Mixed_3b/concat_v2
+```
+
+Note that the final representation in TensorFlow Lite FlatBuffers tends to have
+coarser granularity than the very fine granularity of the TensorFlow GraphDef
+representation. For example, while a fully-connected layer is typically
+represented as at least four separate ops in TensorFlow GraphDef (Reshape,
+MatMul, BiasAdd, Relu...), it is typically represented as a single "fused" op
+(FullyConnected) in the converter's optimized representation and in the final
+on-device representation. As the level of granularity gets coarser, some
+intermediate arrays (say, the array between the MatMul and the BiasAdd in the
+TensorFlow GraphDef) are dropped.
+
+When specifying intermediate arrays as `--input_arrays` and `--output_arrays`,
+it is desirable (and often required) to specify arrays that are meant to survive
+in the final form of the graph, after fusing. These are typically the outputs of
+activation functions (since everything in each layer until the activation
+function tends to get fused).
+
+## Logging
+
+
+## Graph visualizations
+
+The converter can export a graph to the Graphviz Dot format for easy
+visualization using either the `--output_format` flag or the
+`--dump_graphviz_dir` flag. The subsections below outline the use cases for
+each.
+
+### Using `--output_format=GRAPHVIZ_DOT` <a name="using-output-format-graphviz-dot"></a>
+
+The first way to get a Graphviz rendering is to pass `GRAPHVIZ_DOT` into
+`--output_format`. This results in a plausible visualization of the graph. This
+reduces the requirements that exist during conversion from a TensorFlow GraphDef
+to a TensorFlow Lite FlatBuffer. This may be useful if the conversion to TFLite
+is failing.
+
+```
+curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \
+ | tar xzv -C /tmp
+tflite_convert \
+ --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \
+ --output_file=/tmp/foo.dot \
+ --output_format=GRAPHVIZ_DOT \
+ --input_shape=1,128,128,3 \
+ --input_arrays=input \
+ --output_arrays=MobilenetV1/Predictions/Reshape_1
+```
+
+The resulting `.dot` file can be rendered into a PDF as follows:
+
+```
+dot -Tpdf -O /tmp/foo.dot
+```
+
+And the resulting `.dot.pdf` can be viewed in any PDF viewer, but we suggest one
+with a good ability to pan and zoom across a very large page. Google Chrome does
+well in that respect.
+
+```
+google-chrome /tmp/foo.dot.pdf
+```
+
+Example PDF files are viewable online in the next section.
+
+### Using `--dump_graphviz_dir`
+
+The second way to get a Graphviz rendering is to pass the `--dump_graphviz_dir`
+flag, specifying a destination directory to dump Graphviz rendering to. Unlike
+the previous approach, this one retains the original output format. This
+provides a visualization of the actual graph resulting from a specific
+conversion process.
+
+```
+curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \
+ | tar xzv -C /tmp
+tflite_convert \
+ --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \
+ --output_file=/tmp/foo.tflite \
+ --input_arrays=input \
+ --output_arrays=MobilenetV1/Predictions/Reshape_1 \
+ --dump_graphviz_dir=/tmp
+```
+
+This generates a few files in the destination directory. The two most important
+files are `toco_AT_IMPORT.dot` and `/tmp/toco_AFTER_TRANSFORMATIONS.dot`.
+`toco_AT_IMPORT.dot` represents the original graph containing only the
+transformations done at import time. This tends to be a complex visualization
+with limited information about each node. It is useful in situations where a
+conversion command fails.
+
+`toco_AFTER_TRANSFORMATIONS.dot` represents the graph after all transformations
+were applied to it, just before it is exported. Typically, this is a much
+smaller graph with more information about each node.
+
+As before, these can be rendered to PDFs:
+
+```
+dot -Tpdf -O /tmp/toco_*.dot
+```
+
+Sample output files can be seen here below. Note that it is the same
+`AveragePool` node in the top right of each image.
+
+<table><tr>
+ <td>
+ <a target="_blank" href="https://storage.googleapis.com/download.tensorflow.org/example_images/toco_AT_IMPORT.dot.pdf">
+ <img src="https://www.tensorflow.org/images/tflite_convert/tflite_convert_before.png"/>
+ </a>
+ </td>
+ <td>
+ <a target="_blank" href="https://storage.googleapis.com/download.tensorflow.org/example_images/toco_AFTER_TRANSFORMATIONS.dot.pdf">
+ <img src="https://www.tensorflow.org/images/tflite_convert/tflite_convert_after.png"/>
+ </a>
+ </td>
+</tr>
+<tr><td>before</td><td>after</td></tr>
+</table>
+
+### Graph "video" logging
+
+When `--dump_graphviz_dir` is used, one may additionally pass
+`--dump_graphviz_video`. This causes a graph visualization to be dumped after
+each individual graph transformation, resulting in thousands of files.
+Typically, one would then bisect into these files to understand when a given
+change was introduced in the graph.
+
+### Legend for the graph visualizations <a name="graphviz-legend"></a>
+
+* Operators are red square boxes with the following hues of red:
+ * Most operators are
+ <span style="background-color:#db4437;color:white;border:1px;border-style:solid;border-color:black;padding:1px">bright
+ red</span>.
+ * Some typically heavy operators (e.g. Conv) are rendered in a
+ <span style="background-color:#c53929;color:white;border:1px;border-style:solid;border-color:black;padding:1px">darker
+ red</span>.
+* Arrays are octagons with the following colors:
+ * Constant arrays are
+ <span style="background-color:#4285f4;color:white;border:1px;border-style:solid;border-color:black;padding:1px">blue</span>.
+ * Activation arrays are gray:
+ * Internal (intermediate) activation arrays are
+ <span style="background-color:#f5f5f5;border:1px;border-style:solid;border-color:black;border:1px;border-style:solid;border-color:black;padding:1px">light
+ gray</span>.
+ * Those activation arrays that are designated as `--input_arrays` or
+ `--output_arrays` are
+ <span style="background-color:#9e9e9e;border:1px;border-style:solid;border-color:black;padding:1px">dark
+ gray</span>.
+ * RNN state arrays are green. Because of the way that the converter
+ represents RNN back-edges explicitly, each RNN state is represented by a
+ pair of green arrays:
+ * The activation array that is the source of the RNN back-edge (i.e.
+ whose contents are copied into the RNN state array after having been
+ computed) is
+ <span style="background-color:#b7e1cd;border:1px;border-style:solid;border-color:black;padding:1px">light
+ green</span>.
+ * The actual RNN state array is
+ <span style="background-color:#0f9d58;color:white;border:1px;border-style:solid;border-color:black;padding:1px">dark
+ green</span>. It is the destination of the RNN back-edge updating
+ it.