From ee1cb110360b12d752c9cb4ebbb76d33930f67d7 Mon Sep 17 00:00:00 2001 From: Mark Daoust Date: Tue, 9 Oct 2018 17:23:45 -0700 Subject: Move tflite_convert g3docs, so they will be pulled into the site. PiperOrigin-RevId: 216452447 --- tensorflow/contrib/lite/g3doc/_book.yaml | 9 + .../lite/g3doc/tflite_convert/cmdline_examples.md | 360 ++++++++++++++++++++ .../lite/g3doc/tflite_convert/cmdline_reference.md | 159 +++++++++ .../contrib/lite/g3doc/tflite_convert/index.md | 22 ++ .../lite/g3doc/tflite_convert/python_api.md | 258 ++++++++++++++ .../lite/g3doc/tflite_convert/toco_landscape.svg | 1 + tensorflow/contrib/lite/toco/g3doc/README.md | 3 + .../contrib/lite/toco/g3doc/cmdline_examples.md | 372 --------------------- .../contrib/lite/toco/g3doc/cmdline_reference.md | 168 ---------- tensorflow/contrib/lite/toco/g3doc/python_api.md | 279 ---------------- .../contrib/lite/toco/g3doc/toco_landscape.svg | 1 - 11 files changed, 812 insertions(+), 820 deletions(-) create mode 100644 tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_examples.md create mode 100644 tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_reference.md create mode 100644 tensorflow/contrib/lite/g3doc/tflite_convert/index.md create mode 100644 tensorflow/contrib/lite/g3doc/tflite_convert/python_api.md create mode 100644 tensorflow/contrib/lite/g3doc/tflite_convert/toco_landscape.svg create mode 100644 tensorflow/contrib/lite/toco/g3doc/README.md delete mode 100644 tensorflow/contrib/lite/toco/g3doc/cmdline_examples.md delete mode 100644 tensorflow/contrib/lite/toco/g3doc/cmdline_reference.md delete mode 100644 tensorflow/contrib/lite/toco/g3doc/python_api.md delete mode 100644 tensorflow/contrib/lite/toco/g3doc/toco_landscape.svg diff --git a/tensorflow/contrib/lite/g3doc/_book.yaml b/tensorflow/contrib/lite/g3doc/_book.yaml index de6914e536..f6ec387ad2 100644 --- a/tensorflow/contrib/lite/g3doc/_book.yaml +++ b/tensorflow/contrib/lite/g3doc/_book.yaml @@ -38,6 +38,15 @@ upper_tabs: path: /lite/ios - title: TensorFlow Lite for Raspberry Pi path: /lite/rpi + - heading: TFLite Converter + - title: Overview + path: /lite/tflite_convert/ + - title: Python API + path: /lite/tflite_convert/python_api + - title: Command Line Examples + path: /lite/tflite_convert/cmdline_examples + - title: Command Line Reference + path: /lite/tflite_convert/cmdline_reference - title: TF Mobile style: accordion diff --git a/tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_examples.md b/tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_examples.md new file mode 100644 index 0000000000..d88acfae80 --- /dev/null +++ b/tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_examples.md @@ -0,0 +1,360 @@ +# TensorFlow Lite Converter command-line examples + +This page shows how to use the TensorFlow Lite Converter in the command line. + +[TOC] + +## Command-line tools + +There are two approaches to running the converter in the command line. + +* `tflite_convert`: Starting from TensorFlow 1.9, the command-line tool + `tflite_convert` is installed as part of the Python package. All of the + examples below use `tflite_convert` for simplicity. + * Example: `tflite_convert --output_file=...` +* `bazel`: In order to run the latest version of the TensorFlow Lite Converter + either install the nightly build using + [pip](https://www.tensorflow.org/install/pip) or + [clone the TensorFlow repository](https://www.tensorflow.org/install/source) + and use `bazel`. + * Example: `bazel run + //tensorflow/contrib/lite/python:tflite_convert -- + --output_file=...` + +### Converting models prior to TensorFlow 1.9 + +The recommended approach for using the converter prior to TensorFlow 1.9 is the +[Python API](python_api.md#pre-tensorflow-1.9). If a command line tool is +desired, the `toco` command line tool was available in TensorFlow 1.7. Enter +`toco --help` in Terminal for additional details on the command-line flags +available. There were no command line tools in TensorFlow 1.8. + +## Basic examples + +The following section shows examples of how to convert a basic float-point model +from each of the supported data formats into a TensorFlow Lite FlatBuffers. + +### Convert a TensorFlow GraphDef + +The follow example converts a basic TensorFlow GraphDef (frozen by +[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py)) +into a TensorFlow Lite FlatBuffer to perform floating-point inference. Frozen +graphs contain the variables stored in Checkpoint files as Const ops. + +``` +curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \ + | tar xzv -C /tmp +tflite_convert \ + --output_file=/tmp/foo.tflite \ + --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \ + --input_arrays=input \ + --output_arrays=MobilenetV1/Predictions/Reshape_1 +``` + +The value for `input_shapes` is automatically determined whenever possible. + +### Convert a TensorFlow SavedModel + +The follow example converts a basic TensorFlow SavedModel into a Tensorflow Lite +FlatBuffer to perform floating-point inference. + +``` +tflite_convert \ + --output_file=/tmp/foo.tflite \ + --saved_model_dir=/tmp/saved_model +``` + +[SavedModel](https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators) +has fewer required flags than frozen graphs due to access to additional data +contained within the SavedModel. The values for `--input_arrays` and +`--output_arrays` are an aggregated, alphabetized list of the inputs and outputs +in the [SignatureDefs](https://www.tensorflow.org/serving/signature_defs) within +the +[MetaGraphDef](https://www.tensorflow.org/guide/saved_model#apis_to_build_and_load_a_savedmodel) +specified by `--saved_model_tag_set`. As with the GraphDef, the value for +`input_shapes` is automatically determined whenever possible. + +There is currently no support for MetaGraphDefs without a SignatureDef or for +MetaGraphDefs that use the [`assets/` +directory](https://www.tensorflow.org/guide/saved_model#structure_of_a_savedmodel_directory). + +### Convert a tf.Keras model + +The following example converts a `tf.keras` model into a TensorFlow Lite +Flatbuffer. The `tf.keras` file must contain both the model and the weights. + +``` +tflite_convert \ + --output_file=/tmp/foo.tflite \ + --keras_model_file=/tmp/keras_model.h5 +``` + +## Quantization + +### Convert a TensorFlow GraphDef for quantized inference + +The TensorFlow Lite Converter is compatible with fixed point quantization models +described [here](https://www.tensorflow.org/performance/quantization). These are +float models with +[`FakeQuant*`](https://www.tensorflow.org/api_guides/python/array_ops#Fake_quantization) +ops inserted at the boundaries of fused layers to record min-max range +information. This generates a quantized inference workload that reproduces the +quantization behavior that was used during training. + +The following command generates a quantized TensorFlow Lite FlatBuffer from a +"quantized" TensorFlow GraphDef. + +``` +tflite_convert \ + --output_file=/tmp/foo.tflite \ + --graph_def_file=/tmp/some_quantized_graph.pb \ + --inference_type=QUANTIZED_UINT8 \ + --input_arrays=input \ + --output_arrays=MobilenetV1/Predictions/Reshape_1 \ + --mean_values=128 \ + --std_dev_values=127 +``` + +### Use \"dummy-quantization\" to try out quantized inference on a float graph + +In order to evaluate the possible benefit of generating a quantized graph, the +converter allows "dummy-quantization" on float graphs. The flags +`--default_ranges_min` and `--default_ranges_max` accept plausible values for +the min-max ranges of the values in all arrays that do not have min-max +information. "Dummy-quantization" will produce lower accuracy but will emulate +the performance of a correctly quantized model. + +The example below contains a model using Relu6 activation functions. Therefore, +a reasonable guess is that most activation ranges should be contained in [0, 6]. + +``` +curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \ + | tar xzv -C /tmp +tflite_convert \ + --output_file=/tmp/foo.cc \ + --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \ + --inference_type=QUANTIZED_UINT8 \ + --input_arrays=input \ + --output_arrays=MobilenetV1/Predictions/Reshape_1 \ + --default_ranges_min=0 \ + --default_ranges_max=6 \ + --mean_values=128 \ + --std_dev_values=127 +``` + +## Specifying input and output arrays + +### Multiple input arrays + +The flag `input_arrays` takes in a comma-separated list of input arrays as seen +in the example below. This is useful for models or subgraphs with multiple +inputs. + +``` +curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \ + | tar xzv -C /tmp +tflite_convert \ + --graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \ + --output_file=/tmp/foo.tflite \ + --input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \ + --input_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_3/MaxPool_0a_3x3/MaxPool,InceptionV1/InceptionV1/Mixed_3b/Branch_0/Conv2d_0a_1x1/Relu \ + --output_arrays=InceptionV1/Logits/Predictions/Reshape_1 +``` + +Note that `input_shapes` is provided as a colon-separated list. Each input shape +corresponds to the input array at the same position in the respective list. + +### Multiple output arrays + +The flag `output_arrays` takes in a comma-separated list of output arrays as +seen in the example below. This is useful for models or subgraphs with multiple +outputs. + +``` +curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \ + | tar xzv -C /tmp +tflite_convert \ + --graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \ + --output_file=/tmp/foo.tflite \ + --input_arrays=input \ + --output_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu +``` + +### Specifying subgraphs + +Any array in the input file can be specified as an input or output array in +order to extract subgraphs out of an input graph file. The TensorFlow Lite +Converter discards the parts of the graph outside of the specific subgraph. Use +[graph visualizations](#graph-visualizations) to identify the input and output +arrays that make up the desired subgraph. + +The follow command shows how to extract a single fused layer out of a TensorFlow +GraphDef. + +``` +curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \ + | tar xzv -C /tmp +tflite_convert \ + --graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \ + --output_file=/tmp/foo.pb \ + --input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \ + --input_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_3/MaxPool_0a_3x3/MaxPool,InceptionV1/InceptionV1/Mixed_3b/Branch_0/Conv2d_0a_1x1/Relu \ + --output_arrays=InceptionV1/InceptionV1/Mixed_3b/concat_v2 +``` + +Note that the final representation in TensorFlow Lite FlatBuffers tends to have +coarser granularity than the very fine granularity of the TensorFlow GraphDef +representation. For example, while a fully-connected layer is typically +represented as at least four separate ops in TensorFlow GraphDef (Reshape, +MatMul, BiasAdd, Relu...), it is typically represented as a single "fused" op +(FullyConnected) in the converter's optimized representation and in the final +on-device representation. As the level of granularity gets coarser, some +intermediate arrays (say, the array between the MatMul and the BiasAdd in the +TensorFlow GraphDef) are dropped. + +When specifying intermediate arrays as `--input_arrays` and `--output_arrays`, +it is desirable (and often required) to specify arrays that are meant to survive +in the final form of the graph, after fusing. These are typically the outputs of +activation functions (since everything in each layer until the activation +function tends to get fused). + +## Logging + + +## Graph visualizations + +The converter can export a graph to the Graphviz Dot format for easy +visualization using either the `--output_format` flag or the +`--dump_graphviz_dir` flag. The subsections below outline the use cases for +each. + +### Using `--output_format=GRAPHVIZ_DOT` + +The first way to get a Graphviz rendering is to pass `GRAPHVIZ_DOT` into +`--output_format`. This results in a plausible visualization of the graph. This +reduces the requirements that exist during conversion from a TensorFlow GraphDef +to a TensorFlow Lite FlatBuffer. This may be useful if the conversion to TFLite +is failing. + +``` +curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \ + | tar xzv -C /tmp +tflite_convert \ + --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \ + --output_file=/tmp/foo.dot \ + --output_format=GRAPHVIZ_DOT \ + --input_shape=1,128,128,3 \ + --input_arrays=input \ + --output_arrays=MobilenetV1/Predictions/Reshape_1 +``` + +The resulting `.dot` file can be rendered into a PDF as follows: + +``` +dot -Tpdf -O /tmp/foo.dot +``` + +And the resulting `.dot.pdf` can be viewed in any PDF viewer, but we suggest one +with a good ability to pan and zoom across a very large page. Google Chrome does +well in that respect. + +``` +google-chrome /tmp/foo.dot.pdf +``` + +Example PDF files are viewable online in the next section. + +### Using `--dump_graphviz_dir` + +The second way to get a Graphviz rendering is to pass the `--dump_graphviz_dir` +flag, specifying a destination directory to dump Graphviz rendering to. Unlike +the previous approach, this one retains the original output format. This +provides a visualization of the actual graph resulting from a specific +conversion process. + +``` +curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \ + | tar xzv -C /tmp +tflite_convert \ + --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \ + --output_file=/tmp/foo.tflite \ + --input_arrays=input \ + --output_arrays=MobilenetV1/Predictions/Reshape_1 \ + --dump_graphviz_dir=/tmp +``` + +This generates a few files in the destination directory. The two most important +files are `toco_AT_IMPORT.dot` and `/tmp/toco_AFTER_TRANSFORMATIONS.dot`. +`toco_AT_IMPORT.dot` represents the original graph containing only the +transformations done at import time. This tends to be a complex visualization +with limited information about each node. It is useful in situations where a +conversion command fails. + +`toco_AFTER_TRANSFORMATIONS.dot` represents the graph after all transformations +were applied to it, just before it is exported. Typically, this is a much +smaller graph with more information about each node. + +As before, these can be rendered to PDFs: + +``` +dot -Tpdf -O /tmp/toco_*.dot +``` + +Sample output files can be seen here below. Note that it is the same +`AveragePool` node in the top right of each image. + + + + + + +
+ + + + + + + +
beforeafter
+ +### Graph "video" logging + +When `--dump_graphviz_dir` is used, one may additionally pass +`--dump_graphviz_video`. This causes a graph visualization to be dumped after +each individual graph transformation, resulting in thousands of files. +Typically, one would then bisect into these files to understand when a given +change was introduced in the graph. + +### Legend for the graph visualizations + +* Operators are red square boxes with the following hues of red: + * Most operators are + bright + red. + * Some typically heavy operators (e.g. Conv) are rendered in a + darker + red. +* Arrays are octagons with the following colors: + * Constant arrays are + blue. + * Activation arrays are gray: + * Internal (intermediate) activation arrays are + light + gray. + * Those activation arrays that are designated as `--input_arrays` or + `--output_arrays` are + dark + gray. + * RNN state arrays are green. Because of the way that the converter + represents RNN back-edges explicitly, each RNN state is represented by a + pair of green arrays: + * The activation array that is the source of the RNN back-edge (i.e. + whose contents are copied into the RNN state array after having been + computed) is + light + green. + * The actual RNN state array is + dark + green. It is the destination of the RNN back-edge updating + it. diff --git a/tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_reference.md b/tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_reference.md new file mode 100644 index 0000000000..d65912fea6 --- /dev/null +++ b/tensorflow/contrib/lite/g3doc/tflite_convert/cmdline_reference.md @@ -0,0 +1,159 @@ +# TensorFlow Lite Converter command-line glossary + +This page is complete reference of command-line flags used by the TensorFlow +Lite Converter's command line starting from TensorFlow 1.9 up until the most +recent build of TensorFlow. + +[TOC] + +## High-level flags + +The following high level flags specify the details of the input and output +files. The flag `--output_file` is always required. Additionally, either +`--graph_def_file`, `--saved_model_dir` or `--keras_model_file` is required. + +* `--output_file`. Type: string. Specifies the full path of the output file. +* `--graph_def_file`. Type: string. Specifies the full path of the input + GraphDef file frozen using + [freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py). +* `--saved_model_dir`. Type: string. Specifies the full path to the directory + containing the SavedModel. +* `--keras_model_file`. Type: string. Specifies the full path of the HDF5 file + containing the tf.keras model. +* `--output_format`. Type: string. Default: `TFLITE`. Specifies the format of + the output file. Allowed values: + * `TFLITE`: TensorFlow Lite FlatBuffer format. + * `GRAPHVIZ_DOT`: GraphViz `.dot` format containing a visualization of the + graph after graph transformations. + * Note that passing `GRAPHVIZ_DOT` to `--output_format` leads to loss + of TFLite specific transformations. Therefore, the resulting + visualization may not reflect the final set of graph + transformations. To get a final visualization with all graph + transformations use `--dump_graphviz_dir` instead. + +The following flags specify optional parameters when using SavedModels. + +* `--saved_model_tag_set`. Type: string. Default: + [kSavedModelTagServe](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/tag_constants.h). + Specifies a comma-separated set of tags identifying the MetaGraphDef within + the SavedModel to analyze. All tags in the tag set must be specified. +* `--saved_model_signature_key`. Type: string. Default: + [DEFAULT_SERVING_SIGNATURE_DEF_KEY](https://www.tensorflow.org/api_docs/python/tf/saved_model/signature_constants). + Specifies the key identifying the SignatureDef containing inputs and + outputs. + +## Model flags + +*Model flags* provide additional information about the model stored in the input +file. + +* `--input_arrays`. Type: comma-separated list of strings. Specifies the list + of names of input activation tensors. +* `--output_arrays`. Type: comma-separated list of strings. Specifies the list + of names of output activation tensors. + +The following flags define properties of the input tensors. Each item in the +`--input_arrays` flag should correspond to each item in the following flags +based on index. + +* `--input_shapes`. Type: colon-separated list of comma-separated lists of + integers. Each comma-separated list of integers gives the shape of one of + the input arrays specified in + [TensorFlow convention](https://www.tensorflow.org/guide/dims_types#shape). + * Example: `--input_shapes=1,60,80,3` for a typical vision model means a + batch size of 1, an input image height of 60, an input image width of + 80, and an input image depth of 3 (representing RGB channels). + * Example: `--input_arrays=foo,bar --input_shapes=2,3:4,5,6` means "foo" + has a shape of [2, 3] and "bar" has a shape of [4, 5, 6]. +* `--std_dev_values`, `--mean_values`. Type: comma-separated list of floats. + These specify the (de-)quantization parameters of the input array, when it + is quantized. This is only needed if `inference_input_type` is + `QUANTIZED_UINT8`. + * The meaning of `mean_values` and `std_dev_values` is as follows: each + quantized value in the quantized input array will be interpreted as a + mathematical real number (i.e. as an input activation value) according + to the following formula: + * `real_value = (quantized_input_value - mean_value) / std_dev_value`. + * When performing float inference (`--inference_type=FLOAT`) on a + quantized input, the quantized input would be immediately dequantized by + the inference code according to the above formula, before proceeding + with float inference. + * When performing quantized inference + (`--inference_type=QUANTIZED_UINT8`), no dequantization is performed by + the inference code. However, the quantization parameters of all arrays, + including those of the input arrays as specified by `mean_value` and + `std_dev_value`, determine the fixed-point multipliers used in the + quantized inference code. `mean_value` must be an integer when + performing quantized inference. + +## Transformation flags + +*Transformation flags* specify options of the transformations to be applied to +the graph, i.e. they specify requested properties that the output file should +have. + +* `--inference_type`. Type: string. Default: `FLOAT`. Data type of all + real-number arrays in the output file except for input arrays (defined by + `--inference_input_type`). Must be `{FLOAT, QUANTIZED_UINT8}`. + + This flag only impacts real-number arrays including float and quantized + arrays. This excludes all other data types including plain integer arrays + and string arrays. Specifically: + + * If `FLOAT`, then real-numbers arrays will be of type float in the output + file. If they were quantized in the input file, then they get + dequantized. + * If `QUANTIZED_UINT8`, then real-numbers arrays will be quantized as + uint8 in the output file. If they were float in the input file, then + they get quantized. + +* `--inference_input_type`. Type: string. Data type of a real-number input + array in the output file. By default the `--inference_type` is used as type + of all of the input arrays. Flag is primarily intended for generating a + float-point graph with a quantized input array. A Dequantized operator is + added immediately after the input array. Must be `{FLOAT, QUANTIZED_UINT8}`. + + The flag is typically used for vision models taking a bitmap as input but + requiring floating-point inference. For such image models, the uint8 input + is quantized and the quantization parameters used for such input arrays are + their `mean_value` and `std_dev_value` parameters. + +* `--default_ranges_min`, `--default_ranges_max`. Type: floating-point. + Default value for the (min, max) range values used for all arrays without a + specified range. Allows user to proceed with quantization of non-quantized + or incorrectly-quantized input files. These flags produce models with low + accuracy. They are intended for easy experimentation with quantization via + "dummy quantization". + +* `--drop_control_dependency`. Type: boolean. Default: True. Indicates whether + to drop control dependencies silently. This is due to TensorFlow Lite not + supporting control dependencies. + +* `--reorder_across_fake_quant`. Type: boolean. Default: False. Indicates + whether to reorder FakeQuant nodes in unexpected locations. Used when the + location of the FakeQuant nodes is preventing graph transformations + necessary to convert the graph. Results in a graph that differs from the + quantized training graph, potentially causing differing arithmetic behavior. + +* `--allow_custom_ops`. Type: string. Default: False. Indicates whether to + allow custom operations. When false, any unknown operation is an error. When + true, custom ops are created for any op that is unknown. The developer will + need to provide these to the TensorFlow Lite runtime with a custom resolver. + +* `--post_training_quantize`. Type: boolean. Default: False. Boolean + indicating whether to quantize the weights of the converted float model. + Model size will be reduced and there will be latency improvements (at the + cost of accuracy). + +## Logging flags + +The following flags generate graph visualizations of the graph as +[GraphViz](https://www.graphviz.org/) `.dot` files at various points during +graph transformations: + +* `--dump_graphviz_dir`. Type: string. Specifies the full path of the + directory to output GraphViz `.dot` files. Outputs the graph immediately + after reading in the graph and after all of the transformations have been + completed. +* `--dump_graphviz_video`. Type: boolean. Outputs GraphViz after every graph + transformation. Requires `--dump_graphviz_dir` to be specified. diff --git a/tensorflow/contrib/lite/g3doc/tflite_convert/index.md b/tensorflow/contrib/lite/g3doc/tflite_convert/index.md new file mode 100644 index 0000000000..12ba0225f6 --- /dev/null +++ b/tensorflow/contrib/lite/g3doc/tflite_convert/index.md @@ -0,0 +1,22 @@ +# TensorFlow Lite Converter + +The TensorFlow Lite Converter converts TensorFlow graphs into +TensorFlow Lite graphs. There are additional usages that are also detailed in +the usage documentation. + + +## Where the converter fits in the TensorFlow landscape + +Once an application developer has a trained TensorFlow model, the TensorFlow +Lite Converter will accept +that model and generate a TensorFlow Lite +[FlatBuffer](https://google.github.io/flatbuffers/) file. The converter currently supports +[SavedModels](https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators), +frozen graphs (models generated via +[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py)), +and `tf.Keras` model files. The TensorFlow Lite FlatBuffer file can be shipped +to client devices, generally mobile devices, where the TensorFlow Lite +interpreter handles them on-device. This flow is represented in the diagram +below. + +![drawing](toco_landscape.svg) diff --git a/tensorflow/contrib/lite/g3doc/tflite_convert/python_api.md b/tensorflow/contrib/lite/g3doc/tflite_convert/python_api.md new file mode 100644 index 0000000000..e1c0e0c240 --- /dev/null +++ b/tensorflow/contrib/lite/g3doc/tflite_convert/python_api.md @@ -0,0 +1,258 @@ +# TensorFlow Lite Converter & Interpreter Python API reference + +This page provides examples on how to use the TensorFlow Lite Converter and the +TensorFlow Lite interpreter using the Python API. + +[TOC] + + +## High-level overview + +While the TensorFlow Lite Converter can be used from the command line, it is +often convenient to use in a Python script as part of the model development +pipeline. This allows you to know early that you are designing a model that can +be targeted to devices with mobile. + +## API + +The API for converting TensorFlow models to TensorFlow Lite as of TensorFlow 1.9 +is `tf.contrib.lite.TFLiteConverter`. The API for calling the Python intepreter +is `tf.contrib.lite.Interpreter`. + +Note: Reference "Additional Instructions" sections for converting TensorFlow +models to TensorFlow Lite +[in TensorFlow 1.9 to TensorFlow 1.11](#pre-tensorflow-1.11) and +[prior to TensorFlow 1.9](#pre-tensorflow-1.9) + +`TFLiteConverter` provides class methods based on the original format of the +model. `TFLiteConverter.from_session()` is available for GraphDefs. +`TFLiteConverter.from_saved_model()` is available for SavedModels. +`TFLiteConverter.from_keras_model_file()` is available for `tf.Keras` files. +Example usages for simple float-point models are shown in +[Basic Examples](#basic). Examples usages for more complex models is shown in +[Complex Examples](#complex). + +## Basic examples + +The following section shows examples of how to convert a basic float-point model +from each of the supported data formats into a TensorFlow Lite FlatBuffers. + +### Exporting a GraphDef from tf.Session + +The following example shows how to convert a TensorFlow GraphDef into a +TensorFlow Lite FlatBuffer from a `tf.Session` object. + +```python +import tensorflow as tf + +img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3)) +var = tf.get_variable("weights", dtype=tf.float32, shape=(1, 64, 64, 3)) +val = img + var +out = tf.identity(val, name="out") + +with tf.Session() as sess: + sess.run(tf.global_variables_initializer()) + converter = tf.contrib.lite.TFLiteConverter.from_session(sess, [img], [out]) + tflite_model = converter.convert() + open("converted_model.tflite", "wb").write(tflite_model) +``` + +### Exporting a GraphDef from file + +The following example shows how to convert a TensorFlow GraphDef into a +TensorFlow Lite FlatBuffer when the GraphDef is stored in a file. Both `.pb` and +`.pbtxt` files are accepted. + +The example uses +[Mobilenet_1.0_224](https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz). +The function only supports GraphDefs frozen using +[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py). + +```python +import tensorflow as tf + +graph_def_file = "/path/to/Downloads/mobilenet_v1_1.0_224/frozen_graph.pb" +input_arrays = ["input"] +output_arrays = ["MobilenetV1/Predictions/Softmax"] + +converter = tf.contrib.lite.TFLiteConverter.from_frozen_graph( + graph_def_file, input_arrays, output_arrays) +tflite_model = converter.convert() +open("converted_model.tflite", "wb").write(tflite_model) +``` + +### Exporting a SavedModel + +The following example shows how to convert a SavedModel into a TensorFlow Lite +FlatBuffer. + +```python +import tensorflow as tf + +converter = tf.contrib.lite.TFLiteConverter.from_saved_model(saved_model_dir) +tflite_model = converter.convert() +open("converted_model.tflite", "wb").write(tflite_model) +``` + +For more complex SavedModels, the optional parameters that can be passed into +`TFLiteConverter.from_saved_model()` are `input_arrays`, `input_shapes`, +`output_arrays`, `tag_set` and `signature_key`. Details of each parameter are +available by running `help(tf.contrib.lite.TFLiteConverter)`. + +### Exporting a tf.keras File + +The following example shows how to convert a `tf.keras` model into a TensorFlow +Lite FlatBuffer. This example requires +[`h5py`](http://docs.h5py.org/en/latest/build.html) to be installed. + +```python +import tensorflow as tf + +converter = tf.contrib.lite.TFLiteConverter.from_keras_model_file("keras_model.h5") +tflite_model = converter.convert() +open("converted_model.tflite", "wb").write(tflite_model) +``` + +The `tf.keras` file must contain both the model and the weights. A comprehensive +example including model construction can be seen below. + +```python +import numpy as np +import tensorflow as tf + +# Generate tf.keras model. +model = tf.keras.models.Sequential() +model.add(tf.keras.layers.Dense(2, input_shape=(3,))) +model.add(tf.keras.layers.RepeatVector(3)) +model.add(tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(3))) +model.compile(loss=tf.keras.losses.MSE, + optimizer=tf.keras.optimizers.RMSprop(lr=0.0001), + metrics=[tf.keras.metrics.categorical_accuracy], + sample_weight_mode='temporal') + +x = np.random.random((1, 3)) +y = np.random.random((1, 3, 3)) +model.train_on_batch(x, y) +model.predict(x) + +# Save tf.keras model in HDF5 format. +keras_file = "keras_model.h5" +tf.keras.models.save_model(model, keras_file) + +# Convert to TensorFlow Lite model. +converter = tf.contrib.lite.TFLiteConverter.from_keras_model_file(keras_file) +tflite_model = converter.convert() +open("converted_model.tflite", "wb").write(tflite_model) +``` + +## Complex examples + +For models where the default value of the attributes is not sufficient, the +attribute's values should be set before calling `convert()`. In order to call +any constants use `tf.contrib.lite.constants.` as seen below with +`QUANTIZED_UINT8`. Run `help(tf.contrib.lite.TFLiteConverter)` in the Python +terminal for detailed documentation on the attributes. + +Although the examples are demonstrated on GraphDefs containing only constants. +The same logic can be applied irrespective of the input data format. + +### Exporting a quantized GraphDef + +The following example shows how to convert a quantized model into a TensorFlow +Lite FlatBuffer. + +```python +import tensorflow as tf + +img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3)) +const = tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.]) +val = img + const +out = tf.fake_quant_with_min_max_args(val, min=0., max=1., name="output") + +with tf.Session() as sess: + converter = tf.contrib.lite.TFLiteConverter.from_session(sess, [img], [out]) + converter.inference_type = tf.contrib.lite.constants.QUANTIZED_UINT8 + input_arrays = converter.get_input_arrays() + converter.quantized_input_stats = {input_arrays[0] : (0., 1.)} # mean, std_dev + tflite_model = converter.convert() + open("converted_model.tflite", "wb").write(tflite_model) +``` + +## TensorFlow Lite Python interpreter + +### Using the interpreter from a model file + +The following example shows how to use the TensorFlow Lite Python interpreter +when provided a TensorFlow Lite FlatBuffer file. The example also demonstrates +how to run inference on random input data. Run +`help(tf.contrib.lite.Interpreter)` in the Python terminal to get detailed +documentation on the interpreter. + +```python +import numpy as np +import tensorflow as tf + +# Load TFLite model and allocate tensors. +interpreter = tf.contrib.lite.Interpreter(model_path="converted_model.tflite") +interpreter.allocate_tensors() + +# Get input and output tensors. +input_details = interpreter.get_input_details() +output_details = interpreter.get_output_details() + +# Test model on random input data. +input_shape = input_details[0]['shape'] +input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32) +interpreter.set_tensor(input_details[0]['index'], input_data) + +interpreter.invoke() +output_data = interpreter.get_tensor(output_details[0]['index']) +print(output_data) +``` + +### Using the interpreter from model data + +The following example shows how to use the TensorFlow Lite Python interpreter +when starting with the TensorFlow Lite Flatbuffer model previously loaded. This +example shows an end-to-end use case, starting from building the TensorFlow +model. + +```python +import numpy as np +import tensorflow as tf + +img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3)) +const = tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.]) +val = img + const +out = tf.identity(val, name="out") + +with tf.Session() as sess: + converter = tf.contrib.lite.TFLiteConverter.from_session(sess, [img], [out]) + tflite_model = converter.convert() + +# Load TFLite model and allocate tensors. +interpreter = tf.contrib.lite.Interpreter(model_content=tflite_model) +interpreter.allocate_tensors() +``` + +## Additional instructions + +### Build from source code + +In order to run the latest version of the TensorFlow Lite Converter Python API, +either install the nightly build with +[pip](https://www.tensorflow.org/install/pip) (recommended) or +[Docker](https://www.tensorflow.org/install/docker), or +[build the pip package from source](https://www.tensorflow.org/install/source). + +### Converting models in TensorFlow 1.9 to TensorFlow 1.11 + +To convert TensorFlow models to TensorFlow Lite in TensorFlow 1.9 through +TensorFlow 1.11, use `TocoConverter`. `TocoConverter` is semantically +identically to `TFLiteConverter`. + +### Converting models prior to TensorFlow 1.9 + +To convert TensorFlow models to TensorFlow Lite in TensorFlow 1.7 and TensorFlow +1.8, use the `toco_convert` function. Run `help(tf.contrib.lite.toco_convert)` +to get details about accepted parameters. diff --git a/tensorflow/contrib/lite/g3doc/tflite_convert/toco_landscape.svg b/tensorflow/contrib/lite/g3doc/tflite_convert/toco_landscape.svg new file mode 100644 index 0000000000..335debde57 --- /dev/null +++ b/tensorflow/contrib/lite/g3doc/tflite_convert/toco_landscape.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/tensorflow/contrib/lite/toco/g3doc/README.md b/tensorflow/contrib/lite/toco/g3doc/README.md new file mode 100644 index 0000000000..2153b6cc63 --- /dev/null +++ b/tensorflow/contrib/lite/toco/g3doc/README.md @@ -0,0 +1,3 @@ +# TOCO + +These files have moved to [../../g3doc/tflite_convert](../../g3doc/tflite_convert) diff --git a/tensorflow/contrib/lite/toco/g3doc/cmdline_examples.md b/tensorflow/contrib/lite/toco/g3doc/cmdline_examples.md deleted file mode 100644 index e3c46eb377..0000000000 --- a/tensorflow/contrib/lite/toco/g3doc/cmdline_examples.md +++ /dev/null @@ -1,372 +0,0 @@ -# TensorFlow Lite Converter command-line examples - -This page shows how to use the TensorFlow Lite Converter in the command line. It -is complemented by the following documents: - -* [README](../README.md) -* [Command-line glossary](cmdline_reference.md) -* [Python API examples](python_api.md) - -Table of contents: - -* [Command-line tools](#tools) - * [Converting models prior to TensorFlow 1.9](#pre-tensorflow-1.9) -* [Basic examples](#basic) - * [Convert a TensorFlow GraphDef](#graphdef) - * [Convert a TensorFlow SavedModel](#savedmodel) - * [Convert a tf.keras model](#keras) -* [Quantization](#quantization) - * [Convert a TensorFlow GraphDef for quantized inference](#graphdef-quant) - * [Use "dummy-quantization" to try out quantized inference on a float - graph](#dummy-quant) -* [Specifying input and output arrays](#specifying-input-and-output-arrays) - * [Multiple input arrays](#multiple-input-arrays) - * [Multiple output arrays](#multiple-output-arrays) - * [Specifying subgraphs](#specifying-subgraphs) -* [Graph visualizations](#graph-visualizations) - * [Using --output_format=GRAPHVIZ_DOT](#using-output-format-graphviz-dot) - * [Using --dump_graphviz_dir](#using-dump-graphviz-dir) - * [Graph "video" logging](#graph-video-logging) - * [Legend for the graph visualizations](#graphviz-legend) - -## Command-line tools - -There are two approaches to running the converter in the command line. - -* `tflite_convert`: Starting from TensorFlow 1.9, the command-line tool - `tflite_convert` is installed as part of the Python package. All of the - examples below use `tflite_convert` for simplicity. - * Example: `tflite_convert --output_file=...` -* `bazel`: In order to run the latest version of the TensorFlow Lite Converter - either install the nightly build using - [pip](https://www.tensorflow.org/install/pip) or - [clone the TensorFlow repository](https://www.tensorflow.org/install/source) - and use `bazel`. - * Example: `bazel run - //tensorflow/contrib/lite/python:tflite_convert -- - --output_file=...` - -### Converting models prior to TensorFlow 1.9 - -The recommended approach for using the converter prior to TensorFlow 1.9 is the -[Python API](python_api.md#pre-tensorflow-1.9). If a command line tool is -desired, the `toco` command line tool was available in TensorFlow 1.7. Enter -`toco --help` in Terminal for additional details on the command-line flags -available. There were no command line tools in TensorFlow 1.8. - -## Basic examples - -The following section shows examples of how to convert a basic float-point model -from each of the supported data formats into a TensorFlow Lite FlatBuffers. - -### Convert a TensorFlow GraphDef - -The follow example converts a basic TensorFlow GraphDef (frozen by -[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py)) -into a TensorFlow Lite FlatBuffer to perform floating-point inference. Frozen -graphs contain the variables stored in Checkpoint files as Const ops. - -``` -curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \ - | tar xzv -C /tmp -tflite_convert \ - --output_file=/tmp/foo.tflite \ - --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \ - --input_arrays=input \ - --output_arrays=MobilenetV1/Predictions/Reshape_1 -``` - -The value for `input_shapes` is automatically determined whenever possible. - -### Convert a TensorFlow SavedModel - -The follow example converts a basic TensorFlow SavedModel into a Tensorflow Lite -FlatBuffer to perform floating-point inference. - -``` -tflite_convert \ - --output_file=/tmp/foo.tflite \ - --saved_model_dir=/tmp/saved_model -``` - -[SavedModel](https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators) -has fewer required flags than frozen graphs due to access to additional data -contained within the SavedModel. The values for `--input_arrays` and -`--output_arrays` are an aggregated, alphabetized list of the inputs and outputs -in the [SignatureDefs](https://www.tensorflow.org/serving/signature_defs) within -the -[MetaGraphDef](https://www.tensorflow.org/guide/saved_model#apis_to_build_and_load_a_savedmodel) -specified by `--saved_model_tag_set`. As with the GraphDef, the value for -`input_shapes` is automatically determined whenever possible. - -There is currently no support for MetaGraphDefs without a SignatureDef or for -MetaGraphDefs that use the [`assets/` -directory](https://www.tensorflow.org/guide/saved_model#structure_of_a_savedmodel_directory). - -### Convert a tf.Keras model - -The following example converts a `tf.keras` model into a TensorFlow Lite -Flatbuffer. The `tf.keras` file must contain both the model and the weights. - -``` -tflite_convert \ - --output_file=/tmp/foo.tflite \ - --keras_model_file=/tmp/keras_model.h5 -``` - -## Quantization - -### Convert a TensorFlow GraphDef for quantized inference - -The TensorFlow Lite Converter is compatible with fixed point quantization models -described [here](https://www.tensorflow.org/performance/quantization). These are -float models with -[`FakeQuant*`](https://www.tensorflow.org/api_guides/python/array_ops#Fake_quantization) -ops inserted at the boundaries of fused layers to record min-max range -information. This generates a quantized inference workload that reproduces the -quantization behavior that was used during training. - -The following command generates a quantized TensorFlow Lite FlatBuffer from a -"quantized" TensorFlow GraphDef. - -``` -tflite_convert \ - --output_file=/tmp/foo.tflite \ - --graph_def_file=/tmp/some_quantized_graph.pb \ - --inference_type=QUANTIZED_UINT8 \ - --input_arrays=input \ - --output_arrays=MobilenetV1/Predictions/Reshape_1 \ - --mean_values=128 \ - --std_dev_values=127 -``` - -### Use \"dummy-quantization\" to try out quantized inference on a float graph - -In order to evaluate the possible benefit of generating a quantized graph, the -converter allows "dummy-quantization" on float graphs. The flags -`--default_ranges_min` and `--default_ranges_max` accept plausible values for -the min-max ranges of the values in all arrays that do not have min-max -information. "Dummy-quantization" will produce lower accuracy but will emulate -the performance of a correctly quantized model. - -The example below contains a model using Relu6 activation functions. Therefore, -a reasonable guess is that most activation ranges should be contained in [0, 6]. - -``` -curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \ - | tar xzv -C /tmp -tflite_convert \ - --output_file=/tmp/foo.cc \ - --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \ - --inference_type=QUANTIZED_UINT8 \ - --input_arrays=input \ - --output_arrays=MobilenetV1/Predictions/Reshape_1 \ - --default_ranges_min=0 \ - --default_ranges_max=6 \ - --mean_values=128 \ - --std_dev_values=127 -``` - -## Specifying input and output arrays - -### Multiple input arrays - -The flag `input_arrays` takes in a comma-separated list of input arrays as seen -in the example below. This is useful for models or subgraphs with multiple -inputs. - -``` -curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \ - | tar xzv -C /tmp -tflite_convert \ - --graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \ - --output_file=/tmp/foo.tflite \ - --input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \ - --input_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_3/MaxPool_0a_3x3/MaxPool,InceptionV1/InceptionV1/Mixed_3b/Branch_0/Conv2d_0a_1x1/Relu \ - --output_arrays=InceptionV1/Logits/Predictions/Reshape_1 -``` - -Note that `input_shapes` is provided as a colon-separated list. Each input shape -corresponds to the input array at the same position in the respective list. - -### Multiple output arrays - -The flag `output_arrays` takes in a comma-separated list of output arrays as -seen in the example below. This is useful for models or subgraphs with multiple -outputs. - -``` -curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \ - | tar xzv -C /tmp -tflite_convert \ - --graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \ - --output_file=/tmp/foo.tflite \ - --input_arrays=input \ - --output_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu -``` - -### Specifying subgraphs - -Any array in the input file can be specified as an input or output array in -order to extract subgraphs out of an input graph file. The TensorFlow Lite -Converter discards the parts of the graph outside of the specific subgraph. Use -[graph visualizations](#graph-visualizations) to identify the input and output -arrays that make up the desired subgraph. - -The follow command shows how to extract a single fused layer out of a TensorFlow -GraphDef. - -``` -curl https://storage.googleapis.com/download.tensorflow.org/models/inception_v1_2016_08_28_frozen.pb.tar.gz \ - | tar xzv -C /tmp -tflite_convert \ - --graph_def_file=/tmp/inception_v1_2016_08_28_frozen.pb \ - --output_file=/tmp/foo.pb \ - --input_shapes=1,28,28,96:1,28,28,16:1,28,28,192:1,28,28,64 \ - --input_arrays=InceptionV1/InceptionV1/Mixed_3b/Branch_1/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_2/Conv2d_0a_1x1/Relu,InceptionV1/InceptionV1/Mixed_3b/Branch_3/MaxPool_0a_3x3/MaxPool,InceptionV1/InceptionV1/Mixed_3b/Branch_0/Conv2d_0a_1x1/Relu \ - --output_arrays=InceptionV1/InceptionV1/Mixed_3b/concat_v2 -``` - -Note that the final representation in TensorFlow Lite FlatBuffers tends to have -coarser granularity than the very fine granularity of the TensorFlow GraphDef -representation. For example, while a fully-connected layer is typically -represented as at least four separate ops in TensorFlow GraphDef (Reshape, -MatMul, BiasAdd, Relu...), it is typically represented as a single "fused" op -(FullyConnected) in the converter's optimized representation and in the final -on-device representation. As the level of granularity gets coarser, some -intermediate arrays (say, the array between the MatMul and the BiasAdd in the -TensorFlow GraphDef) are dropped. - -When specifying intermediate arrays as `--input_arrays` and `--output_arrays`, -it is desirable (and often required) to specify arrays that are meant to survive -in the final form of the graph, after fusing. These are typically the outputs of -activation functions (since everything in each layer until the activation -function tends to get fused). - -## Logging - - -## Graph visualizations - -The converter can export a graph to the Graphviz Dot format for easy -visualization using either the `--output_format` flag or the -`--dump_graphviz_dir` flag. The subsections below outline the use cases for -each. - -### Using `--output_format=GRAPHVIZ_DOT` - -The first way to get a Graphviz rendering is to pass `GRAPHVIZ_DOT` into -`--output_format`. This results in a plausible visualization of the graph. This -reduces the requirements that exist during conversion from a TensorFlow GraphDef -to a TensorFlow Lite FlatBuffer. This may be useful if the conversion to TFLite -is failing. - -``` -curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \ - | tar xzv -C /tmp -tflite_convert \ - --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \ - --output_file=/tmp/foo.dot \ - --output_format=GRAPHVIZ_DOT \ - --input_shape=1,128,128,3 \ - --input_arrays=input \ - --output_arrays=MobilenetV1/Predictions/Reshape_1 -``` - -The resulting `.dot` file can be rendered into a PDF as follows: - -``` -dot -Tpdf -O /tmp/foo.dot -``` - -And the resulting `.dot.pdf` can be viewed in any PDF viewer, but we suggest one -with a good ability to pan and zoom across a very large page. Google Chrome does -well in that respect. - -``` -google-chrome /tmp/foo.dot.pdf -``` - -Example PDF files are viewable online in the next section. - -### Using `--dump_graphviz_dir` - -The second way to get a Graphviz rendering is to pass the `--dump_graphviz_dir` -flag, specifying a destination directory to dump Graphviz rendering to. Unlike -the previous approach, this one retains the original output format. This -provides a visualization of the actual graph resulting from a specific -conversion process. - -``` -curl https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_0.50_128_frozen.tgz \ - | tar xzv -C /tmp -tflite_convert \ - --graph_def_file=/tmp/mobilenet_v1_0.50_128/frozen_graph.pb \ - --output_file=/tmp/foo.tflite \ - --input_arrays=input \ - --output_arrays=MobilenetV1/Predictions/Reshape_1 \ - --dump_graphviz_dir=/tmp -``` - -This generates a few files in the destination directory. The two most important -files are `toco_AT_IMPORT.dot` and `/tmp/toco_AFTER_TRANSFORMATIONS.dot`. -`toco_AT_IMPORT.dot` represents the original graph containing only the -transformations done at import time. This tends to be a complex visualization -with limited information about each node. It is useful in situations where a -conversion command fails. - -`toco_AFTER_TRANSFORMATIONS.dot` represents the graph after all transformations -were applied to it, just before it is exported. Typically, this is a much -smaller graph with more information about each node. - -As before, these can be rendered to PDFs: - -``` -dot -Tpdf -O /tmp/toco_*.dot -``` - -Sample output files can be seen here: - -* [toco_AT_IMPORT.dot.pdf](https://storage.googleapis.com/download.tensorflow.org/example_images/toco_AT_IMPORT.dot.pdf) -* [toco_AFTER_TRANSFORMATIONS.dot.pdf](https://storage.googleapis.com/download.tensorflow.org/example_images/toco_AFTER_TRANSFORMATIONS.dot.pdf). - -### Graph "video" logging - -When `--dump_graphviz_dir` is used, one may additionally pass -`--dump_graphviz_video`. This causes a graph visualization to be dumped after -each individual graph transformation, resulting in thousands of files. -Typically, one would then bisect into these files to understand when a given -change was introduced in the graph. - -### Legend for the graph visualizations - -* Operators are red square boxes with the following hues of red: - * Most operators are - bright - red. - * Some typically heavy operators (e.g. Conv) are rendered in a - darker - red. -* Arrays are octogons with the following colors: - * Constant arrays are - blue. - * Activation arrays are gray: - * Internal (intermediate) activation arrays are - light - gray. - * Those activation arrays that are designated as `--input_arrays` or - `--output_arrays` are - dark - gray. - * RNN state arrays are green. Because of the way that the converter - represents RNN back-edges explicitly, each RNN state is represented by a - pair of green arrays: - * The activation array that is the source of the RNN back-edge (i.e. - whose contents are copied into the RNN state array after having been - computed) is - light - green. - * The actual RNN state array is - dark - green. It is the destination of the RNN back-edge updating - it. diff --git a/tensorflow/contrib/lite/toco/g3doc/cmdline_reference.md b/tensorflow/contrib/lite/toco/g3doc/cmdline_reference.md deleted file mode 100644 index 31200fd657..0000000000 --- a/tensorflow/contrib/lite/toco/g3doc/cmdline_reference.md +++ /dev/null @@ -1,168 +0,0 @@ -# TensorFlow Lite Converter command-line glossary - -This page is complete reference of command-line flags used by the TensorFlow -Lite Converter's command line starting from TensorFlow 1.9 up until the most -recent build of TensorFlow. It is complemented by the following other documents: - -* [README](../README.md) -* [Command-line examples](cmdline_examples.md) -* [Python API examples](python_api.md) - -Table of contents: - -* [High-level flags](#high-level-flags) -* [Model flags](#model-flags) -* [Transformation flags](#transformation-flags) -* [Logging flags](#logging-flags) - -## High-level flags - -The following high level flags specify the details of the input and output -files. The flag `--output_file` is always required. Additionally, either -`--graph_def_file`, `--saved_model_dir` or `--keras_model_file` is required. - -* `--output_file`. Type: string. Specifies the full path of the output file. -* `--graph_def_file`. Type: string. Specifies the full path of the input - GraphDef file frozen using - [freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py). -* `--saved_model_dir`. Type: string. Specifies the full path to the directory - containing the SavedModel. -* `--keras_model_file`. Type: string. Specifies the full path of the HDF5 file - containing the tf.keras model. -* `--output_format`. Type: string. Default: `TFLITE`. Specifies the format of - the output file. Allowed values: - * `TFLITE`: TensorFlow Lite FlatBuffer format. - * `GRAPHVIZ_DOT`: GraphViz `.dot` format containg a visualization of the - graph after graph transformations. - * Note that passing `GRAPHVIZ_DOT` to `--output_format` leads to loss - of TFLite specific transformations. Therefore, the resulting - visualization may not reflect the final set of graph - transformations. To get a final visualization with all graph - transformations use `--dump_graphviz_dir` instead. - -The following flags specify optional parameters when using SavedModels. - -* `--saved_model_tag_set`. Type: string. Default: - [kSavedModelTagServe](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/tag_constants.h). - Specifies a comma-separated set of tags identifying the MetaGraphDef within - the SavedModel to analyze. All tags in the tag set must be specified. -* `--saved_model_signature_key`. Type: string. Default: - [DEFAULT_SERVING_SIGNATURE_DEF_KEY](https://www.tensorflow.org/api_docs/python/tf/saved_model/signature_constants). - Specifies the key identifying the SignatureDef containing inputs and - outputs. - -## Model flags - -*Model flags* provide additional information about the model stored in the input -file. - -* `--input_arrays`. Type: comma-separated list of strings. Specifies the list - of names of input activation tensors. -* `--output_arrays`. Type: comma-separated list of strings. Specifies the list - of names of output activation tensors. - -The following flags define properties of the input tensors. Each item in the -`--input_arrays` flag should correspond to each item in the following flags -based on index. - -* `--input_shapes`. Type: colon-separated list of comma-separated lists of - integers. Each comma-separated list of integers gives the shape of one of - the input arrays specified in - [TensorFlow convention](https://www.tensorflow.org/versions/r1.2/programmers_guide/dims_types#shape). - * Example: `--input_shapes=1,60,80,3` for a typical vision model means a - batch size of 1, an input image height of 60, an input image width of - 80, and an input image depth of 3 (representing RGB channels). - * Example: `--input_arrays=foo,bar --input_shapes=2,3:4,5,6` means "foo" - has a shape of [2, 3] and "bar" has a shape of [4, 5, 6]. -* `--std_dev_values`, `--mean_values`. Type: comma-separated list of floats. - These specify the (de-)quantization parameters of the input array, when it - is quantized. This is only needed if `inference_input_type` is - `QUANTIZED_UINT8`. - * The meaning of `mean_values` and `std_dev_values` is as follows: each - quantized value in the quantized input array will be interpreted as a - mathematical real number (i.e. as an input activation value) according - to the following formula: - * `real_value = (quantized_input_value - mean_value) / std_dev_value`. - * When performing float inference (`--inference_type=FLOAT`) on a - quantized input, the quantized input would be immediately dequantized by - the inference code according to the above formula, before proceeding - with float inference. - * When performing quantized inference - (`--inference_type=QUANTIZED_UINT8`), no dequantization is performed by - the inference code. However, the quantization parameters of all arrays, - including those of the input arrays as specified by `mean_value` and - `std_dev_value`, determine the fixed-point multipliers used in the - quantized inference code. `mean_value` must be an integer when - performing quantized inference. - -## Transformation flags - -*Transformation flags* specify options of the transformations to be applied to -the graph, i.e. they specify requested properties that the output file should -have. - -* `--inference_type`. Type: string. Default: `FLOAT`. Data type of all - real-number arrays in the output file except for input arrays (defined by - `--inference_input_type`). Must be `{FLOAT, QUANTIZED_UINT8}`. - - This flag only impacts real-number arrays including float and quantized - arrays. This excludes all other data types including plain integer arrays - and string arrays. Specifically: - - * If `FLOAT`, then real-numbers arrays will be of type float in the output - file. If they were quantized in the input file, then they get - dequantized. - * If `QUANTIZED_UINT8`, then real-numbers arrays will be quantized as - uint8 in the output file. If they were float in the input file, then - they get quantized. - -* `--inference_input_type`. Type: string. Data type of a real-number input - array in the output file. By default the `--inference_type` is used as type - of all of the input arrays. Flag is primarily intended for generating a - float-point graph with a quantized input array. A Dequantized operator is - added immediately after the input array. Must be `{FLOAT, QUANTIZED_UINT8}`. - - The flag is typically used for vision models taking a bitmap as input but - requiring floating-point inference. For such image models, the uint8 input - is quantized and the quantization parameters used for such input arrays are - their `mean_value` and `std_dev_value` parameters. - -* `--default_ranges_min`, `--default_ranges_max`. Type: floating-point. - Default value for the (min, max) range values used for all arrays without a - specified range. Allows user to proceed with quantization of non-quantized - or incorrectly-quantized input files. These flags produce models with low - accuracy. They are intended for easy experimentation with quantization via - "dummy quantization". - -* `--drop_control_dependency`. Type: boolean. Default: True. Indicates whether - to drop control dependencies silently. This is due to TensorFlow Lite not - supporting control dependencies. - -* `--reorder_across_fake_quant`. Type: boolean. Default: False. Indicates - whether to reorder FakeQuant nodes in unexpected locations. Used when the - location of the FakeQuant nodes is preventing graph transformations - necessary to convert the graph. Results in a graph that differs from the - quantized training graph, potentially causing differing arithmetic behavior. - -* `--allow_custom_ops`. Type: string. Default: False. Indicates whether to - allow custom operations. When false, any unknown operation is an error. When - true, custom ops are created for any op that is unknown. The developer will - need to provide these to the TensorFlow Lite runtime with a custom resolver. - -* `--post_training_quantize`. Type: boolean. Default: False. Boolean - indicating whether to quantize the weights of the converted float model. - Model size will be reduced and there will be latency improvements (at the - cost of accuracy). - -## Logging flags - -The following flags generate graph visualizations of the graph as -[GraphViz](https://www.graphviz.org/) `.dot` files at various points during -graph transformations: - -* `--dump_graphviz_dir`. Type: string. Specifies the full path of the - directory to output GraphViz `.dot` files. Outputs the graph immediately - after reading in the graph and after all of the transformations have been - completed. -* `--dump_graphviz_video`. Type: boolean. Outputs GraphViz after every graph - transformation. Requires `--dump_graphviz_dir` to be specified. diff --git a/tensorflow/contrib/lite/toco/g3doc/python_api.md b/tensorflow/contrib/lite/toco/g3doc/python_api.md deleted file mode 100644 index 1f741360c6..0000000000 --- a/tensorflow/contrib/lite/toco/g3doc/python_api.md +++ /dev/null @@ -1,279 +0,0 @@ -# TensorFlow Lite Converter & Interpreter Python API reference - -This page provides examples on how to use the TensorFlow Lite Converter and the -TensorFlow Lite interpreter using the Python API. It is complemented by the -following documents: - -* [README](../README.md) -* [Command-line examples](cmdline_examples.md) -* [Command-line glossary](cmdline_reference.md) - -Table of contents: - -* [High-level overview](#high-level-overview) -* [API](#api) -* [Basic examples](#basic) - * [Exporting a GraphDef from tf.Session](#basic-graphdef-sess) - * [Exporting a GraphDef from file](#basic-graphdef-file) - * [Exporting a SavedModel](#basic-savedmodel) - * [Exporting a tf.keras File](#basic-keras-file) -* [Complex examples](#complex) - * [Exporting a quantized GraphDef](#complex-quant) -* [TensorFlow Lite Python interpreter](#interpreter) - * [Using the interpreter from a model file](#interpreter-file) - * [Using the interpreter from model data](#interpreter-data) -* [Additional instructions](#additional-instructions) - * [Build from source code](#latest-package) - * [Converting models in TensorFlow 1.9 to TensorFlow 1.11](#pre-tensorflow-1.11) - * [Converting models prior to TensorFlow 1.9](#pre-tensorflow-1.9) - -## High-level overview - -While the TensorFlow Lite Converter can be used from the command line, it is -often convenient to use in a Python script as part of the model development -pipeline. This allows you to know early that you are designing a model that can -be targeted to devices with mobile. - -## API - -The API for converting TensorFlow models to TensorFlow Lite as of TensorFlow 1.9 -is `tf.contrib.lite.TFLiteConverter`. The API for calling the Python intepreter -is `tf.contrib.lite.Interpreter`. - -Note: Reference "Additional Instructions" sections for converting TensorFlow -models to TensorFlow Lite -[in TensorFlow 1.9 to TensorFlow 1.11](#pre-tensorflow-1.11) and -[prior to TensorFlow 1.9](#pre-tensorflow-1.9) - -`TFLiteConverter` provides class methods based on the original format of the -model. `TFLiteConverter.from_session()` is available for GraphDefs. -`TFLiteConverter.from_saved_model()` is available for SavedModels. -`TFLiteConverter.from_keras_model_file()` is available for `tf.Keras` files. -Example usages for simple float-point models are shown in -[Basic Examples](#basic). Examples usages for more complex models is shown in -[Complex Examples](#complex). - -## Basic examples - -The following section shows examples of how to convert a basic float-point model -from each of the supported data formats into a TensorFlow Lite FlatBuffers. - -### Exporting a GraphDef from tf.Session - -The following example shows how to convert a TensorFlow GraphDef into a -TensorFlow Lite FlatBuffer from a `tf.Session` object. - -```python -import tensorflow as tf - -img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3)) -var = tf.get_variable("weights", dtype=tf.float32, shape=(1, 64, 64, 3)) -val = img + var -out = tf.identity(val, name="out") - -with tf.Session() as sess: - sess.run(tf.global_variables_initializer()) - converter = tf.contrib.lite.TFLiteConverter.from_session(sess, [img], [out]) - tflite_model = converter.convert() - open("converted_model.tflite", "wb").write(tflite_model) -``` - -### Exporting a GraphDef from file - -The following example shows how to convert a TensorFlow GraphDef into a -TensorFlow Lite FlatBuffer when the GraphDef is stored in a file. Both `.pb` and -`.pbtxt` files are accepted. - -The example uses -[Mobilenet_1.0_224](https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz). -The function only supports GraphDefs frozen using -[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py). - -```python -import tensorflow as tf - -graph_def_file = "/path/to/Downloads/mobilenet_v1_1.0_224/frozen_graph.pb" -input_arrays = ["input"] -output_arrays = ["MobilenetV1/Predictions/Softmax"] - -converter = tf.contrib.lite.TFLiteConverter.from_frozen_graph( - graph_def_file, input_arrays, output_arrays) -tflite_model = converter.convert() -open("converted_model.tflite", "wb").write(tflite_model) -``` - -### Exporting a SavedModel - -The following example shows how to convert a SavedModel into a TensorFlow Lite -FlatBuffer. - -```python -import tensorflow as tf - -converter = tf.contrib.lite.TFLiteConverter.from_saved_model(saved_model_dir) -tflite_model = converter.convert() -open("converted_model.tflite", "wb").write(tflite_model) -``` - -For more complex SavedModels, the optional parameters that can be passed into -`TFLiteConverter.from_saved_model()` are `input_arrays`, `input_shapes`, -`output_arrays`, `tag_set` and `signature_key`. Details of each parameter are -available by running `help(tf.contrib.lite.TFLiteConverter)`. - -### Exporting a tf.keras File - -The following example shows how to convert a `tf.keras` model into a TensorFlow -Lite FlatBuffer. This example requires -[`h5py`](http://docs.h5py.org/en/latest/build.html) to be installed. - -```python -import tensorflow as tf - -converter = tf.contrib.lite.TFLiteConverter.from_keras_model_file("keras_model.h5") -tflite_model = converter.convert() -open("converted_model.tflite", "wb").write(tflite_model) -``` - -The `tf.keras` file must contain both the model and the weights. A comprehensive -example including model construction can be seen below. - -```python -import numpy as np -import tensorflow as tf - -# Generate tf.keras model. -model = tf.keras.models.Sequential() -model.add(tf.keras.layers.Dense(2, input_shape=(3,))) -model.add(tf.keras.layers.RepeatVector(3)) -model.add(tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(3))) -model.compile(loss=tf.keras.losses.MSE, - optimizer=tf.keras.optimizers.RMSprop(lr=0.0001), - metrics=[tf.keras.metrics.categorical_accuracy], - sample_weight_mode='temporal') - -x = np.random.random((1, 3)) -y = np.random.random((1, 3, 3)) -model.train_on_batch(x, y) -model.predict(x) - -# Save tf.keras model in HDF5 format. -keras_file = "keras_model.h5" -tf.keras.models.save_model(model, keras_file) - -# Convert to TensorFlow Lite model. -converter = tf.contrib.lite.TFLiteConverter.from_keras_model_file(keras_file) -tflite_model = converter.convert() -open("converted_model.tflite", "wb").write(tflite_model) -``` - -## Complex examples - -For models where the default value of the attributes is not sufficient, the -attribute's values should be set before calling `convert()`. In order to call -any constants use `tf.contrib.lite.constants.` as seen below with -`QUANTIZED_UINT8`. Run `help(tf.contrib.lite.TFLiteConverter)` in the Python -terminal for detailed documentation on the attributes. - -Although the examples are demonstrated on GraphDefs containing only constants. -The same logic can be applied irrespective of the input data format. - -### Exporting a quantized GraphDef - -The following example shows how to convert a quantized model into a TensorFlow -Lite FlatBuffer. - -```python -import tensorflow as tf - -img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3)) -const = tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.]) -val = img + const -out = tf.fake_quant_with_min_max_args(val, min=0., max=1., name="output") - -with tf.Session() as sess: - converter = tf.contrib.lite.TFLiteConverter.from_session(sess, [img], [out]) - converter.inference_type = tf.contrib.lite.constants.QUANTIZED_UINT8 - input_arrays = converter.get_input_arrays() - converter.quantized_input_stats = {input_arrays[0] : (0., 1.)} # mean, std_dev - tflite_model = converter.convert() - open("converted_model.tflite", "wb").write(tflite_model) -``` - -## TensorFlow Lite Python interpreter - -### Using the interpreter from a model file - -The following example shows how to use the TensorFlow Lite Python interpreter -when provided a TensorFlow Lite FlatBuffer file. The example also demonstrates -how to run inference on random input data. Run -`help(tf.contrib.lite.Interpreter)` in the Python terminal to get detailed -documentation on the interpreter. - -```python -import numpy as np -import tensorflow as tf - -# Load TFLite model and allocate tensors. -interpreter = tf.contrib.lite.Interpreter(model_path="converted_model.tflite") -interpreter.allocate_tensors() - -# Get input and output tensors. -input_details = interpreter.get_input_details() -output_details = interpreter.get_output_details() - -# Test model on random input data. -input_shape = input_details[0]['shape'] -input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32) -interpreter.set_tensor(input_details[0]['index'], input_data) - -interpreter.invoke() -output_data = interpreter.get_tensor(output_details[0]['index']) -print(output_data) -``` - -### Using the interpreter from model data - -The following example shows how to use the TensorFlow Lite Python interpreter -when starting with the TensorFlow Lite Flatbuffer model previously loaded. This -example shows an end-to-end use case, starting from building the TensorFlow -model. - -```python -import numpy as np -import tensorflow as tf - -img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3)) -const = tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.]) -val = img + const -out = tf.identity(val, name="out") - -with tf.Session() as sess: - converter = tf.contrib.lite.TFLiteConverter.from_session(sess, [img], [out]) - tflite_model = converter.convert() - -# Load TFLite model and allocate tensors. -interpreter = tf.contrib.lite.Interpreter(model_content=tflite_model) -interpreter.allocate_tensors() -``` - -## Additional instructions - -### Build from source code - -In order to run the latest version of the TensorFlow Lite Converter Python API, -either install the nightly build with -[pip](https://www.tensorflow.org/install/pip) (recommended) or -[Docker](https://www.tensorflow.org/install/docker), or -[build the pip package from source](https://www.tensorflow.org/install/source). - -### Converting models in TensorFlow 1.9 to TensorFlow 1.11 - -To convert TensorFlow models to TensorFlow Lite in TensorFlow 1.9 through -TensorFlow 1.11, use `TocoConverter`. `TocoConverter` is semantically -identically to `TFLiteConverter`. - -### Converting models prior to TensorFlow 1.9 - -To convert TensorFlow models to TensorFlow Lite in TensorFlow 1.7 and TensorFlow -1.8, use the `toco_convert` function. Run `help(tf.contrib.lite.toco_convert)` -to get details about accepted parameters. diff --git a/tensorflow/contrib/lite/toco/g3doc/toco_landscape.svg b/tensorflow/contrib/lite/toco/g3doc/toco_landscape.svg deleted file mode 100644 index 335debde57..0000000000 --- a/tensorflow/contrib/lite/toco/g3doc/toco_landscape.svg +++ /dev/null @@ -1 +0,0 @@ - \ No newline at end of file -- cgit v1.2.3