diff options
author | Billy Lamberta <blamb@google.com> | 2018-07-24 11:52:23 -0700 |
---|---|---|
committer | TensorFlower Gardener <gardener@tensorflow.org> | 2018-07-24 11:58:53 -0700 |
commit | ff2aa1b59d4a111af094c0c7724e453eefe1f3b7 (patch) | |
tree | cd13149671e53a3b28e9a2fdb012310a46de03d9 /tensorflow/docs_src | |
parent | badf913c0a2f83ca933b8fe73a29f7dd5d2bc5ce (diff) |
Setup for TFLite subsite
PiperOrigin-RevId: 205866236
Diffstat (limited to 'tensorflow/docs_src')
-rw-r--r-- | tensorflow/docs_src/mobile/README.md | 3 | ||||
-rw-r--r-- | tensorflow/docs_src/mobile/android_build.md | 177 | ||||
-rw-r--r-- | tensorflow/docs_src/mobile/index.md | 33 | ||||
-rw-r--r-- | tensorflow/docs_src/mobile/ios_build.md | 107 | ||||
-rw-r--r-- | tensorflow/docs_src/mobile/leftnav_files | 15 | ||||
-rw-r--r-- | tensorflow/docs_src/mobile/linking_libs.md | 243 | ||||
-rw-r--r-- | tensorflow/docs_src/mobile/mobile_intro.md | 248 | ||||
-rw-r--r-- | tensorflow/docs_src/mobile/optimizing.md | 499 | ||||
-rw-r--r-- | tensorflow/docs_src/mobile/prepare_models.md | 301 | ||||
-rw-r--r-- | tensorflow/docs_src/mobile/tflite/demo_android.md | 146 | ||||
-rw-r--r-- | tensorflow/docs_src/mobile/tflite/demo_ios.md | 68 | ||||
-rw-r--r-- | tensorflow/docs_src/mobile/tflite/devguide.md | 232 | ||||
-rw-r--r-- | tensorflow/docs_src/mobile/tflite/index.md | 201 | ||||
-rw-r--r-- | tensorflow/docs_src/mobile/tflite/performance.md | 174 |
14 files changed, 3 insertions, 2444 deletions
diff --git a/tensorflow/docs_src/mobile/README.md b/tensorflow/docs_src/mobile/README.md new file mode 100644 index 0000000000..ecf4267265 --- /dev/null +++ b/tensorflow/docs_src/mobile/README.md @@ -0,0 +1,3 @@ +# TF Lite subsite + +This subsite directory lives in [tensorflow/contrib/lite/g3doc](../../contrib/lite/g3doc/). diff --git a/tensorflow/docs_src/mobile/android_build.md b/tensorflow/docs_src/mobile/android_build.md deleted file mode 100644 index f4b07db459..0000000000 --- a/tensorflow/docs_src/mobile/android_build.md +++ /dev/null @@ -1,177 +0,0 @@ -# Building TensorFlow on Android - -To get you started working with TensorFlow on Android, we'll walk through two -ways to build our TensorFlow mobile demos and deploying them on an Android -device. The first is Android Studio, which lets you build and deploy in an -IDE. The second is building with Bazel and deploying with ADB on the command -line. - -Why choose one or the other of these methods? - -The simplest way to use TensorFlow on Android is to use Android Studio. If you -aren't planning to customize your TensorFlow build at all, or if you want to use -Android Studio's editor and other features to build an app and just want to add -TensorFlow to it, we recommend using Android Studio. - -If you are using custom ops, or have some other reason to build TensorFlow from -scratch, scroll down and see our instructions -for [building the demo with Bazel](#build_the_demo_using_bazel). - -## Build the demo using Android Studio - -**Prerequisites** - -If you haven't already, do the following two things: - -- Install [Android Studio](https://developer.android.com/studio/index.html), - following the instructions on their website. - -- Clone the TensorFlow repository from GitHub: - - git clone https://github.com/tensorflow/tensorflow - -**Building** - -1. Open Android Studio, and from the Welcome screen, select **Open an existing - Android Studio project**. - -2. From the **Open File or Project** window that appears, navigate to and select - the `tensorflow/examples/android` directory from wherever you cloned the - TensorFlow GitHub repo. Click OK. - - If it asks you to do a Gradle Sync, click OK. - - You may also need to install various platforms and tools, if you get - errors like "Failed to find target with hash string 'android-23' and similar. - -3. Open the `build.gradle` file (you can go to **1:Project** in the side panel - and find it under the **Gradle Scripts** zippy under **Android**). Look for - the `nativeBuildSystem` variable and set it to `none` if it isn't already: - - // set to 'bazel', 'cmake', 'makefile', 'none' - def nativeBuildSystem = 'none' - -4. Click the *Run* button (the green arrow) or select *Run > Run 'android'* from the - top menu. You may need to rebuild the project using *Build > Rebuild Project*. - - If it asks you to use Instant Run, click **Proceed Without Instant Run**. - - Also, you need to have an Android device plugged in with developer options - enabled at this - point. See [here](https://developer.android.com/studio/run/device.html) for - more details on setting up developer devices. - -This installs three apps on your phone that are all part of the TensorFlow -Demo. See [Android Sample Apps](#android_sample_apps) for more information about -them. - -## Adding TensorFlow to your apps using Android Studio - -To add TensorFlow to your own apps on Android, the simplest way is to add the -following lines to your Gradle build file: - - allprojects { - repositories { - jcenter() - } - } - - dependencies { - compile 'org.tensorflow:tensorflow-android:+' - } - -This automatically downloads the latest stable version of TensorFlow as an AAR -and installs it in your project. - -## Build the demo using Bazel - -Another way to use TensorFlow on Android is to build an APK -using [Bazel](https://bazel.build/) and load it onto your device -using [ADB](https://developer.android.com/studio/command-line/adb.html). This -requires some knowledge of build systems and Android developer tools, but we'll -guide you through the basics here. - -- First, follow our instructions for @{$install/install_sources$installing from sources}. - This will also guide you through installing Bazel and cloning the - TensorFlow code. - -- Download the Android [SDK](https://developer.android.com/studio/index.html) - and [NDK](https://developer.android.com/ndk/downloads/index.html) if you do - not already have them. You need at least version 12b of the NDK, and 23 of the - SDK. - -- In your copy of the TensorFlow source, update the - [WORKSPACE](https://github.com/tensorflow/tensorflow/blob/master/WORKSPACE) - file with the location of your SDK and NDK, where it says <PATH_TO_NDK> - and <PATH_TO_SDK>. - -- Run Bazel to build the demo APK: - - bazel build -c opt //tensorflow/examples/android:tensorflow_demo - -- Use [ADB](https://developer.android.com/studio/command-line/adb.html#move) to - install the APK onto your device: - - adb install -r bazel-bin/tensorflow/examples/android/tensorflow_demo.apk - -Note: In general when compiling for Android with Bazel you need -`--config=android` on the Bazel command line, though in this case this -particular example is Android-only, so you don't need it here. - -This installs three apps on your phone that are all part of the TensorFlow -Demo. See [Android Sample Apps](#android_sample_apps) for more information about -them. - -## Android Sample Apps - -The -[Android example code](https://www.tensorflow.org/code/tensorflow/examples/android/) is -a single project that builds and installs three sample apps which all use the -same underlying code. The sample apps all take video input from a phone's -camera: - -- **TF Classify** uses the Inception v3 model to label the objects it’s pointed - at with classes from Imagenet. There are only 1,000 categories in Imagenet, - which misses most everyday objects and includes many things you’re unlikely to - encounter often in real life, so the results can often be quite amusing. For - example there’s no ‘person’ category, so instead it will often guess things it - does know that are often associated with pictures of people, like a seat belt - or an oxygen mask. If you do want to customize this example to recognize - objects you care about, you can use - the - [TensorFlow for Poets codelab](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/index.html#0) as - an example for how to train a model based on your own data. - -- **TF Detect** uses a multibox model to try to draw bounding boxes around the - locations of people in the camera. These boxes are annotated with the - confidence for each detection result. Results will not be perfect, as this - kind of object detection is still an active research topic. The demo also - includes optical tracking for when objects move between frames, which runs - more frequently than the TensorFlow inference. This improves the user - experience since the apparent frame rate is faster, but it also gives the - ability to estimate which boxes refer to the same object between frames, which - is important for counting objects over time. - -- **TF Stylize** implements a real-time style transfer algorithm on the camera - feed. You can select which styles to use and mix between them using the - palette at the bottom of the screen, and also switch out the resolution of the - processing to go higher or lower rez. - -When you build and install the demo, you'll see three app icons on your phone, -one for each of the demos. Tapping on them should open up the app and let you -explore what they do. You can enable profiling statistics on-screen by tapping -the volume up button while they’re running. - -### Android Inference Library - -Because Android apps need to be written in Java, and core TensorFlow is in C++, -TensorFlow has a JNI library to interface between the two. Its interface is aimed -only at inference, so it provides the ability to load a graph, set up inputs, -and run the model to calculate particular outputs. You can see the full -documentation for the minimal set of methods in -[TensorFlowInferenceInterface.java](https://www.tensorflow.org/code/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java) - -The demos applications use this interface, so they’re a good place to look for -example usage. You can download prebuilt binary jars -at -[ci.tensorflow.org](https://ci.tensorflow.org/view/Nightly/job/nightly-android/). diff --git a/tensorflow/docs_src/mobile/index.md b/tensorflow/docs_src/mobile/index.md deleted file mode 100644 index 6032fcad02..0000000000 --- a/tensorflow/docs_src/mobile/index.md +++ /dev/null @@ -1,33 +0,0 @@ -# Overview - -TensorFlow was designed to be a good deep learning solution for mobile -platforms. Currently we have two solutions for deploying machine learning -applications on mobile and embedded devices: -@{$mobile/mobile_intro$TensorFlow for Mobile} and @{$mobile/tflite$TensorFlow Lite}. - -## TensorFlow Lite versus TensorFlow Mobile - -Here are a few of the differences between the two: - -- TensorFlow Lite is an evolution of TensorFlow Mobile. In most cases, apps - developed with TensorFlow Lite will have a smaller binary size, fewer - dependencies, and better performance. - -- TensorFlow Lite supports only a limited set of operators, so not all models - will work on it by default. TensorFlow for Mobile has a fuller set of - supported functionality. - -TensorFlow Lite provides better performance and a small binary size on mobile -platforms as well as the ability to leverage hardware acceleration if available -on their platforms. In addition, it has many fewer dependencies so it can be -built and hosted on simpler, more constrained device scenarios. TensorFlow Lite -also allows targeting accelerators through the [Neural Networks -API](https://developer.android.com/ndk/guides/neuralnetworks/index.html). - -TensorFlow Lite currently has coverage for a limited set of operators. While -TensorFlow for Mobile supports only a constrained set of ops by default, in -principle if you use an arbitrary operator in TensorFlow, it can be customized -to build that kernel. Thus use cases which are not currently supported by -TensorFlow Lite should continue to use TensorFlow for Mobile. As TensorFlow Lite -evolves, it will gain additional operators, and the decision will be easier to -make. diff --git a/tensorflow/docs_src/mobile/ios_build.md b/tensorflow/docs_src/mobile/ios_build.md deleted file mode 100644 index 4c84a1214a..0000000000 --- a/tensorflow/docs_src/mobile/ios_build.md +++ /dev/null @@ -1,107 +0,0 @@ -# Building TensorFlow on iOS - -## Using CocoaPods - -The simplest way to get started with TensorFlow on iOS is using the CocoaPods -package management system. You can add the `TensorFlow-experimental` pod to your -Podfile, which installs a universal binary framework. This makes it easy to get -started but has the disadvantage of being hard to customize, which is important -in case you want to shrink your binary size. If you do need the ability to -customize your libraries, see later sections on how to do that. - -## Creating your own app - -If you'd like to add TensorFlow capabilities to your own app, do the following: - -- Create your own app or load your already-created app in XCode. - -- Add a file named Podfile at the project root directory with the following content: - - target 'YourProjectName' - pod 'TensorFlow-experimental' - -- Run `pod install` to download and install the `TensorFlow-experimental` pod. - -- Open `YourProjectName.xcworkspace` and add your code. - -- In your app's **Build Settings**, make sure to add `$(inherited)` to the - **Other Linker Flags**, and **Header Search Paths** sections. - -## Running the Samples - -You'll need Xcode 7.3 or later to run our iOS samples. - -There are currently three examples: simple, benchmark, and camera. For now, you -can download the sample code by cloning the main tensorflow repository (we are -planning to make the samples available as a separate repository later). - -From the root of the tensorflow folder, download [Inception -v1](https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip), -and extract the label and graph files into the data folders inside both the -simple and camera examples using these steps: - - mkdir -p ~/graphs - curl -o ~/graphs/inception5h.zip \ - https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip \ - && unzip ~/graphs/inception5h.zip -d ~/graphs/inception5h - cp ~/graphs/inception5h/* tensorflow/examples/ios/benchmark/data/ - cp ~/graphs/inception5h/* tensorflow/examples/ios/camera/data/ - cp ~/graphs/inception5h/* tensorflow/examples/ios/simple/data/ - -Change into one of the sample directories, download the -[Tensorflow-experimental](https://cocoapods.org/pods/TensorFlow-experimental) -pod, and open the Xcode workspace. Note that installing the pod can take a long -time since it is big (~450MB). If you want to run the simple example, then: - - cd tensorflow/examples/ios/simple - pod install - open tf_simple_example.xcworkspace # note .xcworkspace, not .xcodeproj - # this is created by pod install - -Run the simple app in the XCode simulator. You should see a single-screen app -with a **Run Model** button. Tap that, and you should see some debug output -appear below indicating that the example Grace Hopper image in directory data -has been analyzed, with a military uniform recognized. - -Run the other samples using the same process. The camera example requires a real -device connected. Once you build and run that, you should get a live camera view -that you can point at objects to get real-time recognition results. - -### iOS Example details - -There are three demo applications for iOS, all defined in Xcode projects inside -[tensorflow/examples/ios](https://www.tensorflow.org/code/tensorflow/examples/ios/). - -- **Simple**: This is a minimal example showing how to load and run a TensorFlow - model in as few lines as possible. It just consists of a single view with a - button that executes the model loading and inference when its pressed. - -- **Camera**: This is very similar to the Android TF Classify demo. It loads - Inception v3 and outputs its best label estimate for what’s in the live camera - view. As with the Android version, you can train your own custom model using - TensorFlow for Poets and drop it into this example with minimal code changes. - -- **Benchmark**: is quite close to Simple, but it runs the graph repeatedly and - outputs similar statistics to the benchmark tool on Android. - - -### Troubleshooting - -- Make sure you use the TensorFlow-experimental pod (and not TensorFlow). - -- The TensorFlow-experimental pod is current about ~450MB. The reason it is so - big is because we are bundling multiple platforms, and the pod includes all - TensorFlow functionality (e.g. operations). The final app size after build is - substantially smaller though (~25MB). Working with the complete pod is - convenient during development, but see below section on how you can build your - own custom TensorFlow library to reduce the size. - -## Building the TensorFlow iOS libraries from source - -While Cocoapods is the quickest and easiest way of getting started, you sometimes -need more flexibility to determine which parts of TensorFlow your app should be -shipped with. For such cases, you can build the iOS libraries from the -sources. [This -guide](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/ios#building-the-tensorflow-ios-libraries-from-source) -contains detailed instructions on how to do that. - diff --git a/tensorflow/docs_src/mobile/leftnav_files b/tensorflow/docs_src/mobile/leftnav_files deleted file mode 100644 index 97340ef7e1..0000000000 --- a/tensorflow/docs_src/mobile/leftnav_files +++ /dev/null @@ -1,15 +0,0 @@ -index.md -### TensorFlow Lite -tflite/index.md -tflite/devguide.md -tflite/demo_android.md -tflite/demo_ios.md -tflite/performance.md ->>> -### TensorFlow Mobile -mobile_intro.md -android_build.md -ios_build.md -linking_libs.md -prepare_models.md -optimizing.md diff --git a/tensorflow/docs_src/mobile/linking_libs.md b/tensorflow/docs_src/mobile/linking_libs.md deleted file mode 100644 index efef5dd0da..0000000000 --- a/tensorflow/docs_src/mobile/linking_libs.md +++ /dev/null @@ -1,243 +0,0 @@ -# Integrating TensorFlow libraries - -Once you have made some progress on a model that addresses the problem you’re -trying to solve, it’s important to test it out inside your application -immediately. There are often unexpected differences between your training data -and what users actually encounter in the real world, and getting a clear picture -of the gap as soon as possible improves the product experience. - -This page talks about how to integrate the TensorFlow libraries into your own -mobile applications, once you have already successfully built and deployed the -TensorFlow mobile demo apps. - -## Linking the library - -After you've managed to build the examples, you'll probably want to call -TensorFlow from one of your existing applications. The very easiest way to do -this is to use the Pod installation steps described -@{$mobile/ios_build#using_cocoapods$here}, but if you want to build TensorFlow -from source (for example to customize which operators are included) you'll need -to break out TensorFlow as a framework, include the right header files, and link -against the built libraries and dependencies. - -### Android - -For Android, you just need to link in a Java library contained in a JAR file -called `libandroid_tensorflow_inference_java.jar`. There are three ways to -include this functionality in your program: - -1. Include the jcenter AAR which contains it, as in this - [example app](https://github.com/googlecodelabs/tensorflow-for-poets-2/blob/master/android/tfmobile/build.gradle#L59-L65) - -2. Download the nightly precompiled version from -[ci.tensorflow.org](http://ci.tensorflow.org/view/Nightly/job/nightly-android/lastSuccessfulBuild/artifact/out/). - -3. Build the JAR file yourself using the instructions [in our Android GitHub repo](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/android) - -### iOS - -Pulling in the TensorFlow libraries on iOS is a little more complicated. Here is -a checklist of what you’ll need to do to your iOS app: - -- Link against tensorflow/contrib/makefile/gen/lib/libtensorflow-core.a, usually - by adding `-L/your/path/tensorflow/contrib/makefile/gen/lib/` and - `-ltensorflow-core` to your linker flags. - -- Link against the generated protobuf libraries by adding - `-L/your/path/tensorflow/contrib/makefile/gen/protobuf_ios/lib` and - `-lprotobuf` and `-lprotobuf-lite` to your command line. - -- For the include paths, you need the root of your TensorFlow source folder as - the first entry, followed by - `tensorflow/contrib/makefile/downloads/protobuf/src`, - `tensorflow/contrib/makefile/downloads`, - `tensorflow/contrib/makefile/downloads/eigen`, and - `tensorflow/contrib/makefile/gen/proto`. - -- Make sure your binary is built with `-force_load` (or the equivalent on your - platform), aimed at the TensorFlow library to ensure that it’s linked - correctly. More detail on why this is necessary can be found in the next - section, [Global constructor magic](#global_constructor_magic). On Linux-like - platforms, you’ll need different flags, more like - `-Wl,--allow-multiple-definition -Wl,--whole-archive`. - -You’ll also need to link in the Accelerator framework, since this is used to -speed up some of the operations. - -## Global constructor magic - -One of the subtlest problems you may run up against is the “No session factory -registered for the given session options” error when trying to call TensorFlow -from your own application. To understand why this is happening and how to fix -it, you need to know a bit about the architecture of TensorFlow. - -The framework is designed to be very modular, with a thin core and a large -number of specific objects that are independent and can be mixed and matched as -needed. To enable this, the coding pattern in C++ had to let modules easily -notify the framework about the services they offer, without requiring a central -list that has to be updated separately from each implementation. It also had to -allow separate libraries to add their own implementations without needing a -recompile of the core. - -To achieve this capability, TensorFlow uses a registration pattern in a lot of -places. In the code, it looks like this: - - class MulKernel : OpKernel { - Status Compute(OpKernelContext* context) { … } - }; - REGISTER_KERNEL(MulKernel, “Mul”); - -This would be in a standalone `.cc` file linked into your application, either -as part of the main set of kernels or as a separate custom library. The magic -part is that the `REGISTER_KERNEL()` macro is able to inform the core of -TensorFlow that it has an implementation of the Mul operation, so that it can be -called in any graphs that require it. - -From a programming point of view, this setup is very convenient. The -implementation and registration code live in the same file, and adding new -implementations is as simple as compiling and linking it in. The difficult part -comes from the way that the `REGISTER_KERNEL()` macro is implemented. C++ -doesn’t offer a good mechanism for doing this sort of registration, so we have -to resort to some tricky code. Under the hood, the macro is implemented so that -it produces something like this: - - class RegisterMul { - public: - RegisterMul() { - global_kernel_registry()->Register(“Mul”, [](){ - return new MulKernel() - }); - } - }; - RegisterMul g_register_mul; - -This sets up a class `RegisterMul` with a constructor that tells the global -kernel registry what function to call when somebody asks it how to create a -“Mul” kernel. Then there’s a global object of that class, and so the constructor -should be called at the start of any program. - -While this may sound sensible, the unfortunate part is that the global object -that’s defined is not used by any other code, so linkers not designed with this -in mind will decide that it can be deleted. As a result, the constructor is -never called, and the class is never registered. All sorts of modules use this -pattern in TensorFlow, and it happens that `Session` implementations are the -first to be looked for when the code is run, which is why it shows up as the -characteristic error when this problem occurs. - -The solution is to force the linker to not strip any code from the library, even -if it believes it’s unused. On iOS, this step can be accomplished with the -`-force_load` flag, specifying a library path, and on Linux you need -`--whole-archive`. These persuade the linker to not be as aggressive about -stripping, and should retain the globals. - -The actual implementation of the various `REGISTER_*` macros is a bit more -complicated in practice, but they all suffer the same underlying problem. If -you’re interested in how they work, [op_kernel.h](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/op_kernel.h#L1091) -is a good place to start investigating. - -## Protobuf problems - -TensorFlow relies on -the [Protocol Buffer](https://developers.google.com/protocol-buffers/) library, -commonly known as protobuf. This library takes definitions of data structures -and produces serialization and access code for them in a variety of -languages. The tricky part is that this generated code needs to be linked -against shared libraries for the exact same version of the framework that was -used for the generator. This can be an issue when `protoc`, the tool used to -generate the code, is from a different version of protobuf than the libraries in -the standard linking and include paths. For example, you might be using a copy -of `protoc` that was built locally in `~/projects/protobuf-3.0.1.a`, but you have -libraries installed at `/usr/local/lib` and `/usr/local/include` that are from -3.0.0. - -The symptoms of this issue are errors during the compilation or linking phases -with protobufs. Usually, the build tools take care of this, but if you’re using -the makefile, make sure you’re building the protobuf library locally and using -it, as shown in [this Makefile](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/makefile/Makefile#L18). - -Another situation that can cause problems is when protobuf headers and source -files need to be generated as part of the build process. This process makes -building more complex, since the first phase has to be a pass over the protobuf -definitions to create all the needed code files, and only after that can you go -ahead and do a build of the library code. - -### Multiple versions of protobufs in the same app - -Protobufs generate headers that are needed as part of the C++ interface to the -overall TensorFlow library. This complicates using the library as a standalone -framework. - -If your application is already using version 1 of the protocol buffers library, -you may have trouble integrating TensorFlow because it requires version 2. If -you just try to link both versions into the same binary, you’ll see linking -errors because some of the symbols clash. To solve this particular problem, we -have an experimental script at [rename_protobuf.sh](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/makefile/rename_protobuf.sh). - -You need to run this as part of the makefile build, after you’ve downloaded all -the dependencies: - - tensorflow/contrib/makefile/download_dependencies.sh - tensorflow/contrib/makefile/rename_protobuf.sh - -## Calling the TensorFlow API - -Once you have the framework available, you then need to call into it. The usual -pattern is that you first load your model, which represents a preset set of -numeric computations, and then you run inputs through that model (for example, -images from a camera) and receive outputs (for example, predicted labels). - -On Android, we provide the Java Inference Library that is focused on just this -use case, while on iOS and Raspberry Pi you call directly into the C++ API. - -### Android - -Here’s what a typical Inference Library sequence looks like on Android: - - // Load the model from disk. - TensorFlowInferenceInterface inferenceInterface = - new TensorFlowInferenceInterface(assetManager, modelFilename); - - // Copy the input data into TensorFlow. - inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3); - - // Run the inference call. - inferenceInterface.run(outputNames, logStats); - - // Copy the output Tensor back into the output array. - inferenceInterface.fetch(outputName, outputs); - -You can find the source of this code in the [Android examples](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/android/src/org/tensorflow/demo/TensorFlowImageClassifier.java#L107). - -### iOS and Raspberry Pi - -Here’s the equivalent code for iOS and Raspberry Pi: - - // Load the model. - PortableReadFileToProto(file_path, &tensorflow_graph); - - // Create a session from the model. - tensorflow::Status s = session->Create(tensorflow_graph); - if (!s.ok()) { - LOG(FATAL) << "Could not create TensorFlow Graph: " << s; - } - - // Run the model. - std::string input_layer = "input"; - std::string output_layer = "output"; - std::vector<tensorflow::Tensor> outputs; - tensorflow::Status run_status = session->Run({{input_layer, image_tensor}}, - {output_layer}, {}, &outputs); - if (!run_status.ok()) { - LOG(FATAL) << "Running model failed: " << run_status; - } - - // Access the output data. - tensorflow::Tensor* output = &outputs[0]; - -This is all based on the -[iOS sample code](https://www.tensorflow.org/code/tensorflow/examples/ios/simple/RunModelViewController.mm), -but there’s nothing iOS-specific; the same code should be usable on any platform -that supports C++. - -You can also find specific examples for Raspberry Pi -[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/pi_examples/label_image/label_image.cc). diff --git a/tensorflow/docs_src/mobile/mobile_intro.md b/tensorflow/docs_src/mobile/mobile_intro.md deleted file mode 100644 index baad443308..0000000000 --- a/tensorflow/docs_src/mobile/mobile_intro.md +++ /dev/null @@ -1,248 +0,0 @@ -# Introduction to TensorFlow Mobile - -TensorFlow was designed from the ground up to be a good deep learning solution -for mobile platforms like Android and iOS. This mobile guide should help you -understand how machine learning can work on mobile platforms and how to -integrate TensorFlow into your mobile apps effectively and efficiently. - -## About this Guide - -This guide is aimed at developers who have a TensorFlow model that’s -successfully working in a desktop environment, who want to integrate it into -a mobile application, and cannot use TensorFlow Lite. Here are the -main challenges you’ll face during that process: - -- Understanding how to use Tensorflow for mobile. -- Building TensorFlow for your platform. -- Integrating the TensorFlow library into your application. -- Preparing your model file for mobile deployment. -- Optimizing for latency, RAM usage, model file size, and binary size. - -## Common use cases for mobile machine learning - -**Why run TensorFlow on mobile?** - -Traditionally, deep learning has been associated with data centers and giant -clusters of high-powered GPU machines. However, it can be very expensive and -time-consuming to send all of the data a device has access to across a network -connection. Running on mobile makes it possible to deliver very interactive -applications in a way that’s not possible when you have to wait for a network -round trip. - -Here are some common use cases for on-device deep learning: - -### Speech Recognition - -There are a lot of interesting applications that can be built with a -speech-driven interface, and many of these require on-device processing. Most of -the time a user isn’t giving commands, and so streaming audio continuously to a -remote server would be a waste of bandwidth, since it would mostly be silence or -background noises. To solve this problem it’s common to have a small neural -network running on-device -[listening out for a particular keyword](../tutorials/sequences/audio_recognition). -Once that keyword has been spotted, the rest of the -conversation can be transmitted over to the server for further processing if -more computing power is needed. - -### Image Recognition - -It can be very useful for a mobile app to be able to make sense of a camera -image. If your users are taking photos, recognizing what’s in them can help your -camera apps apply appropriate filters, or label the photos so they’re easily -findable. It’s important for embedded applications too, since you can use image -sensors to detect all sorts of interesting conditions, whether it’s spotting -endangered animals in the wild -or -[reporting how late your train is running](https://svds.com/tensorflow-image-recognition-raspberry-pi/). - -TensorFlow comes with several examples of recognizing the types of objects -inside images along with a variety of different pre-trained models, and they can -all be run on mobile devices. You can try out -our -[Tensorflow for Poets](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/index.html#0) and -[Tensorflow for Poets 2: Optimize for Mobile](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets-2/index.html#0) codelabs to -see how to take a pretrained model and run some very fast and lightweight -training to teach it to recognize specific objects, and then optimize it to -run on mobile. - -### Object Localization - -Sometimes it’s important to know where objects are in an image as well as what -they are. There are lots of augmented reality use cases that could benefit a -mobile app, such as guiding users to the right component when offering them -help fixing their wireless network or providing informative overlays on top of -landscape features. Embedded applications often need to count objects that are -passing by them, whether it’s pests in a field of crops, or people, cars and -bikes going past a street lamp. - -TensorFlow offers a pretrained model for drawing bounding boxes around people -detected in images, together with tracking code to follow them over time. The -tracking is especially important for applications where you’re trying to count -how many objects are present over time, since it gives you a good idea when a -new object enters or leaves the scene. We have some sample code for this -available for Android [on -GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android), -and also a [more general object detection -model](https://github.com/tensorflow/models/tree/master/research/object_detection/README.md) -available as well. - -### Gesture Recognition - -It can be useful to be able to control applications with hand or other -gestures, either recognized from images or through analyzing accelerometer -sensor data. Creating those models is beyond the scope of this guide, but -TensorFlow is an effective way of deploying them. - -### Optical Character Recognition - -Google Translate’s live camera view is a great example of how effective -interactive on-device detection of text can be. - -<div class="video-wrapper"> - <iframe class="devsite-embedded-youtube-video" data-video-id="06olHmcJjS0" - data-autohide="1" data-showinfo="0" frameborder="0" allowfullscreen> - </iframe> -</div> - -There are multiple steps involved in recognizing text in images. You first have -to identify the areas where the text is present, which is a variation on the -object localization problem, and can be solved with similar techniques. Once you -have an area of text, you then need to interpret it as letters, and then use a -language model to help guess what words they represent. The simplest way to -estimate what letters are present is to segment the line of text into individual -letters, and then apply a simple neural network to the bounding box of each. You -can get good results with the kind of models used for MNIST, which you can find -in TensorFlow’s tutorials, though you may want a higher-resolution input. A -more advanced alternative is to use an LSTM model to process a whole line of -text at once, with the model itself handling the segmentation into different -characters. - -### Translation - -Translating from one language to another quickly and accurately, even if you -don’t have a network connection, is an important use case. Deep networks are -very effective at this sort of task, and you can find descriptions of a lot of -different models in the literature. Often these are sequence-to-sequence -recurrent models where you’re able to run a single graph to do the whole -translation, without needing to run separate parsing stages. - -### Text Classification - -If you want to suggest relevant prompts to users based on what they’re typing or -reading, it can be very useful to understand the meaning of the text. This is -where text classification comes in. Text classification is an umbrella term -that covers everything from sentiment analysis to topic discovery. You’re likely -to have your own categories or labels that you want to apply, so the best place -to start is with an example -like -[Skip-Thoughts](https://github.com/tensorflow/models/tree/master/research/skip_thoughts/), -and then train on your own examples. - -### Voice Synthesis - -A synthesized voice can be a great way of giving users feedback or aiding -accessibility, and recent advances such as -[WaveNet](https://deepmind.com/blog/wavenet-generative-model-raw-audio/) show -that deep learning can offer very natural-sounding speech. - -## Mobile machine learning and the cloud - -These examples of use cases give an idea of how on-device networks can -complement cloud services. Cloud has a great deal of computing power in a -controlled environment, but running on devices can offer higher interactivity. -In situations where the cloud is unavailable, or your cloud capacity is limited, -you can provide an offline experience, or reduce cloud workload by processing -easy cases on device. - -Doing on-device computation can also signal when it's time to switch to working -on the cloud. A good example of this is hotword detection in speech. Since -devices are able to constantly listen out for the keywords, this then triggers a -lot of traffic to cloud-based speech recognition once one is recognized. Without -the on-device component, the whole application wouldn’t be feasible, and this -pattern exists across several other applications as well. Recognizing that some -sensor input is interesting enough for further processing makes a lot of -interesting products possible. - -## What hardware and software should you have? - -TensorFlow runs on Ubuntu Linux, Windows 10, and OS X. For a list of all -supported operating systems and instructions to install TensorFlow, see -@{$install$Installing Tensorflow}. - -Note that some of the sample code we provide for mobile TensorFlow requires you -to compile TensorFlow from source, so you’ll need more than just `pip install` -to work through all the sample code. - -To try out the mobile examples, you’ll need a device set up for development, -using -either [Android Studio](https://developer.android.com/studio/install.html), -or [XCode](https://developer.apple.com/xcode/) if you're developing for iOS. - -## What should you do before you get started? - -Before thinking about how to get your solution on mobile: - -1. Determine whether your problem is solvable by mobile machine learning -2. Create a labelled dataset to define your problem -3. Pick an effective model for the problem - -We'll discuss these in more detail below. - -### Is your problem solvable by mobile machine learning? - -Once you have an idea of the problem you want to solve, you need to make a plan -of how to build your solution. The most important first step is making sure that -your problem is actually solvable, and the best way to do that is to mock it up -using humans in the loop. - -For example, if you want to drive a robot toy car using voice commands, try -recording some audio from the device and listen back to it to see if you can -make sense of what’s being said. Often you’ll find there are problems in the -capture process, such as the motor drowning out speech or not being able to hear -at a distance, and you should tackle these problems before investing in the -modeling process. - -Another example would be giving photos taken from your app to people see if they -can classify what’s in them, in the way you’re looking for. If they can’t do -that (for example, trying to estimate calories in food from photos may be -impossible because all white soups look the same), then you’ll need to redesign -your experience to cope with that. A good rule of thumb is that if a human can’t -handle the task then it will be difficult to train a computer to do better. - -### Create a labelled dataset - -After you’ve solved any fundamental issues with your use case, you need to -create a labeled dataset to define what problem you’re trying to solve. This -step is extremely important, more than picking which model to use. You want it -to be as representative as possible of your actual use case, since the model -will only be effective at the task you teach it. It’s also worth investing in -tools to make labeling the data as efficient and accurate as possible. For -example, if you’re able to switch from having to click a button on a web -interface to simple keyboard shortcuts, you may be able to speed up the -generation process a lot. You should also start by doing the initial labeling -yourself, so you can learn about the difficulties and likely errors, and -possibly change your labeling or data capture process to avoid them. Once you -and your team are able to consistently label examples (that is once you -generally agree on the same labels for most examples), you can then try and -capture your knowledge in a manual and teach external raters how to run the same -process. - -### Pick an effective model - -The next step is to pick an effective model to use. You might be able to avoid -training a model from scratch if someone else has already implemented a model -similar to what you need; we have a repository of models implemented in -TensorFlow [on GitHub](https://github.com/tensorflow/models) that you can look -through. Lean towards the simplest model you can find, and try to get started as -soon as you have even a small amount of labelled data, since you’ll get the best -results when you’re able to iterate quickly. The shorter the time it takes to -try training a model and running it in its real application, the better overall -results you’ll see. It’s common for an algorithm to get great training accuracy -numbers but then fail to be useful within a real application because there’s a -mismatch between the dataset and real usage. Prototype end-to-end usage as soon -as possible to create a consistent user experience. - -## Next Steps - -We suggest you get started by building one of our demos for -@{$mobile/android_build$Android} or @{$mobile/ios_build$iOS}. diff --git a/tensorflow/docs_src/mobile/optimizing.md b/tensorflow/docs_src/mobile/optimizing.md deleted file mode 100644 index 778e4d3a62..0000000000 --- a/tensorflow/docs_src/mobile/optimizing.md +++ /dev/null @@ -1,499 +0,0 @@ -# Optimizing for mobile - -There are some special issues that you have to deal with when you’re trying to -ship on mobile or embedded devices, and you’ll need to think about these as -you’re developing your model. - -These issues are: - -- Model and Binary Size -- App speed and model loading speed -- Performance and threading - -We'll discuss a few of these below. - -## What are the minimum device requirements for TensorFlow? - -You need at least one megabyte of program memory and several megabytes of RAM to -run the base TensorFlow runtime, so it’s not suitable for DSPs or -microcontrollers. Other than those, the biggest constraint is usually the -calculation speed of the device, and whether you can run the model you need for -your application with a low enough latency. You can use the benchmarking tools -in [How to Profile your Model](#how_to_profile_your_model) to get an idea of how -many FLOPs are required for a model, and then use that to make rule-of-thumb -estimates of how fast they will run on different devices. For example, a modern -smartphone might be able to run 10 GFLOPs per second, so the best you could hope -for from a 5 GFLOP model is two frames per second, though you may do worse -depending on what the exact computation patterns are. - -This model dependence means that it’s possible to run TensorFlow even on very -old or constrained phones, as long as you optimize your network to fit within -the latency budget and possibly within limited RAM too. For memory usage, you -mostly need to make sure that the intermediate buffers that TensorFlow creates -aren’t too large, which you can examine in the benchmark output too. - -## Speed - -One of the highest priorities of most model deployments is figuring out how to -run the inference fast enough to give a good user experience. The first place to -start is by looking at the total number of floating point operations that are -required to execute the graph. You can get a very rough estimate of this by -using the `benchmark_model` tool: - - bazel build -c opt tensorflow/tools/benchmark:benchmark_model && \ - bazel-bin/tensorflow/tools/benchmark/benchmark_model \ - --graph=/tmp/inception_graph.pb --input_layer="Mul:0" \ - --input_layer_shape="1,299,299,3" --input_layer_type="float" \ - --output_layer="softmax:0" --show_run_order=false --show_time=false \ - --show_memory=false --show_summary=true --show_flops=true --logtostderr - -This should show you an estimate of how many operations are needed to run the -graph. You can then use that information to figure out how feasible your model -is to run on the devices you’re targeting. For an example, a high-end phone from -2016 might be able to do 20 billion FLOPs per second, so the best speed you -could hope for from a model that requires 10 billion FLOPs is around 500ms. On a -device like the Raspberry Pi 3 that can do about 5 billion FLOPs, you may only -get one inference every two seconds. - -Having this estimate helps you plan for what you’ll be able to realistically -achieve on a device. If the model is using too many ops, then there are a lot of -opportunities to optimize the architecture to reduce that number. - -Advanced techniques include [SqueezeNet](https://arxiv.org/abs/1602.07360) -and [MobileNet](https://arxiv.org/abs/1704.04861), which are architectures -designed to produce models for mobile -- lean and fast but with a small accuracy -cost. You can also just look at alternative models, even older ones, which may -be smaller. For example, Inception v1 only has around 7 million parameters, -compared to Inception v3’s 24 million, and requires only 3 billion FLOPs rather -than 9 billion for v3. - -## Model Size - -Models that run on a device need to be stored somewhere on the device, and very -large neural networks can be hundreds of megabytes. Most users are reluctant to -download very large app bundles from app stores, so you want to make your model -as small as possible. Furthermore, smaller neural networks can persist in and -out of a mobile device's memory faster. - -To understand how large your network will be on disk, start by looking at the -size on disk of your `GraphDef` file after you’ve run `freeze_graph` and -`strip_unused_nodes` on it (see @{$mobile/prepare_models$Preparing models} for -more details on these tools), since then it should only contain -inference-related nodes. To double-check that your results are as expected, run -the `summarize_graph` tool to see how many parameters are in constants: - - bazel build tensorflow/tools/graph_transforms:summarize_graph && \ - bazel-bin/tensorflow/tools/graph_transforms/summarize_graph \ - --in_graph=/tmp/tensorflow_inception_graph.pb - -That command should give you output that looks something like this: - - No inputs spotted. - Found 1 possible outputs: (name=softmax, op=Softmax) - Found 23885411 (23.89M) const parameters, 0 (0) variable parameters, - and 99 control_edges - Op types used: 489 Const, 99 CheckNumerics, 99 Identity, 94 - BatchNormWithGlobalNormalization, 94 Conv2D, 94 Relu, 11 Concat, 9 AvgPool, - 5 MaxPool, 1 Sub, 1 Softmax, 1 ResizeBilinear, 1 Reshape, 1 Mul, 1 MatMul, - 1 ExpandDims, 1 DecodeJpeg, 1 Cast, 1 BiasAdd - -The important part for our current purposes is the number of const -parameters. In most models these will be stored as 32-bit floats to start, so if -you multiply the number of const parameters by four, you should get something -that’s close to the size of the file on disk. You can often get away with only -eight-bits per parameter with very little loss of accuracy in the final result, -so if your file size is too large you can try using -@{$performance/quantization$quantize_weights} to transform the parameters down. - - bazel build tensorflow/tools/graph_transforms:transform_graph && \ - bazel-bin/tensorflow/tools/graph_transforms/transform_graph \ - --in_graph=/tmp/tensorflow_inception_optimized.pb \ - --out_graph=/tmp/tensorflow_inception_quantized.pb \ - --inputs='Mul:0' --outputs='softmax:0' --transforms='quantize_weights' - -If you look at the resulting file size, you should see that it’s about a quarter -of the original at 23MB. - -Another transform is `round_weights`, which doesn't make the file smaller, but it -makes the file compressible to about the same size as when `quantize_weights` is -used. This is particularly useful for mobile development, taking advantage of -the fact that app bundles are compressed before they’re downloaded by consumers. - -The original file does not compress well with standard algorithms, because the -bit patterns of even very similar numbers can be very different. The -`round_weights` transform keeps the weight parameters stored as floats, but -rounds them to a set number of step values. This means there are a lot more -repeated byte patterns in the stored model, and so compression can often bring -the size down dramatically, in many cases to near the size it would be if they -were stored as eight bit. - -Another advantage of `round_weights` is that the framework doesn’t have to -allocate a temporary buffer to unpack the parameters into, as we have to when -we just use `quantize_weights`. This saves a little bit of latency (though the -results should be cached so it’s only costly on the first run) and makes it -possible to use memory mapping, as described later. - -## Binary Size - -One of the biggest differences between mobile and server development is the -importance of binary size. On desktop machines it’s not unusual to have -executables that are hundreds of megabytes on disk, but for mobile and embedded -apps it’s vital to keep the binary as small as possible so that user downloads -are easy. As mentioned above, TensorFlow only includes a subset of op -implementations by default, but this still results in a 12 MB final -executable. To reduce this, you can set up the library to only include the -implementations of the ops that you actually need, based on automatically -analyzing your model. To use it: - -- Run `tools/print_required_ops/print_selective_registration_header.py` on your - model to produce a header file that only enables the ops it uses. - -- Place the `ops_to_register.h` file somewhere that the compiler can find - it. This can be in the root of your TensorFlow source folder. - -- Build TensorFlow with `SELECTIVE_REGISTRATION` defined, for example by passing - in `--copts=”-DSELECTIVE_REGISTRATION”` to your Bazel build command. - -This process recompiles the library so that only the needed ops and types are -included, which can dramatically reduce the executable size. For example, with -Inception v3, the new size is only 1.5MB. - -## How to Profile your Model - -Once you have an idea of what your device's peak performance range is, it’s -worth looking at its actual current performance. Using a standalone TensorFlow -benchmark, rather than running it inside a larger app, helps isolate just the -Tensorflow contribution to the -latency. The -[tensorflow/tools/benchmark](https://www.tensorflow.org/code/tensorflow/tools/benchmark/) tool -is designed to help you do this. To run it on Inception v3 on your desktop -machine, build this benchmark model: - - bazel build -c opt tensorflow/tools/benchmark:benchmark_model && \ - bazel-bin/tensorflow/tools/benchmark/benchmark_model \ - --graph=/tmp/tensorflow_inception_graph.pb --input_layer="Mul" \ - --input_layer_shape="1,299,299,3" --input_layer_type="float" \ - --output_layer="softmax:0" --show_run_order=false --show_time=false \ - --show_memory=false --show_summary=true --show_flops=true --logtostderr - -You should see output that looks something like this: - -<pre> -============================== Top by Computation Time ============================== -[node - type] [start] [first] [avg ms] [%] [cdf%] [mem KB] [Name] -Conv2D 22.859 14.212 13.700 4.972% 4.972% 3871.488 conv_4/Conv2D -Conv2D 8.116 8.964 11.315 4.106% 9.078% 5531.904 conv_2/Conv2D -Conv2D 62.066 16.504 7.274 2.640% 11.717% 443.904 mixed_3/conv/Conv2D -Conv2D 2.530 6.226 4.939 1.792% 13.510% 2765.952 conv_1/Conv2D -Conv2D 55.585 4.605 4.665 1.693% 15.203% 313.600 mixed_2/tower/conv_1/Conv2D -Conv2D 127.114 5.469 4.630 1.680% 16.883% 81.920 mixed_10/conv/Conv2D -Conv2D 47.391 6.994 4.588 1.665% 18.548% 313.600 mixed_1/tower/conv_1/Conv2D -Conv2D 39.463 7.878 4.336 1.574% 20.122% 313.600 mixed/tower/conv_1/Conv2D -Conv2D 127.113 4.192 3.894 1.413% 21.535% 114.688 mixed_10/tower_1/conv/Conv2D -Conv2D 70.188 5.205 3.626 1.316% 22.850% 221.952 mixed_4/conv/Conv2D - -============================== Summary by node type ============================== -[Node type] [count] [avg ms] [avg %] [cdf %] [mem KB] -Conv2D 94 244.899 88.952% 88.952% 35869.953 -BiasAdd 95 9.664 3.510% 92.462% 35873.984 -AvgPool 9 7.990 2.902% 95.364% 7493.504 -Relu 94 5.727 2.080% 97.444% 35869.953 -MaxPool 5 3.485 1.266% 98.710% 3358.848 -Const 192 1.727 0.627% 99.337% 0.000 -Concat 11 1.081 0.393% 99.730% 9892.096 -MatMul 1 0.665 0.242% 99.971% 4.032 -Softmax 1 0.040 0.015% 99.986% 4.032 -<> 1 0.032 0.012% 99.997% 0.000 -Reshape 1 0.007 0.003% 100.000% 0.000 - -Timings (microseconds): count=50 first=330849 curr=274803 min=232354 max=415352 avg=275563 std=44193 -Memory (bytes): count=50 curr=128366400(all same) -514 nodes defined 504 nodes observed -</pre> - -This is the summary view, which is enabled by the show_summary flag. To -interpret it, the first table is a list of the nodes that took the most time, in -order by how long they took. From left to right, the columns are: - -- Node type, what kind of operation this was. - -- Start time of the op, showing where it falls in the sequence of operations. - -- First time in milliseconds. This is how long the operation took on the first - run of the benchmark, since by default 20 runs are executed to get more - reliable statistics. The first time is useful to spot which ops are doing - expensive calculations on the first run, and then caching the results. - -- Average time for the operation across all runs, in milliseconds. - -- What percentage of the total time for one run the op took. This is useful to - understand where the hotspots are. - -- The cumulative total time of this and the previous ops in the table. This is - handy for understanding what the distribution of work is across the layers, to - see if just a few of the nodes are taking up most of the time. - -- The amount of memory consumed by outputs of this type of op. - -- Name of the node. - -The second table is similar, but instead of breaking down the timings by -particular named nodes, it groups them by the kind of op. This is very useful to -understand which op implementations you might want to optimize or eliminate from -your graph. The table is arranged with the most costly operations at the start, -and only shows the top ten entries, with a placeholder for other nodes. The -columns from left to right are: - -- Type of the nodes being analyzed. - -- Accumulated average time taken by all nodes of this type, in milliseconds. - -- What percentage of the total time was taken by this type of operation. - -- Cumulative time taken by this and op types higher in the table, so you can - understand the distribution of the workload. - -- How much memory the outputs of this op type took up. - -Both of these tables are set up so that you can easily copy and paste their -results into spreadsheet documents, since they are output with tabs as -separators between the columns. The summary by node type can be the most useful -when looking for optimization opportunities, since it’s a pointer to the code -that’s taking the most time. In this case, you can see that the Conv2D ops are -almost 90% of the execution time. This is a sign that the graph is pretty -optimal, since convolutions and matrix multiplies are expected to be the bulk of -a neural network’s computing workload. - -As a rule of thumb, it’s more worrying if you see a lot of other operations -taking up more than a small fraction of the time. For neural networks, the ops -that don’t involve large matrix multiplications should usually be dwarfed by the -ones that do, so if you see a lot of time going into those it’s a sign that -either your network is non-optimally constructed, or the code implementing those -ops is not as optimized as it could -be. [Performance bugs](https://github.com/tensorflow/tensorflow/issues) or -patches are always welcome if you do encounter this situation, especially if -they include an attached model exhibiting this behavior and the command line -used to run the benchmark tool on it. - -The run above was on your desktop, but the tool also works on Android, which is -where it’s most useful for mobile development. Here’s an example command line to -run it on a 64-bit ARM device: - - bazel build -c opt --config=android_arm64 \ - tensorflow/tools/benchmark:benchmark_model - adb push bazel-bin/tensorflow/tools/benchmark/benchmark_model /data/local/tmp - adb push /tmp/tensorflow_inception_graph.pb /data/local/tmp/ - adb shell '/data/local/tmp/benchmark_model \ - --graph=/data/local/tmp/tensorflow_inception_graph.pb --input_layer="Mul" \ - --input_layer_shape="1,299,299,3" --input_layer_type="float" \ - --output_layer="softmax:0" --show_run_order=false --show_time=false \ - --show_memory=false --show_summary=true' - -You can interpret the results in exactly the same way as the desktop version -above. If you have any trouble figuring out what the right input and output -names and types are, take a look at the @{$mobile/prepare_models$Preparing models} -page for details about detecting these for your model, and look at the -`summarize_graph` tool which may give you -helpful information. - -There isn’t good support for command line tools on iOS, so instead there’s a -separate example -at -[tensorflow/examples/ios/benchmark](https://www.tensorflow.org/code/tensorflow/examples/ios/benchmark) that -packages the same functionality inside a standalone app. This outputs the -statistics to both the screen of the device and the debug log. If you want -on-screen statistics for the Android example apps, you can turn them on by -pressing the volume-up button. - -## Profiling within your own app - -The output you see from the benchmark tool is generated from modules that are -included as part of the standard TensorFlow runtime, which means you have access -to them within your own applications too. You can see an example of how to do -that [here](https://www.tensorflow.org/code/tensorflow/examples/ios/benchmark/BenchmarkViewController.mm?l=139). - -The basic steps are: - -1. Create a StatSummarizer object: - - tensorflow::StatSummarizer stat_summarizer(tensorflow_graph); - -2. Set up the options: - - tensorflow::RunOptions run_options; - run_options.set_trace_level(tensorflow::RunOptions::FULL_TRACE); - tensorflow::RunMetadata run_metadata; - -3. Run the graph: - - run_status = session->Run(run_options, inputs, output_layer_names, {}, - output_layers, &run_metadata); - -4. Calculate the results and print them out: - - assert(run_metadata.has_step_stats()); - const tensorflow::StepStats& step_stats = run_metadata.step_stats(); - stat_summarizer->ProcessStepStats(step_stats); - stat_summarizer->PrintStepStats(); - -## Visualizing Models - -The most effective way to speed up your code is by altering your model so it -does less work. To do that, you need to understand what your model is doing, and -visualizing it is a good first step. To get a high-level overview of your graph, -use [TensorBoard](https://github.com/tensorflow/tensorboard). - -## Threading - -The desktop version of TensorFlow has a sophisticated threading model, and will -try to run multiple operations in parallel if it can. In our terminology this is -called “inter-op parallelism” (though to avoid confusion with “intra-op”, you -could think of it as “between-op” instead), and can be set by specifying -`inter_op_parallelism_threads` in the session options. - -By default, mobile devices run operations serially; that is, -`inter_op_parallelism_threads` is set to 1. Mobile processors usually have few -cores and a small cache, so running multiple operations accessing disjoint parts -of memory usually doesn’t help performance. “Intra-op parallelism” (or -“within-op”) can be very helpful though, especially for computation-bound -operations like convolutions where different threads can feed off the same small -set of memory. - -On mobile, how many threads an op will use is set to the number of cores by -default, or 2 when the number of cores can't be determined. You can override the -default number of threads that ops are using by setting -`intra_op_parallelism_threads` in the session options. It’s a good idea to -reduce the default if your app has its own threads doing heavy processing, so -that they don’t interfere with each other. - -To see more details on session options, look at [ConfigProto](https://www.tensorflow.org/code/tensorflow/core/protobuf/config.proto). - -## Retrain with mobile data - -The biggest cause of accuracy problems when running models on mobile apps is -unrepresentative training data. For example, most of the Imagenet photos are -well-framed so that the object is in the center of the picture, well-lit, and -shot with a normal lens. Photos from mobile devices are often poorly framed, -badly lit, and can have fisheye distortions, especially selfies. - -The solution is to expand your training set with data actually captured from -your application. This step can involve extra work, since you’ll have to label -the examples yourself, but even if you just use it to expand your original -training data, it can help the training set dramatically. Improving the training -set by doing this, and by fixing other quality issues like duplicates or badly -labeled examples is the single best way to improve accuracy. It’s usually a -bigger help than altering your model architecture or using different techniques. - -## Reducing model loading time and/or memory footprint - -Most operating systems allow you to load a file using memory mapping, rather -than going through the usual I/O APIs. Instead of allocating an area of memory -on the heap and then copying bytes from disk into it, you simply tell the -operating system to make the entire contents of a file appear directly in -memory. This has several advantages: - -* Speeds loading -* Reduces paging (increases performance) -* Does not count towards RAM budget for your app - -TensorFlow has support for memory mapping the weights that form the bulk of most -model files. Because of limitations in the `ProtoBuf` serialization format, we -have to make a few changes to our model loading and processing code. The -way memory mapping works is that we have a single file where the first part is a -normal `GraphDef` serialized into the protocol buffer wire format, but then the -weights are appended in a form that can be directly mapped. - -To create this file, run the -`tensorflow/contrib/util:convert_graphdef_memmapped_format` tool. This takes in -a `GraphDef` file that’s been run through `freeze_graph` and converts it to the -format that has the weights appended at the end. Since that file’s no longer a -standard `GraphDef` protobuf, you then need to make some changes to the loading -code. You can see an example of this in -the -[iOS Camera demo app](https://www.tensorflow.org/code/tensorflow/examples/ios/camera/tensorflow_utils.mm?l=147), -in the `LoadMemoryMappedModel()` function. - -The same code (with the Objective C calls for getting the filenames substituted) -can be used on other platforms too. Because we’re using memory mapping, we need -to start by creating a special TensorFlow environment object that’s set up with -the file we’ll be using: - - std::unique_ptr<tensorflow::MemmappedEnv> memmapped_env; - memmapped_env->reset( - new tensorflow::MemmappedEnv(tensorflow::Env::Default())); - tensorflow::Status mmap_status = - (memmapped_env->get())->InitializeFromFile(file_path); - -You then need to pass in this environment to subsequent calls, like this one for -loading the graph: - - tensorflow::GraphDef tensorflow_graph; - tensorflow::Status load_graph_status = ReadBinaryProto( - memmapped_env->get(), - tensorflow::MemmappedFileSystem::kMemmappedPackageDefaultGraphDef, - &tensorflow_graph); - -You also need to create the session with a pointer to the environment you’ve -created: - - tensorflow::SessionOptions options; - options.config.mutable_graph_options() - ->mutable_optimizer_options() - ->set_opt_level(::tensorflow::OptimizerOptions::L0); - options.env = memmapped_env->get(); - - tensorflow::Session* session_pointer = nullptr; - tensorflow::Status session_status = - tensorflow::NewSession(options, &session_pointer); - -One thing to notice here is that we’re also disabling automatic optimizations, -since in some cases these will fold constant sub-trees, and so create copies of -tensor values that we don’t want and use up more RAM. - -Once you’ve gone through these steps, you can use the session and graph as -normal, and you should see a reduction in loading time and memory usage. - -## Protecting model files from easy copying - -By default, your models will be stored in the standard serialized protobuf -format on disk. In theory this means that anybody can copy your model, which you -may not want. However, in practice, most models are so application-specific and -obfuscated by optimizations that the risk is similar to that of competitors -disassembling and reusing your code, but if you do want to make it tougher for -casual users to access your files it is possible to take some basic steps. - -Most of our examples use -the -[ReadBinaryProto()](https://www.tensorflow.org/code/tensorflow/core/platform/env.cc?q=core/platform/env.cc&l=409) convenience -call to load a `GraphDef` from disk. This does require an unencrypted protobuf on -disk. Luckily though, the implementation of the call is pretty straightforward -and it should be easy to write an equivalent that can decrypt in memory. Here's -some code that shows how you can read and decrypt a protobuf using your own -decryption routine: - - Status ReadEncryptedProto(Env* env, const string& fname, - ::tensorflow::protobuf::MessageLite* proto) { - string data; - TF_RETURN_IF_ERROR(ReadFileToString(env, fname, &data)); - - DecryptData(&data); // Your own function here. - - if (!proto->ParseFromString(&data)) { - TF_RETURN_IF_ERROR(stream->status()); - return errors::DataLoss("Can't parse ", fname, " as binary proto"); - } - return Status::OK(); - } - -To use this you’d need to define the DecryptData() function yourself. It could -be as simple as something like: - - void DecryptData(string* data) { - for (int i = 0; i < data.size(); ++i) { - data[i] = data[i] ^ 0x23; - } - } - -You may want something more complex, but exactly what you’ll need is outside the -current scope here. diff --git a/tensorflow/docs_src/mobile/prepare_models.md b/tensorflow/docs_src/mobile/prepare_models.md deleted file mode 100644 index 2b84dbb973..0000000000 --- a/tensorflow/docs_src/mobile/prepare_models.md +++ /dev/null @@ -1,301 +0,0 @@ -# Preparing models for mobile deployment - -The requirements for storing model information during training are very -different from when you want to release it as part of a mobile app. This section -covers the tools involved in converting from a training model to something -releasable in production. - -## What is up with all the different saved file formats? - -You may find yourself getting very confused by all the different ways that -TensorFlow can save out graphs. To help, here’s a rundown of some of the -different components, and what they are used for. The objects are mostly defined -and serialized as protocol buffers: - -- [NodeDef](https://www.tensorflow.org/code/tensorflow/core/framework/node_def.proto): - Defines a single operation in a model. It has a unique name, a list of the - names of other nodes it pulls inputs from, the operation type it implements - (for example `Add`, or `Mul`), and any attributes that are needed to control - that operation. This is the basic unit of computation for TensorFlow, and all - work is done by iterating through a network of these nodes, applying each one - in turn. One particular operation type that’s worth knowing about is `Const`, - since this holds information about a constant. This may be a single, scalar - number or string, but it can also hold an entire multi-dimensional tensor - array. The values for a `Const` are stored inside the `NodeDef`, and so large - constants can take up a lot of room when serialized. - -- [Checkpoint](https://www.tensorflow.org/code/tensorflow/core/util/tensor_bundle/tensor_bundle.h). Another - way of storing values for a model is by using `Variable` ops. Unlike `Const` - ops, these don’t store their content as part of the `NodeDef`, so they take up - very little space within the `GraphDef` file. Instead their values are held in - RAM while a computation is running, and then saved out to disk as checkpoint - files periodically. This typically happens as a neural network is being - trained and weights are updated, so it’s a time-critical operation, and it may - happen in a distributed fashion across many workers, so the file format has to - be both fast and flexible. They are stored as multiple checkpoint files, - together with metadata files that describe what’s contained within the - checkpoints. When you’re referring to a checkpoint in the API (for example - when passing a filename in as a command line argument), you’ll use the common - prefix for a set of related files. If you had these files: - - /tmp/model/model-chkpt-1000.data-00000-of-00002 - /tmp/model/model-chkpt-1000.data-00001-of-00002 - /tmp/model/model-chkpt-1000.index - /tmp/model/model-chkpt-1000.meta - - You would refer to them as `/tmp/model/chkpt-1000`. - -- [GraphDef](https://www.tensorflow.org/code/tensorflow/core/framework/graph.proto): - Has a list of `NodeDefs`, which together define the computational graph to - execute. During training, some of these nodes will be `Variables`, and so if - you want to have a complete graph you can run, including the weights, you’ll - need to call a restore operation to pull those values from - checkpoints. Because checkpoint loading has to be flexible to deal with all of - the training requirements, this can be tricky to implement on mobile and - embedded devices, especially those with no proper file system available like - iOS. This is where - the - [`freeze_graph.py`](https://www.tensorflow.org/code/tensorflow/python/tools/freeze_graph.py) script - comes in handy. As mentioned above, `Const` ops store their values as part of - the `NodeDef`, so if all the `Variable` weights are converted to `Const` nodes, - then we only need a single `GraphDef` file to hold the model architecture and - the weights. Freezing the graph handles the process of loading the - checkpoints, and then converts all Variables to Consts. You can then load the - resulting file in a single call, without having to restore variable values - from checkpoints. One thing to watch out for with `GraphDef` files is that - sometimes they’re stored in text format for easy inspection. These versions - usually have a ‘.pbtxt’ filename suffix, whereas the binary files end with - ‘.pb’. - -- [FunctionDefLibrary](https://www.tensorflow.org/code/tensorflow/core/framework/function.proto): - This appears in `GraphDef`, and is effectively a set of sub-graphs, each with - information about their input and output nodes. Each sub-graph can then be - used as an op in the main graph, allowing easy instantiation of different - nodes, in a similar way to how functions encapsulate code in other languages. - -- [MetaGraphDef](https://www.tensorflow.org/code/tensorflow/core/protobuf/meta_graph.proto): - A plain `GraphDef` only has information about the network of computations, but - doesn’t have any extra information about the model or how it can be - used. `MetaGraphDef` contains a `GraphDef` defining the computation part of - the model, but also includes information like ‘signatures’, which are - suggestions about which inputs and outputs you may want to call the model - with, data on how and where any checkpoint files are saved, and convenience - tags for grouping ops together for ease of use. - -- [SavedModel](https://www.tensorflow.org/code/tensorflow/core/protobuf/saved_model.proto): - It’s common to want to have different versions of a graph that rely on a - common set of variable checkpoints. For example, you might need a GPU and a - CPU version of the same graph, but keep the same weights for both. You might - also need some extra files (like label names) as part of your - model. The - [SavedModel](https://www.tensorflow.org/code/tensorflow/python/saved_model/README.md) format - addresses these needs by letting you save multiple versions of the same graph - without duplicating variables, and also storing asset files in the same - bundle. Under the hood, it uses `MetaGraphDef` and checkpoint files, along - with extra metadata files. It’s the format that you’ll want to use if you’re - deploying a web API using TensorFlow Serving, for example. - -## How do you get a model you can use on mobile? - -In most situations, training a model with TensorFlow will give you a folder -containing a `GraphDef` file (usually ending with the `.pb` or `.pbtxt` extension) and -a set of checkpoint files. What you need for mobile or embedded deployment is a -single `GraphDef` file that’s been ‘frozen’, or had its variables converted into -inline constants so everything’s in one file. To handle the conversion, you’ll -need the `freeze_graph.py` script, that’s held in -[`tensorflow/python/tools/freeze_graph.py`](https://www.tensorflow.org/code/tensorflow/python/tools/freeze_graph.py). You’ll run it like this: - - bazel build tensorflow/python/tools:freeze_graph - bazel-bin/tensorflow/python/tools/freeze_graph \ - --input_graph=/tmp/model/my_graph.pb \ - --input_checkpoint=/tmp/model/model.ckpt-1000 \ - --output_graph=/tmp/frozen_graph.pb \ - --output_node_names=output_node \ - -The `input_graph` argument should point to the `GraphDef` file that holds your -model architecture. It’s possible that your `GraphDef` has been stored in a text -format on disk, in which case it’s likely to end in `.pbtxt` instead of `.pb`, -and you should add an extra `--input_binary=false` flag to the command. - -The `input_checkpoint` should be the most recent saved checkpoint. As mentioned -in the checkpoint section, you need to give the common prefix to the set of -checkpoints here, rather than a full filename. - -`output_graph` defines where the resulting frozen `GraphDef` will be -saved. Because it’s likely to contain a lot of weight values that take up a -large amount of space in text format, it’s always saved as a binary protobuf. - -`output_node_names` is a list of the names of the nodes that you want to extract -the results of your graph from. This is needed because the freezing process -needs to understand which parts of the graph are actually needed, and which are -artifacts of the training process, like summarization ops. Only ops that -contribute to calculating the given output nodes will be kept. If you know how -your graph is going to be used, these should just be the names of the nodes you -pass into `Session::Run()` as your fetch targets. The easiest way to find the -node names is to inspect the Node objects while building your graph in python. -Inspecting your graph in TensorBoard is another simple way. You can get some -suggestions on likely outputs by running the [`summarize_graph` tool](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms/README.md#inspecting-graphs). - -Because the output format for TensorFlow has changed over time, there are a -variety of other less commonly used flags available too, like `input_saver`, but -hopefully you shouldn’t need these on graphs trained with modern versions of the -framework. - -## Using the Graph Transform Tool - -A lot of the things you need to do to efficiently run a model on device are -available through the [Graph Transform -Tool](https://www.tensorflow.org/code/tensorflow/tools/graph_transforms/README.md). This -command-line tool takes an input `GraphDef` file, applies the set of rewriting -rules you request, and then writes out the result as a `GraphDef`. See the -documentation for more information on how to build and run this tool. - -### Removing training-only nodes - -TensorFlow `GraphDefs` produced by the training code contain all of the -computation that’s needed for back-propagation and updates of weights, as well -as the queuing and decoding of inputs, and the saving out of checkpoints. All of -these nodes are no longer needed during inference, and some of the operations -like checkpoint saving aren’t even supported on mobile platforms. To create a -model file that you can load on devices you need to delete those unneeded -operations by running the `strip_unused_nodes` rule in the Graph Transform Tool. - -The trickiest part of this process is figuring out the names of the nodes you -want to use as inputs and outputs during inference. You'll need these anyway -once you start to run inference, but you also need them here so that the -transform can calculate which nodes are not needed on the inference-only -path. These may not be obvious from the training code. The easiest way to -determine the node name is to explore the graph with TensorBoard. - -Remember that mobile applications typically gather their data from sensors and -have it as arrays in memory, whereas training typically involves loading and -decoding representations of the data stored on disk. In the case of Inception v3 -for example, there’s a `DecodeJpeg` op at the start of the graph that’s designed -to take JPEG-encoded data from a file retrieved from disk and turn it into an -arbitrary-sized image. After that there’s a `BilinearResize` op to scale it to -the expected size, followed by a couple of other ops that convert the byte data -into float and scale the value magnitudes it in the way the rest of the graph -expects. A typical mobile app will skip most of these steps because it’s getting -its input directly from a live camera, so the input node you will actually -supply will be the output of the `Mul` node in this case. - -<img src ="../images/inception_input.png" width="300"> - -You’ll need to do a similar process of inspection to figure out the correct -output nodes. - -If you’ve just been given a frozen `GraphDef` file, and are not sure about the -contents, try using the `summarize_graph` tool to print out information -about the inputs and outputs it finds from the graph structure. Here’s an -example with the original Inception v3 file: - - bazel run tensorflow/tools/graph_transforms:summarize_graph -- - --in_graph=tensorflow_inception_graph.pb - -Once you have an idea of what the input and output nodes are, you can feed them -into the graph transform tool as the `--input_names` and `--output_names` -arguments, and call the `strip_unused_nodes` transform, like this: - - bazel run tensorflow/tools/graph_transforms:transform_graph -- - --in_graph=tensorflow_inception_graph.pb - --out_graph=optimized_inception_graph.pb --inputs='Mul' --outputs='softmax' - --transforms=' - strip_unused_nodes(type=float, shape="1,299,299,3") - fold_constants(ignore_errors=true) - fold_batch_norms - fold_old_batch_norms' - -One thing to look out for here is that you need to specify the size and type -that you want your inputs to be. This is because any values that you’re going to -be passing in as inputs to inference need to be fed to special `Placeholder` op -nodes, and the transform may need to create them if they don’t already exist. In -the case of Inception v3 for example, a `Placeholder` node replaces the old -`Mul` node that used to output the resized and rescaled image array, since we’re -going to be doing that processing ourselves before we call TensorFlow. It keeps -the original name though, which is why we always feed in inputs to `Mul` when we -run a session with our modified Inception graph. - -After you’ve run this process, you’ll have a graph that only contains the actual -nodes you need to run your prediction process. This is the point where it -becomes useful to run metrics on the graph, so it’s worth running -`summarize_graph` again to understand what’s in your model. - -## What ops should you include on mobile? - -There are hundreds of operations available in TensorFlow, and each one has -multiple implementations for different data types. On mobile platforms, the size -of the executable binary that’s produced after compilation is important, because -app download bundles need to be as small as possible for the best user -experience. If all of the ops and data types are compiled into the TensorFlow -library then the total size of the compiled library can be tens of megabytes, so -by default only a subset of ops and data types are included. - -That means that if you load a model file that’s been trained on a desktop -machine, you may see the error “No OpKernel was registered to support Op” when -you load it on mobile. The first thing to try is to make sure you’ve stripped -out any training-only nodes, since the error will occur at load time even if the -op is never executed. If you’re still hitting the same problem once that’s done, -you’ll need to look at adding the op to your built library. - -The criteria for including ops and types fall into several categories: - -- Are they only useful in back-propagation, for gradients? Since mobile is - focused on inference, we don’t include these. - -- Are they useful mainly for other training needs, such as checkpoint saving? - These we leave out. - -- Do they rely on frameworks that aren’t always available on mobile, such as - libjpeg? To avoid extra dependencies we don’t include ops like `DecodeJpeg`. - -- Are there types that aren’t commonly used? We don’t include boolean variants - of ops for example, since we don’t see much use of them in typical inference - graphs. - -These ops are trimmed by default to optimize for inference on mobile, but it is -possible to alter some build files to change the default. After alternating the -build files, you will need to recompile TensorFlow. See below for more details -on how to do this, and also see @{$mobile/optimizing#binary_size$Optimizing} for -more on reducing your binary size. - -### Locate the implementation - -Operations are broken into two parts. The first is the op definition, which -declares the signature of the operation, which inputs, outputs, and attributes -it has. These take up very little space, and so all are included by default. The -implementations of the op computations are done in kernels, which live in the -`tensorflow/core/kernels` folder. You need to compile the C++ file containing -the kernel implementation of the op you need into the library. To figure out -which file that is, you can search for the operation name in the source -files. - -[Here’s an example search in github](https://github.com/search?utf8=%E2%9C%93&q=repo%3Atensorflow%2Ftensorflow+extension%3Acc+path%3Atensorflow%2Fcore%2Fkernels+REGISTER+Mul&type=Code&ref=searchresults). - -You’ll see that this search is looking for the `Mul` op implementation, and it -finds it in `tensorflow/core/kernels/cwise_op_mul_1.cc`. You need to look for -macros beginning with `REGISTER`, with the op name you care about as one of the -string arguments. - -In this case, the implementations are actually broken up across multiple `.cc` -files, so you’d need to include all of them in your build. If you’re more -comfortable using the command line for code search, here’s a grep command that -also locates the right files if you run it from the root of your TensorFlow -repository: - -`grep 'REGISTER.*"Mul"' tensorflow/core/kernels/*.cc` - -### Add the implementation to the build - -If you’re using Bazel, and building for Android, you’ll want to add the files -you’ve found to -the -[`android_extended_ops_group1`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3565) or -[`android_extended_ops_group2`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3632) targets. You -may also need to include any .cc files they depend on in there. If the build -complains about missing header files, add the .h’s that are needed into -the -[`android_extended_ops`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3525) target. - -If you’re using a makefile targeting iOS, Raspberry Pi, etc, go to -[`tensorflow/contrib/makefile/tf_op_files.txt`](https://www.tensorflow.org/code/tensorflow/contrib/makefile/tf_op_files.txt) and -add the right implementation files there. diff --git a/tensorflow/docs_src/mobile/tflite/demo_android.md b/tensorflow/docs_src/mobile/tflite/demo_android.md deleted file mode 100644 index fdf0bcf3c1..0000000000 --- a/tensorflow/docs_src/mobile/tflite/demo_android.md +++ /dev/null @@ -1,146 +0,0 @@ -# Android Demo App - -An example Android application using TensorFLow Lite is available -[on GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo). -The demo is a sample camera app that classifies images continuously -using either a quantized Mobilenet model or a floating point Inception-v3 model. -To run the demo, a device running Android 5.0 ( API 21) or higher is required. - -In the demo app, inference is done using the TensorFlow Lite Java API. The demo -app classifies frames in real-time, displaying the top most probable -classifications. It also displays the time taken to detect the object. - -There are three ways to get the demo app to your device: - -* Download the [prebuilt binary APK](http://download.tensorflow.org/deps/tflite/TfLiteCameraDemo.apk). -* Use Android Studio to build the application. -* Download the source code for TensorFlow Lite and the demo and build it using - bazel. - - -## Download the pre-built binary - -The easiest way to try the demo is to download the -[pre-built binary APK](https://storage.googleapis.com/download.tensorflow.org/deps/tflite/TfLiteCameraDemo.apk) - -Once the APK is installed, click the app icon to start the program. The first -time the app is opened, it asks for runtime permissions to access the device -camera. The demo app opens the back-camera of the device and recognizes objects -in the camera's field of view. At the bottom of the image (or at the left -of the image if the device is in landscape mode), it displays top three objects -classified and the classification latency. - - -## Build in Android Studio with TensorFlow Lite AAR from JCenter - -Use Android Studio to try out changes in the project code and compile the demo -app: - -* Install the latest version of - [Android Studio](https://developer.android.com/studio/index.html). -* Make sure the Android SDK version is greater than 26 and NDK version is greater - than 14 (in the Android Studio settings). -* Import the `tensorflow/contrib/lite/java/demo` directory as a new - Android Studio project. -* Install all the Gradle extensions it requests. - -Now you can build and run the demo app. - -The build process downloads the quantized [Mobilenet TensorFlow Lite model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_224_android_quant_2017_11_08.zip), and unzips it into the assets directory: `tensorflow/contrib/lite/java/demo/app/src/main/assets/`. - -Some additional details are available on the -[TF Lite Android App page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/README.md). - -### Using other models - -To use a different model: -* Download the floating point [Inception-v3 model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/inception_v3_slim_2016_android_2017_11_10.zip). -* Unzip and copy `inceptionv3_non_slim_2015.tflite` to the assets directory. -* Change the chosen classifier in [Camera2BasicFragment.java](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/java/demo/app/src/main/java/com/example/android/tflitecamerademo/Camera2BasicFragment.java)<br> - from: `classifier = new ImageClassifierQuantizedMobileNet(getActivity());`<br> - to: `classifier = new ImageClassifierFloatInception(getActivity());`. - - -## Build TensorFlow Lite and the demo app from source - -### Clone the TensorFlow repo - -```sh -git clone https://github.com/tensorflow/tensorflow -``` - -### Install Bazel - -If `bazel` is not installed on your system, see -[Installing Bazel](https://bazel.build/versions/master/docs/install.html). - -Note: Bazel does not currently support Android builds on Windows. Windows users -should download the -[prebuilt binary](https://storage.googleapis.com/download.tensorflow.org/deps/tflite/TfLiteCameraDemo.apk). - -### Install Android NDK and SDK - -The Android NDK is required to build the native (C/C++) TensorFlow Lite code. The -current recommended version is *14b* and can be found on the -[NDK Archives](https://developer.android.com/ndk/downloads/older_releases.html#ndk-14b-downloads) -page. - -The Android SDK and build tools can be -[downloaded separately](https://developer.android.com/tools/revisions/build-tools.html) -or used as part of -[Android Studio](https://developer.android.com/studio/index.html). To build the -TensorFlow Lite Android demo, build tools require API >= 23 (but it will run on -devices with API >= 21). - -In the root of the TensorFlow repository, update the `WORKSPACE` file with the -`api_level` and location of the SDK and NDK. If you installed it with -Android Studio, the SDK path can be found in the SDK manager. The default NDK -path is:`{SDK path}/ndk-bundle.` For example: - -``` -android_sdk_repository ( - name = "androidsdk", - api_level = 23, - build_tools_version = "23.0.2", - path = "/home/xxxx/android-sdk-linux/", -) - -android_ndk_repository( - name = "androidndk", - path = "/home/xxxx/android-ndk-r10e/", - api_level = 19, -) -``` - -Some additional details are available on the -[TF Lite Android App page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/README.md). - -### Build the source code - -To build the demo app, run `bazel`: - -``` -bazel build --cxxopt=--std=c++11 //tensorflow/contrib/lite/java/demo/app/src/main:TfLiteCameraDemo -``` - -Caution: Because of an bazel bug, we only support building the Android demo app -within a Python 2 environment. - - -## About the demo - -The demo app is resizing each camera image frame (224 width * 224 height) to -match the quantized MobileNets model (299 * 299 for Inception-v3). The resized -image is converted—row by row—into a -[ByteBuffer](https://developer.android.com/reference/java/nio/ByteBuffer.html). -Its size is 1 * 224 * 224 * 3 bytes, where 1 is the number of images in a batch. -224 * 224 (299 * 299) is the width and height of the image. 3 bytes represents -the 3 colors of a pixel. - -This demo uses the TensorFlow Lite Java inference API -for models which take a single input and provide a single output. This outputs a -two-dimensional array, with the first dimension being the category index and the -second dimension being the confidence of classification. Both models have 1001 -unique categories and the app sorts the probabilities of all the categories and -displays the top three. The model file must be downloaded and bundled within the -assets directory of the app. diff --git a/tensorflow/docs_src/mobile/tflite/demo_ios.md b/tensorflow/docs_src/mobile/tflite/demo_ios.md deleted file mode 100644 index 3be21da89f..0000000000 --- a/tensorflow/docs_src/mobile/tflite/demo_ios.md +++ /dev/null @@ -1,68 +0,0 @@ -# iOS Demo App - -The TensorFlow Lite demo is a camera app that continuously classifies whatever -it sees from your device's back camera, using a quantized MobileNet model. These -instructions walk you through building and running the demo on an iOS device. - -## Prerequisites - -* You must have [Xcode](https://developer.apple.com/xcode/) installed and have a - valid Apple Developer ID, and have an iOS device set up and linked to your - developer account with all of the appropriate certificates. For these - instructions, we assume that you have already been able to build and deploy an - app to an iOS device with your current developer environment. - -* The demo app requires a camera and must be executed on a real iOS device. You - can build it and run with the iPhone Simulator but it won't have any camera - information to classify. - -* You don't need to build the entire TensorFlow library to run the demo, but you - will need to clone the TensorFlow repository if you haven't already: - - git clone https://github.com/tensorflow/tensorflow - -* You'll also need the Xcode command-line tools: - - xcode-select --install - - If this is a new install, you will need to run the Xcode application once to - agree to the license before continuing. - -## Building the iOS Demo App - -1. Install CocoaPods if you don't have it: - - sudo gem install cocoapods - -2. Download the model files used by the demo app (this is done from inside the - cloned directory): - - sh tensorflow/contrib/lite/examples/ios/download_models.sh - -3. Install the pod to generate the workspace file: - - cd tensorflow/contrib/lite/examples/ios/camera - pod install - - If you have installed this pod before and that command doesn't work, try - - pod update - - At the end of this step you should have a file called - `tflite_camera_example.xcworkspace`. - -4. Open the project in Xcode by typing this on the command line: - - open tflite_camera_example.xcworkspace - - This launches Xcode if it isn't open already and opens the - `tflite_camera_example` project. - -5. Build and run the app in Xcode. - - Note that as mentioned earlier, you must already have a device set up and - linked to your Apple Developer account in order to deploy the app on a - device. - -You'll have to grant permissions for the app to use the device's camera. Point -the camera at various objects and enjoy seeing how the model classifies things! diff --git a/tensorflow/docs_src/mobile/tflite/devguide.md b/tensorflow/docs_src/mobile/tflite/devguide.md deleted file mode 100644 index b168d6c183..0000000000 --- a/tensorflow/docs_src/mobile/tflite/devguide.md +++ /dev/null @@ -1,232 +0,0 @@ -# Developer Guide - -Using a TensorFlow Lite model in your mobile app requires multiple -considerations: you must choose a pre-trained or custom model, convert the model -to a TensorFLow Lite format, and finally, integrate the model in your app. - -## 1. Choose a model - -Depending on the use case, you can choose one of the popular open-sourced models, -such as *InceptionV3* or *MobileNets*, and re-train these models with a custom -data set or even build your own custom model. - -### Use a pre-trained model - -[MobileNets](https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html) -is a family of mobile-first computer vision models for TensorFlow designed to -effectively maximize accuracy, while taking into consideration the restricted -resources for on-device or embedded applications. MobileNets are small, -low-latency, low-power models parameterized to meet the resource constraints for -a variety of uses. They can be used for classification, detection, embeddings, and -segmentation—similar to other popular large scale models, such as -[Inception](https://arxiv.org/pdf/1602.07261.pdf). Google provides 16 pre-trained -[ImageNet](http://www.image-net.org/challenges/LSVRC/) classification checkpoints -for MobileNets that can be used in mobile projects of all sizes. - -[Inception-v3](https://arxiv.org/abs/1512.00567) is an image recognition model -that achieves fairly high accuracy recognizing general objects with 1000 classes, -for example, "Zebra", "Dalmatian", and "Dishwasher". The model extracts general -features from input images using a convolutional neural network and classifies -them based on those features with fully-connected and softmax layers. - -[On Device Smart Reply](https://research.googleblog.com/2017/02/on-device-machine-intelligence.html) -is an on-device model that provides one-touch replies for incoming text messages -by suggesting contextually relevant messages. The model is built specifically for -memory constrained devices, such as watches and phones, and has been successfully -used in Smart Replies on Android Wear. Currently, this model is Android-specific. - -These pre-trained models are [available for download](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/models.md) - -### Re-train Inception-V3 or MobileNet for a custom data set - -These pre-trained models were trained on the *ImageNet* data set which contains -1000 predefined classes. If these classes are not sufficient for your use case, -the model will need to be re-trained. This technique is called -*transfer learning* and starts with a model that has been already trained on a -problem, then retrains the model on a similar problem. Deep learning from -scratch can take days, but transfer learning is fairly quick. In order to do -this, you need to generate a custom data set labeled with the relevant classes. - -The [TensorFlow for Poets](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/) -codelab walks through the re-training process step-by-step. The code supports -both floating point and quantized inference. - -### Train a custom model - -A developer may choose to train a custom model using Tensorflow (see the -[TensorFlow tutorials](../../tutorials/) for examples of building and training -models). If you have already written a model, the first step is to export this -to a @{tf.GraphDef} file. This is required because some formats do not store the -model structure outside the code, and we must communicate with other parts of the -framework. See -[Exporting the Inference Graph](https://github.com/tensorflow/models/blob/master/research/slim/README.md) -to create .pb file for the custom model. - -TensorFlow Lite currently supports a subset of TensorFlow operators. Refer to the -[TensorFlow Lite & TensorFlow Compatibility Guide](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/g3doc/tf_ops_compatibility.md) -for supported operators and their usage. This set of operators will continue to -grow in future Tensorflow Lite releases. - - -## 2. Convert the model format - -The model generated (or downloaded) in the previous step is a *standard* -Tensorflow model and you should now have a .pb or .pbtxt @{tf.GraphDef} file. -Models generated with transfer learning (re-training) or custom models must be -converted—but, we must first freeze the graph to convert the model to the -Tensorflow Lite format. This process uses several model formats: - -* @{tf.GraphDef} (.pb) —A protobuf that represents the TensorFlow training or - computation graph. It contains operators, tensors, and variables definitions. -* *CheckPoint* (.ckpt) —Serialized variables from a TensorFlow graph. Since this - does not contain a graph structure, it cannot be interpreted by itself. -* `FrozenGraphDef` —A subclass of `GraphDef` that does not contain - variables. A `GraphDef` can be converted to a `FrozenGraphDef` by taking a - CheckPoint and a `GraphDef`, and converting each variable into a constant - using the value retrieved from the CheckPoint. -* `SavedModel` —A `GraphDef` and CheckPoint with a signature that labels - input and output arguments to a model. A `GraphDef` and CheckPoint can be - extracted from a `SavedModel`. -* *TensorFlow Lite model* (.tflite) —A serialized - [FlatBuffer](https://google.github.io/flatbuffers/) that contains TensorFlow - Lite operators and tensors for the TensorFlow Lite interpreter, similar to a - `FrozenGraphDef`. - -### Freeze Graph - -To use the `GraphDef` .pb file with TensorFlow Lite, you must have checkpoints -that contain trained weight parameters. The .pb file only contains the structure -of the graph. The process of merging the checkpoint values with the graph -structure is called *freezing the graph*. - -You should have a checkpoints folder or download them for a pre-trained model -(for example, -[MobileNets](https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md)). - -To freeze the graph, use the following command (changing the arguments): - -``` -freeze_graph --input_graph=/tmp/mobilenet_v1_224.pb \ - --input_checkpoint=/tmp/checkpoints/mobilenet-10202.ckpt \ - --input_binary=true \ - --output_graph=/tmp/frozen_mobilenet_v1_224.pb \ - --output_node_names=MobileNetV1/Predictions/Reshape_1 -``` - -The `input_binary` flag must be enabled so the protobuf is read and written in -a binary format. Set the `input_graph` and `input_checkpoint` files. - -The `output_node_names` may not be obvious outside of the code that built the -model. The easiest way to find them is to visualize the graph, either with -[TensorBoard](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets-2/#3) -or `graphviz`. - -The frozen `GraphDef` is now ready for conversion to the `FlatBuffer` format -(.tflite) for use on Android or iOS devices. For Android, the Tensorflow -Optimizing Converter tool supports both float and quantized models. To convert -the frozen `GraphDef` to the .tflite format: - -``` -toco --input_file=$(pwd)/mobilenet_v1_1.0_224/frozen_graph.pb \ - --input_format=TENSORFLOW_GRAPHDEF \ - --output_format=TFLITE \ - --output_file=/tmp/mobilenet_v1_1.0_224.tflite \ - --inference_type=FLOAT \ - --input_type=FLOAT \ - --input_arrays=input \ - --output_arrays=MobilenetV1/Predictions/Reshape_1 \ - --input_shapes=1,224,224,3 -``` - -The `input_file` argument should reference the frozen `GraphDef` file -containing the model architecture. The [frozen_graph.pb](https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz) -file used here is available for download. `output_file` is where the TensorFlow -Lite model will get generated. The `input_type` and `inference_type` -arguments should be set to `FLOAT`, unless converting a -@{$performance/quantization$quantized model}. Setting the `input_array`, -`output_array`, and `input_shape` arguments are not as straightforward. The -easiest way to find these values is to explore the graph using Tensorboard. Reuse -the arguments for specifying the output nodes for inference in the -`freeze_graph` step. - -It is also possible to use the Tensorflow Optimizing Converter with protobufs -from either Python or from the command line (see the -[toco_from_protos.py](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/toco/python/toco_from_protos.py) -example). This allows you to integrate the conversion step into the model design -workflow, ensuring the model is easily convertible to a mobile inference graph. -For example: - -```python -import tensorflow as tf - -img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3)) -val = img + tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.]) -out = tf.identity(val, name="out") - -with tf.Session() as sess: - tflite_model = tf.contrib.lite.toco_convert(sess.graph_def, [img], [out]) - open("converteds_model.tflite", "wb").write(tflite_model) -``` - -For usage, see the Tensorflow Optimizing Converter -[command-line examples](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/toco/g3doc/cmdline_examples.md). - -Refer to the -[Ops compatibility guide](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/g3doc/tf_ops_compatibility.md) -for troubleshooting help, and if that doesn't help, please -[file an issue](https://github.com/tensorflow/tensorflow/issues). - -The [development repo](https://github.com/tensorflow/tensorflow) contains a tool -to visualize TensorFlow Lite models after conversion. To build the -[visualize.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/tools/visualize.py) -tool: - -```sh -bazel run tensorflow/contrib/lite/tools:visualize -- model.tflite model_viz.html -``` - -This generates an interactive HTML page listing subgraphs, operations, and a -graph visualization. - - -## 3. Use the TensorFlow Lite model for inference in a mobile app - -After completing the prior steps, you should now have a `.tflite` model file. - -### Android - -Since Android apps are written in Java and the core TensorFlow library is in C++, -a JNI library is provided as an interface. This is only meant for inference—it -provides the ability to load a graph, set up inputs, and run the model to -calculate outputs. - -The open source Android demo app uses the JNI interface and is available -[on GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/app). -You can also download a -[prebuilt APK](http://download.tensorflow.org/deps/tflite/TfLiteCameraDemo.apk). -See the @{$tflite/demo_android} guide for details. - -The @{$mobile/android_build} guide has instructions for installing TensorFlow on -Android and setting up `bazel` and Android Studio. - -### iOS - -To integrate a TensorFlow model in an iOS app, see the -[TensorFlow Lite for iOS](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/g3doc/ios.md) -guide and @{$tflite/demo_ios} guide. - -#### Core ML support - -Core ML is a machine learning framework used in Apple products. In addition to -using Tensorflow Lite models directly in your applications, you can convert -trained Tensorflow models to the -[CoreML](https://developer.apple.com/machine-learning/) format for use on Apple -devices. To use the converter, refer to the -[Tensorflow-CoreML converter documentation](https://github.com/tf-coreml/tf-coreml). - -### Raspberry Pi - -Compile Tensorflow Lite for a Raspberry Pi by following the -[RPi build instructions](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/rpi.md) -This compiles a static library file (`.a`) used to build your app. There are -plans for Python bindings and a demo app. diff --git a/tensorflow/docs_src/mobile/tflite/index.md b/tensorflow/docs_src/mobile/tflite/index.md deleted file mode 100644 index cc4af2a875..0000000000 --- a/tensorflow/docs_src/mobile/tflite/index.md +++ /dev/null @@ -1,201 +0,0 @@ -# Introduction to TensorFlow Lite - -TensorFlow Lite is TensorFlow’s lightweight solution for mobile and embedded -devices. It enables on-device machine learning inference with low latency and a -small binary size. TensorFlow Lite also supports hardware acceleration with the -[Android Neural Networks -API](https://developer.android.com/ndk/guides/neuralnetworks/index.html). - -TensorFlow Lite uses many techniques for achieving low latency such as -optimizing the kernels for mobile apps, pre-fused activations, and quantized -kernels that allow smaller and faster (fixed-point math) models. - -Most of our TensorFlow Lite documentation is [on -GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite) -for the time being. - -## What does TensorFlow Lite contain? - -TensorFlow Lite supports a set of core operators, both quantized and -float, which have been tuned for mobile platforms. They incorporate pre-fused -activations and biases to further enhance performance and quantized -accuracy. Additionally, TensorFlow Lite also supports using custom operations in -models. - -TensorFlow Lite defines a new model file format, based on -[FlatBuffers](https://google.github.io/flatbuffers/). FlatBuffers is an -open-sourced, efficient cross platform serialization library. It is similar to -[protocol buffers](https://developers.google.com/protocol-buffers/?hl=en), but -the primary difference is that FlatBuffers does not need a parsing/unpacking -step to a secondary representation before you can access data, often coupled -with per-object memory allocation. Also, the code footprint of FlatBuffers is an -order of magnitude smaller than protocol buffers. - -TensorFlow Lite has a new mobile-optimized interpreter, which has the key goals -of keeping apps lean and fast. The interpreter uses a static graph ordering and -a custom (less-dynamic) memory allocator to ensure minimal load, initialization, -and execution latency. - -TensorFlow Lite provides an interface to leverage hardware acceleration, if -available on the device. It does so via the -[Android Neural Networks API](https://developer.android.com/ndk/guides/neuralnetworks/index.html), -available on Android 8.1 (API level 27) and higher. - -## Why do we need a new mobile-specific library? - -Machine Learning is changing the computing paradigm, and we see an emerging -trend of new use cases on mobile and embedded devices. Consumer expectations are -also trending toward natural, human-like interactions with their devices, driven -by the camera and voice interaction models. - -There are several factors which are fueling interest in this domain: - -- Innovation at the silicon layer is enabling new possibilities for hardware - acceleration, and frameworks such as the Android Neural Networks API make it - easy to leverage these. - -- Recent advances in real-time computer-vision and spoken language understanding - have led to mobile-optimized benchmark models being open sourced - (e.g. MobileNets, SqueezeNet). - -- Widely-available smart appliances create new possibilities for - on-device intelligence. - -- Interest in stronger user data privacy paradigms where user data does not need - to leave the mobile device. - -- Ability to serve ‘offline’ use cases, where the device does not need to be - connected to a network. - -We believe the next wave of machine learning applications will have significant -processing on mobile and embedded devices. - -## TensorFlow Lite highlights - -TensorFlow Lite provides: - -- A set of core operators, both quantized and float, many of which have been - tuned for mobile platforms. These can be used to create and run custom - models. Developers can also write their own custom operators and use them in - models. - -- A new [FlatBuffers](https://google.github.io/flatbuffers/)-based - model file format. - -- On-device interpreter with kernels optimized for faster execution on mobile. - -- TensorFlow converter to convert TensorFlow-trained models to the TensorFlow - Lite format. - -- Smaller in size: TensorFlow Lite is smaller than 300KB when all supported - operators are linked and less than 200KB when using only the operators needed - for supporting InceptionV3 and Mobilenet. - -- **Pre-tested models:** - - All of the following models are guaranteed to work out of the box: - - - Inception V3, a popular model for detecting the dominant objects - present in an image. - - - [MobileNets](https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md), - a family of mobile-first computer vision models designed to effectively - maximize accuracy while being mindful of the restricted resources for an - on-device or embedded application. They are small, low-latency, low-power - models parameterized to meet the resource constraints of a variety of use - cases. They can be built upon for classification, detection, embeddings - and segmentation. MobileNet models are smaller but [lower in - accuracy](https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html) - than Inception V3. - - - On Device Smart Reply, an on-device model which provides one-touch - replies for an incoming text message by suggesting contextually relevant - messages. The model was built specifically for memory constrained devices - such as watches & phones and it has been successfully used to surface - [Smart Replies on Android - Wear](https://research.googleblog.com/2017/02/on-device-machine-intelligence.html) - to all first-party and third-party apps. - - Also see the complete list of - [TensorFlow Lite's supported models](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/models.md), - including the model sizes, performance numbers, and downloadable model files. - -- Quantized versions of the MobileNet model, which runs faster than the - non-quantized (float) version on CPU. - -- New Android demo app to illustrate the use of TensorFlow Lite with a quantized - MobileNet model for object classification. - -- Java and C++ API support - - -## Getting Started - -We recommend you try out TensorFlow Lite with the pre-tested models indicated -above. If you have an existing model, you will need to test whether your model -is compatible with both the converter and the supported operator set. To test -your model, see the -[documentation on GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite). - -### Retrain Inception-V3 or MobileNet for a custom data set - -The pre-trained models mentioned above have been trained on the ImageNet data -set, which consists of 1000 predefined classes. If those classes are not -relevant or useful for your use case, you will need to retrain those -models. This technique is called transfer learning, which starts with a model -that has been already trained on a problem and will then be retrained on a -similar problem. Deep learning from scratch can take days, but transfer learning -can be done fairly quickly. In order to do this, you'll need to generate your -custom data set labeled with the relevant classes. - -The [TensorFlow for Poets](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/) -codelab walks through this process step-by-step. The retraining code supports -retraining for both floating point and quantized inference. - -## TensorFlow Lite Architecture - -The following diagram shows the architectural design of TensorFlow Lite: - -<img src="https://www.tensorflow.org/images/tflite-architecture.jpg" - alt="TensorFlow Lite architecture diagram" - style="max-width:600px;"> - -Starting with a trained TensorFlow model on disk, you'll convert that model to -the TensorFlow Lite file format (`.tflite`) using the TensorFlow Lite -Converter. Then you can use that converted file in your mobile application. - -Deploying the TensorFlow Lite model file uses: - -- Java API: A convenience wrapper around the C++ API on Android. - -- C++ API: Loads the TensorFlow Lite Model File and invokes the Interpreter. The - same library is available on both Android and iOS. - -- Interpreter: Executes the model using a set of kernels. The interpreter - supports selective kernel loading; without kernels it is only 100KB, and 300KB - with all the kernels loaded. This is a significant reduction from the 1.5M - required by TensorFlow Mobile. - -- On select Android devices, the Interpreter will use the Android Neural - Networks API for hardware acceleration, or default to CPU execution if none - are available. - -You can also implement custom kernels using the C++ API that can be used by the -Interpreter. - -## Future Work - -In future releases, TensorFlow Lite will support more models and built-in -operators, contain performance improvements for both fixed point and floating -point models, improvements to the tools to enable easier developer workflows and -support for other smaller devices and more. As we continue development, we hope -that TensorFlow Lite will greatly simplify the developer experience of targeting -a model for small devices. - -Future plans include using specialized machine learning hardware to get the best -possible performance for a particular model on a particular device. - -## Next Steps - -The TensorFlow Lite [GitHub repository](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite). -contains additional docs, code samples, and demo applications. diff --git a/tensorflow/docs_src/mobile/tflite/performance.md b/tensorflow/docs_src/mobile/tflite/performance.md deleted file mode 100644 index 79bacaaa1b..0000000000 --- a/tensorflow/docs_src/mobile/tflite/performance.md +++ /dev/null @@ -1,174 +0,0 @@ -# Performance - -This document lists TensorFlow Lite performance benchmarks when running well -known models on some Android and iOS devices. - -These performance benchmark numbers were generated with the -[Android TFLite benchmark binary](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark) -and the [iOS benchmark app](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark/ios). - -# Android performance benchmarks - -For Android benchmarks, the CPU affinity is set to use big cores on the device to -reduce variance (see [details](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark#reducing-variance-between-runs-on-android)). - -It assumes that models were download and unzipped to the -`/data/local/tmp/tflite_models` directory. The benchmark binary is built -using [these instructions](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark#on-android) -and assumed in the `/data/local/tmp` directory. - -To run the benchmark: - -``` -adb shell taskset ${CPU_MASK} /data/local/tmp/benchmark_model \ - --num_threads=1 \ - --graph=/data/local/tmp/tflite_models/${GRAPH} \ - --warmup_runs=1 \ - --num_runs=50 \ - --use_nnapi=false -``` - -Here, `${GRAPH}` is the name of model and `${CPU_MASK}` is the CPU affinity -chosen according to the following table: - -Device | CPU_MASK | --------| ---------- -Pixel 2 | f0 | -Pixel xl | 0c | - - -<table> - <thead> - <tr> - <th>Model Name</th> - <th>Device </th> - <th>Mean inference time (std dev)</th> - </tr> - </thead> - <tr> - <td rowspan = 2> - <a href="http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz">Mobilenet_1.0_224(float)</a> - </td> - <td>Pixel 2 </td> - <td>166.5 ms (2.6 ms)</td> - </tr> - <tr> - <td>Pixel xl </td> - <td>122.9 ms (1.8 ms) </td> - </tr> - <tr> - <td rowspan = 2> - <a href="http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224_quant.tgz">Mobilenet_1.0_224 (quant)</a> - </td> - <td>Pixel 2 </td> - <td>69.5 ms (0.9 ms)</td> - </tr> - <tr> - <td>Pixel xl </td> - <td>78.9 ms (2.2 ms) </td> - </tr> - <tr> - <td rowspan = 2> - <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/nasnet_mobile_2018_04_27.tgz">NASNet mobile</a> - </td> - <td>Pixel 2 </td> - <td>273.8 ms (3.5 ms)</td> - </tr> - <tr> - <td>Pixel xl </td> - <td>210.8 ms (4.2 ms)</td> - </tr> - <tr> - <td rowspan = 2> - <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/squeezenet_2018_04_27.tgz">SqueezeNet</a> - </td> - <td>Pixel 2 </td> - <td>234.0 ms (2.1 ms)</td> - </tr> - <tr> - <td>Pixel xl </td> - <td>158.0 ms (2.1 ms)</td> - </tr> - <tr> - <td rowspan = 2> - <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_resnet_v2_2018_04_27.tgz">Inception_ResNet_V2</a> - </td> - <td>Pixel 2 </td> - <td>2846.0 ms (15.0 ms)</td> - </tr> - <tr> - <td>Pixel xl </td> - <td>1973.0 ms (15.0 ms) </td> - </tr> - <tr> - <td rowspan = 2> - <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_v4_2018_04_27.tgz">Inception_V4</a> - </td> - <td>Pixel 2 </td> - <td>3180.0 ms (11.7 ms)</td> - </tr> - <tr> - <td>Pixel xl </td> - <td>2262.0 ms (21.0 ms) </td> - </tr> - - </table> - -# iOS benchmarks - -To run iOS benchmarks, the [benchmark -app](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark/ios) -was modified to include the appropriate model and `benchmark_params.json` was -modified to set `num_threads` to 1. - -<table> - <thead> - <tr> - <th>Model Name</th> - <th>Device </th> - <th>Mean inference time (std dev)</th> - </tr> - </thead> - <tr> - <td> - <a href="http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz">Mobilenet_1.0_224(float)</a> - </td> - <td>iPhone 8 </td> - <td>32.2 ms (0.8 ms)</td> - </tr> - <tr> - <td> - <a href="http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224_quant.tgz)">Mobilenet_1.0_224 (quant)</a> - </td> - <td>iPhone 8 </td> - <td>24.4 ms (0.8 ms)</td> - </tr> - <tr> - <td> - <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/nasnet_mobile_2018_04_27.tgz">NASNet mobile</a> - </td> - <td>iPhone 8 </td> - <td>60.3 ms (0.6 ms)</td> - </tr> - <tr> - <td> - <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/squeezenet_2018_04_27.tgz">SqueezeNet</a> - </td> - <td>iPhone 8 </td> - <td>44.3 (0.7 ms)</td> - </tr> - <tr> - <td> - <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_resnet_v2_2018_04_27.tgz">Inception_ResNet_V2</a> - </td> - <td>iPhone 8</td> - <td>562.4 ms (18.2 ms)</td> - </tr> - <tr> - <td> - <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_v4_2018_04_27.tgz">Inception_V4</a> - </td> - <td>iPhone 8 </td> - <td>661.0 ms (29.2 ms)</td> - </tr> - </table> |