Setup for TFLite subsite

PiperOrigin-RevId: 205866236
author: Billy Lamberta <blamb@google.com> 2018-07-24 11:52:23 -0700
committer: TensorFlower Gardener <gardener@tensorflow.org> 2018-07-24 11:58:53 -0700
commit: ff2aa1b59d4a111af094c0c7724e453eefe1f3b7 (patch)
tree: cd13149671e53a3b28e9a2fdb012310a46de03d9 /tensorflow/docs_src
parent: badf913c0a2f83ca933b8fe73a29f7dd5d2bc5ce (diff)
14 files changed, 3 insertions, 2444 deletions
diff --git a/tensorflow/docs_src/mobile/README.md b/tensorflow/docs_src/mobile/README.md
new file mode 100644
index 0000000000..ecf4267265
--- /dev/null
+++ b/tensorflow/docs_src/mobile/README.md
@@ -0,0 +1,3 @@
+# TF Lite subsite
+
+This subsite directory lives in [tensorflow/contrib/lite/g3doc](../../contrib/lite/g3doc/).
diff --git a/tensorflow/docs_src/mobile/android_build.md b/tensorflow/docs_src/mobile/android_build.md
deleted file mode 100644
index f4b07db459..0000000000
--- a/tensorflow/docs_src/mobile/android_build.md
+++ /dev/null
@@ -1,177 +0,0 @@
-# Building TensorFlow on Android
-
-To get you started working with TensorFlow on Android, we'll walk through two
-ways to build our TensorFlow mobile demos and deploying them on an Android
-device. The first is Android Studio, which lets you build and deploy in an
-IDE. The second is building with Bazel and deploying with ADB on the command
-line.
-
-Why choose one or the other of these methods?
-
-The simplest way to use TensorFlow on Android is to use Android Studio. If you
-aren't planning to customize your TensorFlow build at all, or if you want to use
-Android Studio's editor and other features to build an app and just want to add
-TensorFlow to it, we recommend using Android Studio.
-
-If you are using custom ops, or have some other reason to build TensorFlow from
-scratch, scroll down and see our instructions
-for [building the demo with Bazel](#build_the_demo_using_bazel).
-
-## Build the demo using Android Studio
-
-**Prerequisites**
-
-If you haven't already, do the following two things:
-
-- Install [Android Studio](https://developer.android.com/studio/index.html),
-  following the instructions on their website.
-
-- Clone the TensorFlow repository from GitHub:
-
-        git clone https://github.com/tensorflow/tensorflow
-
-**Building**
-
-1. Open Android Studio, and from the Welcome screen, select **Open an existing
-   Android Studio project**.
-
-2. From the **Open File or Project** window that appears, navigate to and select
-    the `tensorflow/examples/android` directory from wherever you cloned the
-    TensorFlow GitHub repo.  Click OK.
-
-    If it asks you to do a Gradle Sync, click OK.
-
-    You may also need to install various platforms and tools, if you get
-    errors like "Failed to find target with hash string 'android-23' and similar.
-
-3. Open the `build.gradle` file (you can go to **1:Project** in the side panel
-    and find it under the **Gradle Scripts** zippy under **Android**). Look for
-    the `nativeBuildSystem` variable and set it to `none` if it isn't already:
-
-        // set to 'bazel', 'cmake', 'makefile', 'none'
-        def nativeBuildSystem = 'none'
-
-4. Click the *Run* button (the green arrow) or select *Run > Run 'android'* from the
-    top menu. You may need to rebuild the project using *Build > Rebuild Project*.
-
-    If it asks you to use Instant Run, click **Proceed Without Instant Run**.
-
-    Also, you need to have an Android device plugged in with developer options
-    enabled at this
-    point. See [here](https://developer.android.com/studio/run/device.html) for
-    more details on setting up developer devices.
-
-This installs three apps on your phone that are all part of the TensorFlow
-Demo. See [Android Sample Apps](#android_sample_apps) for more information about
-them.
-
-## Adding TensorFlow to your apps using Android Studio
-
-To add TensorFlow to your own apps on Android, the simplest way is to add the
-following lines to your Gradle build file:
-
-    allprojects {
-        repositories {
-            jcenter()
-        }
-	}
-
-    dependencies {
-        compile 'org.tensorflow:tensorflow-android:+'
-    }
-
-This automatically downloads the latest stable version of TensorFlow as an AAR
-and installs it in your project.
-
-## Build the demo using Bazel
-
-Another way to use TensorFlow on Android is to build an APK
-using [Bazel](https://bazel.build/) and load it onto your device
-using [ADB](https://developer.android.com/studio/command-line/adb.html). This
-requires some knowledge of build systems and Android developer tools, but we'll
-guide you through the basics here.
-
-- First, follow our instructions for @{$install/install_sources$installing from sources}.
-  This will also guide you through installing Bazel and cloning the
-  TensorFlow code.
-
-- Download the Android [SDK](https://developer.android.com/studio/index.html)
-  and [NDK](https://developer.android.com/ndk/downloads/index.html) if you do
-  not already have them. You need at least version 12b of the NDK, and 23 of the
-  SDK.
-
-- In your copy of the TensorFlow source, update the
-  [WORKSPACE](https://github.com/tensorflow/tensorflow/blob/master/WORKSPACE)
-  file with the location of your SDK and NDK, where it says &lt;PATH_TO_NDK&gt;
-  and &lt;PATH_TO_SDK&gt;.
-
-- Run Bazel to build the demo APK:
-
-        bazel build -c opt //tensorflow/examples/android:tensorflow_demo
-
-- Use [ADB](https://developer.android.com/studio/command-line/adb.html#move) to
-  install the APK onto your device:
-
-        adb install -r bazel-bin/tensorflow/examples/android/tensorflow_demo.apk
-
-Note: In general when compiling for Android with Bazel you need
-`--config=android` on the Bazel command line, though in this case this
-particular example is Android-only, so you don't need it here.
-
-This installs three apps on your phone that are all part of the TensorFlow
-Demo. See [Android Sample Apps](#android_sample_apps) for more information about
-them.
-
-## Android Sample Apps
-
-The
-[Android example code](https://www.tensorflow.org/code/tensorflow/examples/android/) is
-a single project that builds and installs three sample apps which all use the
-same underlying code. The sample apps all take video input from a phone's
-camera:
-
-- **TF Classify** uses the Inception v3 model to label the objects it’s pointed
-  at with classes from Imagenet. There are only 1,000 categories in Imagenet,
-  which misses most everyday objects and includes many things you’re unlikely to
-  encounter often in real life, so the results can often be quite amusing. For
-  example there’s no ‘person’ category, so instead it will often guess things it
-  does know that are often associated with pictures of people, like a seat belt
-  or an oxygen mask. If you do want to customize this example to recognize
-  objects you care about, you can use
-  the
-  [TensorFlow for Poets codelab](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/index.html#0) as
-  an example for how to train a model based on your own data.
-
-- **TF Detect** uses a multibox model to try to draw bounding boxes around the
-  locations of people in the camera. These boxes are annotated with the
-  confidence for each detection result. Results will not be perfect, as this
-  kind of object detection is still an active research topic.  The demo also
-  includes optical tracking for when objects move between frames, which runs
-  more frequently than the TensorFlow inference. This improves the user
-  experience since the apparent frame rate is faster, but it also gives the
-  ability to estimate which boxes refer to the same object between frames, which
-  is important for counting objects over time.
-
-- **TF Stylize** implements a real-time style transfer algorithm on the camera
-  feed. You can select which styles to use and mix between them using the
-  palette at the bottom of the screen, and also switch out the resolution of the
-  processing to go higher or lower rez.
-
-When you build and install the demo, you'll see three app icons on your phone,
-one for each of the demos. Tapping on them should open up the app and let you
-explore what they do. You can enable profiling statistics on-screen by tapping
-the volume up button while they’re running.
-
-### Android Inference Library
-
-Because Android apps need to be written in Java, and core TensorFlow is in C++,
-TensorFlow has a JNI library to interface between the two. Its interface is aimed
-only at inference, so it provides the ability to load a graph, set up inputs,
-and run the model to calculate particular outputs. You can see the full
-documentation for the minimal set of methods in
-[TensorFlowInferenceInterface.java](https://www.tensorflow.org/code/tensorflow/contrib/android/java/org/tensorflow/contrib/android/TensorFlowInferenceInterface.java)
-
-The demos applications use this interface, so they’re a good place to look for
-example usage. You can download prebuilt binary jars
-at
-[ci.tensorflow.org](https://ci.tensorflow.org/view/Nightly/job/nightly-android/).
diff --git a/tensorflow/docs_src/mobile/index.md b/tensorflow/docs_src/mobile/index.md
deleted file mode 100644
index 6032fcad02..0000000000
--- a/tensorflow/docs_src/mobile/index.md
+++ /dev/null
@@ -1,33 +0,0 @@
-# Overview
-
-TensorFlow was designed to be a good deep learning solution for mobile
-platforms. Currently we have two solutions for deploying machine learning
-applications on mobile and embedded devices:
-@{$mobile/mobile_intro$TensorFlow for Mobile} and @{$mobile/tflite$TensorFlow Lite}.
-
-## TensorFlow Lite versus TensorFlow Mobile
-
-Here are a few of the differences between the two:
-
-- TensorFlow Lite is an evolution of TensorFlow Mobile.  In most cases, apps
-  developed with TensorFlow Lite will have a smaller binary size, fewer
-  dependencies, and better performance.
-
-- TensorFlow Lite supports only a limited set of operators, so not all models
-  will work on it by default. TensorFlow for Mobile has a fuller set of
-  supported functionality.
-
-TensorFlow Lite provides better performance and a small binary size on mobile
-platforms as well as the ability to leverage hardware acceleration if available
-on their platforms. In addition, it has many fewer dependencies so it can be
-built and hosted on simpler, more constrained device scenarios. TensorFlow Lite
-also allows targeting accelerators through the [Neural Networks
-API](https://developer.android.com/ndk/guides/neuralnetworks/index.html).
-
-TensorFlow Lite currently has coverage for a limited set of operators. While
-TensorFlow for Mobile supports only a constrained set of ops by default, in
-principle if you use an arbitrary operator in TensorFlow, it can be customized
-to build that kernel. Thus use cases which are not currently supported by
-TensorFlow Lite should continue to use TensorFlow for Mobile. As TensorFlow Lite
-evolves, it will gain additional operators, and the decision will be easier to
-make.
diff --git a/tensorflow/docs_src/mobile/ios_build.md b/tensorflow/docs_src/mobile/ios_build.md
deleted file mode 100644
index 4c84a1214a..0000000000
--- a/tensorflow/docs_src/mobile/ios_build.md
+++ /dev/null
@@ -1,107 +0,0 @@
-# Building TensorFlow on iOS
-
-## Using CocoaPods
-
-The simplest way to get started with TensorFlow on iOS is using the CocoaPods
-package management system. You can add the `TensorFlow-experimental` pod to your
-Podfile, which installs a universal binary framework. This makes it easy to get
-started but has the disadvantage of being hard to customize, which is important
-in case you want to shrink your binary size. If you do need the ability to
-customize your libraries, see later sections on how to do that.
-
-## Creating your own app
-
-If you'd like to add TensorFlow capabilities to your own app, do the following:
-
-- Create your own app or load your already-created app in XCode.
-
-- Add a file named Podfile at the project root directory with the following content:
-
-        target 'YourProjectName'
-        pod 'TensorFlow-experimental'
-
-- Run `pod install` to download and install the `TensorFlow-experimental` pod.
-
-- Open `YourProjectName.xcworkspace` and add your code.
-
-- In your app's **Build Settings**, make sure to add `$(inherited)` to the
-  **Other Linker Flags**, and **Header Search Paths** sections.
-
-## Running the Samples
-
-You'll need Xcode 7.3 or later to run our iOS samples.
-
-There are currently three examples: simple, benchmark, and camera. For now, you
-can download the sample code by cloning the main tensorflow repository (we are
-planning to make the samples available as a separate repository later).
-
-From the root of the tensorflow folder, download [Inception
-v1](https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip),
-and extract the label and graph files into the data folders inside both the
-simple and camera examples using these steps:
-
-    mkdir -p ~/graphs
-    curl -o ~/graphs/inception5h.zip \
-     https://storage.googleapis.com/download.tensorflow.org/models/inception5h.zip \
-     && unzip ~/graphs/inception5h.zip -d ~/graphs/inception5h
-    cp ~/graphs/inception5h/* tensorflow/examples/ios/benchmark/data/
-    cp ~/graphs/inception5h/* tensorflow/examples/ios/camera/data/
-    cp ~/graphs/inception5h/* tensorflow/examples/ios/simple/data/
-
-Change into one of the sample directories, download the
-[Tensorflow-experimental](https://cocoapods.org/pods/TensorFlow-experimental)
-pod, and open the Xcode workspace. Note that installing the pod can take a long
-time since it is big (~450MB). If you want to run the simple example, then:
-
-    cd tensorflow/examples/ios/simple
-    pod install
-    open tf_simple_example.xcworkspace   # note .xcworkspace, not .xcodeproj
-                                         # this is created by pod install
-
-Run the simple app in the XCode simulator. You should see a single-screen app
-with a **Run Model** button. Tap that, and you should see some debug output
-appear below indicating that the example Grace Hopper image in directory data
-has been analyzed, with a military uniform recognized.
-
-Run the other samples using the same process. The camera example requires a real
-device connected. Once you build and run that, you should get a live camera view
-that you can point at objects to get real-time recognition results.
-
-### iOS Example details
-
-There are three demo applications for iOS, all defined in Xcode projects inside
-[tensorflow/examples/ios](https://www.tensorflow.org/code/tensorflow/examples/ios/).
-
-- **Simple**: This is a minimal example showing how to load and run a TensorFlow
-  model in as few lines as possible. It just consists of a single view with a
-  button that executes the model loading and inference when its pressed.
-
-- **Camera**: This is very similar to the Android TF Classify demo. It loads
-  Inception v3 and outputs its best label estimate for what’s in the live camera
-  view. As with the Android version, you can train your own custom model using
-  TensorFlow for Poets and drop it into this example with minimal code changes.
-
-- **Benchmark**: is quite close to Simple, but it runs the graph repeatedly and
-  outputs similar statistics to the benchmark tool on Android.
-
-
-### Troubleshooting
-
-- Make sure you use the TensorFlow-experimental pod (and not TensorFlow).
-
-- The TensorFlow-experimental pod is current about ~450MB. The reason it is so
-  big is because we are bundling multiple platforms, and the pod includes all
-  TensorFlow functionality (e.g. operations). The final app size after build is
-  substantially smaller though (~25MB). Working with the complete pod is
-  convenient during development, but see below section on how you can build your
-  own custom TensorFlow library to reduce the size.
-
-## Building the TensorFlow iOS libraries from source
-
-While Cocoapods is the quickest and easiest way of getting started, you sometimes
-need more flexibility to determine which parts of TensorFlow your app should be
-shipped with. For such cases, you can build the iOS libraries from the
-sources. [This
-guide](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/ios#building-the-tensorflow-ios-libraries-from-source)
-contains detailed instructions on how to do that.
-
diff --git a/tensorflow/docs_src/mobile/leftnav_files b/tensorflow/docs_src/mobile/leftnav_files
deleted file mode 100644
index 97340ef7e1..0000000000
--- a/tensorflow/docs_src/mobile/leftnav_files
+++ /dev/null
@@ -1,15 +0,0 @@
-index.md
-### TensorFlow Lite
-tflite/index.md
-tflite/devguide.md
-tflite/demo_android.md
-tflite/demo_ios.md
-tflite/performance.md
->>>
-### TensorFlow Mobile
-mobile_intro.md
-android_build.md
-ios_build.md
-linking_libs.md
-prepare_models.md
-optimizing.md
diff --git a/tensorflow/docs_src/mobile/linking_libs.md b/tensorflow/docs_src/mobile/linking_libs.md
deleted file mode 100644
index efef5dd0da..0000000000
--- a/tensorflow/docs_src/mobile/linking_libs.md
+++ /dev/null
@@ -1,243 +0,0 @@
-# Integrating TensorFlow libraries
-
-Once you have made some progress on a model that addresses the problem you’re
-trying to solve, it’s important to test it out inside your application
-immediately. There are often unexpected differences between your training data
-and what users actually encounter in the real world, and getting a clear picture
-of the gap as soon as possible improves the product experience.
-
-This page talks about how to integrate the TensorFlow libraries into your own
-mobile applications, once you have already successfully built and deployed the
-TensorFlow mobile demo apps.
-
-## Linking the library
-
-After you've managed to build the examples, you'll probably want to call
-TensorFlow from one of your existing applications. The very easiest way to do
-this is to use the Pod installation steps described
-@{$mobile/ios_build#using_cocoapods$here}, but if you want to build TensorFlow
-from source (for example to customize which operators are included) you'll need
-to break out TensorFlow as a framework, include the right header files, and link
-against the built libraries and dependencies.
-
-### Android
-
-For Android, you just need to link in a Java library contained in a JAR file
-called `libandroid_tensorflow_inference_java.jar`. There are three ways to
-include this functionality in your program:
-
-1. Include the jcenter AAR which contains it, as in this
- [example app](https://github.com/googlecodelabs/tensorflow-for-poets-2/blob/master/android/tfmobile/build.gradle#L59-L65)
-
-2. Download the nightly precompiled version from
-[ci.tensorflow.org](http://ci.tensorflow.org/view/Nightly/job/nightly-android/lastSuccessfulBuild/artifact/out/).
-
-3. Build the JAR file yourself using the instructions [in our Android GitHub repo](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/android)
-
-### iOS
-
-Pulling in the TensorFlow libraries on iOS is a little more complicated. Here is
-a checklist of what you’ll need to do to your iOS app:
-
-- Link against tensorflow/contrib/makefile/gen/lib/libtensorflow-core.a, usually
-  by adding `-L/your/path/tensorflow/contrib/makefile/gen/lib/` and
-  `-ltensorflow-core` to your linker flags.
-
-- Link against the generated protobuf libraries by adding
-  `-L/your/path/tensorflow/contrib/makefile/gen/protobuf_ios/lib` and
-  `-lprotobuf` and `-lprotobuf-lite` to your command line.
-
-- For the include paths, you need the root of your TensorFlow source folder as
-  the first entry, followed by
-  `tensorflow/contrib/makefile/downloads/protobuf/src`,
-  `tensorflow/contrib/makefile/downloads`,
-  `tensorflow/contrib/makefile/downloads/eigen`, and
-  `tensorflow/contrib/makefile/gen/proto`.
-
-- Make sure your binary is built with `-force_load` (or the equivalent on your
-  platform), aimed at the TensorFlow library to ensure that it’s linked
-  correctly. More detail on why this is necessary can be found in the next
-  section, [Global constructor magic](#global_constructor_magic). On Linux-like
-  platforms, you’ll need different flags, more like
-  `-Wl,--allow-multiple-definition -Wl,--whole-archive`.
-
-You’ll also need to link in the Accelerator framework, since this is used to
-speed up some of the operations.
-
-## Global constructor magic
-
-One of the subtlest problems you may run up against is the “No session factory
-registered for the given session options” error when trying to call TensorFlow
-from your own application. To understand why this is happening and how to fix
-it, you need to know a bit about the architecture of TensorFlow.
-
-The framework is designed to be very modular, with a thin core and a large
-number of specific objects that are independent and can be mixed and matched as
-needed. To enable this, the coding pattern in C++ had to let modules easily
-notify the framework about the services they offer, without requiring a central
-list that has to be updated separately from each implementation. It also had to
-allow separate libraries to add their own implementations without needing a
-recompile of the core.
-
-To achieve this capability, TensorFlow uses a registration pattern in a lot of
-places. In the code, it looks like this:
-
-    class MulKernel : OpKernel {
-      Status Compute(OpKernelContext* context) { … }
-    };
-    REGISTER_KERNEL(MulKernel, “Mul”);
-
-This would be in a standalone `.cc` file linked into your application, either
-as part of the main set of kernels or as a separate custom library. The magic
-part is that the `REGISTER_KERNEL()` macro is able to inform the core of
-TensorFlow that it has an implementation of the Mul operation, so that it can be
-called in any graphs that require it.
-
-From a programming point of view, this setup is very convenient. The
-implementation and registration code live in the same file, and adding new
-implementations is as simple as compiling and linking it in. The difficult part
-comes from the way that the `REGISTER_KERNEL()` macro is implemented. C++
-doesn’t offer a good mechanism for doing this sort of registration, so we have
-to resort to some tricky code. Under the hood, the macro is implemented so that
-it produces something like this:
-
-    class RegisterMul {
-     public:
-      RegisterMul() {
-        global_kernel_registry()->Register(“Mul”, [](){
-          return new MulKernel()
-        });
-      }
-    };
-    RegisterMul g_register_mul;
-
-This sets up a class `RegisterMul` with a constructor that tells the global
-kernel registry what function to call when somebody asks it how to create a
-“Mul” kernel. Then there’s a global object of that class, and so the constructor
-should be called at the start of any program.
-
-While this may sound sensible, the unfortunate part is that the global object
-that’s defined is not used by any other code, so linkers not designed with this
-in mind will decide that it can be deleted. As a result, the constructor is
-never called, and the class is never registered. All sorts of modules use this
-pattern in TensorFlow, and it happens that `Session` implementations are the
-first to be looked for when the code is run, which is why it shows up as the
-characteristic error when this problem occurs.
-
-The solution is to force the linker to not strip any code from the library, even
-if it believes it’s unused. On iOS, this step can be accomplished with the
-`-force_load` flag, specifying a library path, and on Linux you need
-`--whole-archive`. These persuade the linker to not be as aggressive about
-stripping, and should retain the globals.
-
-The actual implementation of the various `REGISTER_*` macros is a bit more
-complicated in practice, but they all suffer the same underlying problem. If
-you’re interested in how they work, [op_kernel.h](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/op_kernel.h#L1091)
-is a good place to start investigating.
-
-## Protobuf problems
-
-TensorFlow relies on
-the [Protocol Buffer](https://developers.google.com/protocol-buffers/) library,
-commonly known as protobuf. This library takes definitions of data structures
-and produces serialization and access code for them in a variety of
-languages. The tricky part is that this generated code needs to be linked
-against shared libraries for the exact same version of the framework that was
-used for the generator. This can be an issue when `protoc`, the tool used to
-generate the code, is from a different version of protobuf than the libraries in
-the standard linking and include paths. For example, you might be using a copy
-of `protoc` that was built locally in `~/projects/protobuf-3.0.1.a`, but you have
-libraries installed at `/usr/local/lib` and `/usr/local/include` that are from
-3.0.0.
-
-The symptoms of this issue are errors during the compilation or linking phases
-with protobufs. Usually, the build tools take care of this, but if you’re using
-the makefile, make sure you’re building the protobuf library locally and using
-it, as shown in [this Makefile](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/makefile/Makefile#L18).
-
-Another situation that can cause problems is when protobuf headers and source
-files need to be generated as part of the build process. This process makes
-building more complex, since the first phase has to be a pass over the protobuf
-definitions to create all the needed code files, and only after that can you go
-ahead and do a build of the library code.
-
-### Multiple versions of protobufs in the same app
-
-Protobufs generate headers that are needed as part of the C++ interface to the
-overall TensorFlow library. This complicates using the library as a standalone
-framework.
-
-If your application is already using version 1 of the protocol buffers library,
-you may have trouble integrating TensorFlow because it requires version 2. If
-you just try to link both versions into the same binary, you’ll see linking
-errors because some of the symbols clash. To solve this particular problem, we
-have an experimental script at [rename_protobuf.sh](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/makefile/rename_protobuf.sh).
-
-You need to run this as part of the makefile build, after you’ve downloaded all
-the dependencies:
-
-    tensorflow/contrib/makefile/download_dependencies.sh
-    tensorflow/contrib/makefile/rename_protobuf.sh
-
-## Calling the TensorFlow API
-
-Once you have the framework available, you then need to call into it. The usual
-pattern is that you first load your model, which represents a preset set of
-numeric computations, and then you run inputs through that model (for example,
-images from a camera) and receive outputs (for example, predicted labels).
-
-On Android, we provide the Java Inference Library that is focused on just this
-use case, while on iOS and Raspberry Pi you call directly into the C++ API.
-
-### Android
-
-Here’s what a typical Inference Library sequence looks like on Android:
-
-    // Load the model from disk.
-    TensorFlowInferenceInterface inferenceInterface =
-    new TensorFlowInferenceInterface(assetManager, modelFilename);
-
-    // Copy the input data into TensorFlow.
-    inferenceInterface.feed(inputName, floatValues, 1, inputSize, inputSize, 3);
-
-    // Run the inference call.
-    inferenceInterface.run(outputNames, logStats);
-
-    // Copy the output Tensor back into the output array.
-    inferenceInterface.fetch(outputName, outputs);
-
-You can find the source of this code in the [Android examples](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/android/src/org/tensorflow/demo/TensorFlowImageClassifier.java#L107).
-
-### iOS and Raspberry Pi
-
-Here’s the equivalent code for iOS and Raspberry Pi:
-
-    // Load the model.
-    PortableReadFileToProto(file_path, &tensorflow_graph);
-
-    // Create a session from the model.
-    tensorflow::Status s = session->Create(tensorflow_graph);
-    if (!s.ok()) {
-      LOG(FATAL) << "Could not create TensorFlow Graph: " << s;
-    }
-
-    // Run the model.
-    std::string input_layer = "input";
-    std::string output_layer = "output";
-    std::vector<tensorflow::Tensor> outputs;
-    tensorflow::Status run_status = session->Run({{input_layer, image_tensor}},
-                               {output_layer}, {}, &outputs);
-    if (!run_status.ok()) {
-      LOG(FATAL) << "Running model failed: " << run_status;
-    }
-
-    // Access the output data.
-    tensorflow::Tensor* output = &outputs[0];
-
-This is all based on the
-[iOS sample code](https://www.tensorflow.org/code/tensorflow/examples/ios/simple/RunModelViewController.mm),
-but there’s nothing iOS-specific; the same code should be usable on any platform
-that supports C++.
-
-You can also find specific examples for Raspberry Pi
-[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/pi_examples/label_image/label_image.cc).
diff --git a/tensorflow/docs_src/mobile/mobile_intro.md b/tensorflow/docs_src/mobile/mobile_intro.md
deleted file mode 100644
index baad443308..0000000000
--- a/tensorflow/docs_src/mobile/mobile_intro.md
+++ /dev/null
@@ -1,248 +0,0 @@
-# Introduction to TensorFlow Mobile
-
-TensorFlow was designed from the ground up to be a good deep learning solution
-for mobile platforms like Android and iOS. This mobile guide should help you
-understand how machine learning can work on mobile platforms and how to
-integrate TensorFlow into your mobile apps effectively and efficiently.
-
-## About this Guide
-
-This guide is aimed at developers who have a TensorFlow model that’s
-successfully working in a desktop environment, who want to integrate it into
-a mobile application, and cannot use TensorFlow Lite. Here are the
-main challenges you’ll face during that process:
-
-- Understanding how to use Tensorflow for mobile.
-- Building TensorFlow for your platform.
-- Integrating the TensorFlow library into your application.
-- Preparing your model file for mobile deployment.
-- Optimizing for latency, RAM usage, model file size, and binary size.
-
-## Common use cases for mobile machine learning
-
-**Why run TensorFlow on mobile?**
-
-Traditionally, deep learning has been associated with data centers and giant
-clusters of high-powered GPU machines. However, it can be very expensive and
-time-consuming to send all of the data a device has access to across a network
-connection. Running on mobile makes it possible to deliver very interactive
-applications in a way that’s not possible when you have to wait for a network
-round trip.
-
-Here are some common use cases for on-device deep learning:
-
-### Speech Recognition
-
-There are a lot of interesting applications that can be built with a
-speech-driven interface, and many of these require on-device processing. Most of
-the time a user isn’t giving commands, and so streaming audio continuously to a
-remote server would be a waste of bandwidth, since it would mostly be silence or
-background noises. To solve this problem it’s common to have a small neural
-network running on-device
-[listening out for a particular keyword](../tutorials/sequences/audio_recognition).
-Once that keyword has been spotted, the rest of the
-conversation can be transmitted over to the server for further processing if
-more computing power is needed.
-
-### Image Recognition
-
-It can be very useful for a mobile app to be able to make sense of a camera
-image. If your users are taking photos, recognizing what’s in them can help your
-camera apps apply appropriate filters, or label the photos so they’re easily
-findable. It’s important for embedded applications too, since you can use image
-sensors to detect all sorts of interesting conditions, whether it’s spotting
-endangered animals in the wild
-or
-[reporting how late your train is running](https://svds.com/tensorflow-image-recognition-raspberry-pi/).
-
-TensorFlow comes with several examples of recognizing the types of objects
-inside images along with a variety of different pre-trained models, and they can
-all be run on mobile devices. You can try out
-our
-[Tensorflow for Poets](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/index.html#0) and
-[Tensorflow for Poets 2: Optimize for Mobile](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets-2/index.html#0) codelabs to
-see how to take a pretrained model and run some very fast and lightweight
-training to teach it to recognize specific objects, and then optimize it to
-run on mobile.
-
-### Object Localization
-
-Sometimes it’s important to know where objects are in an image as well as what
-they are. There are lots of augmented reality use cases that could benefit a
-mobile app, such as guiding users to the right component when offering them
-help fixing their wireless network or providing informative overlays on top of
-landscape features. Embedded applications often need to count objects that are
-passing by them, whether it’s pests in a field of crops, or people, cars and
-bikes going past a street lamp.
-
-TensorFlow offers a pretrained model for drawing bounding boxes around people
-detected in images, together with tracking code to follow them over time. The
-tracking is especially important for applications where you’re trying to count
-how many objects are present over time, since it gives you a good idea when a
-new object enters or leaves the scene. We have some sample code for this
-available for Android [on
-GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android),
-and also a [more general object detection
-model](https://github.com/tensorflow/models/tree/master/research/object_detection/README.md)
-available as well.
-
-### Gesture Recognition
-
-It can be useful to be able to control applications with hand or other
-gestures, either recognized from images or through analyzing accelerometer
-sensor data. Creating those models is beyond the scope of this guide, but
-TensorFlow is an effective way of deploying them.
-
-### Optical Character Recognition
-
-Google Translate’s live camera view is a great example of how effective
-interactive on-device detection of text can be.
-
-<div class="video-wrapper">
-  <iframe class="devsite-embedded-youtube-video" data-video-id="06olHmcJjS0"
-            data-autohide="1" data-showinfo="0" frameborder="0" allowfullscreen>
-  </iframe>
-</div>
-
-There are multiple steps involved in recognizing text in images. You first have
-to identify the areas where the text is present, which is a variation on the
-object localization problem, and can be solved with similar techniques. Once you
-have an area of text, you then need to interpret it as letters, and then use a
-language model to help guess what words they represent. The simplest way to
-estimate what letters are present is to segment the line of text into individual
-letters, and then apply a simple neural network to the bounding box of each. You
-can get good results with the kind of models used for MNIST, which you can find
-in TensorFlow’s tutorials, though you may want a higher-resolution input.  A
-more advanced alternative is to use an LSTM model to process a whole line of
-text at once, with the model itself handling the segmentation into different
-characters.
-
-### Translation
-
-Translating from one language to another quickly and accurately, even if you
-don’t have a network connection, is an important use case. Deep networks are
-very effective at this sort of task, and you can find descriptions of a lot of
-different models in the literature. Often these are sequence-to-sequence
-recurrent models where you’re able to run a single graph to do the whole
-translation, without needing to run separate parsing stages.
-
-### Text Classification
-
-If you want to suggest relevant prompts to users based on what they’re typing or
-reading, it can be very useful to understand the meaning of the text. This is
-where text classification comes in. Text classification is an umbrella term
-that covers everything from sentiment analysis to topic discovery. You’re likely
-to have your own categories or labels that you want to apply, so the best place
-to start is with an example
-like
-[Skip-Thoughts](https://github.com/tensorflow/models/tree/master/research/skip_thoughts/),
-and then train on your own examples.
-
-### Voice Synthesis
-
-A synthesized voice can be a great way of giving users feedback or aiding
-accessibility, and recent advances such as
-[WaveNet](https://deepmind.com/blog/wavenet-generative-model-raw-audio/) show
-that deep learning can offer very natural-sounding speech.
-
-## Mobile machine learning and the cloud
-
-These examples of use cases give an idea of how on-device networks can
-complement cloud services. Cloud has a great deal of computing power in a
-controlled environment, but running on devices can offer higher interactivity.
-In situations where the cloud is unavailable, or your cloud capacity is limited,
-you can provide an offline experience, or reduce cloud workload by processing
-easy cases on device.
-
-Doing on-device computation can also signal when it's time to switch to working
-on the cloud. A good example of this is hotword detection in speech. Since
-devices are able to constantly listen out for the keywords, this then triggers a
-lot of traffic to cloud-based speech recognition once one is recognized. Without
-the on-device component, the whole application wouldn’t be feasible, and this
-pattern exists across several other applications as well. Recognizing that some
-sensor input is interesting enough for further processing makes a lot of
-interesting products possible.
-
-## What hardware and software should you have?
-
-TensorFlow runs on Ubuntu Linux, Windows 10, and OS X. For a list of all
-supported operating systems and instructions to install TensorFlow, see
-@{$install$Installing Tensorflow}.
-
-Note that some of the sample code we provide for mobile TensorFlow requires you
-to compile TensorFlow from source, so you’ll need more than just `pip install`
-to work through all the sample code.
-
-To try out the mobile examples, you’ll need a device set up for development,
-using
-either [Android Studio](https://developer.android.com/studio/install.html),
-or [XCode](https://developer.apple.com/xcode/) if you're developing for iOS.
-
-## What should you do before you get started?
-
-Before thinking about how to get your solution on mobile:
-
-1. Determine whether your problem is solvable by mobile machine learning
-2. Create a labelled dataset to define your problem
-3. Pick an effective model for the problem
-
-We'll discuss these in more detail below.
-
-### Is your problem solvable by mobile machine learning?
-
-Once you have an idea of the problem you want to solve, you need to make a plan
-of how to build your solution. The most important first step is making sure that
-your problem is actually solvable, and the best way to do that is to mock it up
-using humans in the loop.
-
-For example, if you want to drive a robot toy car using voice commands, try
-recording some audio from the device and listen back to it to see if you can
-make sense of what’s being said. Often you’ll find there are problems in the
-capture process, such as the motor drowning out speech or not being able to hear
-at a distance, and you should tackle these problems before investing in the
-modeling process.
-
-Another example would be giving photos taken from your app to people see if they
-can classify what’s in them, in the way you’re looking for. If they can’t do
-that (for example, trying to estimate calories in food from photos may be
-impossible because all white soups look the same), then you’ll need to redesign
-your experience to cope with that. A good rule of thumb is that if a human can’t
-handle the task then it will be difficult to train a computer to do better.
-
-### Create a labelled dataset
-
-After you’ve solved any fundamental issues with your use case, you need to
-create a labeled dataset to define what problem you’re trying to solve. This
-step is extremely important, more than picking which model to use. You want it
-to be as representative as possible of your actual use case, since the model
-will only be effective at the task you teach it. It’s also worth investing in
-tools to make labeling the data as efficient and accurate as possible. For
-example, if you’re able to switch from having to click a button on a web
-interface to simple keyboard shortcuts, you may be able to speed up the
-generation process a lot. You should also start by doing the initial labeling
-yourself, so you can learn about the difficulties and likely errors, and
-possibly change your labeling or data capture process to avoid them. Once you
-and your team are able to consistently label examples (that is once you
-generally agree on the same labels for most examples), you can then try and
-capture your knowledge in a manual and teach external raters how to run the same
-process.
-
-### Pick an effective model
-
-The next step is to pick an effective model to use. You might be able to avoid
-training a model from scratch if someone else has already implemented a model
-similar to what you need; we have a repository of models implemented in
-TensorFlow [on GitHub](https://github.com/tensorflow/models) that you can look
-through. Lean towards the simplest model you can find, and try to get started as
-soon as you have even a small amount of labelled data, since you’ll get the best
-results when you’re able to iterate quickly. The shorter the time it takes to
-try training a model and running it in its real application, the better overall
-results you’ll see. It’s common for an algorithm to get great training accuracy
-numbers but then fail to be useful within a real application because there’s a
-mismatch between the dataset and real usage. Prototype end-to-end usage as soon
-as possible to create a consistent user experience.
-
-## Next Steps
-
-We suggest you get started by building one of our demos for
-@{$mobile/android_build$Android} or @{$mobile/ios_build$iOS}.
diff --git a/tensorflow/docs_src/mobile/optimizing.md b/tensorflow/docs_src/mobile/optimizing.md
deleted file mode 100644
index 778e4d3a62..0000000000
--- a/tensorflow/docs_src/mobile/optimizing.md
+++ /dev/null
@@ -1,499 +0,0 @@
-# Optimizing for mobile
-
-There are some special issues that you have to deal with when you’re trying to
-ship on mobile or embedded devices, and you’ll need to think about these as
-you’re developing your model.
-
-These issues are:
-
-- Model and Binary Size
-- App speed and model loading speed
-- Performance and threading
-
-We'll discuss a few of these below.
-
-## What are the minimum device requirements for TensorFlow?
-
-You need at least one megabyte of program memory and several megabytes of RAM to
-run the base TensorFlow runtime, so it’s not suitable for DSPs or
-microcontrollers. Other than those, the biggest constraint is usually the
-calculation speed of the device, and whether you can run the model you need for
-your application with a low enough latency. You can use the benchmarking tools
-in [How to Profile your Model](#how_to_profile_your_model) to get an idea of how
-many FLOPs are required for a model, and then use that to make rule-of-thumb
-estimates of how fast they will run on different devices. For example, a modern
-smartphone might be able to run 10 GFLOPs per second, so the best you could hope
-for from a 5 GFLOP model is two frames per second, though you may do worse
-depending on what the exact computation patterns are.
-
-This model dependence means that it’s possible to run TensorFlow even on very
-old or constrained phones, as long as you optimize your network to fit within
-the latency budget and possibly within limited RAM too. For memory usage, you
-mostly need to make sure that the intermediate buffers that TensorFlow creates
-aren’t too large, which you can examine in the benchmark output too.
-
-## Speed
-
-One of the highest priorities of most model deployments is figuring out how to
-run the inference fast enough to give a good user experience. The first place to
-start is by looking at the total number of floating point operations that are
-required to execute the graph. You can get a very rough estimate of this by
-using the `benchmark_model` tool:
-
-    bazel build -c opt tensorflow/tools/benchmark:benchmark_model && \
-    bazel-bin/tensorflow/tools/benchmark/benchmark_model \
-    --graph=/tmp/inception_graph.pb --input_layer="Mul:0" \
-    --input_layer_shape="1,299,299,3" --input_layer_type="float" \
-    --output_layer="softmax:0" --show_run_order=false --show_time=false \
-    --show_memory=false --show_summary=true --show_flops=true --logtostderr
-
-This should show you an estimate of how many operations are needed to run the
-graph. You can then use that information to figure out how feasible your model
-is to run on the devices you’re targeting. For an example, a high-end phone from
-2016 might be able to do 20 billion FLOPs per second, so the best speed you
-could hope for from a model that requires 10 billion FLOPs is around 500ms. On a
-device like the Raspberry Pi 3 that can do about 5 billion FLOPs, you may only
-get one inference every two seconds.
-
-Having this estimate helps you plan for what you’ll be able to realistically
-achieve on a device. If the model is using too many ops, then there are a lot of
-opportunities to optimize the architecture to reduce that number.
-
-Advanced techniques include [SqueezeNet](https://arxiv.org/abs/1602.07360)
-and [MobileNet](https://arxiv.org/abs/1704.04861), which are architectures
-designed to produce models for mobile -- lean and fast but with a small accuracy
-cost.  You can also just look at alternative models, even older ones, which may
-be smaller. For example, Inception v1 only has around 7 million parameters,
-compared to Inception v3’s 24 million, and requires only 3 billion FLOPs rather
-than 9 billion for v3.
-
-## Model Size
-
-Models that run on a device need to be stored somewhere on the device, and very
-large neural networks can be hundreds of megabytes. Most users are reluctant to
-download very large app bundles from app stores, so you want to make your model
-as small as possible. Furthermore, smaller neural networks can persist in and
-out of a mobile device's memory faster.
-
-To understand how large your network will be on disk, start by looking at the
-size on disk of your `GraphDef` file after you’ve run `freeze_graph` and
-`strip_unused_nodes` on it (see @{$mobile/prepare_models$Preparing models} for
-more details on these tools), since then it should only contain
-inference-related nodes. To double-check that your results are as expected, run
-the `summarize_graph` tool to see how many parameters are in constants:
-
-    bazel build tensorflow/tools/graph_transforms:summarize_graph && \
-    bazel-bin/tensorflow/tools/graph_transforms/summarize_graph \
-    --in_graph=/tmp/tensorflow_inception_graph.pb
-
-That command should give you output that looks something like this:
-
-    No inputs spotted.
-    Found 1 possible outputs: (name=softmax, op=Softmax)
-    Found 23885411 (23.89M) const parameters, 0 (0) variable parameters,
-    and 99 control_edges
-    Op types used: 489 Const, 99 CheckNumerics, 99 Identity, 94
-    BatchNormWithGlobalNormalization, 94 Conv2D, 94 Relu, 11 Concat, 9 AvgPool,
-    5 MaxPool, 1 Sub, 1 Softmax, 1 ResizeBilinear, 1 Reshape, 1 Mul, 1 MatMul,
-    1 ExpandDims, 1 DecodeJpeg, 1 Cast, 1 BiasAdd
-
-The important part for our current purposes is the number of const
-parameters. In most models these will be stored as 32-bit floats to start, so if
-you multiply the number of const parameters by four, you should get something
-that’s close to the size of the file on disk. You can often get away with only
-eight-bits per parameter with very little loss of accuracy in the final result,
-so if your file size is too large you can try using
-@{$performance/quantization$quantize_weights} to transform the parameters down.
-
-    bazel build tensorflow/tools/graph_transforms:transform_graph && \
-    bazel-bin/tensorflow/tools/graph_transforms/transform_graph \
-    --in_graph=/tmp/tensorflow_inception_optimized.pb \
-    --out_graph=/tmp/tensorflow_inception_quantized.pb \
-    --inputs='Mul:0' --outputs='softmax:0' --transforms='quantize_weights'
-
-If you look at the resulting file size, you should see that it’s about a quarter
-of the original at 23MB.
-
-Another transform is `round_weights`, which doesn't make the file smaller, but it
-makes the file compressible to about the same size as when `quantize_weights` is
-used. This is particularly useful for mobile development, taking advantage of
-the fact that app bundles are compressed before they’re downloaded by consumers.
-
-The original file does not compress well with standard algorithms, because the
-bit patterns of even very similar numbers can be very different. The
-`round_weights` transform keeps the weight parameters stored as floats, but
-rounds them to a set number of step values. This means there are a lot more
-repeated byte patterns in the stored model, and so compression can often bring
-the size down dramatically, in many cases to near the size it would be if they
-were stored as eight bit.
-
-Another advantage of `round_weights` is that the framework doesn’t have to
-allocate a temporary buffer to unpack the parameters into, as we have to when
-we just use `quantize_weights`. This saves a little bit of latency (though the
-results should be cached so it’s only costly on the first run) and makes it
-possible to use memory mapping, as described later.
-
-## Binary Size
-
-One of the biggest differences between mobile and server development is the
-importance of binary size. On desktop machines it’s not unusual to have
-executables that are hundreds of megabytes on disk, but for mobile and embedded
-apps it’s vital to keep the binary as small as possible so that user downloads
-are easy. As mentioned above, TensorFlow only includes a subset of op
-implementations by default, but this still results in a 12 MB final
-executable. To reduce this, you can set up the library to only include the
-implementations of the ops that you actually need, based on automatically
-analyzing your model. To use it:
-
-- Run `tools/print_required_ops/print_selective_registration_header.py` on your
-  model to produce a header file that only enables the ops it uses.
-
-- Place the `ops_to_register.h` file somewhere that the compiler can find
-  it. This can be in the root of your TensorFlow source folder.
-
-- Build TensorFlow with `SELECTIVE_REGISTRATION` defined, for example by passing
-  in `--copts=”-DSELECTIVE_REGISTRATION”` to your Bazel build command.
-
-This process recompiles the library so that only the needed ops and types are
-included, which can dramatically reduce the executable size. For example, with
-Inception v3, the new size is only 1.5MB.
-
-## How to Profile your Model
-
-Once you have an idea of what your device's peak performance range is, it’s
-worth looking at its actual current performance. Using a standalone TensorFlow
-benchmark, rather than running it inside a larger app, helps isolate just the
-Tensorflow contribution to the
-latency. The
-[tensorflow/tools/benchmark](https://www.tensorflow.org/code/tensorflow/tools/benchmark/) tool
-is designed to help you do this. To run it on Inception v3 on your desktop
-machine, build this benchmark model:
-
-    bazel build -c opt tensorflow/tools/benchmark:benchmark_model && \
-    bazel-bin/tensorflow/tools/benchmark/benchmark_model \
-    --graph=/tmp/tensorflow_inception_graph.pb --input_layer="Mul" \
-    --input_layer_shape="1,299,299,3" --input_layer_type="float" \
-    --output_layer="softmax:0" --show_run_order=false --show_time=false \
-    --show_memory=false --show_summary=true --show_flops=true --logtostderr
-
-You should see output that looks something like this:
-
-<pre>
-============================== Top by Computation Time ==============================
-[node
- type]  [start]  [first] [avg ms]     [%]  [cdf%]  [mem KB]  [Name]
-Conv2D   22.859   14.212   13.700  4.972%  4.972%  3871.488  conv_4/Conv2D
-Conv2D    8.116    8.964   11.315  4.106%  9.078%  5531.904  conv_2/Conv2D
-Conv2D   62.066   16.504    7.274  2.640% 11.717%   443.904  mixed_3/conv/Conv2D
-Conv2D    2.530    6.226    4.939  1.792% 13.510%  2765.952  conv_1/Conv2D
-Conv2D   55.585    4.605    4.665  1.693% 15.203%   313.600  mixed_2/tower/conv_1/Conv2D
-Conv2D  127.114    5.469    4.630  1.680% 16.883%    81.920  mixed_10/conv/Conv2D
-Conv2D   47.391    6.994    4.588  1.665% 18.548%   313.600  mixed_1/tower/conv_1/Conv2D
-Conv2D   39.463    7.878    4.336  1.574% 20.122%   313.600  mixed/tower/conv_1/Conv2D
-Conv2D  127.113    4.192    3.894  1.413% 21.535%   114.688  mixed_10/tower_1/conv/Conv2D
-Conv2D   70.188    5.205    3.626  1.316% 22.850%   221.952  mixed_4/conv/Conv2D
-
-============================== Summary by node type ==============================
-[Node type]  [count]  [avg ms]    [avg %]    [cdf %]  [mem KB]
-Conv2D            94   244.899    88.952%    88.952% 35869.953
-BiasAdd           95     9.664     3.510%    92.462% 35873.984
-AvgPool            9     7.990     2.902%    95.364%  7493.504
-Relu              94     5.727     2.080%    97.444% 35869.953
-MaxPool            5     3.485     1.266%    98.710%  3358.848
-Const            192     1.727     0.627%    99.337%     0.000
-Concat            11     1.081     0.393%    99.730%  9892.096
-MatMul             1     0.665     0.242%    99.971%     4.032
-Softmax            1     0.040     0.015%    99.986%     4.032
-<>                 1     0.032     0.012%    99.997%     0.000
-Reshape            1     0.007     0.003%   100.000%     0.000
-
-Timings (microseconds): count=50 first=330849 curr=274803 min=232354 max=415352 avg=275563 std=44193
-Memory (bytes): count=50 curr=128366400(all same)
-514 nodes defined 504 nodes observed
-</pre>
-
-This is the summary view, which is enabled by the show_summary flag. To
-interpret it, the first table is a list of the nodes that took the most time, in
-order by how long they took. From left to right, the columns are:
-
-- Node type, what kind of operation this was.
-
-- Start time of the op, showing where it falls in the sequence of operations.
-
-- First time in milliseconds. This is how long the operation took on the first
-  run of the benchmark, since by default 20 runs are executed to get more
-  reliable statistics. The first time is useful to spot which ops are doing
-  expensive calculations on the first run, and then caching the results.
-
-- Average time for the operation across all runs, in milliseconds.
-
-- What percentage of the total time for one run the op took. This is useful to
-  understand where the hotspots are.
-
-- The cumulative total time of this and the previous ops in the table. This is
-  handy for understanding what the distribution of work is across the layers, to
-  see if just a few of the nodes are taking up most of the time.
-  
-- The amount of memory consumed by outputs of this type of op.
-
-- Name of the node.
-
-The second table is similar, but instead of breaking down the timings by
-particular named nodes, it groups them by the kind of op. This is very useful to
-understand which op implementations you might want to optimize or eliminate from
-your graph. The table is arranged with the most costly operations at the start,
-and only shows the top ten entries, with a placeholder for other nodes. The
-columns from left to right are:
-
-- Type of the nodes being analyzed.
-
-- Accumulated average time taken by all nodes of this type, in milliseconds.
-
-- What percentage of the total time was taken by this type of operation.
-
-- Cumulative time taken by this and op types higher in the table, so you can
-  understand the distribution of the workload.
-
--  How much memory the outputs of this op type took up.
-
-Both of these tables are set up so that you can easily copy and paste their
-results into spreadsheet documents, since they are output with tabs as
-separators between the columns. The summary by node type can be the most useful
-when looking for optimization opportunities, since it’s a pointer to the code
-that’s taking the most time. In this case, you can see that the Conv2D ops are
-almost 90% of the execution time. This is a sign that the graph is pretty
-optimal, since convolutions and matrix multiplies are expected to be the bulk of
-a neural network’s computing workload.
-
-As a rule of thumb, it’s more worrying if you see a lot of other operations
-taking up more than a small fraction of the time. For neural networks, the ops
-that don’t involve large matrix multiplications should usually be dwarfed by the
-ones that do, so if you see a lot of time going into those it’s a sign that
-either your network is non-optimally constructed, or the code implementing those
-ops is not as optimized as it could
-be. [Performance bugs](https://github.com/tensorflow/tensorflow/issues) or
-patches are always welcome if you do encounter this situation, especially if
-they include an attached model exhibiting this behavior and the command line
-used to run the benchmark tool on it.
-
-The run above was on your desktop, but the tool also works on Android, which is
-where it’s most useful for mobile development. Here’s an example command line to
-run it on a 64-bit ARM device:
-
-    bazel build -c opt --config=android_arm64 \
-    tensorflow/tools/benchmark:benchmark_model
-    adb push bazel-bin/tensorflow/tools/benchmark/benchmark_model /data/local/tmp
-    adb push /tmp/tensorflow_inception_graph.pb /data/local/tmp/
-    adb shell '/data/local/tmp/benchmark_model \
-    --graph=/data/local/tmp/tensorflow_inception_graph.pb --input_layer="Mul" \
-    --input_layer_shape="1,299,299,3" --input_layer_type="float" \
-    --output_layer="softmax:0" --show_run_order=false --show_time=false \
-    --show_memory=false --show_summary=true'
-
-You can interpret the results in exactly the same way as the desktop version
-above. If you have any trouble figuring out what the right input and output
-names and types are, take a look at the @{$mobile/prepare_models$Preparing models}
-page for details about detecting these for your model, and look at the
-`summarize_graph` tool which may give you
-helpful information.
-
-There isn’t good support for command line tools on iOS, so instead there’s a
-separate example
-at
-[tensorflow/examples/ios/benchmark](https://www.tensorflow.org/code/tensorflow/examples/ios/benchmark) that
-packages the same functionality inside a standalone app. This outputs the
-statistics to both the screen of the device and the debug log. If you want
-on-screen statistics for the Android example apps, you can turn them on by
-pressing the volume-up button.
-
-## Profiling within your own app
-
-The output you see from the benchmark tool is generated from modules that are
-included as part of the standard TensorFlow runtime, which means you have access
-to them within your own applications too. You can see an example of how to do
-that [here](https://www.tensorflow.org/code/tensorflow/examples/ios/benchmark/BenchmarkViewController.mm?l=139).
-
-The basic steps are:
-
-1. Create a StatSummarizer object:
-
-        tensorflow::StatSummarizer stat_summarizer(tensorflow_graph);
-
-2. Set up the options:
-
-        tensorflow::RunOptions run_options;
-        run_options.set_trace_level(tensorflow::RunOptions::FULL_TRACE);
-        tensorflow::RunMetadata run_metadata;
-
-3. Run the graph:
-
-        run_status = session->Run(run_options, inputs, output_layer_names, {},
-                                  output_layers, &run_metadata);
-
-4. Calculate the results and print them out:
-
-        assert(run_metadata.has_step_stats());
-        const tensorflow::StepStats& step_stats = run_metadata.step_stats();
-        stat_summarizer->ProcessStepStats(step_stats);
-        stat_summarizer->PrintStepStats();
-
-## Visualizing Models
-
-The most effective way to speed up your code is by altering your model so it
-does less work. To do that, you need to understand what your model is doing, and
-visualizing it is a good first step. To get a high-level overview of your graph,
-use [TensorBoard](https://github.com/tensorflow/tensorboard).
-
-## Threading
-
-The desktop version of TensorFlow has a sophisticated threading model, and will
-try to run multiple operations in parallel if it can. In our terminology this is
-called “inter-op parallelism” (though to avoid confusion with “intra-op”, you
-could think of it as “between-op” instead), and can be set by specifying
-`inter_op_parallelism_threads` in the session options.
-
-By default, mobile devices run operations serially; that is,
-`inter_op_parallelism_threads` is set to 1. Mobile processors usually have few
-cores and a small cache, so running multiple operations accessing disjoint parts
-of memory usually doesn’t help performance. “Intra-op parallelism” (or
-“within-op”) can be very helpful though, especially for computation-bound
-operations like convolutions where different threads can feed off the same small
-set of memory.
-
-On mobile, how many threads an op will use is set to the number of cores by
-default, or 2 when the number of cores can't be determined. You can override the
-default number of threads that ops are using by setting
-`intra_op_parallelism_threads` in the session options.  It’s a good idea to
-reduce the default if your app has its own threads doing heavy processing, so
-that they don’t interfere with each other.
-
-To see more details on session options, look at [ConfigProto](https://www.tensorflow.org/code/tensorflow/core/protobuf/config.proto).
-
-## Retrain with mobile data
-
-The biggest cause of accuracy problems when running models on mobile apps is
-unrepresentative training data. For example, most of the Imagenet photos are
-well-framed so that the object is in the center of the picture, well-lit, and
-shot with a normal lens. Photos from mobile devices are often poorly framed,
-badly lit, and can have fisheye distortions, especially selfies.
-
-The solution is to expand your training set with data actually captured from
-your application. This step can involve extra work, since you’ll have to label
-the examples yourself, but even if you just use it to expand your original
-training data, it can help the training set dramatically. Improving the training
-set by doing this, and by fixing other quality issues like duplicates or badly
-labeled examples is the single best way to improve accuracy. It’s usually a
-bigger help than altering your model architecture or using different techniques.
-
-## Reducing model loading time and/or memory footprint
-
-Most operating systems allow you to load a file using memory mapping, rather
-than going through the usual I/O APIs. Instead of allocating an area of memory
-on the heap and then copying bytes from disk into it, you simply tell the
-operating system to make the entire contents of a file appear directly in
-memory. This has several advantages:
-
-* Speeds loading
-* Reduces paging (increases performance)
-* Does not count towards RAM budget for your app
-
-TensorFlow has support for memory mapping the weights that form the bulk of most
-model files. Because of limitations in the `ProtoBuf` serialization format, we
-have to make a few changes to our model loading and processing code. The
-way memory mapping works is that we have a single file where the first part is a
-normal `GraphDef` serialized into the protocol buffer wire format, but then the
-weights are appended in a form that can be directly mapped.
-
-To create this file, run the
-`tensorflow/contrib/util:convert_graphdef_memmapped_format` tool. This takes in
-a `GraphDef` file that’s been run through `freeze_graph` and converts it to the
-format that has the weights appended at the end. Since that file’s no longer a
-standard `GraphDef` protobuf, you then need to make some changes to the loading
-code. You can see an example of this in
-the
-[iOS Camera demo app](https://www.tensorflow.org/code/tensorflow/examples/ios/camera/tensorflow_utils.mm?l=147),
-in the `LoadMemoryMappedModel()` function.
-
-The same code (with the Objective C calls for getting the filenames substituted)
-can be used on other platforms too. Because we’re using memory mapping, we need
-to start by creating a special TensorFlow environment object that’s set up with
-the file we’ll be using:
-
-    std::unique_ptr<tensorflow::MemmappedEnv> memmapped_env;
-    memmapped_env->reset(
-          new tensorflow::MemmappedEnv(tensorflow::Env::Default()));
-    tensorflow::Status mmap_status =
-          (memmapped_env->get())->InitializeFromFile(file_path);
-
-You then need to pass in this environment to subsequent calls, like this one for
-loading the graph:
-
-    tensorflow::GraphDef tensorflow_graph;
-    tensorflow::Status load_graph_status = ReadBinaryProto(
-        memmapped_env->get(),
-        tensorflow::MemmappedFileSystem::kMemmappedPackageDefaultGraphDef,
-        &tensorflow_graph);
-
-You also need to create the session with a pointer to the environment you’ve
-created:
-
-    tensorflow::SessionOptions options;
-    options.config.mutable_graph_options()
-        ->mutable_optimizer_options()
-        ->set_opt_level(::tensorflow::OptimizerOptions::L0);
-    options.env = memmapped_env->get();
-
-    tensorflow::Session* session_pointer = nullptr;
-    tensorflow::Status session_status =
-        tensorflow::NewSession(options, &session_pointer);
-
-One thing to notice here is that we’re also disabling automatic optimizations,
-since in some cases these will fold constant sub-trees, and so create copies of
-tensor values that we don’t want and use up more RAM.
-
-Once you’ve gone through these steps, you can use the session and graph as
-normal, and you should see a reduction in loading time and memory usage.
-
-## Protecting model files from easy copying
-
-By default, your models will be stored in the standard serialized protobuf
-format on disk. In theory this means that anybody can copy your model, which you
-may not want. However, in practice, most models are so application-specific and
-obfuscated by optimizations that the risk is similar to that of competitors
-disassembling and reusing your code, but if you do want to make it tougher for
-casual users to access your files it is possible to take some basic steps.
-
-Most of our examples use
-the
-[ReadBinaryProto()](https://www.tensorflow.org/code/tensorflow/core/platform/env.cc?q=core/platform/env.cc&l=409) convenience
-call to load a `GraphDef` from disk. This does require an unencrypted protobuf on
-disk. Luckily though, the implementation of the call is pretty straightforward
-and it should be easy to write an equivalent that can decrypt in memory. Here's
-some code that shows how you can read and decrypt a protobuf using your own
-decryption routine:
-
-    Status ReadEncryptedProto(Env* env, const string& fname,
-                              ::tensorflow::protobuf::MessageLite* proto) {
-      string data;
-      TF_RETURN_IF_ERROR(ReadFileToString(env, fname, &data));
-
-      DecryptData(&data);  // Your own function here.
-
-      if (!proto->ParseFromString(&data)) {
-        TF_RETURN_IF_ERROR(stream->status());
-        return errors::DataLoss("Can't parse ", fname, " as binary proto");
-      }
-      return Status::OK();
-    }
-
-To use this you’d need to define the DecryptData() function yourself. It could
-be as simple as something like:
-
-    void DecryptData(string* data) {
-      for (int i = 0; i < data.size(); ++i) {
-        data[i] = data[i] ^ 0x23;
-      }
-    }
-
-You may want something more complex, but exactly what you’ll need is outside the
-current scope here.
diff --git a/tensorflow/docs_src/mobile/prepare_models.md b/tensorflow/docs_src/mobile/prepare_models.md
deleted file mode 100644
index 2b84dbb973..0000000000
--- a/tensorflow/docs_src/mobile/prepare_models.md
+++ /dev/null
@@ -1,301 +0,0 @@
-# Preparing models for mobile deployment
-
-The requirements for storing model information during training are very
-different from when you want to release it as part of a mobile app. This section
-covers the tools involved in converting from a training model to something
-releasable in production.
-
-## What is up with all the different saved file formats?
-
-You may find yourself getting very confused by all the different ways that
-TensorFlow can save out graphs. To help, here’s a rundown of some of the
-different components, and what they are used for. The objects are mostly defined
-and serialized as protocol buffers:
-
-- [NodeDef](https://www.tensorflow.org/code/tensorflow/core/framework/node_def.proto):
-  Defines a single operation in a model. It has a unique name, a list of the
-  names of other nodes it pulls inputs from, the operation type it implements
-  (for example `Add`, or `Mul`), and any attributes that are needed to control
-  that operation. This is the basic unit of computation for TensorFlow, and all
-  work is done by iterating through a network of these nodes, applying each one
-  in turn. One particular operation type that’s worth knowing about is `Const`,
-  since this holds information about a constant. This may be a single, scalar
-  number or string, but it can also hold an entire multi-dimensional tensor
-  array. The values for a `Const` are stored inside the `NodeDef`, and so large
-  constants can take up a lot of room when serialized.
-
-- [Checkpoint](https://www.tensorflow.org/code/tensorflow/core/util/tensor_bundle/tensor_bundle.h). Another
-  way of storing values for a model is by using `Variable` ops. Unlike `Const`
-  ops, these don’t store their content as part of the `NodeDef`, so they take up
-  very little space within the `GraphDef` file. Instead their values are held in
-  RAM while a computation is running, and then saved out to disk as checkpoint
-  files periodically. This typically happens as a neural network is being
-  trained and weights are updated, so it’s a time-critical operation, and it may
-  happen in a distributed fashion across many workers, so the file format has to
-  be both fast and flexible. They are stored as multiple checkpoint files,
-  together with metadata files that describe what’s contained within the
-  checkpoints. When you’re referring to a checkpoint in the API (for example
-  when passing a filename in as a command line argument), you’ll use the common
-  prefix for a set of related files. If you had these files:
-
-        /tmp/model/model-chkpt-1000.data-00000-of-00002
-        /tmp/model/model-chkpt-1000.data-00001-of-00002
-        /tmp/model/model-chkpt-1000.index
-        /tmp/model/model-chkpt-1000.meta
-
-    You would refer to them as `/tmp/model/chkpt-1000`.
-
-- [GraphDef](https://www.tensorflow.org/code/tensorflow/core/framework/graph.proto):
-  Has a list of `NodeDefs`, which together define the computational graph to
-  execute. During training, some of these nodes will be `Variables`, and so if
-  you want to have a complete graph you can run, including the weights, you’ll
-  need to call a restore operation to pull those values from
-  checkpoints. Because checkpoint loading has to be flexible to deal with all of
-  the training requirements, this can be tricky to implement on mobile and
-  embedded devices, especially those with no proper file system available like
-  iOS. This is where
-  the
-  [`freeze_graph.py`](https://www.tensorflow.org/code/tensorflow/python/tools/freeze_graph.py) script
-  comes in handy. As mentioned above, `Const` ops store their values as part of
-  the `NodeDef`, so if all the `Variable` weights are converted to `Const` nodes,
-  then we only need a single `GraphDef` file to hold the model architecture and
-  the weights. Freezing the graph handles the process of loading the
-  checkpoints, and then converts all Variables to Consts. You can then load the
-  resulting file in a single call, without having to restore variable values
-  from checkpoints. One thing to watch out for with `GraphDef` files is that
-  sometimes they’re stored in text format for easy inspection. These versions
-  usually have a ‘.pbtxt’ filename suffix, whereas the binary files end with
-  ‘.pb’.
-
-- [FunctionDefLibrary](https://www.tensorflow.org/code/tensorflow/core/framework/function.proto):
-  This appears in `GraphDef`, and is effectively a set of sub-graphs, each with
-  information about their input and output nodes. Each sub-graph can then be
-  used as an op in the main graph, allowing easy instantiation of different
-  nodes, in a similar way to how functions encapsulate code in other languages.
-
-- [MetaGraphDef](https://www.tensorflow.org/code/tensorflow/core/protobuf/meta_graph.proto):
-  A plain `GraphDef` only has information about the network of computations, but
-  doesn’t have any extra information about the model or how it can be
-  used. `MetaGraphDef` contains a `GraphDef` defining the computation part of
-  the model, but also includes information like ‘signatures’, which are
-  suggestions about which inputs and outputs you may want to call the model
-  with, data on how and where any checkpoint files are saved, and convenience
-  tags for grouping ops together for ease of use.
-
-- [SavedModel](https://www.tensorflow.org/code/tensorflow/core/protobuf/saved_model.proto):
-  It’s common to want to have different versions of a graph that rely on a
-  common set of variable checkpoints. For example, you might need a GPU and a
-  CPU version of the same graph, but keep the same weights for both. You might
-  also need some extra files (like label names) as part of your
-  model. The
-  [SavedModel](https://www.tensorflow.org/code/tensorflow/python/saved_model/README.md) format
-  addresses these needs by letting you save multiple versions of the same graph
-  without duplicating variables, and also storing asset files in the same
-  bundle. Under the hood, it uses `MetaGraphDef` and checkpoint files, along
-  with extra metadata files. It’s the format that you’ll want to use if you’re
-  deploying a web API using TensorFlow Serving, for example.
-
-## How do you get a model you can use on mobile?
-
-In most situations, training a model with TensorFlow will give you a folder
-containing a `GraphDef` file (usually ending with the `.pb` or `.pbtxt` extension) and
-a set of checkpoint files. What you need for mobile or embedded deployment is a
-single `GraphDef` file that’s been ‘frozen’, or had its variables converted into
-inline constants so everything’s in one file.  To handle the conversion, you’ll
-need the `freeze_graph.py` script, that’s held in
-[`tensorflow/python/tools/freeze_graph.py`](https://www.tensorflow.org/code/tensorflow/python/tools/freeze_graph.py). You’ll run it like this:
-
-    bazel build tensorflow/python/tools:freeze_graph
-    bazel-bin/tensorflow/python/tools/freeze_graph \
-    --input_graph=/tmp/model/my_graph.pb \
-    --input_checkpoint=/tmp/model/model.ckpt-1000 \
-    --output_graph=/tmp/frozen_graph.pb \
-    --output_node_names=output_node \
-
-The `input_graph` argument should point to the `GraphDef` file that holds your
-model architecture. It’s possible that your `GraphDef` has been stored in a text
-format on disk, in which case it’s likely to end in `.pbtxt` instead of `.pb`,
-and you should add an extra `--input_binary=false` flag to the command.
-
-The `input_checkpoint` should be the most recent saved checkpoint. As mentioned
-in the checkpoint section, you need to give the common prefix to the set of
-checkpoints here, rather than a full filename.
-
-`output_graph` defines where the resulting frozen `GraphDef` will be
-saved. Because it’s likely to contain a lot of weight values that take up a
-large amount of space in text format, it’s always saved as a binary protobuf.
-
-`output_node_names` is a list of the names of the nodes that you want to extract
-the results of your graph from. This is needed because the freezing process
-needs to understand which parts of the graph are actually needed, and which are
-artifacts of the training process, like summarization ops. Only ops that
-contribute to calculating the given output nodes will be kept. If you know how
-your graph is going to be used, these should just be the names of the nodes you
-pass into `Session::Run()` as your fetch targets. The easiest way to find the
-node names is to inspect the Node objects while building your graph in python.
-Inspecting your graph in TensorBoard is another simple way.  You can get some
-suggestions on likely outputs by running the [`summarize_graph` tool](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/tools/graph_transforms/README.md#inspecting-graphs).
-
-Because the output format for TensorFlow has changed over time, there are a
-variety of other less commonly used flags available too, like `input_saver`, but
-hopefully you shouldn’t need these on graphs trained with modern versions of the
-framework.
-
-## Using the Graph Transform Tool
-
-A lot of the things you need to do to efficiently run a model on device are
-available through the [Graph Transform
-Tool](https://www.tensorflow.org/code/tensorflow/tools/graph_transforms/README.md). This
-command-line tool takes an input `GraphDef` file, applies the set of rewriting
-rules you request, and then writes out the result as a `GraphDef`. See the
-documentation for more information on how to build and run this tool.
-
-### Removing training-only nodes
-
-TensorFlow `GraphDefs` produced by the training code contain all of the
-computation that’s needed for back-propagation and updates of weights, as well
-as the queuing and decoding of inputs, and the saving out of checkpoints. All of
-these nodes are no longer needed during inference, and some of the operations
-like checkpoint saving aren’t even supported on mobile platforms. To create a
-model file that you can load on devices you need to delete those unneeded
-operations by running the `strip_unused_nodes` rule in the Graph Transform Tool.
-
-The trickiest part of this process is figuring out the names of the nodes you
-want to use as inputs and outputs during inference.  You'll need these anyway
-once you start to run inference, but you also need them here so that the
-transform can calculate which nodes are not needed on the inference-only
-path. These may not be obvious from the training code. The easiest way to
-determine the node name is to explore the graph with TensorBoard.
-
-Remember that mobile applications typically gather their data from sensors and
-have it as arrays in memory, whereas training typically involves loading and
-decoding representations of the data stored on disk. In the case of Inception v3
-for example, there’s a `DecodeJpeg` op at the start of the graph that’s designed
-to take JPEG-encoded data from a file retrieved from disk and turn it into an
-arbitrary-sized image. After that there’s a `BilinearResize` op to scale it to
-the expected size, followed by a couple of other ops that convert the byte data
-into float and scale the value magnitudes it in the way the rest of the graph
-expects. A typical mobile app will skip most of these steps because it’s getting
-its input directly from a live camera, so the input node you will actually
-supply will be the output of the `Mul` node in this case.
-
-<img src ="../images/inception_input.png" width="300">
-
-You’ll need to do a similar process of inspection to figure out the correct
-output nodes.
-
-If you’ve just been given a frozen `GraphDef` file, and are not sure about the
-contents, try using the `summarize_graph` tool to print out information
-about the inputs and outputs it finds from the graph structure. Here’s an
-example with the original Inception v3 file:
-
-    bazel run tensorflow/tools/graph_transforms:summarize_graph --
-    --in_graph=tensorflow_inception_graph.pb
-
-Once you have an idea of what the input and output nodes are, you can feed them
-into the graph transform tool as the `--input_names` and `--output_names`
-arguments, and call the `strip_unused_nodes` transform, like this:
-
-    bazel run tensorflow/tools/graph_transforms:transform_graph --
-    --in_graph=tensorflow_inception_graph.pb
-    --out_graph=optimized_inception_graph.pb --inputs='Mul' --outputs='softmax'
-    --transforms='
-      strip_unused_nodes(type=float, shape="1,299,299,3")
-      fold_constants(ignore_errors=true)
-      fold_batch_norms
-      fold_old_batch_norms'
-
-One thing to look out for here is that you need to specify the size and type
-that you want your inputs to be. This is because any values that you’re going to
-be passing in as inputs to inference need to be fed to special `Placeholder` op
-nodes, and the transform may need to create them if they don’t already exist. In
-the case of Inception v3 for example, a `Placeholder` node replaces the old
-`Mul` node that used to output the resized and rescaled image array, since we’re
-going to be doing that processing ourselves before we call TensorFlow. It keeps
-the original name though, which is why we always feed in inputs to `Mul` when we
-run a session with our modified Inception graph.
-
-After you’ve run this process, you’ll have a graph that only contains the actual
-nodes you need to run your prediction process. This is the point where it
-becomes useful to run metrics on the graph, so it’s worth running
-`summarize_graph` again to understand what’s in your model.
-
-## What ops should you include on mobile?
-
-There are hundreds of operations available in TensorFlow, and each one has
-multiple implementations for different data types. On mobile platforms, the size
-of the executable binary that’s produced after compilation is important, because
-app download bundles need to be as small as possible for the best user
-experience. If all of the ops and data types are compiled into the TensorFlow
-library then the total size of the compiled library can be tens of megabytes, so
-by default only a subset of ops and data types are included.
-
-That means that if you load a model file that’s been trained on a desktop
-machine, you may see the error “No OpKernel was registered to support Op” when
-you load it on mobile. The first thing to try is to make sure you’ve stripped
-out any training-only nodes, since the error will occur at load time even if the
-op is never executed. If you’re still hitting the same problem once that’s done,
-you’ll need to look at adding the op to your built library.
-
-The criteria for including ops and types fall into several categories:
-
-- Are they only useful in back-propagation, for gradients? Since mobile is
-  focused on inference, we don’t include these.
-
-- Are they useful mainly for other training needs, such as checkpoint saving?
-  These we leave out.
-
-- Do they rely on frameworks that aren’t always available on mobile, such as
-  libjpeg? To avoid extra dependencies we don’t include ops like `DecodeJpeg`.
-
-- Are there types that aren’t commonly used? We don’t include boolean variants
-  of ops for example, since we don’t see much use of them in typical inference
-  graphs.
-
-These ops are trimmed by default to optimize for inference on mobile, but it is
-possible to alter some build files to change the default.  After alternating the
-build files, you will need to recompile TensorFlow.  See below for more details
-on how to do this, and also see @{$mobile/optimizing#binary_size$Optimizing} for
-more on reducing your binary size.
-
-### Locate the implementation
-
-Operations are broken into two parts. The first is the op definition, which
-declares the signature of the operation, which inputs, outputs, and attributes
-it has. These take up very little space, and so all are included by default. The
-implementations of the op computations are done in kernels, which live in the
-`tensorflow/core/kernels` folder. You need to compile the C++ file containing
-the kernel implementation of the op you need into the library. To figure out
-which file that is, you can search for the operation name in the source
-files.
-
-[Here’s an example search in github](https://github.com/search?utf8=%E2%9C%93&q=repo%3Atensorflow%2Ftensorflow+extension%3Acc+path%3Atensorflow%2Fcore%2Fkernels+REGISTER+Mul&type=Code&ref=searchresults).
-
-You’ll see that this search is looking for the `Mul` op implementation, and it
-finds it in `tensorflow/core/kernels/cwise_op_mul_1.cc`. You need to look for
-macros beginning with `REGISTER`, with the op name you care about as one of the
-string arguments.
-
-In this case, the implementations are actually broken up across multiple `.cc`
-files, so you’d need to include all of them in your build. If you’re more
-comfortable using the command line for code search, here’s a grep command that
-also locates the right files if you run it from the root of your TensorFlow
-repository:
-
-`grep 'REGISTER.*"Mul"' tensorflow/core/kernels/*.cc`
-
-### Add the implementation to the build
-
-If you’re using Bazel, and building for Android, you’ll want to add the files
-you’ve found to
-the
-[`android_extended_ops_group1`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3565) or
-[`android_extended_ops_group2`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3632) targets. You
-may also need to include any .cc files they depend on in there. If the build
-complains about missing header files, add the .h’s that are needed into
-the
-[`android_extended_ops`](https://www.tensorflow.org/code/tensorflow/core/kernels/BUILD#L3525) target.
-
-If you’re using a makefile targeting iOS, Raspberry Pi, etc, go to
-[`tensorflow/contrib/makefile/tf_op_files.txt`](https://www.tensorflow.org/code/tensorflow/contrib/makefile/tf_op_files.txt) and
-add the right implementation files there.
diff --git a/tensorflow/docs_src/mobile/tflite/demo_android.md b/tensorflow/docs_src/mobile/tflite/demo_android.md
deleted file mode 100644
index fdf0bcf3c1..0000000000
--- a/tensorflow/docs_src/mobile/tflite/demo_android.md
+++ /dev/null
@@ -1,146 +0,0 @@
-# Android Demo App
-
-An example Android application using TensorFLow Lite is available
-[on GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo).
-The demo is a sample camera app that classifies images continuously
-using either a quantized Mobilenet model or a floating point Inception-v3 model.
-To run the demo, a device running Android 5.0 ( API 21) or higher is required.
-
-In the demo app, inference is done using the TensorFlow Lite Java API. The demo
-app classifies frames in real-time, displaying the top most probable
-classifications. It also displays the time taken to detect the object.
-
-There are three ways to get the demo app to your device:
-
-* Download the [prebuilt binary APK](http://download.tensorflow.org/deps/tflite/TfLiteCameraDemo.apk).
-* Use Android Studio to build the application.
-* Download the source code for TensorFlow Lite and the demo and build it using
-  bazel.
-
-
-## Download the pre-built binary
-
-The easiest way to try the demo is to download the
-[pre-built binary APK](https://storage.googleapis.com/download.tensorflow.org/deps/tflite/TfLiteCameraDemo.apk)
-
-Once the APK is installed, click the app icon to start the program. The first
-time the app is opened, it asks for runtime permissions to access the device
-camera. The demo app opens the back-camera of the device and recognizes objects
-in the camera's field of view. At the bottom of the image (or at the left
-of the image if the device is in landscape mode), it displays top three objects
-classified and the classification latency.
-
-
-## Build in Android Studio with TensorFlow Lite AAR from JCenter
-
-Use Android Studio to try out changes in the project code and compile the demo
-app:
-
-* Install the latest version of
-  [Android Studio](https://developer.android.com/studio/index.html).
-* Make sure the Android SDK version is greater than 26 and NDK version is greater
-  than 14 (in the Android Studio settings).
-* Import the `tensorflow/contrib/lite/java/demo` directory as a new
-  Android Studio project.
-* Install all the Gradle extensions it requests.
-
-Now you can build and run the demo app. 
-
-The build process downloads the quantized [Mobilenet TensorFlow Lite model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_224_android_quant_2017_11_08.zip), and unzips it into the assets directory: `tensorflow/contrib/lite/java/demo/app/src/main/assets/`.
-
-Some additional details are available on the
-[TF Lite Android App page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/README.md).
-
-### Using other models
-
-To use a different model:
-* Download the floating point [Inception-v3 model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/inception_v3_slim_2016_android_2017_11_10.zip).
-* Unzip and copy `inceptionv3_non_slim_2015.tflite` to the assets directory. 
-* Change the chosen classifier in [Camera2BasicFragment.java](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/java/demo/app/src/main/java/com/example/android/tflitecamerademo/Camera2BasicFragment.java)<br>
-  from: `classifier = new ImageClassifierQuantizedMobileNet(getActivity());`<br>
-  to: `classifier = new ImageClassifierFloatInception(getActivity());`.
-
-
-## Build TensorFlow Lite and the demo app from source
-
-### Clone the TensorFlow repo
-
-```sh
-git clone https://github.com/tensorflow/tensorflow
-```
-
-### Install Bazel
-
-If `bazel` is not installed on your system, see
-[Installing Bazel](https://bazel.build/versions/master/docs/install.html).
-
-Note: Bazel does not currently support Android builds on Windows. Windows users
-should download the
-[prebuilt binary](https://storage.googleapis.com/download.tensorflow.org/deps/tflite/TfLiteCameraDemo.apk).
-
-### Install Android NDK and SDK
-
-The Android NDK is required to build the native (C/C++) TensorFlow Lite code. The
-current recommended version is *14b* and can be found on the
-[NDK Archives](https://developer.android.com/ndk/downloads/older_releases.html#ndk-14b-downloads)
-page.
-
-The Android SDK and build tools can be
-[downloaded separately](https://developer.android.com/tools/revisions/build-tools.html)
-or used as part of
-[Android Studio](https://developer.android.com/studio/index.html). To build the
-TensorFlow Lite Android demo, build tools require API >= 23 (but it will run on
-devices with API >= 21).
-
-In the root of the TensorFlow repository, update the `WORKSPACE` file with the
-`api_level` and location of the SDK and NDK. If you installed it with
-Android Studio, the SDK path can be found in the SDK manager. The default NDK
-path is:`{SDK path}/ndk-bundle.` For example:
-
-```
-android_sdk_repository (
-    name = "androidsdk",
-    api_level = 23,
-    build_tools_version = "23.0.2",
-    path = "/home/xxxx/android-sdk-linux/",
-)
-
-android_ndk_repository(
-    name = "androidndk",
-    path = "/home/xxxx/android-ndk-r10e/",
-    api_level = 19,
-)
-```
-
-Some additional details are available on the
-[TF Lite Android App page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/README.md).
-
-### Build the source code
-
-To build the demo app, run `bazel`:
-
-```
-bazel build --cxxopt=--std=c++11 //tensorflow/contrib/lite/java/demo/app/src/main:TfLiteCameraDemo
-```
-
-Caution: Because of an bazel bug, we only support building the Android demo app
-within a Python 2 environment.
-
-
-## About the demo
-
-The demo app is resizing each camera image frame (224 width * 224 height) to
-match the quantized MobileNets model (299 * 299 for Inception-v3). The resized
-image is converted—row by row—into a
-[ByteBuffer](https://developer.android.com/reference/java/nio/ByteBuffer.html).
-Its size is  1 * 224 * 224 * 3 bytes, where 1 is the number of images in a batch.
-224 * 224 (299 * 299) is the width and height of the image. 3 bytes represents
-the 3 colors of a pixel.
-
-This demo uses the TensorFlow Lite Java inference API
-for models which take a single input and provide a single output. This outputs a
-two-dimensional array, with the first dimension being the category index and the
-second dimension being the confidence of classification. Both models have 1001
-unique categories and the app sorts the probabilities of all the categories and
-displays the top three. The model file must be downloaded and bundled within the
-assets directory of the app.
diff --git a/tensorflow/docs_src/mobile/tflite/demo_ios.md b/tensorflow/docs_src/mobile/tflite/demo_ios.md
deleted file mode 100644
index 3be21da89f..0000000000
--- a/tensorflow/docs_src/mobile/tflite/demo_ios.md
+++ /dev/null
@@ -1,68 +0,0 @@
-# iOS Demo App
-
-The TensorFlow Lite demo is a camera app that continuously classifies whatever
-it sees from your device's back camera, using a quantized MobileNet model. These
-instructions walk you through building and running the demo on an iOS device.
-
-## Prerequisites
-
-* You must have [Xcode](https://developer.apple.com/xcode/) installed and have a
-  valid Apple Developer ID, and have an iOS device set up and linked to your
-  developer account with all of the appropriate certificates. For these
-  instructions, we assume that you have already been able to build and deploy an
-  app to an iOS device with your current developer environment.
-
-* The demo app requires a camera and must be executed on a real iOS device. You
-  can build it and run with the iPhone Simulator but it won't have any camera
-  information to classify.
-
-* You don't need to build the entire TensorFlow library to run the demo, but you
-  will need to clone the TensorFlow repository if you haven't already:
-
-        git clone https://github.com/tensorflow/tensorflow
-
-* You'll also need the Xcode command-line tools:
-
-        xcode-select --install
-
-    If this is a new install, you will need to run the Xcode application once to
-    agree to the license before continuing.
-
-## Building the iOS Demo App
-
-1. Install CocoaPods if you don't have it:
-
-        sudo gem install cocoapods
-
-2. Download the model files used by the demo app (this is done from inside the
-   cloned directory):
-
-        sh tensorflow/contrib/lite/examples/ios/download_models.sh
-
-3. Install the pod to generate the workspace file:
-
-        cd tensorflow/contrib/lite/examples/ios/camera
-        pod install
-
-    If you have installed this pod before and that command doesn't work, try
-
-        pod update
-
-    At the end of this step you should have a file called 
-    `tflite_camera_example.xcworkspace`.
-
-4. Open the project in Xcode by typing this on the command line:
-
-        open tflite_camera_example.xcworkspace
-
-    This launches Xcode if it isn't open already and opens the
-    `tflite_camera_example` project.
-
-5. Build and run the app in Xcode.
-
-    Note that as mentioned earlier, you must already have a device set up and
-    linked to your Apple Developer account in order to deploy the app on a
-    device.
-
-You'll have to grant permissions for the app to use the device's camera. Point
-the camera at various objects and enjoy seeing how the model classifies things!
diff --git a/tensorflow/docs_src/mobile/tflite/devguide.md b/tensorflow/docs_src/mobile/tflite/devguide.md
deleted file mode 100644
index b168d6c183..0000000000
--- a/tensorflow/docs_src/mobile/tflite/devguide.md
+++ /dev/null
@@ -1,232 +0,0 @@
-# Developer Guide
-
-Using a TensorFlow Lite model in your mobile app requires multiple
-considerations: you must choose a pre-trained or custom model, convert the model
-to a TensorFLow Lite format, and finally, integrate the model in your app.
-
-## 1. Choose a model
-
-Depending on the use case, you can choose one of the popular open-sourced models,
-such as *InceptionV3* or *MobileNets*, and re-train these models with a custom
-data set or even build your own custom model.
-
-### Use a pre-trained model
-
-[MobileNets](https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html)
-is a family of mobile-first computer vision models for TensorFlow designed to
-effectively maximize accuracy, while taking into consideration the restricted
-resources for on-device or embedded applications. MobileNets are small,
-low-latency, low-power models parameterized to meet the resource constraints for
-a variety of uses. They can be used for classification, detection, embeddings, and
-segmentation—similar to other popular large scale models, such as
-[Inception](https://arxiv.org/pdf/1602.07261.pdf). Google provides 16 pre-trained
-[ImageNet](http://www.image-net.org/challenges/LSVRC/) classification checkpoints
-for MobileNets that can be used in mobile projects of all sizes.
-
-[Inception-v3](https://arxiv.org/abs/1512.00567) is an image recognition model
-that achieves fairly high accuracy recognizing general objects with 1000 classes,
-for example, "Zebra", "Dalmatian", and "Dishwasher". The model extracts general
-features from input images using a convolutional neural network and classifies
-them based on those features with fully-connected and softmax layers.
-
-[On Device Smart Reply](https://research.googleblog.com/2017/02/on-device-machine-intelligence.html)
-is an on-device model that provides one-touch replies for incoming text messages
-by suggesting contextually relevant messages. The model is built specifically for
-memory constrained devices, such as watches and phones, and has been successfully
-used in Smart Replies on Android Wear. Currently, this model is Android-specific.
-
-These pre-trained models are [available for download](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/models.md)
-
-### Re-train Inception-V3 or MobileNet for a custom data set
-
-These pre-trained models were trained on the *ImageNet* data set which contains
-1000 predefined classes. If these classes are not sufficient for your use case,
-the model will need to be re-trained. This technique is called
-*transfer learning* and starts with a model that has been already trained on a
-problem, then retrains the model on a similar problem. Deep learning from
-scratch can take days, but transfer learning is fairly quick. In order to do
-this, you need to generate a custom data set labeled with the relevant classes.
-
-The [TensorFlow for Poets](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/)
-codelab walks through the re-training process step-by-step. The code supports
-both floating point and quantized inference.
-
-### Train a custom model
-
-A developer may choose to train a custom model using Tensorflow (see the
-[TensorFlow tutorials](../../tutorials/) for examples of building and training
-models). If you have already written a model, the first step is to export this
-to a @{tf.GraphDef} file. This is required because some formats do not store the
-model structure outside the code, and we must communicate with other parts of the
-framework. See
-[Exporting the Inference Graph](https://github.com/tensorflow/models/blob/master/research/slim/README.md)
-to create .pb file for the custom model.
-
-TensorFlow Lite currently supports a subset of TensorFlow operators. Refer to the
-[TensorFlow Lite & TensorFlow Compatibility Guide](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/g3doc/tf_ops_compatibility.md)
-for supported operators and their usage. This set of operators will continue to
-grow in future Tensorflow Lite releases.
-
-
-## 2. Convert the model format
-
-The model generated (or downloaded) in the previous step is a *standard*
-Tensorflow model and you should now have a .pb or .pbtxt @{tf.GraphDef} file.
-Models generated with transfer learning (re-training) or custom models must be
-converted—but, we must first freeze the graph to convert the model to the
-Tensorflow Lite format. This process uses several model formats:
-
-* @{tf.GraphDef} (.pb) —A protobuf that represents the TensorFlow training or
-  computation graph. It contains operators, tensors, and variables definitions.
-* *CheckPoint* (.ckpt) —Serialized variables from a TensorFlow graph. Since this
-  does not contain a graph structure, it cannot be interpreted by itself.
-* `FrozenGraphDef` —A subclass of `GraphDef` that does not contain
-  variables. A `GraphDef` can be converted to a `FrozenGraphDef` by taking a
-  CheckPoint and a `GraphDef`, and converting each variable into a constant
-  using the value retrieved from the CheckPoint.
-* `SavedModel` —A `GraphDef` and CheckPoint with a signature that labels
-  input and output arguments to a model. A `GraphDef` and CheckPoint can be
-  extracted from a `SavedModel`.
-* *TensorFlow Lite model* (.tflite) —A serialized
-  [FlatBuffer](https://google.github.io/flatbuffers/) that contains TensorFlow
-  Lite operators and tensors for the TensorFlow Lite interpreter, similar to a
-  `FrozenGraphDef`.
-
-### Freeze Graph
-
-To use the `GraphDef` .pb file with TensorFlow Lite, you must have checkpoints
-that contain trained weight parameters. The .pb file only contains the structure
-of the graph. The process of merging the checkpoint values with the graph
-structure is called *freezing the graph*.
-
-You should have a checkpoints folder or download them for a pre-trained model
-(for example,
-[MobileNets](https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md)).
-
-To freeze the graph, use the following command (changing the arguments):
-
-```
-freeze_graph --input_graph=/tmp/mobilenet_v1_224.pb \
-  --input_checkpoint=/tmp/checkpoints/mobilenet-10202.ckpt \
-  --input_binary=true \
-  --output_graph=/tmp/frozen_mobilenet_v1_224.pb \
-  --output_node_names=MobileNetV1/Predictions/Reshape_1
-```
-
-The `input_binary` flag must be enabled so the protobuf is read and written in
-a binary format. Set the `input_graph` and `input_checkpoint` files.
-
-The `output_node_names` may not be obvious outside of the code that built the
-model. The easiest way to find them is to visualize the graph, either with
-[TensorBoard](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets-2/#3)
-or `graphviz`.
-
-The frozen `GraphDef` is now ready for conversion to the `FlatBuffer` format
-(.tflite) for use on Android or iOS devices. For Android, the Tensorflow
-Optimizing Converter tool supports both float and quantized models. To convert
-the frozen `GraphDef` to the .tflite format:
-
-```
-toco --input_file=$(pwd)/mobilenet_v1_1.0_224/frozen_graph.pb \
-  --input_format=TENSORFLOW_GRAPHDEF \
-  --output_format=TFLITE \
-  --output_file=/tmp/mobilenet_v1_1.0_224.tflite \
-  --inference_type=FLOAT \
-  --input_type=FLOAT \
-  --input_arrays=input \
-  --output_arrays=MobilenetV1/Predictions/Reshape_1 \
-  --input_shapes=1,224,224,3
-```
-
-The `input_file` argument should reference the frozen `GraphDef` file
-containing the model architecture. The [frozen_graph.pb](https://storage.googleapis.com/download.tensorflow.org/models/mobilenet_v1_1.0_224_frozen.tgz)
-file used here is available for download. `output_file` is where the TensorFlow
-Lite model will get generated. The `input_type` and `inference_type`
-arguments should be set to `FLOAT`, unless converting a
-@{$performance/quantization$quantized model}. Setting the `input_array`,
-`output_array`, and `input_shape` arguments are not as straightforward. The
-easiest way to find these values is to explore the graph using Tensorboard. Reuse
-the arguments for specifying the output nodes for inference in the
-`freeze_graph` step.
-
-It is also possible to use the Tensorflow Optimizing Converter with protobufs
-from either Python or from the command line (see the 
-[toco_from_protos.py](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/toco/python/toco_from_protos.py)
-example). This allows you to integrate the conversion step into the model design
-workflow, ensuring the model is easily convertible to a mobile inference graph.
-For example:
-
-```python
-import tensorflow as tf
-
-img = tf.placeholder(name="img", dtype=tf.float32, shape=(1, 64, 64, 3))
-val = img + tf.constant([1., 2., 3.]) + tf.constant([1., 4., 4.])
-out = tf.identity(val, name="out")
-
-with tf.Session() as sess:
-  tflite_model = tf.contrib.lite.toco_convert(sess.graph_def, [img], [out])
-  open("converteds_model.tflite", "wb").write(tflite_model)
-```
-
-For usage, see the Tensorflow Optimizing Converter
-[command-line examples](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/toco/g3doc/cmdline_examples.md).
-
-Refer to the
-[Ops compatibility guide](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/g3doc/tf_ops_compatibility.md)
-for troubleshooting help, and if that doesn't help, please
-[file an issue](https://github.com/tensorflow/tensorflow/issues).
-
-The [development repo](https://github.com/tensorflow/tensorflow) contains a tool
-to visualize TensorFlow Lite models after conversion. To build the
-[visualize.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/tools/visualize.py)
-tool:
-
-```sh
-bazel run tensorflow/contrib/lite/tools:visualize -- model.tflite model_viz.html
-```
-
-This generates an interactive HTML page listing subgraphs, operations, and a
-graph visualization.
-
-
-## 3. Use the TensorFlow Lite model for inference in a mobile app
-
-After completing the prior steps, you should now have a `.tflite` model file.
-
-### Android
-
-Since Android apps are written in Java and the core TensorFlow library is in C++,
-a JNI library is provided as an interface. This is only meant for inference—it
-provides the ability to load a graph, set up inputs, and run the model to
-calculate outputs.
-
-The open source Android demo app uses the JNI interface and is available
-[on GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/app).
-You can also download a
-[prebuilt APK](http://download.tensorflow.org/deps/tflite/TfLiteCameraDemo.apk).
-See the @{$tflite/demo_android} guide for details.
-
-The @{$mobile/android_build} guide has instructions for installing TensorFlow on
-Android and setting up `bazel` and Android Studio.
-
-### iOS
-
-To integrate a TensorFlow model in an iOS app, see the
-[TensorFlow Lite for iOS](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/g3doc/ios.md)
-guide and @{$tflite/demo_ios} guide.
-
-#### Core ML support
-
-Core ML is a machine learning framework used in Apple products. In addition to
-using Tensorflow Lite models directly in your applications, you can convert
-trained Tensorflow models to the
-[CoreML](https://developer.apple.com/machine-learning/) format for use on Apple
-devices. To use the converter, refer to the
-[Tensorflow-CoreML converter documentation](https://github.com/tf-coreml/tf-coreml).
-
-### Raspberry Pi
-
-Compile Tensorflow Lite for a Raspberry Pi by following the
-[RPi build instructions](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/rpi.md)
-This compiles a static library file (`.a`) used to build your app. There are
-plans for Python bindings and a demo app.
diff --git a/tensorflow/docs_src/mobile/tflite/index.md b/tensorflow/docs_src/mobile/tflite/index.md
deleted file mode 100644
index cc4af2a875..0000000000
--- a/tensorflow/docs_src/mobile/tflite/index.md
+++ /dev/null
@@ -1,201 +0,0 @@
-# Introduction to TensorFlow Lite
-
-TensorFlow Lite is TensorFlow’s lightweight solution for mobile and embedded
-devices. It enables on-device machine learning inference with low latency and a
-small binary size. TensorFlow Lite also supports hardware acceleration with the
-[Android Neural Networks
-API](https://developer.android.com/ndk/guides/neuralnetworks/index.html).
-
-TensorFlow Lite uses many techniques for achieving low latency such as
-optimizing the kernels for mobile apps, pre-fused activations, and quantized
-kernels that allow smaller and faster (fixed-point math) models.
-
-Most of our TensorFlow Lite documentation is [on
-GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite)
-for the time being.
-
-## What does TensorFlow Lite contain?
-
-TensorFlow Lite supports a set of core operators, both quantized and
-float, which have been tuned for mobile platforms. They incorporate pre-fused
-activations and biases to further enhance performance and quantized
-accuracy. Additionally, TensorFlow Lite also supports using custom operations in
-models.
-
-TensorFlow Lite defines a new model file format, based on
-[FlatBuffers](https://google.github.io/flatbuffers/). FlatBuffers is an
-open-sourced, efficient cross platform serialization library. It is similar to
-[protocol buffers](https://developers.google.com/protocol-buffers/?hl=en), but
-the primary difference is that FlatBuffers does not need a parsing/unpacking
-step to a secondary representation before you can access data, often coupled
-with per-object memory allocation. Also, the code footprint of FlatBuffers is an
-order of magnitude smaller than protocol buffers.
-
-TensorFlow Lite has a new mobile-optimized interpreter, which has the key goals
-of keeping apps lean and fast. The interpreter uses a static graph ordering and
-a custom (less-dynamic) memory allocator to ensure minimal load, initialization,
-and execution latency.
-
-TensorFlow Lite provides an interface to leverage hardware acceleration, if
-available on the device. It does so via the
-[Android Neural Networks API](https://developer.android.com/ndk/guides/neuralnetworks/index.html),
-available on Android 8.1 (API level 27) and higher.
-
-## Why do we need a new mobile-specific library?
-
-Machine Learning is changing the computing paradigm, and we see an emerging
-trend of new use cases on mobile and embedded devices. Consumer expectations are
-also trending toward natural, human-like interactions with their devices, driven
-by the camera and voice interaction models.
-
-There are several factors which are fueling interest in this domain:
-
-- Innovation at the silicon layer is enabling new possibilities for hardware
-  acceleration, and frameworks such as the Android Neural Networks API make it
-  easy to leverage these.
-
-- Recent advances in real-time computer-vision and spoken language understanding
-  have led to mobile-optimized benchmark models being open sourced
-  (e.g. MobileNets, SqueezeNet).
-
-- Widely-available smart appliances create new possibilities for
-  on-device intelligence.
-
-- Interest in stronger user data privacy paradigms where user data does not need
-  to leave the mobile device.
-
-- Ability to serve ‘offline’ use cases, where the device does not need to be
-  connected to a network.
-
-We believe the next wave of machine learning applications will have significant
-processing on mobile and embedded devices.
-
-## TensorFlow Lite highlights
-
-TensorFlow Lite provides:
-
-- A set of core operators, both quantized and float, many of which have been
-  tuned for mobile platforms.  These can be used to create and run custom
-  models.  Developers can also write their own custom operators and use them in
-  models.
-
-- A new [FlatBuffers](https://google.github.io/flatbuffers/)-based
-  model file format.
-
-- On-device interpreter with kernels optimized for faster execution on mobile.
-
-- TensorFlow converter to convert TensorFlow-trained models to the TensorFlow
-  Lite format.
-
-- Smaller in size: TensorFlow Lite is smaller than 300KB when all supported
-  operators are linked and less than 200KB when using only the operators needed
-  for supporting InceptionV3 and Mobilenet.
-
-- **Pre-tested models:**
-
-    All of the following models are guaranteed to work out of the box:
-
-    - Inception V3, a popular model for detecting the dominant objects
-      present in an image.
-
-    - [MobileNets](https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md),
-      a family of mobile-first computer vision models designed to effectively
-      maximize accuracy while being mindful of the restricted resources for an
-      on-device or embedded application. They are small, low-latency, low-power
-      models parameterized to meet the resource constraints of a variety of use
-      cases. They can be built upon for classification, detection, embeddings
-      and segmentation. MobileNet models are smaller but [lower in
-      accuracy](https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html)
-      than Inception V3.
-
-    - On Device Smart Reply, an on-device model which provides one-touch
-      replies for an incoming text message by suggesting contextually relevant
-      messages. The model was built specifically for memory constrained devices
-      such as watches & phones and it has been successfully used to surface
-      [Smart Replies on Android
-      Wear](https://research.googleblog.com/2017/02/on-device-machine-intelligence.html)
-      to all first-party and third-party apps.
-
-    Also see the complete list of
-    [TensorFlow Lite's supported models](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/models.md),
-    including the model sizes, performance numbers, and downloadable model files.
-
-- Quantized versions of the MobileNet model, which runs faster than the
-  non-quantized (float) version on CPU.
-
-- New Android demo app to illustrate the use of TensorFlow Lite with a quantized
-  MobileNet model for object classification.
-
-- Java and C++ API support
-
-
-## Getting Started
-
-We recommend you try out TensorFlow Lite with the pre-tested models indicated
-above. If you have an existing model, you will need to test whether your model
-is compatible with both the converter and the supported operator set.  To test
-your model, see the
-[documentation on GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite).
-
-### Retrain Inception-V3 or MobileNet for a custom data set
-
-The pre-trained models mentioned above have been trained on the ImageNet data
-set, which consists of 1000 predefined classes. If those classes are not
-relevant or useful for your use case, you will need to retrain those
-models. This technique is called transfer learning, which starts with a model
-that has been already trained on a problem and will then be retrained on a
-similar problem. Deep learning from scratch can take days, but transfer learning
-can be done fairly quickly. In order to do this, you'll need to generate your
-custom data set labeled with the relevant classes.
-
-The [TensorFlow for Poets](https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/)
-codelab walks through this process step-by-step. The retraining code supports
-retraining for both floating point and quantized inference.
-
-## TensorFlow Lite Architecture
-
-The following diagram shows the architectural design of TensorFlow Lite:
-
-<img src="https://www.tensorflow.org/images/tflite-architecture.jpg"
-     alt="TensorFlow Lite architecture diagram"
-     style="max-width:600px;">
-
-Starting with a trained TensorFlow model on disk, you'll convert that model to
-the TensorFlow Lite file format (`.tflite`) using the TensorFlow Lite
-Converter. Then you can use that converted file in your mobile application.
-
-Deploying the TensorFlow Lite model file uses:
-
-- Java API: A convenience wrapper around the C++ API on Android.
-
-- C++ API: Loads the TensorFlow Lite Model File and invokes the Interpreter. The
-  same library is available on both Android and iOS.
-
-- Interpreter: Executes the model using a set of kernels. The interpreter
-  supports selective kernel loading; without kernels it is only 100KB, and 300KB
-  with all the kernels loaded. This is a significant reduction from the 1.5M
-  required by TensorFlow Mobile.
-
-- On select Android devices, the Interpreter will use the Android Neural
-  Networks API for hardware acceleration, or default to CPU execution if none
-  are available.
-
-You can also implement custom kernels using the C++ API that can be used by the
-Interpreter.
-
-## Future Work
-
-In future releases, TensorFlow Lite will support more models and built-in
-operators, contain performance improvements for both fixed point and floating
-point models, improvements to the tools to enable easier developer workflows and
-support for other smaller devices and more. As we continue development, we hope
-that TensorFlow Lite will greatly simplify the developer experience of targeting
-a model for small devices.
-
-Future plans include using specialized machine learning hardware to get the best
-possible performance for a particular model on a particular device.
-
-## Next Steps
-
-The TensorFlow Lite [GitHub repository](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite).
-contains additional docs, code samples, and demo applications.
diff --git a/tensorflow/docs_src/mobile/tflite/performance.md b/tensorflow/docs_src/mobile/tflite/performance.md
deleted file mode 100644
index 79bacaaa1b..0000000000
--- a/tensorflow/docs_src/mobile/tflite/performance.md
+++ /dev/null
@@ -1,174 +0,0 @@
-# Performance
-
-This document lists TensorFlow Lite performance benchmarks when running well
-known models on some Android and iOS devices.
-
-These performance benchmark numbers were generated with the
-[Android TFLite benchmark binary](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark)
-and the [iOS benchmark app](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark/ios).
-
-# Android performance benchmarks
-
-For Android benchmarks, the CPU affinity is set to use big cores on the device to
-reduce variance (see [details](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark#reducing-variance-between-runs-on-android)).
-
-It assumes that models were download and unzipped to the
-`/data/local/tmp/tflite_models` directory. The benchmark binary is built
-using [these instructions](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark#on-android)
-and assumed in the `/data/local/tmp` directory.
-
-To run the benchmark:
-
-```
-adb shell taskset ${CPU_MASK} /data/local/tmp/benchmark_model \
-  --num_threads=1 \
-  --graph=/data/local/tmp/tflite_models/${GRAPH} \
-  --warmup_runs=1 \
-  --num_runs=50 \
-  --use_nnapi=false
-```
-
-Here, `${GRAPH}` is the name of model and `${CPU_MASK}` is the CPU affinity
-chosen according to the following table:
-
-Device | CPU_MASK |
--------| ----------
-Pixel 2 | f0 |
-Pixel xl | 0c |
-
-
-<table>
-  <thead>
-    <tr>
-      <th>Model Name</th>
-      <th>Device </th>
-      <th>Mean inference time (std dev)</th>
-    </tr>
-  </thead>
-  <tr>
-    <td rowspan = 2>
-      <a href="http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz">Mobilenet_1.0_224(float)</a>
-    </td>
-    <td>Pixel 2 </td>
-    <td>166.5 ms (2.6 ms)</td>
-  </tr>
-   <tr>
-     <td>Pixel xl </td>
-     <td>122.9 ms (1.8 ms)  </td>
-  </tr>
-  <tr>
-    <td rowspan = 2>
-      <a href="http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224_quant.tgz">Mobilenet_1.0_224 (quant)</a>
-    </td>
-    <td>Pixel 2 </td>
-    <td>69.5 ms (0.9 ms)</td>
-  </tr>
-   <tr>
-     <td>Pixel xl </td>
-     <td>78.9 ms (2.2 ms)  </td>
-  </tr>
-  <tr>
-    <td rowspan = 2>
-      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/nasnet_mobile_2018_04_27.tgz">NASNet mobile</a>
-    </td>
-    <td>Pixel 2 </td>
-    <td>273.8 ms (3.5 ms)</td>
-  </tr>
-   <tr>
-     <td>Pixel xl </td>
-     <td>210.8 ms (4.2 ms)</td>
-  </tr>
-  <tr>
-    <td rowspan = 2>
-      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/squeezenet_2018_04_27.tgz">SqueezeNet</a>
-    </td>
-    <td>Pixel 2 </td>
-    <td>234.0 ms (2.1 ms)</td>
-  </tr>
-   <tr>
-     <td>Pixel xl </td>
-     <td>158.0 ms (2.1 ms)</td>
-  </tr>
-  <tr>
-    <td rowspan = 2>
-      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_resnet_v2_2018_04_27.tgz">Inception_ResNet_V2</a>
-    </td>
-    <td>Pixel 2 </td>
-    <td>2846.0 ms (15.0 ms)</td>
-  </tr>
-   <tr>
-     <td>Pixel xl </td>
-     <td>1973.0 ms (15.0 ms)  </td>
-  </tr>
-  <tr>
-    <td rowspan = 2>
-      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_v4_2018_04_27.tgz">Inception_V4</a>
-    </td>
-    <td>Pixel 2 </td>
-    <td>3180.0 ms (11.7 ms)</td>
-  </tr>
-   <tr>
-     <td>Pixel xl </td>
-     <td>2262.0 ms (21.0 ms)  </td>
-  </tr>
-
- </table>
-
-# iOS benchmarks
-
-To run iOS benchmarks, the [benchmark
-app](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark/ios)
-was modified to include the appropriate model and `benchmark_params.json` was
-modified  to set `num_threads` to 1.
-
-<table>
-  <thead>
-    <tr>
-      <th>Model Name</th>
-      <th>Device </th>
-      <th>Mean inference time (std dev)</th>
-    </tr>
-  </thead>
-  <tr>
-    <td>
-      <a href="http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224.tgz">Mobilenet_1.0_224(float)</a>
-    </td>
-    <td>iPhone 8 </td>
-    <td>32.2 ms (0.8 ms)</td>
-  </tr>
-  <tr>
-    <td>
-      <a href="http://download.tensorflow.org/models/mobilenet_v1_2018_02_22/mobilenet_v1_1.0_224_quant.tgz)">Mobilenet_1.0_224 (quant)</a>
-    </td>
-    <td>iPhone 8 </td>
-    <td>24.4 ms (0.8 ms)</td>
-  </tr>
-  <tr>
-    <td>
-      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/nasnet_mobile_2018_04_27.tgz">NASNet mobile</a>
-    </td>
-    <td>iPhone 8 </td>
-    <td>60.3 ms (0.6 ms)</td>
-  </tr>
-  <tr>
-    <td>
-      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/squeezenet_2018_04_27.tgz">SqueezeNet</a>
-    </td>
-    <td>iPhone 8 </td>
-    <td>44.3 (0.7 ms)</td>
-  </tr>
-  <tr>
-    <td>
-      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_resnet_v2_2018_04_27.tgz">Inception_ResNet_V2</a>
-    </td>
-    <td>iPhone 8</td>
-    <td>562.4 ms (18.2 ms)</td>
-  </tr>
-  <tr>
-    <td>
-      <a href="https://storage.googleapis.com/download.tensorflow.org/models/tflite/model_zoo/upload_20180427/inception_v4_2018_04_27.tgz">Inception_V4</a>
-    </td>
-    <td>iPhone 8 </td>
-    <td>661.0 ms (29.2 ms)</td>
-  </tr>
- </table>
author	Billy Lamberta <blamb@google.com>	2018-07-24 11:52:23 -0700
committer	TensorFlower Gardener <gardener@tensorflow.org>	2018-07-24 11:58:53 -0700
commit	ff2aa1b59d4a111af094c0c7724e453eefe1f3b7 (patch)
tree	cd13149671e53a3b28e9a2fdb012310a46de03d9 /tensorflow/docs_src
parent	badf913c0a2f83ca933b8fe73a29f7dd5d2bc5ce (diff)