# A Tool Developer's Guide to TensorFlow Model Files

Most users shouldn't need to care about the internal details of how TensorFlow
stores data on disk, but you might if you're a tool developer. For example, you
may want to analyze models, or convert back and forth between TensorFlow and
other formats. This guide tries to explain some of the details of how you can
work with the main files that hold model data, to make it easier to develop
those kind of tools.

[TOC]

## Protocol Buffers

All of TensorFlow's file formats are based on
[Protocol Buffers](https://developers.google.com/protocol-buffers/?hl=en), so to
start it's worth getting familiar with how they work. The summary is that you
define data structures in text files, and the protobuf tools generate classes in
C, Python, and other languages that can load, save, and access the data in a
friendly way. We often refer to Protocol Buffers as protobufs, and I'll use
that convention in this guide.

## GraphDef

The foundation of computation in TensorFlow is the `Graph` object. This holds a
network of nodes, each representing one operation, connected to each other as
inputs and outputs. After you've created a `Graph` object, you can save it out
by calling `as_graph_def()`, which returns a `GraphDef` object.

The GraphDef class is an object created by the ProtoBuf library from the
definition in
[tensorflow/core/framework/graph.proto](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/graph.proto). The protobuf tools parse
this text file, and generate the code to load, store, and manipulate graph
definitions. If you see a standalone TensorFlow file representing a model, it's
likely to contain a serialized version of one of these `GraphDef` objects
saved out by the protobuf code.

This generated code is used to save and load the GraphDef files from disk. The code that actually loads the model looks like this:

```python
graph_def = graph_pb2.GraphDef()
```

This line creates an empty `GraphDef` object, the class that's been created
from the textual definition in graph.proto. This is the object we're going to
populate with the data from our file.

```python
with open(FLAGS.graph, "rb") as f:
```

Here we get a file handle for the path we've passed in to the script

```python
  if FLAGS.input_binary:
    graph_def.ParseFromString(f.read())
  else:
    text_format.Merge(f.read(), graph_def)
```

## Text or Binary?

There are actually two different formats that a ProtoBuf can be saved in.
TextFormat is a human-readable form, which makes it nice for debugging and
editing, but can get large when there's numerical data like weights stored in
it. You can see a small example of that in
[graph_run_run2.pbtxt](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tensorboard/demo/data/graph_run_run2.pbtxt).

Binary format files are a lot smaller than their text equivalents, even though
they're not as readable for us. In this script, we ask the user to supply a
flag indicating whether the input file is binary or text, so we know the right
function to call. You can find an example of a large binary file inside the
[inception_v3 archive](https://storage.googleapis.com/download.tensorflow.org/models/inception_v3_2016_08_28_frozen.pb.tar.gz),
as `inception_v3_2016_08_28_frozen.pb`.

The API itself can be a bit confusing - the binary call is actually
`ParseFromString()`, whereas you use a utility function from the `text_format`
module to load textual files.

## Nodes

Once you've loaded a file into the `graph_def` variable, you can now access the
data inside it. For most practical purposes, the important section is the list
of nodes stored in the node member. Here's the code that loops through those:

```python
for node in graph_def.node
```

Each node is a `NodeDef` object, defined in
[tensorflow/core/framework/node_def.proto](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/node_def.proto). These
are the fundamental building blocks of TensorFlow graphs, with each one defining
a single operation along with its input connections. Here are the members of a
`NodeDef`, and what they mean.

### `name`

Every node should have a unique identifier that's not used by any other nodes
in the graph. If you don't specify one as you're building a graph using the
Python API, one reflecting the name of operation, such as "MatMul",
concatenated with a monotonically increasing number, such as "5", will be
picked for you. The name is used when defining the connections between nodes,
and when setting inputs and outputs for the whole graph when it's run.

### `op`

This defines what operation to run, for example `"Add"`, `"MatMul"`, or
`"Conv2D"`. When a graph is run, this op name is looked up in a registry to
find an implementation. The registry is populated by calls to the
`REGISTER_OP()` macro, like those in
[tensorflow/core/ops/nn_ops.cc](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/ops/nn_ops.cc).

### `input`

A list of strings, each one of which is the name of another node, optionally
followed by a colon and an output port number. For example, a node with two
inputs might have a list like `["some_node_name", "another_node_name"]`, which
is equivalent to `["some_node_name:0", "another_node_name:0"]`, and defines the
node's first input as the first output from the node with the name
`"some_node_name"`, and a second input from the first output of
`"another_node_name"`

### `device`

In most cases you can ignore this, since it defines where to run a node in a
distributed environment, or when you want to force the operation onto CPU or
GPU.

### `attr`

This is a key/value store holding all the attributes of a node. These are the
permanent properties of nodes, things that don't change at runtime such as the
size of filters for convolutions, or the values of constant ops. Because there
can be so many different types of attribute values, from strings, to ints, to
arrays of tensor values, there's a separate protobuf file defining the data
structure that holds them, in
[tensorflow/core/framework/attr_value.proto](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/attr_value.proto).

Each attribute has a unique name string, and the expected attributes are listed
when the operation is defined. If an attribute isn't present in a node, but it
has a default listed in the operation definition, that default is used when the
graph is created.

You can access all of these members by calling `node.name`, `node.op`, etc. in
Python. The list of nodes stored in the `GraphDef` is a full definition of the
model architecture.

## Freezing

One confusing part about this is that the weights usually aren't stored inside
the file format during training. Instead, they're held in separate checkpoint
files, and there are `Variable` ops in the graph that load the latest values
when they're initialized. It's often not very convenient to have separate files
when you're deploying to production, so there's the
[freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py) script that takes a graph definition and a set
of checkpoints and freezes them together into a single file.

What this does is load the `GraphDef`, pull in the values for all the variables
from the latest checkpoint file, and then replace each `Variable` op with a
`Const` that has the numerical data for the weights stored in its attributes
It then strips away all the extraneous nodes that aren't used for forward
inference, and saves out the resulting `GraphDef` into an output file.

## Weight Formats

If you're dealing with TensorFlow models that represent neural networks, one of
the most common problems is extracting and interpreting the weight values. A
common way to store them, for example in graphs created by the freeze_graph
script, is as `Const` ops containing the weights as `Tensors`. These are
defined in
[tensorflow/core/framework/tensor.proto](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/tensor.proto), and contain information
about the size and type of the data, as well as the values themselves. In
Python, you get a `TensorProto` object from a `NodeDef` representing a `Const`
op by calling something like `some_node_def.attr['value'].tensor`.

This will give you an object representing the weights data. The data itself
will be stored in one of the lists with the suffix _val as indicated by the
type of the object, for example `float_val` for 32-bit float data types.

The ordering of convolution weight values is often tricky to deal with when
converting between different frameworks. In TensorFlow, the filter weights for
the `Conv2D` operation are stored on the second input, and are expected to be
in the order `[filter_height, filter_width, input_depth, output_depth]`, where
filter_count increasing by one means moving to an adjacent value in memory.

Hopefully this rundown gives you a better idea of what's going on inside
TensorFlow model files, and will help you if you ever need to manipulate them.