aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/docs_src/get_started/premade_estimators.md
blob: 4f01f997c33c211e8cff81b6b268bb320aa794df (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435

# Getting Started with TensorFlow

This document introduces the TensorFlow programming environment and shows you
how to solve the Iris classification problem in TensorFlow.

## Prerequisites

Prior to using the sample code in this document, you'll need to do the
following:

* @{$install$Install TensorFlow}.
* If you installed TensorFlow with virtualenv or Anaconda, activate your
  TensorFlow environment.
* Install or upgrade pandas by issuing the following command:

        pip install pandas

## Getting the sample code

Take the following steps to get the sample code we'll be going through:

1. Clone the TensorFlow Models repository from GitHub by entering the following
   command:

        git clone https://github.com/tensorflow/models

1. Change directory within that branch to the location containing the examples
   used in this document:

        cd models/samples/core/get_started/

The program described in this document is
[`premade_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py).
This program uses
[`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py)
to fetch its training data.

### Running the program

You run TensorFlow programs as you would run any Python program. For example:

``` bsh
python premade_estimator.py
```

The program should output training logs followed by some predictions against
the test set. For example, the first line in the following output shows that
the model thinks there is a 99.6% chance that the first example in the test
set is a Setosa. Since the test set expected Setosa, this appears to be
a good prediction.

``` None
...
Prediction is "Setosa" (99.6%), expected "Setosa"

Prediction is "Versicolor" (99.8%), expected "Versicolor"

Prediction is "Virginica" (97.9%), expected "Virginica"
```

If the program generates errors instead of answers, ask yourself the following
questions:

* Did you install TensorFlow properly?
* Are you using the correct version of TensorFlow?
* Did you activate the environment you installed TensorFlow in? (This is
  only relevant in certain installation mechanisms.)

## The programming stack

Before getting into the details of the program itself, let's investigate the
programming environment. As the following illustration shows, TensorFlow
provides a programming stack consisting of multiple API layers:

<div style="width:100%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%" src="../images/tensorflow_programming_environment.png">
</div>

We strongly recommend writing TensorFlow programs with the following APIs:

* @{$programmers_guide/estimators$Estimators}, which represent a complete model.
  The Estimator API provides methods to train the model, to judge the model's
  accuracy, and to generate predictions.
* @{$get_started/datasets_quickstart$Datasets}, which build a data input
  pipeline. The Dataset API has methods to load and manipulate data, and feed
  it into your model. The Dataset API meshes well with the Estimators API.

## Classifying irises: an overview

The sample program in this document builds and tests a model that
classifies Iris flowers into three different species based on the size of their
[sepals](https://en.wikipedia.org/wiki/Sepal) and
[petals](https://en.wikipedia.org/wiki/Petal).

<div style="width:80%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%"
  alt="Petal geometry compared for three iris species: Iris setosa, Iris virginica, and Iris versicolor"
  src="../images/iris_three_species.jpg">
</div>
**From left to right,
[*Iris setosa*](https://commons.wikimedia.org/w/index.php?curid=170298) (by
[Radomil](https://commons.wikimedia.org/wiki/User:Radomil), CC BY-SA 3.0),
[*Iris versicolor*](https://commons.wikimedia.org/w/index.php?curid=248095) (by
[Dlanglois](https://commons.wikimedia.org/wiki/User:Dlanglois), CC BY-SA 3.0),
and [*Iris virginica*](https://www.flickr.com/photos/33397993@N05/3352169862)
(by [Frank Mayfield](https://www.flickr.com/photos/33397993@N05), CC BY-SA
2.0).**

### The data set

The Iris data set contains four features and one
[label](https://developers.google.com/machine-learning/glossary/#label).
The four features identify the following botanical characteristics of
individual Iris flowers:

* sepal length
* sepal width
* petal length
* petal width

Our model will represent these features as `float32` numerical data.

The label identifies the Iris species, which must be one of the following:

* Iris setosa (0)
* Iris versicolor (1)
* Iris virginica (2)

Our model will represent the label as `int32` categorical data.

The following table shows three examples in the data set:

|sepal length | sepal width | petal length | petal width| species (label) |
|------------:|------------:|-------------:|-----------:|:---------------:|
|         5.1 |         3.3 |          1.7 |        0.5 |   0 (Setosa)   |
|         5.0 |         2.3 |          3.3 |        1.0 |   1 (versicolor)|
|         6.4 |         2.8 |          5.6 |        2.2 |   2 (virginica) |

### The algorithm

The program trains a Deep Neural Network classifier model having the following
topology:

* 2 hidden layers.
* Each hidden layer contains 10 nodes.

The following figure illustrates the features, hidden layers, and predictions
(not all of the nodes in the hidden layers are shown):

<div style="width:80%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%"
  alt="A diagram of the network architecture: Inputs, 2 hidden layers, and outputs"
  src="../images/custom_estimators/full_network.png">
</div>

### Inference

Running the trained model on an unlabeled example yields three predictions,
namely, the likelihood that this flower is the given Iris species. The sum of
those output predictions will be 1.0. For example, the prediction on an
unlabeled example might be something like the following:

* 0.03 for Iris Setosa
* 0.95 for Iris Versicolor
* 0.02 for Iris Virginica

The preceding prediction indicates a 95% probability that the given unlabeled
example is an Iris Versicolor.

## Overview of programming with Estimators

An Estimator is TensorFlow's high-level representation of a complete model. It
handles the details of initialization, logging, saving and restoring, and many
other features so you can concentrate on your model. For more details see
@{$programmers_guide/estimators}.

An Estimator is any class derived from @{tf.estimator.Estimator}. TensorFlow
provides a collection of
[pre-made Estimators](https://developers.google.com/machine-learning/glossary/#pre-made_Estimator)
(for example, `LinearRegressor`) to implement common ML algorithms. Beyond
those, you may write your own
[custom Estimators](https://developers.google.com/machine-learning/glossary/#custom_Estimator).
We recommend using pre-made Estimators when just getting started with
TensorFlow. After gaining expertise with the pre-made Estimators, we recommend
optimizing your model by creating your own custom Estimators.

To write a TensorFlow program based on pre-made Estimators, you must perform the
following tasks:

* Create one or more input functions.
* Define the model's feature columns.
* Instantiate an Estimator, specifying the feature columns and various
  hyperparameters.
* Call one or more methods on the Estimator object, passing the appropriate
  input function as the source of the data.

Let's see how those tasks are implemented for Iris classification.

## Create input functions

You must create input functions to supply data for training,
evaluating, and prediction.

An **input function** is a function that returns a @{tf.data.Dataset} object
which outputs the following two-element tuple:

* [`features`](https://developers.google.com/machine-learning/glossary/#feature) - A Python dictionary in which:
    * Each key is the name of a feature.
    * Each value is an array containing all of that feature's values.
* `label` - An array containing the values of the
  [label](https://developers.google.com/machine-learning/glossary/#label) for
  every example.

Just to demonstrate the format of the input function, here's a simple
implementation:

```python
def input_evaluation_set():
    features = {'SepalLength': np.array([6.4, 5.0]),
                'SepalWidth':  np.array([2.8, 2.3]),
                'PetalLength': np.array([5.6, 3.3]),
                'PetalWidth':  np.array([2.2, 1.0])}
    labels = np.array([2, 1])
    return features, labels
```

Your input function may generate the `features` dictionary and `label` list any
way you like. However, we recommend using TensorFlow's Dataset API, which can
parse all sorts of data. At a high level, the Dataset API consists of the
following classes:

<div style="width:80%; margin:auto; margin-bottom:10px; margin-top:20px;">
<img style="width:100%"
  alt="A diagram showing subclasses of the Dataset class"
  src="../images/dataset_classes.png">
</div>

Where the individual members are:

* `Dataset` - Base class containing methods to create and transform
  datasets. Also allows you to initialize a dataset from data in memory, or from
  a Python generator.
* `TextLineDataset` - Reads lines from text files.
* `TFRecordDataset` - Reads records from TFRecord files.
* `FixedLengthRecordDataset` - Reads fixed size records from binary files.
* `Iterator` - Provides a way to access one data set element at a time.

The Dataset API can handle a lot of common cases for you. For example,
using the Dataset API, you can easily read in records from a large collection
of files in parallel and join them into a single stream.

To keep things simple in this example we are going to load the data with
[pandas](https://pandas.pydata.org/), and build our input pipeline from this
in-memory data.

Here is the input function used for training in this program, which is available
in [`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py):

``` python
def train_input_fn(features, labels, batch_size):
    """An input function for training"""
    # Convert the inputs to a Dataset.
    dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels))

    # Shuffle, repeat, and batch the examples.
    return dataset.shuffle(1000).repeat().batch(batch_size)
```

## Define the feature columns

A [**feature column**](https://developers.google.com/machine-learning/glossary/#feature_columns)
is an object describing how the model should use raw input data from the
features dictionary. When you build an Estimator model, you pass it a list of
feature columns that describes each of the features you want the model to use.
The @{tf.feature_column} module provides many options for representing data
to the model.

For Iris, the 4 raw features are numeric values, so we'll build a list of
feature columns to tell the Estimator model to represent each of the four
features as 32-bit floating-point values. Therefore, the code to create the
feature column is:

```python
# Feature columns describe how to use the input.
my_feature_columns = []
for key in train_x.keys():
    my_feature_columns.append(tf.feature_column.numeric_column(key=key))
```

Feature columns can be far more sophisticated than those we're showing here.  We
detail feature columns @{$get_started/feature_columns$later on} in our Getting
Started guide.

Now that we have the description of how we want the model to represent the raw
features, we can build the estimator.


## Instantiate an estimator

The Iris problem is a classic classification problem. Fortunately, TensorFlow
provides several pre-made classifier Estimators, including:

* @{tf.estimator.DNNClassifier} for deep models that perform multi-class
  classification.
* @{tf.estimator.DNNLinearCombinedClassifier} for wide & deep models.
* @{tf.estimator.LinearClassifier} for classifiers based on linear models.

For the Iris problem, `tf.estimator.DNNClassifier` seems like the best choice.
Here's how we instantiated this Estimator:

```python
# Build a DNN with 2 hidden layers and 10 nodes in each hidden layer.
classifier = tf.estimator.DNNClassifier(
    feature_columns=my_feature_columns,
    # Two hidden layers of 10 nodes each.
    hidden_units=[10, 10],
    # The model must choose between 3 classes.
    n_classes=3)
```

## Train, Evaluate, and Predict

Now that we have an Estimator object, we can call methods to do the following:

* Train the model.
* Evaluate the trained model.
* Use the trained model to make predictions.

### Train the model

Train the model by calling the Estimator's `train` method as follows:

```python
# Train the Model.
classifier.train(
    input_fn=lambda:iris_data.train_input_fn(train_x, train_y, args.batch_size),
    steps=args.train_steps)
```

Here we wrap up our `input_fn` call in a
[`lambda`](https://docs.python.org/3/tutorial/controlflow.html)
to capture the arguments while providing an input function that takes no
arguments, as expected by the Estimator. The `steps` argument tells the method
to stop training after a number of training steps.

### Evaluate the trained model

Now that the model has been trained, we can get some statistics on its
performance. The following code block evaluates the accuracy of the trained
model on the test data:

```python
# Evaluate the model.
eval_result = classifier.evaluate(
    input_fn=lambda:iris_data.eval_input_fn(test_x, test_y, args.batch_size))

print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))
```

Unlike our call to the `train` method, we did not pass the `steps`
argument to evaluate. Our `eval_input_fn` only yields a single
[epoch](https://developers.google.com/machine-learning/glossary/#epoch) of data.

Running this code yields the following output (or something similar):

```none
Test set accuracy: 0.967
```

### Making predictions (inferring) from the trained model

We now have a trained model that produces good evaluation results.
We can now use the trained model to predict the species of an Iris flower
based on some unlabeled measurements. As with training and evaluation, we make
predictions using a single function call:

```python
# Generate predictions from the model
expected = ['Setosa', 'Versicolor', 'Virginica']
predict_x = {
    'SepalLength': [5.1, 5.9, 6.9],
    'SepalWidth': [3.3, 3.0, 3.1],
    'PetalLength': [1.7, 4.2, 5.4],
    'PetalWidth': [0.5, 1.5, 2.1],
}

predictions = classifier.predict(
    input_fn=lambda:iris_data.eval_input_fn(predict_x,
                                            batch_size=args.batch_size))
```

The `predict` method returns a Python iterable, yielding a dictionary of
prediction results for each example. The following code prints a few
predictions and their probabilities:


``` python
for pred_dict, expec in zip(predictions, expected):
    template = ('\nPrediction is "{}" ({:.1f}%), expected "{}"')

    class_id = pred_dict['class_ids'][0]
    probability = pred_dict['probabilities'][class_id]

    print(template.format(iris_data.SPECIES[class_id],
                          100 * probability, expec))
```

Running the preceding code yields the following output:

``` None
...
Prediction is "Setosa" (99.6%), expected "Setosa"

Prediction is "Versicolor" (99.8%), expected "Versicolor"

Prediction is "Virginica" (97.9%), expected "Virginica"
```


## Summary

Pre-made Estimators are an effective way to quickly create standard models.

Now that you've gotten started writing TensorFlow programs, consider the
following material:

* @{$get_started/checkpoints$Checkpoints} to learn how to save and restore
  models.
* @{$get_started/datasets_quickstart$Datasets} to learn more about importing
  data into your
  model.
* @{$get_started/custom_estimators$Creating Custom Estimators} to learn how to
  write your own Estimator, customized for a particular problem.