From 60a9b8a71341d6809eaa165be50b46195d438e57 Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Tue, 21 Jun 2016 09:04:59 -0800 Subject: First pass on tf.learn quickstart tutorial and code files Change: 125463524 --- tensorflow/g3doc/tutorials/tflearn/index.md | 167 ++++++++++++++++++++++++++++ 1 file changed, 167 insertions(+) create mode 100644 tensorflow/g3doc/tutorials/tflearn/index.md diff --git a/tensorflow/g3doc/tutorials/tflearn/index.md b/tensorflow/g3doc/tutorials/tflearn/index.md new file mode 100644 index 0000000000..a0d9652fc9 --- /dev/null +++ b/tensorflow/g3doc/tutorials/tflearn/index.md @@ -0,0 +1,167 @@ +## TF.Learn Quickstart + +TensorFlow’s Learn API (TF.Learn) makes it easy to configure, train, and evaluate a +variety of machine learning models. In this quickstart tutorial, you’ll use TF.Learn +to construct a [Deep Neural Network](https://en.wikipedia.org/wiki/Artificial_neural_network) +classifier model and train it on [Fisher’s Iris data set](https://en.wikipedia.org/wiki/Iris_flower_data_set) +to predict flower species based on sepal/petal geometry. You’ll perform the following four steps: + +1. Load CSVs containing Iris training/test data into a TensorFlow `Dataset` +2. Construct a [Deep Neural Network classifier](https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.learn.html#DNNClassifier) +3. Fit the DNN model using the training data +4. Evaluate the accuracy of the model + +## Get Started +Remember to [install TensorFlow on your machine](https://www.tensorflow.org/versions/r0.9/get_started/os_setup.html#download-and-setup) +before getting started with this tutorial. The full code and datasets for this tutorial +can be found [here](https://www.tensorflow.org/code/tensorflow/examples/tutorials/tflearnqs/), +and the following sections walk through them in detail. + +## Load the Iris CSV data to TensorFlow + +The [Iris data set](https://en.wikipedia.org/wiki/Iris_flower_data_set) +contains 150 rows of data, comprising 50 samples from each of three related +Iris species: *Iris setosa*, *Iris virginica*, and *Iris versicolor*. Each row +contains the following data for each flower sample: [sepal](https://en.wikipedia.org/wiki/Sepal) +length, sepal width, [petal](https://en.wikipedia.org/wiki/Petal) length, petal width, +and flower species. Flower species are represented as integers, with 0 denoting *Iris +setosa*, 1 denoting *Iris versicolor*, and 2 denoting *Iris virginica*. + +Sepal Length | Sepal Width | Petal Length | Petal Width | Species +:----------- | :---------- | :----------- | :---------- | :------ +5.1 | 3.5 | 1.4 | 0.2 | 0 +4.9 | 3.0 | 1.4 | 0.2 | 0 +4.7 | 3.2 | 1.3 | 0.2 | 0 +… | … | … | … | … +7.0 | 3.2 | 4.7 | 1.4 | 1 +6.4 | 3.2 | 4.5 | 1.5 | 1 +6.9 | 3.1 | 4.9 | 1.5 | 1 +… | … | … | … | … +6.5 | 3.0 | 5.2 | 2.0 | 2 +6.2 | 3.4 | 5.4 | 2.3 | 2 +5.9 | 3.0 | 5.1 | 1.8 | 2 + + +For this tutorial, the Iris data has been randomized and split into two separate CSVs: +a training set of 120 samples ([iris_training.csv](https://www.tensorflow.org/code/tensorflow/examples/tutorials/tflearnqs/iris_training.csv)) +and a test set of 30 samples ([iris_test.csv](https://www.tensorflow.org/code/tensorflow/examples/tutorials/tflearnqs/iris_test.csv)). + +To get started, first import TensorFlow, TF.Learn, and numpy: + +```python +import tensorflow as tf +from tensorflow.contrib import learn +import numpy as np +``` + +Next, load the training and test sets into `Dataset`s using the [`load_csv()`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/learn/python/learn/datasets/base.py#L36) +method in `learn.datasets.base`. `load_csv()` has two required arguments: +`filename`, which takes the filepath to the CSV file, +and `target_dtype`, which takes the [`numpy` datatype](http://docs.scipy.org/doc/numpy/user/basics.types.html) +of the dataset's target value. Here, the target (the value you're training the model to predict) is +flower species, which is an integer from 0–2, so the appropriate `numpy` datatype is `np.int`: + +```python +# Data sets +IRIS_TRAINING = "iris_training.csv" +IRIS_TEST = "iris_test.csv" + +# Load datasets. +training_set = learn.datasets.base.load_csv(filename=IRIS_TRAINING, target_dtype=np.int) +test_set = learn.datasets.base.load_csv(filename=IRIS_TEST, target_dtype=np.int) +``` + +Next, assign variables to the feature data and target values: `x_train` for training-set feature data, +`x_test` for test-set feature data, `y_train` for training-set target values, and `y_test` for test-set +target values. Datasets in TensorFlow are [named tuples](https://docs.python.org/2/library/collections.html#collections.namedtuple), +and you can access feature data and target values via the `data` and `target` fields, respectively: + +```python +x_train, x_test, y_train, y_test = training_set.data, test_set.data, \ + training_set.target, test_set.target +``` + +Later on, in "Fit the DNNClassifier to the Iris Training Data," you'll use `x_train` and `y_train` to +train your model, and in "Evaluate Model Accuracy", you'll use `x_test` and `y_test`. But first, +you'll construct your model in the next section. + +## Construct a Deep Neural Network Classifier + +TF.Learn offers a variety of predefined models, called [`Estimator`s](https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.learn.html#estimators), +which you can use "out of the box" to run training and evaluation operations on your data. +Here, you'll configure a Deep Neural Network Classifier model to fit the iris data. Using TF.Learn, +you can instantiate your [`DNNClassifier`](https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.learn.html#DNNClassifier) +with just one line of code: + +```python +# Build 3 layer DNN with 10, 20, 10 units respectively. +classifier = learn.DNNClassifier(hidden_units=[10, 20, 10], n_classes=3) +``` + +The code above creates a `DNNClassifier` model with three [hidden layers](http://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw), +containing 10, 20, and 10 neurons, respectively (`hidden_units=[10, 20, 10]`), and three target +classes (`n_classes=3`). + + +## Fit the DNNClassifier to the Iris Training Data + +Now that you've configured your DNN `classifier` model, you can fit it to the Iris training data +using the [`fit`](https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.learn.html#BaseEstimator.fit) +method. Pass as arguments your feature data (`x_train`), target values +(`y_train`), and the number of steps to train (here, 200): + +```python +# Fit model +classifier.fit(x=x_train, y=y_train, steps=200) +``` + + + + +The state of the model is preserved in the `classifier`, which means you can train iteratively if +you like. For example, the above is equivalent to the following: + +```python +classifier.fit(x=x_train, y=y_train, steps=100) +classifier.fit(x=x_train, y=y_train, steps=100) +``` + + +However, if you're looking to track the model while it trains, you'll likely +want to instead use a TensorFlow [`monitor`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/learn/python/learn/monitors.py) +to perform logging operations. + +## Evaluate Model Accuracy + +You've fit your `DNNClassifier` model on the Iris training data; now, you can check its accuracy on +the Iris test data using the [`evaluate`](https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.learn.html#BaseEstimator.evaluate) +method. Like `fit`, `evaluate` takes feature data and target values as arguments, +and returns a `dict` with the evaluation results. The following code passes the Iris +test data—`x_test` and `y_test`—to `evaluate`, retrieves `accuracy` from the +results, and prints it to output: + +```python +accuracy_score = classifier.evaluate(x=x_test, y=y_test)["accuracy"] +print('Accuracy: {0:f}'.format(accuracy_score)) +``` + +Run the full script, and check the accuracy results. You should get: + +``` +Accuracy: 0.933333 +``` + +Not bad for a relatively small data set! + +## Additional Resources + +* For further reference materials on TF.Learn, see the official [API docs](https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.learn.html). + + +* To learn more about using TF.Learn to create linear models, see +[Large-scale Linear Models with TensorFlow](https://www.tensorflow.org/versions/r0.9/tutorials/linear/index.html). + +* To experiment with neural network modeling and visualization in the browser, check out [Deep Playground](http://playground.tensorflow.org/). + +* For more advanced tutorials on neural networks, see [Convolutional Neural Networks](https://www.tensorflow.org/versions/r0.9/tutorials/deep_cnn/index.html) +and [Recurrent Neural Networks](https://www.tensorflow.org/versions/r0.9/tutorials/recurrent/index.html). -- cgit v1.2.3 From c0714595cb454e9f2115831b9794fe74e284907b Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Wed, 22 Jun 2016 18:24:41 -0800 Subject: Adds an overview of the tf.learn linear model tools. Change: 125636045 --- tensorflow/g3doc/tutorials/linear/overview.md | 237 ++++++++++++++++++++++++++ 1 file changed, 237 insertions(+) create mode 100644 tensorflow/g3doc/tutorials/linear/overview.md diff --git a/tensorflow/g3doc/tutorials/linear/overview.md b/tensorflow/g3doc/tutorials/linear/overview.md new file mode 100644 index 0000000000..b592495212 --- /dev/null +++ b/tensorflow/g3doc/tutorials/linear/overview.md @@ -0,0 +1,237 @@ +# Large-scale Linear Models with TensorFlow + +The tf.learn API provides (among other things) a rich set of tools for working +with linear models in TensorFlow. This document provides an overview of those +tools. It explains: + + * what a linear model is. + * why you might want to use a linear model. + * how tf.learn makes it easy to build linear models in TensorFlow. + * how you can use tf.learn to combine linear models with + deep learning to get the advantages of both. + +Read this overview to decide whether the tf.learn linear model tools might be +useful to you. Then do the [Linear Models tutorial](wide/) to +give it a try. This overview uses code samples from the tutorial, but the +tutorial walks through the code in greater detail. + +To understand this overview it will help to have some familiarity +with basic machine learning concepts, and also with +[tf.learn](../tflearn/). + +[TOC] + +## What is a linear model? + +A *linear model* uses a single weighted sum of features to make a prediction. +For example, if you have [data](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names) +on age, years of education, and weekly hours of +work for a population, you can learn weights for each of those numbers so that +their weighted sum estimates a person's salary. You can also use linear models +for classification. + +Some linear models transform the weighted sum into a more convenient form. For +example, *logistic regression* plugs the weighted sum into the logistic +function to turn the output into a value between 0 and 1. But you still just +have one weight for each input feature. + +## Why would you want to use a linear model? + +Why would you want to use so simple a model when recent research has +demonstrated the power of more complex neural networks with many layers? + +Linear models: + + * train quickly, compared to deep neural nets. + * can work well on very large feature sets. + * can be trained with algorithms that don't require a lot of fiddling + with learning rates, etc. + * can be interpreted and debugged more easily than neural nets. + You can examine the weights assigned to each feature to figure out what's + having the biggest impact on a prediction. + * provide an excellent starting point for learning about machine learning. + * are widely used in industry. + +## How does tf.learn help you build linear models? + +You can build a linear model from scratch in TensorFlow without the help of a +special API. But tf.learn provides some tools that make it easier to build +effective large-scale linear models. + +### Feature columns and transformations + +Much of the work of designing a linear model consists of transforming raw data +into suitable input features. tf.learn uses the `FeatureColumn` abstraction to +enable these transformations. + +A `FeatureColumn` represents a single feature in your data. A `FeatureColumn` +may represent a quantity like 'height', or it may represent a category like +'eye_color' where the value is drawn from a set of discrete possibilities like {'blue', 'brown', 'green'}. + +In the case of both *continuous features* like 'height' and *categorical +features* like 'eye_color', a single value in the data might get transformed +into a sequence of numbers before it is input into the model. The +`FeatureColumn` abstraction lets you manipulate the feature as a single +semantic unit in spite of this fact. You can specify transformations and +select features to include without dealing with specific indices in the +tensors you feed into the model. + +#### Sparse columns + +Categorical features in linear models are typically translated into a sparse +vector in which each possible value has a corresponding index or id. For +example, if there are only three possible eye colors you can represent +'eye_color' as a length 3 vector: 'brown' would become [1, 0, 0], 'blue' would +become [0, 1, 0] and 'green' would become [0, 0, 1]. These vectors are called +"sparse" because they may be very long, with many zeros, when the set of +possible values is very large (such as all English words). + +While you don't need to use sparse columns to use tf.learn linear models, one +of the strengths of linear models is their ability to deal with large sparse +vectors. Sparse features are a primary use case for the tf.learn linear model +tools. + +##### Encoding sparse columns + +`FeatureColumn` handles the conversion of categorical values into vectors +automatically, with code like this: + +```python +eye_color = tf.contrib.layers.sparse_column_with_keys( + column_name="eye_color", keys=["blue", "brown", "green"]) +``` + +where `eye_color` is the name of a column in your source data. + +You can also generate `FeatureColumn`s for categorical features for which you +don't know all possible values. For this case you would use +`sparse_column_with_hash_bucket()`, which uses a hash function to assign +indices to feature values. + +```python +education = tf.contrib.layers.sparse_column_with_hash_bucket(\ + "education", hash_bucket_size=1000) +``` + +##### Feature Crosses + +Because linear models assign independent weights to separate features, they +can't learn the relative importance of specific combinations of feature +values. If you have a feature 'favorite_sport' and a feature 'home_city' and +you're trying to predict whether a person likes to wear red, your linear model +won't be able to learn that baseball fans from St. Louis especially like to +wear red. + +You can get around this limitation by creating a new feature +'favorite_sport_x_home_city'. The value of this feature for a given person is +just the concatenation of the values of the two source features: +'baseball_x_stlouis', for example. This sort of combination feature is called +a *feature cross*. + +The `crossed_column()` method makes it easy to set up feature crosses: + +```python +sport = tf.contrib.layers.sparse_column_with_hash_bucket(\ + "sport", hash_bucket_size=1000) +city = tf.contrib.layers.sparse_column_with_hash_bucket(\ + "city", hash_bucket_size=1000) +sport_x_city = tf.contrib.layers.crossed_column( + [sport, city], hash_bucket_size=int(1e4)) +``` + +#### Continuous columns + +You can specify a continuous feature like so: + +```python +age = tf.contrib.layers.real_valued_column("age") +``` + +Although, as a single real number, a continuous feature can often be input +directly into the model, tf.learn offers useful transformations for this sort +of column as well. + +##### Bucketization + +*Bucketization* turns a continuous column into a categorical column. This +transformation lets you use continuous features in feature crosses, or learn +cases where specific value ranges have particular importance. + +Bucketization divides the range of possible values into subranges called +buckets: + +```python +age_buckets = tf.contrib.layers.bucketized_column( + age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) +``` + +The bucket into which a value falls becomes the categorical label for +that value. + +#### Input function + +`FeatureColumn`s provide a specification for the input data for your model, +indicating how to represent and transform the data. But they do not provide +the data itself. You provide the data through an input function. + +The input function must return a dictionary of tensors. Each key corresponds +to the name of a `FeatureColumn`. Each key's value is a tensor containing the +values of that feature for all data instances. See `input_fn` in the [linear +models tutorial code]( +https://www.tensorflow.org/code/tensorflow/examples/learn/wide_n_deep_tutorial.py?l=160) +for an example of an input function. + +The input function is passed to the `fit()` and `evaluate()` calls that +initiate training and testing, as described in the next section. + +### Linear estimators + +tf.learn's estimator classes provide a unified training and evaluation harness +for regression and classification models. They take care of the details of the +training and evaluation loops and allow the user to focus on model inputs and +architecture. + +To build a linear estimator, you can use either the +`tf.contrib.learn.LinearClassifier` estimator or the +`tf.contrib.learn.LinearRegressor` estimator, for classification and +regression respectively. + +As with all tf.learn estimators, to run the estimator you just: + + 1. Instantiate the estimator class. For the two linear estimator classes, + you pass a list of `FeatureColumn`s to the constructor. + 2. Call the estimator's `fit()` method to train it. + 3. Call the estimator's `evaluate()` method to see how it does. + +For example: + +```python +e = tf.contrib.learn.LinearClassifier(feature_columns=[ + native_country, education, occupation, workclass, marital_status, + race, age_buckets, education_x_occupation, age_buckets_x_race_x_occupation], + model_dir=YOUR_MODEL_DIRECTORY) +e.fit(input_fn=input_fn_train, steps=200) +# Evaluate for one step (one pass through the test data). +results = e.evaluate(input_fn=input_fn_test, steps=1) + +# Print the stats for the evaluation. +for key in sorted(results): + print "%s: %s" % (key, results[key]) +``` + +### Wide and deep learning + +The tf.learn API also provides an estimator class that lets you jointly train +a linear model and a deep neural network. This novel approach combines the +ability of linear models to "memorize" key features with the generalization +ability of neural nets. Use `tf.contrib.learn.DNNLinearCombinedClassifier` to +create this sort of "wide and deep" model: + +```python +e = tf.contrib.learn.DNNLinearCombinedClassifier( + model_dir=YOUR_MODEL_DIR, + linear_feature_columns=wide_columns, + dnn_feature_columns=deep_columns, + dnn_hidden_units=[100, 50]) +``` +For more information, see the [Wide and Deep Learning tutorial](../wide_n_deep/). -- cgit v1.2.3 From 06d5dbbbb9994ed18f363d5b6b6c2c27ab4ecbbd Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Tue, 28 Jun 2016 08:24:35 -0800 Subject: Publishes tutorials for tf.contrib.learn linear models and wide and deep models. Change: 126082003 --- tensorflow/g3doc/images/wide_n_deep.svg | 1540 +++++++++++++++++++++ tensorflow/g3doc/tutorials/wide/index.md | 482 +++++++ tensorflow/g3doc/tutorials/wide_and_deep/index.md | 275 ++++ 3 files changed, 2297 insertions(+) create mode 100644 tensorflow/g3doc/images/wide_n_deep.svg create mode 100644 tensorflow/g3doc/tutorials/wide/index.md create mode 100644 tensorflow/g3doc/tutorials/wide_and_deep/index.md diff --git a/tensorflow/g3doc/images/wide_n_deep.svg b/tensorflow/g3doc/images/wide_n_deep.svg new file mode 100644 index 0000000000..6dfe9e7f10 --- /dev/null +++ b/tensorflow/g3doc/images/wide_n_deep.svg @@ -0,0 +1,1540 @@ + + + + + + image/svg+xml + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/tensorflow/g3doc/tutorials/wide/index.md b/tensorflow/g3doc/tutorials/wide/index.md new file mode 100644 index 0000000000..5dd409f4e4 --- /dev/null +++ b/tensorflow/g3doc/tutorials/wide/index.md @@ -0,0 +1,482 @@ +# TensorFlow Linear Model Tutorial + +In this tutorial, we will use the TF.Learn API in TensorFlow to solve a binary +classification problem: Given census data about a person such as age, gender, +education and occupation (the features), we will try to predict whether or not +the person earns more than 50,000 dollars a year (the target label). We will +train a **logistic regression** model, and given an individual's information our +model will output a number between 0 and 1, which can be interpreted as the +probability that the individual has an annual income of over 50,000 dollars. + +## Setup + +To try the code for this tutorial: + +1. [Install TensorFlow](../../get_started/os_setup.md) if you haven't +already. + +2. Download [the tutorial code]( +https://www.tensorflow.org/code/tensorflow/examples/learn/wide_n_deep_tutorial.py). + +3. Install the pandas data analysis library. tf.learn doesn't require pandas, but it does support it, and this tutorial uses pandas. To install pandas: + 1. Get `pip`: + + ```shell + # Ubuntu/Linux 64-bit + $ sudo apt-get install python-pip python-dev + + # Mac OS X + $ sudo easy_install pip + $ sudo easy_install --upgrade six + ``` + + 2. Use `pip` to install pandas: + + ```shell + $ sudo pip install pandas + ``` + + If you have trouble installing pandas, consult the [instructions] +(http://pandas.pydata.org/pandas-docs/stable/install.html) on the pandas site. + +4. Execute the tutorial code with the following command to train the linear +model described in this tutorial: + + ```shell + $ python wide_n_deep_tutorial.py --model_type=wide + ``` + +Read on to find out how this code builds its linear model. + +## Reading The Census Data + +The dataset we'll be using is the [Census Income Dataset] +(https://archive.ics.uci.edu/ml/datasets/Census+Income). You can download the +[training data] +(https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data) and +[test data] +(https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test) +manually or use code like this: + +```python +import tempfile +import urllib +train_file = tempfile.NamedTemporaryFile() +test_file = tempfile.NamedTemporaryFile() +urllib.urlretrieve("https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data", train_file.name) +urllib.urlretrieve("https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test", test_file.name) +``` + +Once the CSV files are downloaded, let's read them into [Pandas] +(http://pandas.pydata.org/) dataframes. + +```python +import pandas as pd +COLUMNS = ["age", "workclass", "fnlwgt", "education", "education_num", + "marital_status", "occupation", "relationship", "race", "gender", + "capital_gain", "capital_loss", "hours_per_week", "native_country", + "income_bracket"] +df_train = pd.read_csv(train_file, names=COLUMNS, skipinitialspace=True) +df_test = pd.read_csv(test_file, names=COLUMNS, skipinitialspace=True, skiprows=1) +``` + +Since the task is a binary classification problem, we'll construct a label +column named "label" whose value is 1 if the income is over 50K, and 0 +otherwise. + +```python +LABEL_COLUMN = "label" +df_train[LABEL_COLUMN] = (df_train["income_bracket"].apply(lambda x: ">50K" in x)).astype(int) +df_test[LABEL_COLUMN] = (df_test["income_bracket"].apply(lambda x: ">50K" in x)).astype(int) +``` + +Next, let's take a look at the dataframe and see which columns we can use to +predict the target label. The columns can be grouped into two types—categorical +and continuous columns: + +* A column is called **categorical** if its value can only be one of the + categories in a finite set. For example, the native country of a person + (U.S., India, Japan, etc.) or the education level (high school, college, + etc.) are categorical columns. +* A column is called **continuous** if its value can be any numerical value in + a continuous range. For example, the capital gain of a person (e.g. $14,084) + is a continuous column. + +```python +CATEGORICAL_COLUMNS = ["workclass", "education", "marital_status", "occupation", + "relationship", "race", "gender", "native_country"] +CONTINUOUS_COLUMNS = ["age", "education_num", "capital_gain", "capital_loss", "hours_per_week"] +``` + +Here's a list of columns available in the Census Income dataset: + +| Column Name | Type | Description | {.sortable} +| -------------- | ----------- | --------------------------------- | +| age | Continuous | The age of the individual | +| workclass | Categorical | The type of employer the | +: : : individual has (government, : +: : : military, private, etc.). : +| fnlwgt | Continuous | The number of people the census | +: : : takers believe that observation : +: : : represents (sample weight). This : +: : : variable will not be used. : +| education | Categorical | The highest level of education | +: : : achieved for that individual. : +| education_num | Continuous | The highest level of education in | +: : : numerical form. : +| marital_status | Categorical | Marital status of the individual. | +| occupation | Categorical | The occupation of the individual. | +| relationship | Categorical | Wife, Own-child, Husband, | +: : : Not-in-family, Other-relative, : +: : : Unmarried. : +| race | Categorical | White, Asian-Pac-Islander, | +: : : Amer-Indian-Eskimo, Other, Black. : +| gender | Categorical | Female, Male. | +| capital_gain | Continuous | Capital gains recorded. | +| capital_loss | Continuous | Capital Losses recorded. | +| hours_per_week | Continuous | Hours worked per week. | +| native_country | Categorical | Country of origin of the | +: : : individual. : +| income | Categorical | ">50K" or "<=50K", meaning | +: : : whether the person makes more : +: : : than \$50,000 annually. : + +## Converting Data into Tensors + +When building a TF.Learn model, the input data is specified by means of an Input +Builder function. This builder function will not be called until it is later +passed to TF.Learn methods such as `fit` and `evaluate`. The purpose of this +function is to construct the input data, which is represented in the form of +[Tensors] +(https://www.tensorflow.org/versions/r0.9/api_docs/python/framework.html#Tensor) +or [SparseTensors] +(https://www.tensorflow.org/versions/r0.9/api_docs/python/sparse_ops.html#SparseTensor). +In more detail, the Input Builder function returns the following as a pair: + +1. `feature_cols`: A dict from feature column names to `Tensors` or + `SparseTensors`. +2. `label`: A `Tensor` containing the label column. + +The keys of the `feature_cols` will be used to when construct columns in the +next section. Because we want to call the `fit` and `evaluate` methods with +different data, we define two different input builder functions, +`train_input_fn` and `test_input_fn` which are identical except that they pass +different data to `input_fn`. Note that `input_fn` will be called while +constructing the TensorFlow graph, not while running the graph. What it is +returning is a representation of the input data as the fundamental unit of +TensorFlow computations, a `Tensor` (or `SparseTensor`). + +Our model represents the input data as *constant* tensors, meaning that the +tensor represents a constant value, in this case the values of a particular +column of `df_train` or `df_test`. This is the simplest way to pass data into +TensorFlow. Another more advanced way to represent input data would be to +construct an [Input Reader] +(https://www.tensorflow.org/versions/r0.9/api_docs/python/io_ops.html#inputs-and-readers) +that represents a file or other data source, and iterates through the file as +TensorFlow runs the graph. Each continuous column in the train or test dataframe +will be converted into a `Tensor`, which in general is a good format to +represent dense data. For cateogorical data, we must represent the data as a +`SparseTensor`. This data format is good for representing sparse data. + +```python +import tensorflow as tf + +def input_fn(df): + # Creates a dictionary mapping from each continuous feature column name (k) to + # the values of that column stored in a constant Tensor. + continuous_cols = {k: tf.constant(df[k].values) + for k in CONTINUOUS_COLUMNS} + # Creates a dictionary mapping from each categorical feature column name (k) + # to the values of that column stored in a tf.SparseTensor. + categorical_cols = {k: tf.SparseTensor( + indices=[[i, 0] for i in range(df[k].size)], + values=df[k].values, + shape=[df[k].size, 1]) + for k in CATEGORICAL_COLUMNS} + # Merges the two dictionaries into one. + feature_cols = dict(continuous_cols.items() + categorical_cols.items()) + # Converts the label column into a constant Tensor. + label = tf.constant(df[LABEL_COLUMN].values) + # Returns the feature columns and the label. + return feature_cols, label + +def train_input_fn(): + return input_fn(df_train) + +def eval_input_fn(): + return input_fn(df_test) +``` + +## Selecting and Engineering Features for the Model + +Selecting and crafting the right set of feature columns is key to learning an +effective model. A **feature column** can be either one of the raw columns in +the original dataframe (let's call them **base feature columns**), or any new +columns created based on some transformations defined over one or multiple base +columns (let's call them **derived feature columns**). Basically, "feature +column" is an abstract concept of any raw or derived variable that can be used +to predict the target label. + +### Base Categorical Feature Columns + +To define a feature column for a categorical feature, we can create a +`SparseColumn` using the TF.Learn API. If you know the set of all possible +feature values of a column and there are only a few of them, you can use +`sparse_column_with_keys`. Each key in the list will get assigned an +auto-incremental ID starting from 0. For example, for the `gender` column we can +assign the feature string "female" to an integer ID of 0 and "male" to 1 by +doing: + +```python +gender = tf.contrib.layers.sparse_column_with_keys( + column_name="gender", keys=["female", "male"]) +``` + +What if we don't know the set of possible values in advance? Not a problem. We +can use `sparse_column_with_hash_bucket` instead: + +```python +education = tf.contrib.layers.sparse_column_with_hash_bucket("education", hash_bucket_size=1000) +``` + +What will happen is that each possible value in the feature column `education` +will be hashed to an integer ID as we encounter them in training. See an example +illustration below: + +ID | Feature +--- | ------------- +... | +9 | `"Bachelors"` +... | +103 | `"Doctorate"` +... | +375 | `"Masters"` +... | + +No matter which way we choose to define a `SparseColumn`, each feature string +will be mapped into an integer ID by looking up a fixed mapping or by hashing. +Note that hashing collisions are possible, but may not significantly impact the +model quality. Under the hood, the `LinearModel` class is responsible for +managing the mapping and creating `tf.Variable` to store the model parameters +(also known as model weights) for each feature ID. The model parameters will be +learned through the model training process we'll go through later. + +We'll do the similar trick to define the other categorical features: + +```python +race = tf.contrib.layers.sparse_column_with_keys(column_name="race", keys=[ + "Amer-Indian-Eskimo", "Asian-Pac-Islander", "Black", "Other", "White"]) +marital_status = tf.contrib.layers.sparse_column_with_hash_bucket("marital_status", hash_bucket_size=100) +relationship = tf.contrib.layers.sparse_column_with_hash_bucket("relationship", hash_bucket_size=100) +workclass = tf.contrib.layers.sparse_column_with_hash_bucket("workclass", hash_bucket_size=100) +occupation = tf.contrib.layers.sparse_column_with_hash_bucket("occupation", hash_bucket_size=1000) +native_country = tf.contrib.layers.sparse_column_with_hash_bucket("native_country", hash_bucket_size=1000) +``` + +### Base Continuous Feature Columns + +Similarly, we can define a `RealValuedColumn` for each continuous feature column +that we want to use in the model: + +```python +age = tf.contrib.layers.real_valued_column("age") +education_num = tf.contrib.layers.real_valued_column("education_num") +capital_gain = tf.contrib.layers.real_valued_column("capital_gain") +capital_loss = tf.contrib.layers.real_valued_column("capital_loss") +hours_per_week = tf.contrib.layers.real_valued_column("hours_per_week") +``` + +### Making Continuous Features Categorical through Bucketization + +Sometimes the relationship between a continuous feature and the label is not +linear. As an hypothetical example, a person's income may grow with age in the +early stage of one's career, then the growth may slow at some point, and finally +the income decreases after retirement. In this scenario, using the raw `age` as +a real-valued feature column might not be a good choice because the model can +only learn one of the three cases: + +1. Income always increases at some rate as age grows (positive correlation), +1. Income always decreases at some rate as age grows (negative correlation), or +1. Income stays the same no matter at what age (no correlation) + +If we want to learn the fine-grained correlation between income and each age +group seperately, we can leverage **bucketization**. Bucketization is a process +of dividing the entire range of a continuous feature into a set of consecutive +bins/buckets, and then converting the original numerical feature into a bucket +ID (as a categorical feature) depending on which bucket that value falls into. +So, we can define a `bucketized_column` over `age` as: + +```python +age_buckets = tf.contrib.layers.bucketized_column(age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) +``` + +where the `boundaries` is a list of bucket boundaries. In this case, there are +10 boundaries, resulting in 11 age group buckets (from age 17 and below, 18-24, +25-29, ..., to 65 and over). + +### Intersecting Multiple Columns with CrossedColumn + +Using each base feature column separately may not be enough to explain the data. +For example, the correlation between education and the label (earning > 50,000 +dollars) may be different for different occupations. Therefore, if we only learn +a single model weight for `education="Bachelors"` and `education="Masters"`, we +won't be able to capture every single education-occupation combination (e.g. +distinguishing between `education="Bachelors" AND occupation="Exec-managerial"` +and `education="Bachelors" AND occupation="Craft-repair"`). To learn the +differences between different feature combinations, we can add **crossed feature +columns** to the model. + +```python +education_x_occupation = tf.contrib.layers.crossed_column([education, occupation], hash_bucket_size=int(1e4)) +``` + +We can also create a `CrossedColumn` over more than two columns. Each +constituent column can be either a base feature column that is categorical +(`SparseColumn`), a bucketized real-valued feature column (`BucketizedColumn`), +or even another `CrossColumn`. Here's an example: + +```python +age_buckets_x_race_x_occupation = tf.contrib.layers.crossed_column( + [age_buckets, race, occupation], hash_bucket_size=int(1e6)) +``` + +## Defining The Logistic Regression Model + +After processing the input data and defining all the feature columns, we're now +ready to put them all together and build a Logistic Regression model. In the +previous section we've seen several types of base and derived feature columns, +including: + +* `SparseColumn` +* `RealValuedColumn` +* `BucketizedColumn` +* `CrossedColumn` + +All of these are subclasses of the abstract `FeatureColumn` class, and can be +added to the `feature_columns` field of a model: + +```python +model_dir = tempfile.mkdtemp() +m = tf.contrib.learn.LinearClassifier(feature_columns=[ + gender, native_country, education, occupation, workclass, marital_status, race, + age_buckets, education_x_occupation, age_buckets_x_race_x_occupation], + model_dir=model_dir) +``` + +The model also automatically learns a bias term, which controls the prediction +one would make without observing any features (see the section "How Logistic +Regression Works" for more explanations). The learned model files will be stored +in `model_dir`. + +## Training and Evaluating Our Model + +After adding all the features to the model, now let's look at how to actually +train the model. Training a model is just a one-liner using the TF.Learn API: + +```python +m.fit(input_fn=train_input_fn, steps=200) +``` + +After the model is trained, we can evaluate how good our model is at predicting +the labels of the holdout data: + +```python +results = m.evaluate(input_fn=eval_input_fn, steps=1) +for key in sorted(results): + print "%s: %s" % (key, results[key]) +``` + +The first line of the output should be something like `accuracy: 0.83557522`, +which means the accuracy is 83.6%. Feel free to try more features and +transformations and see if you can do even better! + +If you'd like to see a working end-to-end example, you can download our [example +code] +(https://www.tensorflow.org/code/tensorflow/examples/learn/wide_n_deep_tutorial.py) +and set the `model_type` flag to `wide`. + +## Adding Regularization to Prevent Overfitting + +Regularization is a technique used to avoid **overfitting**. Overfitting happens +when your model does well on the data it is trained on, but worse on test data +that the model has not seen before, such as live traffic. Overfitting generally +occurs when a model is excessively complex, such as having too many parameters +relative to the number of observed training data. Regularization allows for you +to control your model's complexity and makes the model more generalizable to +unseen data. + +In the Linear Model library, you can add L1 and L2 regularizations to the model +as: + +``` +m = tf.contrib.learn.LinearClassifier(feature_columns=[ + gender, native_country, education, occupation, workclass, marital_status, race, + age_buckets, education_x_occupation, age_buckets_x_race_x_occupation], + optimizer=tf.train.FtrlOptimizer( + learning_rate=0.1, + l1_regularization_strength=1.0, + l2_regularization_strength=1.0), + model_dir=model_dir) +``` + +One important difference between L1 and L2 regularization is that L1 +regularization tends to make model weights stay at zero, creating sparser +models, whereas L2 regularization also tries to make the model weights closer to +zero but not necessarily zero. Therefore, if you increase the strength of L1 +regularization, you will have a smaller model size because many of the model +weights will be zero. This is often desirable when the feature space is very +large but sparse, and when there are resource constraints that prevent you from +serving a model that is too large. + +In practice, you should try various combinations of L1, L2 regularization +strengths and find the best parameters that best control overfitting and give +you a desirable model size. + +## How Logistic Regression Works + +Finally, let's take a minute to talk about what the Logistic Regression model +actually looks like in case you're not already familiar with it. We'll denote +the label as $$Y$$, and the set of observed features as a feature vector +$$\mathbf{x}=[x_1, x_2, ..., x_d]$$. We define $$Y=1$$ if an individual earned > +50,000 dollars and $$Y=0$$ otherwise. In Logistic Regression, the probability of +the label being positive ($$Y=1$$) given the features $$\mathbf{x}$$ is given +as: + +$$ P(Y=1|\mathbf{x}) = \frac{1}{1+\exp(-(\mathbf{w}^T\mathbf{x}+b))}$$ + +where $$\mathbf{w}=[w_1, w_2, ..., w_d]$$ are the model weights for the features +$$\mathbf{x}=[x_1, x_2, ..., x_d]$$. $$b$$ is a constant that is often called +the **bias** of the model. The equation consists of two parts—A linear model and +a logistic function: + +* **Linear Model**: First, we can see that $$\mathbf{w}^T\mathbf{x}+b = b + + w_1x_1 + ... +w_dx_d$$ is a linear model where the output is a linear + function of the input features $$\mathbf{x}$$. The bias $$b$$ is the + prediction one would make without observing any features. The model weight + $$w_i$$ reflects how the feature $$x_i$$ is correlated with the positive + label. If $$x_i$$ is positively correlated with the positive label, the + weight $$w_i$$ increases, and the probability $$P(Y=1|\mathbf{x})$$ will be + closer to 1. On the other hand, if $$x_i$$ is negatively correlated with the + positive label, then the weight $$w_i$$ decreases and the probability + $$P(Y=1|\mathbf{x})$$ will be closer to 0. + +* **Logistic Function**: Second, we can see that there's a logistic function + (also known as the sigmoid function) $$S(t) = 1/(1+\exp(-t))$$ being applied + to the linear model. The logistic function is used to convert the output of + the linear model $$\mathbf{w}^T\mathbf{x}+b$$ from any real number into the + range of $$[0, 1]$$, which can be interpreted as a probability. + +Model training is an optimization problem: The goal is to find a set of model +weights (i.e. model parameters) to minimize a **loss function** defined over the +training data, such as logistic loss for Logistic Regression models. The loss +function measures the discrepancy between the ground-truth label and the model's +prediction. If the prediction is very close to the ground-truth label, the loss +value will be low; if the prediction is very far from the label, then the loss +value would be high. + +## Learn Deeper + +If you're interested in learning more, check out our [Wide & Deep Learning +Tutorial](../wide_and_deep/) where we'll show you how to combine +the strengths of linear models and deep neural networks by jointly training them +using the TF.Learn API. diff --git a/tensorflow/g3doc/tutorials/wide_and_deep/index.md b/tensorflow/g3doc/tutorials/wide_and_deep/index.md new file mode 100644 index 0000000000..910e91e1d0 --- /dev/null +++ b/tensorflow/g3doc/tutorials/wide_and_deep/index.md @@ -0,0 +1,275 @@ +# TensorFlow Wide & Deep Learning Tutorial + +In the previous [TensorFlow Linear Model Tutorial](../wide/), +we trained a logistic regression model to predict the probability that the +individual has an annual income of over 50,000 dollars using the [Census Income +Dataset](https://archive.ics.uci.edu/ml/datasets/Census+Income). TensorFlow is +great for training deep neural networks too, and you might be thinking which one +you should choose—Well, why not both? Would it be possible to combine the +strengths of both in one model? + +In this tutorial, we'll introduce how to use the TF.Learn API to jointly train a +wide linear model and a deep feed-forward neural network. This approach combines +the strengths of memorization and generalization. It's useful for generic +large-scale regression and classification problems with sparse input features +(e.g., categorical features with a large number of possible feature values). If +you're interested in learning more about how Wide & Deep Learning works, please +check out our [research paper](http://arxiv.org/abs/1606.07792). + +![Wide & Deep Spectrum of Models] +(../../images/wide_n_deep.svg "Wide & Deep") + +The figure above shows a comparison of a wide model (logistic regression with +sparse features and transformations), a deep model (feed-forward neural network +with an embedding layer and several hidden layers), and a Wide & Deep model +(joint training of both). At a high level, there are only 3 steps to configure a +wide, deep, or Wide & Deep model using the TF.Learn API: + +1. Select features for the wide part: Choose the sparse base columns and + crossed columns you want to use. +1. Select features for the deep part: Choose the continuous columns, the + embedding dimension for each categorical column, and the hidden layer sizes. +1. Put them all together in a Wide & Deep model + (`DNNLinearCombinedClassifier`). + +And that's it! Let's go through a simple example. + +## Setup + +To try the code for this tutorial: + +1. [Install TensorFlow](../../get_started/os_setup.md) if you haven't +already. + +2. Download [the tutorial code]( +https://www.tensorflow.org/code/tensorflow/examples/learn/wide_n_deep_tutorial.py). + +3. Install the pandas data analysis library. tf.learn doesn't require pandas, but it does support it, and this tutorial uses pandas. To install pandas: + 1. Get `pip`: + + ```shell + # Ubuntu/Linux 64-bit + $ sudo apt-get install python-pip python-dev + + # Mac OS X + $ sudo easy_install pip + $ sudo easy_install --upgrade six + ``` + + 2. Use `pip` to install pandas: + + ```shell + $ sudo pip install pandas + ``` + + If you have trouble installing pandas, consult the [instructions] +(http://pandas.pydata.org/pandas-docs/stable/install.html) on the pandas site. + +4. Execute the tutorial code with the following command to train the linear +model described in this tutorial: + + ```shell + $ python wide_n_deep_tutorial.py --model_type=wide_n_deep + ``` + +Read on to find out how this code builds its linear model. + + +## Define Base Feature Columns + +First, let's define the base categorical and continuous feature columns that +we'll use. These base columns will be the building blocks used by both the wide +part and the deep part of the model. + +```python +import tensorflow as tf + +# Categorical base columns. +gender = tf.contrib.layers.sparse_column_with_keys(column_name="gender", keys=["female", "male"]) +race = tf.contrib.layers.sparse_column_with_keys(column_name="race", keys=[ + "Amer-Indian-Eskimo", "Asian-Pac-Islander", "Black", "Other", "White"]) +education = tf.contrib.layers.sparse_column_with_hash_bucket("education", hash_bucket_size=1000) +marital_status = tf.contrib.layers.sparse_column_with_hash_bucket("marital_status", hash_bucket_size=100) +relationship = tf.contrib.layers.sparse_column_with_hash_bucket("relationship", hash_bucket_size=100) +workclass = tf.contrib.layers.sparse_column_with_hash_bucket("workclass", hash_bucket_size=100) +occupation = tf.contrib.layers.sparse_column_with_hash_bucket("occupation", hash_bucket_size=1000) +native_country = tf.contrib.layers.sparse_column_with_hash_bucket("native_country", hash_bucket_size=1000) + +# Continuous base columns. +age = tf.contrib.layers.real_valued_column("age") +age_buckets = tf.contrib.layers.bucketized_column(age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) +education_num = tf.contrib.layers.real_valued_column("education_num") +capital_gain = tf.contrib.layers.real_valued_column("capital_gain") +capital_loss = tf.contrib.layers.real_valued_column("capital_loss") +hours_per_week = tf.contrib.layers.real_valued_column("hours_per_week") +``` + +## The Wide Model: Linear Model with Crossed Feature Columns + +The wide model is a linear model with a wide set of sparse and crossed feature +columns: + +```python +wide_columns = [ + gender, native_country, education, occupation, workclass, marital_status, relationship, age_buckets, + tf.contrib.layers.crossed_column([education, occupation], hash_bucket_size=int(1e4)), + tf.contrib.layers.crossed_column([native_country, occupation], hash_bucket_size=int(1e4)), + tf.contrib.layers.crossed_column([age_buckets, race, occupation], hash_bucket_size=int(1e6))] +``` + +Wide models with crossed feature columns can memorize sparse interactions +between features effectively. That being said, one limitation of crossed feature +columns is that they do not generalize to feature combinations that have not +appeared in the training data. Let's add a deep model with embeddings to fix +that. + +## The Deep Model: Neural Network with Embeddings + +The deep model is a feed-forward neural network, as shown in the previous +figure. Each of the sparse, high-dimensional categorical features are first +converted into a low-dimensional and dense real-valued vector, often referred to +as an embedding vector. These low-dimensional dense embedding vectors are +concatenated with the continuous features, and then fed into the hidden layers +of a neural network in the forward pass. The embedding values are initialized +randomly, and are trained along with all other model parameters to minimize the +training loss. If you're interested in learning more about embeddings, check out +the TensorFlow tutorial on [Vector Representations of Words] +(https://www.tensorflow.org/versions/r0.9/tutorials/word2vec/index.html), or +[Word Embedding](https://en.wikipedia.org/wiki/Word_embedding) on Wikipedia. + +We'll configure the embeddings for the categorical columns using +`embedding_column`, and concatenate them with the continuous columns: + +```python +deep_columns = [ + tf.contrib.layers.embedding_column(workclass, dimension=8), + tf.contrib.layers.embedding_column(education, dimension=8), + tf.contrib.layers.embedding_column(marital_status, dimension=8), + tf.contrib.layers.embedding_column(gender, dimension=8), + tf.contrib.layers.embedding_column(relationship, dimension=8), + tf.contrib.layers.embedding_column(race, dimension=8), + tf.contrib.layers.embedding_column(native_country, dimension=8), + tf.contrib.layers.embedding_column(occupation, dimension=8), + age, education_num, capital_gain, capital_loss, hours_per_week] +``` + +The higher the `dimension` of the embedding is, the more degrees of freedom the +model will have to learn the representations of the features. For simplicity, we +set the dimension to 8 for all feature columns here. Empirically, a more +informed decision for the number of dimensions is to start with a value on the +order of $$k\log_2(n)$$ or $$k\sqrt[4]n$$, where $$n$$ is the number of unique +features in a feature column and $$k$$ is a small constant (usually smaller than +10). + +Through dense embeddings, deep models can generalize better and make predictions +on feature pairs that were previously unseen in the training data. However, it +is difficult to learn effective low-dimensional representations for feature +columns when the underlying interaction matrix between two feature columns is +sparse and high-rank. In such cases, the interaction between most feature pairs +should be zero except a few, but dense embeddings will lead to nonzero +predictions for all feature pairs, and thus can over-generalize. On the other +hand, linear models with crossed features can memorize these “exception rules” +effectively with fewer model parameters. + +Now, let's see how to jointly train wide and deep models and allow them to +complement each other’s strengths and weaknesses. + +## Combining Wide and Deep Models into One + +The wide models and deep models are combined by summing up their final output +log odds as the prediction, then feeding the prediction to a logistic loss +function. All the graph definition and variable allocations have already been +handled for you under the hood, so you simply need to create a +`DNNLinearCombinedClassifier`: + +```python +import tempfile +model_dir = tempfile.mkdtemp() +m = tf.contrib.learn.DNNLinearCombinedClassifier( + model_dir=model_dir, + linear_feature_columns=wide_columns, + dnn_feature_columns=deep_columns, + dnn_hidden_units=[100, 50]) +``` + +## Training and Evaluating The Model + +Before we train the model, let's read in the Census dataset as we did in the +[TensorFlow Linear Model tutorial](../wide/). The code for +input data processing is provided here again for your convenience: + +```python +import pandas as pd +import urllib + +# Define the column names for the data sets. +COLUMNS = ["age", "workclass", "fnlwgt", "education", "education_num", + "marital_status", "occupation", "relationship", "race", "gender", + "capital_gain", "capital_loss", "hours_per_week", "native_country", "income_bracket"] +LABEL_COLUMN = 'label' +CATEGORICAL_COLUMNS = ["workclass", "education", "marital_status", "occupation", + "relationship", "race", "gender", "native_country"] +CONTINUOUS_COLUMNS = ["age", "education_num", "capital_gain", "capital_loss", + "hours_per_week"] + +# Download the training and test data to temporary files. +# Alternatively, you can download them yourself and change train_file and +# test_file to your own paths. +train_file = tempfile.NamedTemporaryFile() +test_file = tempfile.NamedTemporaryFile() +urllib.urlretrieve("https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data", train_file.name) +urllib.urlretrieve("https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test", test_file.name) + +# Read the training and test data sets into Pandas dataframe. +df_train = pd.read_csv(train_file, names=COLUMNS, skipinitialspace=True) +df_test = pd.read_csv(test_file, names=COLUMNS, skipinitialspace=True, skiprows=1) +df_train[LABEL_COLUMN] = (df_train['income_bracket'].apply(lambda x: '>50K' in x)).astype(int) +df_test[LABEL_COLUMN] = (df_test['income_bracket'].apply(lambda x: '>50K' in x)).astype(int) + +def input_fn(df): + # Creates a dictionary mapping from each continuous feature column name (k) to + # the values of that column stored in a constant Tensor. + continuous_cols = {k: tf.constant(df[k].values) + for k in CONTINUOUS_COLUMNS} + # Creates a dictionary mapping from each categorical feature column name (k) + # to the values of that column stored in a tf.SparseTensor. + categorical_cols = {k: tf.SparseTensor( + indices=[[i, 0] for i in range(df[k].size)], + values=df[k].values, + shape=[df[k].size, 1]) + for k in CATEGORICAL_COLUMNS} + # Merges the two dictionaries into one. + feature_cols = dict(continuous_cols.items() + categorical_cols.items()) + # Converts the label column into a constant Tensor. + label = tf.constant(df[LABEL_COLUMN].values) + # Returns the feature columns and the label. + return feature_cols, label + +def train_input_fn(): + return input_fn(df_train) + +def eval_input_fn(): + return input_fn(df_test) +``` + +After reading in the data, you can train and evaluate the model: + +```python +m.fit(input_fn=train_input_fn, steps=200) +results = m.evaluate(input_fn=eval_input_fn, steps=1) +for key in sorted(results): + print "%s: %s" % (key, results[key]) +``` + +The first line of the output should be something like `accuracy: 0.84429705`. We +can see that the accuracy was improved from about 83.6% using a wide-only linear +model to about 84.4% using a Wide & Deep model. If you'd like to see a working +end-to-end example, you can download our [example code] +(https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/learn/wide_n_deep_tutorial.py). + +Note that this tutorial is just a quick example on a small dataset to get you +familiar with the API. Wide & Deep Learning will be even more powerful if you +try it on a large dataset with many sparse feature columns that have a large +number of possible feature values. Again, feel free to take a look at our +[research paper](http://arxiv.org/abs/1606.07792) for more ideas about how to +apply Wide & Deep Learning in real-world large-scale maching learning problems. -- cgit v1.2.3 From 1fce6b722566b75a9100a1795ed26e8c5f28db98 Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Tue, 28 Jun 2016 11:46:50 -0800 Subject: Adds tf.contrib.learn quickstart and tf-wide (and deep) stuff to tutorial left nav and index page (and fixes an URL in the wide overview). Also breaks the left nav and index page into sections, because the list of tutorials is getting long. Change: 126105974 --- tensorflow/g3doc/tutorials/index.md | 121 ++++++++++++++++---------- tensorflow/g3doc/tutorials/leftnav_files | 14 ++- tensorflow/g3doc/tutorials/linear/overview.md | 2 +- 3 files changed, 89 insertions(+), 48 deletions(-) diff --git a/tensorflow/g3doc/tutorials/index.md b/tensorflow/g3doc/tutorials/index.md index ccb05ab5f0..292596837d 100644 --- a/tensorflow/g3doc/tutorials/index.md +++ b/tensorflow/g3doc/tutorials/index.md @@ -1,7 +1,8 @@ # Tutorials +## Basic Neural Networks -## MNIST For ML Beginners +### MNIST For ML Beginners If you're new to machine learning, we recommend starting here. You'll learn about a classic problem, handwritten digit classification (MNIST), and get a @@ -10,33 +11,75 @@ gentle introduction to multiclass classification. [View Tutorial](../tutorials/mnist/beginners/index.md) -## Deep MNIST for Experts +### Deep MNIST for Experts If you're already familiar with other deep learning software packages, and are -already familiar with MNIST, this tutorial will give you a very brief primer on -TensorFlow. +already familiar with MNIST, this tutorial will give you a very brief primer +on TensorFlow. [View Tutorial](../tutorials/mnist/pros/index.md) - -## TensorFlow Mechanics 101 +### TensorFlow Mechanics 101 This is a technical tutorial, where we walk you through the details of using -TensorFlow infrastructure to train models at scale. We again use MNIST as the +TensorFlow infrastructure to train models at scale. We use MNIST as the example. [View Tutorial](../tutorials/mnist/tf/index.md) +### MNIST Data Download + +Details about downloading the MNIST handwritten digits data set. Exciting +stuff. + +[View Tutorial](../tutorials/mnist/download/index.md) + + +## Easy ML with tf.contrib.learn + +### tf.contrib.learn Quickstart + +A quick introduction to tf.contrib.learn, a high-level API for TensorFlow. +Build, train, and evaluate a neural network with just a few lines of +code. + +[View Tutorial](../tutorials/tflearn/index.md) + +### Overview of Linear Models with tf.contrib.learn + +An overview of tf.contrib.learn's rich set of tools for working with linear +models in TensorFlow. + +[View Tutorial](../tutorials/linear/overview.md) + +### Linear Model Tutorial + +This tutorial walks you through the code for building a linear model using +tf.contrib.learn. + +[View Tutorial](../tutorials/wide/index.md) + +### Wide and Deep Learning Tutorial + +This tutorial shows you how to use tf.contrib.learn to jointly train a linear +model and a deep neural net to harness the advantages of each type of model. + +[View Tutorial](../tutorials/wide_and_deep/index.md) + ## TensorFlow Serving +### TensorFlow Serving + An introduction to TensorFlow Serving, a flexible, high-performance system for serving machine learning models, designed for production environments. [View Tutorial](../tutorials/tfserve/index.md) -## Convolutional Neural Networks +## Image Processing + +### Convolutional Neural Networks An introduction to convolutional neural networks using the CIFAR-10 data set. Convolutional neural nets are particularly tailored to images, since they @@ -45,8 +88,25 @@ representations of visual content. [View Tutorial](../tutorials/deep_cnn/index.md) +### Image Recognition + +How to run object recognition using a convolutional neural network +trained on ImageNet Challenge data and label set. + +[View Tutorial](../tutorials/image_recognition/index.md) + +### Deep Dream Visual Hallucinations + +Building on the Inception recognition model, we will release a TensorFlow +version of the [Deep Dream](https://github.com/google/deepdream) neural network +visual hallucination software. + +[View Tutorial](https://www.tensorflow.org/code/tensorflow/examples/tutorials/deepdream/deepdream.ipynb) + + +## Language and Sequence Processing -## Vector Representations of Words +### Vector Representations of Words This tutorial motivates why it is useful to learn to represent words as vectors (called *word embeddings*). It introduces the word2vec model as an efficient @@ -56,16 +116,14 @@ embeddings). [View Tutorial](../tutorials/word2vec/index.md) - -## Recurrent Neural Networks +### Recurrent Neural Networks An introduction to RNNs, wherein we train an LSTM network to predict the next word in an English sentence. (A task sometimes called language modeling.) [View Tutorial](../tutorials/recurrent/index.md) - -## Sequence-to-Sequence Models +### Sequence-to-Sequence Models A follow on to the RNN tutorial, where we assemble a sequence-to-sequence model for machine translation. You will learn to build your own English-to-French @@ -73,8 +131,7 @@ translator, entirely machine learned, end-to-end. [View Tutorial](../tutorials/seq2seq/index.md) - -## SyntaxNet: Neural Models of Syntax +### SyntaxNet: Neural Models of Syntax An introduction to SyntaxNet, a Natural Language Processing framework for TensorFlow. @@ -82,44 +139,18 @@ TensorFlow. [View Tutorial](../tutorials/syntaxnet/index.md) -## Mandelbrot Set +## Non-ML Applications + +### Mandelbrot Set TensorFlow can be used for computation that has nothing to do with machine learning. Here's a naive implementation of Mandelbrot set visualization. [View Tutorial](../tutorials/mandelbrot/index.md) - -## Partial Differential Equations +### Partial Differential Equations As another example of non-machine learning computation, we offer an example of a naive PDE simulation of raindrops landing on a pond. [View Tutorial](../tutorials/pdes/index.md) - - -## MNIST Data Download - -Details about downloading the MNIST handwritten digits data set. Exciting -stuff. - -[View Tutorial](../tutorials/mnist/download/index.md) - - -## Image Recognition - -How to run object recognition using a convolutional neural network -trained on ImageNet Challenge data and label set. - -[View Tutorial](../tutorials/image_recognition/index.md) - -We will soon be releasing code for training a state-of-the-art Inception model. - - -## Deep Dream Visual Hallucinations - -Building on the Inception recognition model, we will release a TensorFlow -version of the [Deep Dream](https://github.com/google/deepdream) neural network -visual hallucination software. - -[View Tutorial](https://www.tensorflow.org/code/tensorflow/examples/tutorials/deepdream/deepdream.ipynb) diff --git a/tensorflow/g3doc/tutorials/leftnav_files b/tensorflow/g3doc/tutorials/leftnav_files index c35a936995..09cd084b49 100644 --- a/tensorflow/g3doc/tutorials/leftnav_files +++ b/tensorflow/g3doc/tutorials/leftnav_files @@ -1,13 +1,23 @@ +### Basic Neural Networks mnist/beginners/index.md mnist/pros/index.md mnist/tf/index.md +mnist/download/index.md +### Easy ML with tf.contrib.learn +tflearn/index.md +linear/overview.md +wide/index.md +wide_and_deep/index.md +### TensorFlow Serving tfserve/index.md +### Image Processing deep_cnn/index.md +image_recognition/index.md +### Language and Sequence Processing word2vec/index.md recurrent/index.md seq2seq/index.md syntaxnet/index.md +### Non-ML Applications mandelbrot/index.md pdes/index.md -mnist/download/index.md -image_recognition/index.md \ No newline at end of file diff --git a/tensorflow/g3doc/tutorials/linear/overview.md b/tensorflow/g3doc/tutorials/linear/overview.md index b592495212..8614011290 100644 --- a/tensorflow/g3doc/tutorials/linear/overview.md +++ b/tensorflow/g3doc/tutorials/linear/overview.md @@ -234,4 +234,4 @@ e = tf.contrib.learn.DNNLinearCombinedClassifier( dnn_feature_columns=deep_columns, dnn_hidden_units=[100, 50]) ``` -For more information, see the [Wide and Deep Learning tutorial](../wide_n_deep/). +For more information, see the [Wide and Deep Learning tutorial](../wide_and_deep/). -- cgit v1.2.3 From 990bc2c2a3d11522c21c1c54376611f28f5c55ed Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Tue, 28 Jun 2016 11:00:48 -0800 Subject: Edits to tf.learn quickstart. Change: 126101067 --- tensorflow/g3doc/tutorials/tflearn/index.md | 190 ++++++++++++++++++++-------- 1 file changed, 137 insertions(+), 53 deletions(-) diff --git a/tensorflow/g3doc/tutorials/tflearn/index.md b/tensorflow/g3doc/tutorials/tflearn/index.md index a0d9652fc9..0a228baf08 100644 --- a/tensorflow/g3doc/tutorials/tflearn/index.md +++ b/tensorflow/g3doc/tutorials/tflearn/index.md @@ -1,21 +1,62 @@ -## TF.Learn Quickstart +## tf.contrib.learn Quickstart -TensorFlow’s Learn API (TF.Learn) makes it easy to configure, train, and evaluate a -variety of machine learning models. In this quickstart tutorial, you’ll use TF.Learn -to construct a [Deep Neural Network](https://en.wikipedia.org/wiki/Artificial_neural_network) -classifier model and train it on [Fisher’s Iris data set](https://en.wikipedia.org/wiki/Iris_flower_data_set) -to predict flower species based on sepal/petal geometry. You’ll perform the following four steps: +TensorFlow’s high-level machine learning API (tf.contrib.learn) makes it easy +to configure, train, and evaluate a variety of machine learning models. In +this quickstart tutorial, you’ll use tf.contrib.learn to construct a [neural +network](https://en.wikipedia.org/wiki/Artificial_neural_network) classifier +and train it on [Fisher’s Iris data +set](https://en.wikipedia.org/wiki/Iris_flower_data_set) to predict flower +species based on sepal/petal geometry. You’ll perform the following five +steps: 1. Load CSVs containing Iris training/test data into a TensorFlow `Dataset` -2. Construct a [Deep Neural Network classifier](https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.learn.html#DNNClassifier) -3. Fit the DNN model using the training data +2. Construct a [neural network classifier]( +../../api_docs/python/contrib.learn.html#DNNClassifier) +3. Fit the model using the training data 4. Evaluate the accuracy of the model +5. Classify new samples ## Get Started -Remember to [install TensorFlow on your machine](https://www.tensorflow.org/versions/r0.9/get_started/os_setup.html#download-and-setup) -before getting started with this tutorial. The full code and datasets for this tutorial -can be found [here](https://www.tensorflow.org/code/tensorflow/examples/tutorials/tflearnqs/), -and the following sections walk through them in detail. + +Remember to [install TensorFlow on your +machine](../../get_started/os_setup.html#download-and-setup) before getting +started with this tutorial. + +Here is the full code for our neural network: + +```python +import tensorflow as tf +import numpy as np + +# Data sets +IRIS_TRAINING = "iris_training.csv" +IRIS_TEST = "iris_test.csv" + +# Load datasets. +training_set = tf.contrib.learn.datasets.base.load_csv(filename=IRIS_TRAINING, target_dtype=np.int) +test_set = tf.contrib.learn.datasets.base.load_csv(filename=IRIS_TEST, target_dtype=np.int) + +x_train, x_test, y_train, y_test = training_set.data, test_set.data, \ + training_set.target, test_set.target + +# Build 3 layer DNN with 10, 20, 10 units respectively. +classifier = tf.contrib.learn.DNNClassifier(hidden_units=[10, 20, 10], n_classes=3) + +# Fit model. +classifier.fit(x=x_train, y=y_train, steps=200) + +# Evaluate accuracy. +accuracy_score = classifier.evaluate(x=x_test, y=y_test)["accuracy"] +print('Accuracy: {0:f}'.format(accuracy_score)) + +# Classify two new flower samples. +new_samples = np.array( + [[6.4, 3.2, 4.5, 1.5], [5.8, 3.1, 5.0, 1.7]], dtype=float) +y = classifier.predict(new_samples) +print ('Predictions: {}'.format(str(y))) +``` + +The following sections walk through the code in detail. ## Load the Iris CSV data to TensorFlow @@ -41,25 +82,31 @@ Sepal Length | Sepal Width | Petal Length | Petal Width | Species 6.2 | 3.4 | 5.4 | 2.3 | 2 5.9 | 3.0 | 5.1 | 1.8 | 2 - -For this tutorial, the Iris data has been randomized and split into two separate CSVs: -a training set of 120 samples ([iris_training.csv](https://www.tensorflow.org/code/tensorflow/examples/tutorials/tflearnqs/iris_training.csv)) -and a test set of 30 samples ([iris_test.csv](https://www.tensorflow.org/code/tensorflow/examples/tutorials/tflearnqs/iris_test.csv)). + For this +tutorial, the Iris data has been randomized and split into two separate CSVs: +a training set of 120 samples +([iris_training.csv](http://download.tensorflow.org/data/iris_training.csv)). +and a test set of 30 samples +([iris_test.csv](http://download.tensorflow.org/data/iris_test.csv)). -To get started, first import TensorFlow, TF.Learn, and numpy: +To get started, first import TensorFlow and numpy: ```python import tensorflow as tf -from tensorflow.contrib import learn import numpy as np ``` -Next, load the training and test sets into `Dataset`s using the [`load_csv()`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/learn/python/learn/datasets/base.py#L36) -method in `learn.datasets.base`. `load_csv()` has two required arguments: -`filename`, which takes the filepath to the CSV file, -and `target_dtype`, which takes the [`numpy` datatype](http://docs.scipy.org/doc/numpy/user/basics.types.html) -of the dataset's target value. Here, the target (the value you're training the model to predict) is -flower species, which is an integer from 0–2, so the appropriate `numpy` datatype is `np.int`: +Next, load the training and test sets into `Dataset`s using the [`load_csv()`] +(https://www.tensorflow.org/code/tensorflow/contrib/learn/python/learn/datasets/base.py) method in `learn.datasets.base`. The +`load_csv()` method has two required arguments: + +* `filename`, which takes the filepath to the CSV file, and +* `target_dtype`, which takes the [`numpy` datatype](http://docs.scipy.org/doc/numpy/user/basics.types.html) of the dataset's target value. + +Here, the target (the value you're training the model to predict) is flower +species, which is an integer from 0–2, so the appropriate `numpy` +datatype is `np.int`: ```python # Data sets @@ -67,35 +114,40 @@ IRIS_TRAINING = "iris_training.csv" IRIS_TEST = "iris_test.csv" # Load datasets. -training_set = learn.datasets.base.load_csv(filename=IRIS_TRAINING, target_dtype=np.int) -test_set = learn.datasets.base.load_csv(filename=IRIS_TEST, target_dtype=np.int) +training_set = tf.contrib.learn.datasets.base.load_csv(filename=IRIS_TRAINING, target_dtype=np.int) +test_set = tf.contrib.learn.datasets.base.load_csv(filename=IRIS_TEST, target_dtype=np.int) ``` -Next, assign variables to the feature data and target values: `x_train` for training-set feature data, -`x_test` for test-set feature data, `y_train` for training-set target values, and `y_test` for test-set -target values. Datasets in TensorFlow are [named tuples](https://docs.python.org/2/library/collections.html#collections.namedtuple), -and you can access feature data and target values via the `data` and `target` fields, respectively: +Next, assign variables to the feature data and target values: `x_train` for +training-set feature data, `x_test` for test-set feature data, `y_train` for +training-set target values, and `y_test` for test-set target values. `Dataset`s +in tf.contrib.learn are [named tuples](https://docs.python.org/2/library/collections.h +tml#collections.namedtuple), and you can access feature data and target values +via the `data` and `target` fields, respectively: ```python x_train, x_test, y_train, y_test = training_set.data, test_set.data, \ training_set.target, test_set.target ``` -Later on, in "Fit the DNNClassifier to the Iris Training Data," you'll use `x_train` and `y_train` to -train your model, and in "Evaluate Model Accuracy", you'll use `x_test` and `y_test`. But first, -you'll construct your model in the next section. +Later on, in "Fit the DNNClassifier to the Iris Training Data," you'll use +`x_train` and `y_train` to train your model, and in "Evaluate Model +Accuracy", you'll use `x_test` and `y_test`. But first, you'll construct your +model in the next section. ## Construct a Deep Neural Network Classifier -TF.Learn offers a variety of predefined models, called [`Estimator`s](https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.learn.html#estimators), -which you can use "out of the box" to run training and evaluation operations on your data. -Here, you'll configure a Deep Neural Network Classifier model to fit the iris data. Using TF.Learn, -you can instantiate your [`DNNClassifier`](https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.learn.html#DNNClassifier) -with just one line of code: +tf.contrib.learn offers a variety of predefined models, called [`Estimator`s +](../../api_docs/python/contrib.learn.html#estimators), which you can use "out +of the box" to run training and evaluation operations on your data. Here, +you'll configure a Deep Neural Network Classifier model to fit the Iris data. +Using tf.contrib.learn, you can instantiate your +[`DNNClassifier`](../../api_docs/python/contrib.learn.html#DNNClassifier) with +just one line of code: ```python # Build 3 layer DNN with 10, 20, 10 units respectively. -classifier = learn.DNNClassifier(hidden_units=[10, 20, 10], n_classes=3) +classifier = tf.contrib.learn.DNNClassifier(hidden_units=[10, 20, 10], n_classes=3) ``` The code above creates a `DNNClassifier` model with three [hidden layers](http://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw), @@ -106,7 +158,7 @@ classes (`n_classes=3`). ## Fit the DNNClassifier to the Iris Training Data Now that you've configured your DNN `classifier` model, you can fit it to the Iris training data -using the [`fit`](https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.learn.html#BaseEstimator.fit) +using the [`fit`](../../api_docs/python/contrib.learn.html#BaseEstimator.fit) method. Pass as arguments your feature data (`x_train`), target values (`y_train`), and the number of steps to train (here, 200): @@ -128,17 +180,18 @@ classifier.fit(x=x_train, y=y_train, steps=100) However, if you're looking to track the model while it trains, you'll likely -want to instead use a TensorFlow [`monitor`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/learn/python/learn/monitors.py) +want to instead use a TensorFlow [`monitor`](https://www.tensorflow.org/code/tensorflow/contrib/learn/python/learn/monitors.py) to perform logging operations. ## Evaluate Model Accuracy -You've fit your `DNNClassifier` model on the Iris training data; now, you can check its accuracy on -the Iris test data using the [`evaluate`](https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.learn.html#BaseEstimator.evaluate) -method. Like `fit`, `evaluate` takes feature data and target values as arguments, -and returns a `dict` with the evaluation results. The following code passes the Iris -test data—`x_test` and `y_test`—to `evaluate`, retrieves `accuracy` from the -results, and prints it to output: +You've fit your `DNNClassifier` model on the Iris training data; now, you can +check its accuracy on the Iris test data using the [`evaluate` +](../../api_docs/python/contrib.learn.html#BaseEstimator.evaluate) method. +Like `fit`, `evaluate` takes feature data and target values as +arguments, and returns a `dict` with the evaluation results. The following +code passes the Iris test data—`x_test` and `y_test`—to `evaluate` +and prints the `accuracy` from the results: ```python accuracy_score = classifier.evaluate(x=x_test, y=y_test)["accuracy"] @@ -153,15 +206,46 @@ Accuracy: 0.933333 Not bad for a relatively small data set! +## Classify New Samples + +Use the estimator's `predict()` method to classify new samples. For example, +say you have these two new flower samples: + +Sepal Length | Sepal Width | Petal Length | Petal Width +:----------- | :---------- | :----------- | :---------- +6.4 | 3.2 | 4.5 | 1.5 +5.8 | 3.1 | 5.0 | 1.7 + +You can predict their species with the following code: + +```python +# Classify two new flower samples. +new_samples = np.array( + [[6.4, 3.2, 4.5, 1.5], [5.8, 3.1, 5.0, 1.7]], dtype=float) +y = classifier.predict(new_samples) +print ('Predictions: {}'.format(str(y))) +``` + +The `predict()` method returns an array of predictions, one for each sample: + +```python +Prediction: [1 2] +``` + +The model thus predicts that the first sample is *Iris versicolor*, and the +second sample is *Iris virginica*. + ## Additional Resources -* For further reference materials on TF.Learn, see the official [API docs](https://www.tensorflow.org/versions/r0.9/api_docs/python/contrib.learn.html). +* For further reference materials on tf.contrib.learn, see the official +[API docs](../../api_docs/python/contrib.learn.md). -* To learn more about using TF.Learn to create linear models, see -[Large-scale Linear Models with TensorFlow](https://www.tensorflow.org/versions/r0.9/tutorials/linear/index.html). +* To learn more about using tf.contrib.learn to create linear models, see +[Large-scale Linear Models with TensorFlow](../linear/). -* To experiment with neural network modeling and visualization in the browser, check out [Deep Playground](http://playground.tensorflow.org/). +* To experiment with neural network modeling and visualization in the browser, +check out [Deep Playground](http://playground.tensorflow.org/). -* For more advanced tutorials on neural networks, see [Convolutional Neural Networks](https://www.tensorflow.org/versions/r0.9/tutorials/deep_cnn/index.html) -and [Recurrent Neural Networks](https://www.tensorflow.org/versions/r0.9/tutorials/recurrent/index.html). +* For more advanced tutorials on neural networks, see [Convolutional Neural +Networks](../deep_cnn/) and [Recurrent Neural Networks](../recurrent/). -- cgit v1.2.3