diff options
Diffstat (limited to 'tensorflow/docs_src/tutorials/representation/linear.md')
-rw-r--r-- | tensorflow/docs_src/tutorials/representation/linear.md | 239 |
1 files changed, 239 insertions, 0 deletions
diff --git a/tensorflow/docs_src/tutorials/representation/linear.md b/tensorflow/docs_src/tutorials/representation/linear.md new file mode 100644 index 0000000000..1b418cf065 --- /dev/null +++ b/tensorflow/docs_src/tutorials/representation/linear.md @@ -0,0 +1,239 @@ +# Large-scale Linear Models with TensorFlow + +@{tf.estimator$Estimators} provides (among other things) a rich set of tools for +working with linear models in TensorFlow. This document provides an overview of +those tools. It explains: + + * What a linear model is. + * Why you might want to use a linear model. + * How Estimators make it easy to build linear models in TensorFlow. + * How you can use Estimators to combine linear models with. + deep learning to get the advantages of both. + +Read this overview to decide whether the Estimator's linear model tools might +be useful to you. Then work through the +[Estimator wide and deep learning tutorial](https://github.com/tensorflow/models/tree/master/official/wide_deep) +to give it a try. This overview uses code samples from the tutorial, but the +tutorial walks through the code in greater detail. + +To understand this overview it will help to have some familiarity +with basic machine learning concepts, and also with +@{$premade_estimators$Estimators}. + +[TOC] + +## What is a linear model? + +A **linear model** uses a single weighted sum of features to make a prediction. +For example, if you have [data](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names) +on age, years of education, and weekly hours of +work for a population, a model can learn weights for each of those numbers so that +their weighted sum estimates a person's salary. You can also use linear models +for classification. + +Some linear models transform the weighted sum into a more convenient form. For +example, [**logistic regression**](https://developers.google.com/machine-learning/glossary/#logistic_regression) plugs the weighted sum into the logistic +function to turn the output into a value between 0 and 1. But you still just +have one weight for each input feature. + +## Why would you want to use a linear model? + +Why would you want to use so simple a model when recent research has +demonstrated the power of more complex neural networks with many layers? + +Linear models: + + * train quickly, compared to deep neural nets. + * can work well on very large feature sets. + * can be trained with algorithms that don't require a lot of fiddling + with learning rates, etc. + * can be interpreted and debugged more easily than neural nets. + You can examine the weights assigned to each feature to figure out what's + having the biggest impact on a prediction. + * provide an excellent starting point for learning about machine learning. + * are widely used in industry. + +## How do Estimators help you build linear models? + +You can build a linear model from scratch in TensorFlow without the help of a +special API. But Estimators provides some tools that make it easier to build +effective large-scale linear models. + +### Feature columns and transformations + +Much of the work of designing a linear model consists of transforming raw data +into suitable input features. Tensorflow uses the `FeatureColumn` abstraction to +enable these transformations. + +A `FeatureColumn` represents a single feature in your data. A `FeatureColumn` +may represent a quantity like 'height', or it may represent a category like +'eye_color' where the value is drawn from a set of discrete possibilities like +{'blue', 'brown', 'green'}. + +In the case of both *continuous features* like 'height' and *categorical +features* like 'eye_color', a single value in the data might get transformed +into a sequence of numbers before it is input into the model. The +`FeatureColumn` abstraction lets you manipulate the feature as a single +semantic unit in spite of this fact. You can specify transformations and +select features to include without dealing with specific indices in the +tensors you feed into the model. + +#### Sparse columns + +Categorical features in linear models are typically translated into a sparse +vector in which each possible value has a corresponding index or id. For +example, if there are only three possible eye colors you can represent +'eye_color' as a length 3 vector: 'brown' would become [1, 0, 0], 'blue' would +become [0, 1, 0] and 'green' would become [0, 0, 1]. These vectors are called +"sparse" because they may be very long, with many zeros, when the set of +possible values is very large (such as all English words). + +While you don't need to use categorical columns to use the linear model tools +provided by Estimators, one of the strengths of linear models is their ability +to deal with large sparse vectors. Sparse features are a primary use case for +the linear model tools provided by Estimators. + +##### Encoding sparse columns + +`FeatureColumn` handles the conversion of categorical values into vectors +automatically, with code like this: + +```python +eye_color = tf.feature_column.categorical_column_with_vocabulary_list( + "eye_color", vocabulary_list=["blue", "brown", "green"]) +``` + +where `eye_color` is the name of a column in your source data. + +You can also generate `FeatureColumn`s for categorical features for which you +don't know all possible values. For this case you would use +`categorical_column_with_hash_bucket()`, which uses a hash function to assign +indices to feature values. + +```python +education = tf.feature_column.categorical_column_with_hash_bucket( + "education", hash_bucket_size=1000) +``` + +##### Feature Crosses + +Because linear models assign independent weights to separate features, they +can't learn the relative importance of specific combinations of feature +values. If you have a feature 'favorite_sport' and a feature 'home_city' and +you're trying to predict whether a person likes to wear red, your linear model +won't be able to learn that baseball fans from St. Louis especially like to +wear red. + +You can get around this limitation by creating a new feature +'favorite_sport_x_home_city'. The value of this feature for a given person is +just the concatenation of the values of the two source features: +'baseball_x_stlouis', for example. This sort of combination feature is called +a *feature cross*. + +The `crossed_column()` method makes it easy to set up feature crosses: + +```python +sport_x_city = tf.feature_column.crossed_column( + ["sport", "city"], hash_bucket_size=int(1e4)) +``` + +#### Continuous columns + +You can specify a continuous feature like so: + +```python +age = tf.feature_column.numeric_column("age") +``` + +Although, as a single real number, a continuous feature can often be input +directly into the model, Tensorflow offers useful transformations for this sort +of column as well. + +##### Bucketization + +*Bucketization* turns a continuous column into a categorical column. This +transformation lets you use continuous features in feature crosses, or learn +cases where specific value ranges have particular importance. + +Bucketization divides the range of possible values into subranges called +buckets: + +```python +age_buckets = tf.feature_column.bucketized_column( + age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) +``` + +The bucket into which a value falls becomes the categorical label for +that value. + +#### Input function + +`FeatureColumn`s provide a specification for the input data for your model, +indicating how to represent and transform the data. But they do not provide +the data itself. You provide the data through an input function. + +The input function must return a dictionary of tensors. Each key corresponds to +the name of a `FeatureColumn`. Each key's value is a tensor containing the +values of that feature for all data instances. See +@{$premade_estimators#input_fn} for a +more comprehensive look at input functions, and `input_fn` in the +[wide and deep learning tutorial](https://github.com/tensorflow/models/tree/master/official/wide_deep) +for an example implementation of an input function. + +The input function is passed to the `train()` and `evaluate()` calls that +initiate training and testing, as described in the next section. + +### Linear estimators + +Tensorflow estimator classes provide a unified training and evaluation harness +for regression and classification models. They take care of the details of the +training and evaluation loops and allow the user to focus on model inputs and +architecture. + +To build a linear estimator, you can use either the +`tf.estimator.LinearClassifier` estimator or the +`tf.estimator.LinearRegressor` estimator, for classification and +regression respectively. + +As with all tensorflow estimators, to run the estimator you just: + + 1. Instantiate the estimator class. For the two linear estimator classes, + you pass a list of `FeatureColumn`s to the constructor. + 2. Call the estimator's `train()` method to train it. + 3. Call the estimator's `evaluate()` method to see how it does. + +For example: + +```python +e = tf.estimator.LinearClassifier( + feature_columns=[ + native_country, education, occupation, workclass, marital_status, + race, age_buckets, education_x_occupation, + age_buckets_x_race_x_occupation], + model_dir=YOUR_MODEL_DIRECTORY) +e.train(input_fn=input_fn_train, steps=200) +# Evaluate for one step (one pass through the test data). +results = e.evaluate(input_fn=input_fn_test) + +# Print the stats for the evaluation. +for key in sorted(results): + print("%s: %s" % (key, results[key])) +``` + +### Wide and deep learning + +The `tf.estimator` module also provides an estimator class that lets you jointly +train a linear model and a deep neural network. This novel approach combines the +ability of linear models to "memorize" key features with the generalization +ability of neural nets. Use `tf.estimator.DNNLinearCombinedClassifier` to +create this sort of "wide and deep" model: + +```python +e = tf.estimator.DNNLinearCombinedClassifier( + model_dir=YOUR_MODEL_DIR, + linear_feature_columns=wide_columns, + dnn_feature_columns=deep_columns, + dnn_hidden_units=[100, 50]) +``` +For more information, see the +[wide and deep learning tutorial](https://github.com/tensorflow/models/tree/master/official/wide_deep). |