path: root/tensorflow/docs_src/tutorials/representation/linear.md
diff options
Diffstat (limited to 'tensorflow/docs_src/tutorials/representation/linear.md')
1 files changed, 239 insertions, 0 deletions
diff --git a/tensorflow/docs_src/tutorials/representation/linear.md b/tensorflow/docs_src/tutorials/representation/linear.md
new file mode 100644
index 0000000000..1b418cf065
--- /dev/null
+++ b/tensorflow/docs_src/tutorials/representation/linear.md
@@ -0,0 +1,239 @@
+# Large-scale Linear Models with TensorFlow
+@{tf.estimator$Estimators} provides (among other things) a rich set of tools for
+working with linear models in TensorFlow. This document provides an overview of
+those tools. It explains:
+ * What a linear model is.
+ * Why you might want to use a linear model.
+ * How Estimators make it easy to build linear models in TensorFlow.
+ * How you can use Estimators to combine linear models with.
+ deep learning to get the advantages of both.
+Read this overview to decide whether the Estimator's linear model tools might
+be useful to you. Then work through the
+[Estimator wide and deep learning tutorial](https://github.com/tensorflow/models/tree/master/official/wide_deep)
+to give it a try. This overview uses code samples from the tutorial, but the
+tutorial walks through the code in greater detail.
+To understand this overview it will help to have some familiarity
+with basic machine learning concepts, and also with
+## What is a linear model?
+A **linear model** uses a single weighted sum of features to make a prediction.
+For example, if you have [data](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names)
+on age, years of education, and weekly hours of
+work for a population, a model can learn weights for each of those numbers so that
+their weighted sum estimates a person's salary. You can also use linear models
+for classification.
+Some linear models transform the weighted sum into a more convenient form. For
+example, [**logistic regression**](https://developers.google.com/machine-learning/glossary/#logistic_regression) plugs the weighted sum into the logistic
+function to turn the output into a value between 0 and 1. But you still just
+have one weight for each input feature.
+## Why would you want to use a linear model?
+Why would you want to use so simple a model when recent research has
+demonstrated the power of more complex neural networks with many layers?
+Linear models:
+ * train quickly, compared to deep neural nets.
+ * can work well on very large feature sets.
+ * can be trained with algorithms that don't require a lot of fiddling
+ with learning rates, etc.
+ * can be interpreted and debugged more easily than neural nets.
+ You can examine the weights assigned to each feature to figure out what's
+ having the biggest impact on a prediction.
+ * provide an excellent starting point for learning about machine learning.
+ * are widely used in industry.
+## How do Estimators help you build linear models?
+You can build a linear model from scratch in TensorFlow without the help of a
+special API. But Estimators provides some tools that make it easier to build
+effective large-scale linear models.
+### Feature columns and transformations
+Much of the work of designing a linear model consists of transforming raw data
+into suitable input features. Tensorflow uses the `FeatureColumn` abstraction to
+enable these transformations.
+A `FeatureColumn` represents a single feature in your data. A `FeatureColumn`
+may represent a quantity like 'height', or it may represent a category like
+'eye_color' where the value is drawn from a set of discrete possibilities like
+{'blue', 'brown', 'green'}.
+In the case of both *continuous features* like 'height' and *categorical
+features* like 'eye_color', a single value in the data might get transformed
+into a sequence of numbers before it is input into the model. The
+`FeatureColumn` abstraction lets you manipulate the feature as a single
+semantic unit in spite of this fact. You can specify transformations and
+select features to include without dealing with specific indices in the
+tensors you feed into the model.
+#### Sparse columns
+Categorical features in linear models are typically translated into a sparse
+vector in which each possible value has a corresponding index or id. For
+example, if there are only three possible eye colors you can represent
+'eye_color' as a length 3 vector: 'brown' would become [1, 0, 0], 'blue' would
+become [0, 1, 0] and 'green' would become [0, 0, 1]. These vectors are called
+"sparse" because they may be very long, with many zeros, when the set of
+possible values is very large (such as all English words).
+While you don't need to use categorical columns to use the linear model tools
+provided by Estimators, one of the strengths of linear models is their ability
+to deal with large sparse vectors. Sparse features are a primary use case for
+the linear model tools provided by Estimators.
+##### Encoding sparse columns
+`FeatureColumn` handles the conversion of categorical values into vectors
+automatically, with code like this:
+eye_color = tf.feature_column.categorical_column_with_vocabulary_list(
+ "eye_color", vocabulary_list=["blue", "brown", "green"])
+where `eye_color` is the name of a column in your source data.
+You can also generate `FeatureColumn`s for categorical features for which you
+don't know all possible values. For this case you would use
+`categorical_column_with_hash_bucket()`, which uses a hash function to assign
+indices to feature values.
+education = tf.feature_column.categorical_column_with_hash_bucket(
+ "education", hash_bucket_size=1000)
+##### Feature Crosses
+Because linear models assign independent weights to separate features, they
+can't learn the relative importance of specific combinations of feature
+values. If you have a feature 'favorite_sport' and a feature 'home_city' and
+you're trying to predict whether a person likes to wear red, your linear model
+won't be able to learn that baseball fans from St. Louis especially like to
+wear red.
+You can get around this limitation by creating a new feature
+'favorite_sport_x_home_city'. The value of this feature for a given person is
+just the concatenation of the values of the two source features:
+'baseball_x_stlouis', for example. This sort of combination feature is called
+a *feature cross*.
+The `crossed_column()` method makes it easy to set up feature crosses:
+sport_x_city = tf.feature_column.crossed_column(
+ ["sport", "city"], hash_bucket_size=int(1e4))
+#### Continuous columns
+You can specify a continuous feature like so:
+age = tf.feature_column.numeric_column("age")
+Although, as a single real number, a continuous feature can often be input
+directly into the model, Tensorflow offers useful transformations for this sort
+of column as well.
+##### Bucketization
+*Bucketization* turns a continuous column into a categorical column. This
+transformation lets you use continuous features in feature crosses, or learn
+cases where specific value ranges have particular importance.
+Bucketization divides the range of possible values into subranges called
+age_buckets = tf.feature_column.bucketized_column(
+ age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65])
+The bucket into which a value falls becomes the categorical label for
+that value.
+#### Input function
+`FeatureColumn`s provide a specification for the input data for your model,
+indicating how to represent and transform the data. But they do not provide
+the data itself. You provide the data through an input function.
+The input function must return a dictionary of tensors. Each key corresponds to
+the name of a `FeatureColumn`. Each key's value is a tensor containing the
+values of that feature for all data instances. See
+@{$premade_estimators#input_fn} for a
+more comprehensive look at input functions, and `input_fn` in the
+[wide and deep learning tutorial](https://github.com/tensorflow/models/tree/master/official/wide_deep)
+for an example implementation of an input function.
+The input function is passed to the `train()` and `evaluate()` calls that
+initiate training and testing, as described in the next section.
+### Linear estimators
+Tensorflow estimator classes provide a unified training and evaluation harness
+for regression and classification models. They take care of the details of the
+training and evaluation loops and allow the user to focus on model inputs and
+To build a linear estimator, you can use either the
+`tf.estimator.LinearClassifier` estimator or the
+`tf.estimator.LinearRegressor` estimator, for classification and
+regression respectively.
+As with all tensorflow estimators, to run the estimator you just:
+ 1. Instantiate the estimator class. For the two linear estimator classes,
+ you pass a list of `FeatureColumn`s to the constructor.
+ 2. Call the estimator's `train()` method to train it.
+ 3. Call the estimator's `evaluate()` method to see how it does.
+For example:
+e = tf.estimator.LinearClassifier(
+ feature_columns=[
+ native_country, education, occupation, workclass, marital_status,
+ race, age_buckets, education_x_occupation,
+ age_buckets_x_race_x_occupation],
+e.train(input_fn=input_fn_train, steps=200)
+# Evaluate for one step (one pass through the test data).
+results = e.evaluate(input_fn=input_fn_test)
+# Print the stats for the evaluation.
+for key in sorted(results):
+ print("%s: %s" % (key, results[key]))
+### Wide and deep learning
+The `tf.estimator` module also provides an estimator class that lets you jointly
+train a linear model and a deep neural network. This novel approach combines the
+ability of linear models to "memorize" key features with the generalization
+ability of neural nets. Use `tf.estimator.DNNLinearCombinedClassifier` to
+create this sort of "wide and deep" model:
+e = tf.estimator.DNNLinearCombinedClassifier(
+ model_dir=YOUR_MODEL_DIR,
+ linear_feature_columns=wide_columns,
+ dnn_feature_columns=deep_columns,
+ dnn_hidden_units=[100, 50])
+For more information, see the
+[wide and deep learning tutorial](https://github.com/tensorflow/models/tree/master/official/wide_deep).