diff options
author | 2016-11-08 13:28:56 -0800 | |
---|---|---|
committer | 2016-11-08 16:31:10 -0800 | |
commit | e8e72f64dcd5401d923707063da8b23fbc5d65d8 (patch) | |
tree | 9560ca141b0b9629a18c1a3ed8aeb01d82a591b2 | |
parent | 128ce5d5f7aa9ed4b4354fe78af1c2c686327640 (diff) |
Add docs for the estimators with examples.
Change: 138557107
-rw-r--r-- | tensorflow/contrib/learn/python/learn/estimators/__init__.py | 249 |
1 files changed, 248 insertions, 1 deletions
diff --git a/tensorflow/contrib/learn/python/learn/estimators/__init__.py b/tensorflow/contrib/learn/python/learn/estimators/__init__.py index 620f23d1ff..ef5a16d7dc 100644 --- a/tensorflow/contrib/learn/python/learn/estimators/__init__.py +++ b/tensorflow/contrib/learn/python/learn/estimators/__init__.py @@ -13,7 +13,254 @@ # limitations under the License. # ============================================================================== -"""Estimators.""" +"""An estimator is a rule for calculating an estimate of a given quantity. + +# Estimators + +* **Estimators** are used to train and evaluate TensorFlow models. +They support regression and classification problems. +* **Classifiers** are functions that have discrete outcomes. +* **Regressors** are functions that predict continuous values. + +## Choosing the correct estimator + +* For **Regression** problems use one of the following: + * `LinearRegressor`: Uses linear model. + * `DNNRegressor`: Uses DNN. + * `DNNLinearCombinedRegressor`: Uses Wide & Deep. + * `TensorForestEstimator`: Uses RandomForest. Use `.predict()` for + regression problems. + * `Estimator`: Use when you need a custom model. + +* For **Classification** problems use one of the following: + * `LinearClassifier`: Multiclass classifier using Linear model. + * `DNNClassifier`: Multiclass classifier using DNN. + * `DNNLinearCombinedClassifier`: Multiclass classifier using Wide & Deep. + * `TensorForestEstimator`: Uses RandomForest. Use `.predict_proba()` when + using for binary classification problems. + * `SVM`: Binary classifier using linear SVMs. + * `LogisticRegressor`: Use when you need custom model for binary + classification. + * `Estimator`: Use when you need custom model for N class classification. + +## Pre-canned Estimators + +Pre-canned estimators are machine learning estimators premade for general +purpose problems. If you need more customization, you can always write your +own custom estimator as described in the section below. + +Pre-canned estimators are tested and optimized for speed and quality. + +### Define the feature columns + +Here are some possible types of feature columns used as inputs to a pre-canned +estimator. + +Feature columns may vary based on the estimator used. So you can see which +feature columns are fed to each estimator in the below section. + +```python +sparse_feature_a = sparse_column_with_keys( + column_name="sparse_feature_a", keys=["AB", "CD", ...]) + +embedding_feature_a = embedding_column( + sparse_id_column=sparse_feature_a, dimension=3, combiner="sum") + +sparse_feature_b = sparse_column_with_hash_bucket( + column_name="sparse_feature_b", hash_bucket_size=1000) + +embedding_feature_b = embedding_column( + sparse_id_column=sparse_feature_b, dimension=16, combiner="sum") + +crossed_feature_a_x_b = crossed_column( + columns=[sparse_feature_a, sparse_feature_b], hash_bucket_size=10000) + +real_feature = real_valued_column("real_feature") +real_feature_buckets = bucketized_column( + source_column=real_feature, + boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) +``` + +### Create the pre-canned estimator + +DNNClassifier, DNNRegressor, and DNNLinearCombinedClassifier are all pretty +similar to each other in how you use them. You can easily plug in an +optimizer and/or regularization to those estimators. + +#### DNNClassifier + +A classifier for TensorFlow DNN models. + +```python +my_features = [embedding_feature_a, embedding_feature_b] +estimator = DNNClassifier( + feature_columns=my_features, + hidden_units=[1024, 512, 256], + optimizer=tf.train.ProximalAdagradOptimizer( + learning_rate=0.1, + l1_regularization_strength=0.001 + )) +``` + +#### DNNRegressor + +A regressor for TensorFlow DNN models. + +```python +my_features = [embedding_feature_a, embedding_feature_b] + +estimator = DNNRegressor( +feature_columns=my_features, +hidden_units=[1024, 512, 256]) + +# Or estimator using the ProximalAdagradOptimizer optimizer with +# regularization. +estimator = DNNRegressor( + feature_columns=my_features, + hidden_units=[1024, 512, 256], + optimizer=tf.train.ProximalAdagradOptimizer( + learning_rate=0.1, + l1_regularization_strength=0.001 + )) +``` + +#### DNNLinearCombinedClassifier + +A classifier for TensorFlow Linear and DNN joined training models. + +* Wide and deep model +* Multi class (2 by default) + +```python +my_linear_features = [crossed_feature_a_x_b] +my_deep_features = [embedding_feature_a, embedding_feature_b] +estimator = DNNLinearCombinedClassifier( + # Common settings + n_classes=n_classes, + weight_column_name=weight_column_name, + # Wide settings + linear_feature_columns=my_linear_features, + linear_optimizer=tf.train.FtrlOptimizer(...), + # Deep settings + dnn_feature_columns=my_deep_features, + dnn_hidden_units=[1000, 500, 100], + dnn_optimizer=tf.train.AdagradOptimizer(...)) +``` + +#### LinearClassifier + +Train a linear model to classify instances into one of multiple possible +classes. When number of possible classes is 2, this is binary classification. + +```python +my_features = [sparse_feature_b, crossed_feature_a_x_b] +estimator = LinearClassifier( + feature_columns=my_features, + optimizer=tf.train.FtrlOptimizer( + learning_rate=0.1, + l1_regularization_strength=0.001 + )) +``` + +#### LinearRegressor + +Train a linear regression model to predict a label value given observation of +feature values. + +```python +my_features = [sparse_feature_b, crossed_feature_a_x_b] +estimator = LinearRegressor( + feature_columns=my_features) +``` + +#### SVM - Support Vector Machine + +Support Vector Machine (SVM) model for binary classification. + +Currently only linear SVMs are supported. + +```python +my_features = [real_feature, sparse_feature_a] +estimator = SVM( + example_id_column='example_id', + feature_columns=my_features, + l2_regularization=10.0) +``` + +#### TensorForestEstimator + +Supports regression and binary classification. + +### Use the estimator + +There are two main functions for using estimators, one of which is for +training, and one of which is for evaluation. +You can specify different data sources for each one in order to use different +datasets for train and eval. + +```python +# Input builders +def input_fn_train: # returns x, Y + ... +estimator.fit(input_fn=input_fn_train) + +def input_fn_eval: # returns x, Y + ... +estimator.evaluate(input_fn=input_fn_eval) +estimator.predict(x=x) +``` + +## Creating Custom Estimator + +To create a custom `Estimator`, provide a function to `Estimator`'s +constructor that builds your model (`model_fn`, below): + + +```python +estimator = tf.contrib.learn.Estimator( + model_fn=model_fn, + model_dir=model_dir) # Where the model's data (e.g., checkpoints) + # are saved. +``` + +Here is a skeleton of this function, with descriptions of its arguments and +return values in the accompanying tables: + +```python +def model_fn(features, targets, mode, params): + # Logic to do the following: + # 1. Configure the model via TensorFlow operations + # 2. Define the loss function for training/evaluation + # 3. Define the training operation/optimizer + # 4. Generate predictions + return predictions, loss, train_op +``` + +You may use `mode` and check against +`tf.contrib.learn.ModeKeys.{TRAIN, EVAL, INFER}` to parameterize `model_fn`. + +In the Further Reading section below, there is an end-to-end TensorFlow +tutorial for building a custom estimator. + +## Additional Estimators + +There are two additional estimators under +`tensorflow.contrib.factorization.python.ops`: + +* K-Means +* Gaussian mixture model (GMM) clustering + +## Further reading + +For further reading, there are several tutorials with relevant topics, +including: + +* [Overview of linear models](../../../tutorials/linear/overview.md) +* [Linear model tutorial](../../../tutorials/wide/index.md) +* [Wide and deep learning tutorial](../../../tutorials/wide_and_deep/index.md) +* [Custom estimator tutorial](../../../tutorials/estimators/index.md) +* [Building input functions](../../../tutorials/input_fn/index.md) +""" from __future__ import absolute_import from __future__ import division from __future__ import print_function |