Added a dataset page to the api guide

PiperOrigin-RevId: 173272637
author: Katherine Wu <kathywu@google.com> 2017-10-24 10:08:03 -0700
committer: TensorFlower Gardener <gardener@tensorflow.org> 2017-10-24 10:11:45 -0700
commit: 58b071639d97afdbc5ac5e222a4be81dcb344962 (patch)
tree: 393af2e6b989038c5b59476f0a7ab0b0da42e289
parent: 86895d4a87a4d2cf2e1106b3fa3c176378d1029a (diff)
3 files changed, 98 insertions, 8 deletions
diff --git a/tensorflow/docs_src/api_guides/python/input_dataset.md b/tensorflow/docs_src/api_guides/python/input_dataset.md
new file mode 100644
index 0000000000..2798d76be9
--- /dev/null
+++ b/tensorflow/docs_src/api_guides/python/input_dataset.md
@@ -0,0 +1,81 @@
+# `Dataset` Input Pipeline
+[TOC]
+
+@{tf.data.Dataset} allows you to build complex input pipelines. See the
+@{$datasets$programmer's guide} for an in-depth explanation of how to use this
+API.
+
+## Reader classes
+
+Classes that create a dataset from input files.
+
+*   @{tf.data.FixedLengthRecordDataset}
+*   @{tf.data.TextLineDataset}
+*   @{tf.data.TFRecordDataset}
+
+## Creating new datasets
+
+Static methods in `Dataset` that create new datasets.
+
+*   @{tf.data.Dataset.from_generator}
+*   @{tf.data.Dataset.from_sparse_tensor_slices}
+*   @{tf.data.Dataset.from_tensor_slices}
+*   @{tf.data.Dataset.from_tensors}
+*   @{tf.data.Dataset.list_files}
+*   @{tf.data.Dataset.range}
+*   @{tf.data.Dataset.zip}
+
+## Transformations on existing datasets
+
+These functions transform an existing dataset, and return a new dataset. Calls
+can be chained together, as shown in the example below:
+
+```
+train_data = train_data.batch(100).shuffle().repeat()
+```
+
+*   @{tf.data.Dataset.apply}
+*   @{tf.data.Dataset.batch}
+*   @{tf.data.Dataset.cache}
+*   @{tf.data.Dataset.concatenate}
+*   @{tf.data.Dataset.filter}
+*   @{tf.data.Dataset.flat_map}
+*   @{tf.data.Dataset.interleave}
+*   @{tf.data.Dataset.map}
+*   @{tf.data.Dataset.padded_batch}
+*   @{tf.data.Dataset.prefetch}
+*   @{tf.data.Dataset.repeat}
+*   @{tf.data.Dataset.shard}
+*   @{tf.data.Dataset.shuffle}
+*   @{tf.data.Dataset.skip}
+*   @{tf.data.Dataset.take}
+
+### Custom transformation functions
+
+Custom transformation functions can be applied to a `Dataset` using @{tf.data.Dataset.apply}. Below are custom transformation functions from `tf.contrib.data`:
+
+*   @{tf.contrib.data.batch_and_drop_remainder}
+*   @{tf.contrib.data.dense_to_sparse_batch}
+*   @{tf.contrib.data.enumerate_dataset}
+*   @{tf.contrib.data.group_by_window}
+*   @{tf.contrib.data.ignore_errors}
+*   @{tf.contrib.data.rejection_resample}
+*   @{tf.contrib.data.sloppy_interleave}
+*   @{tf.contrib.data.unbatch}
+
+## Iterating over datasets
+
+These functions make a @{tf.data.Iterator} from a `Dataset`.
+
+*   @{tf.data.Dataset.make_initializable_iterator}
+*   @{tf.data.Dataset.make_one_shot_iterator}
+
+The `Iterator` class also contains static methods that create a @{tf.data.Iterator} that can be used with multiple `Dataset` objects.
+
+*   @{tf.data.Iterator.from_structure}
+*   @{tf.data.Iterator.from_string_handle}
+
+## Extra functions from `tf.contrib.data`
+
+*   @{tf.contrib.data.read_batch_features}
+
diff --git a/tensorflow/docs_src/api_guides/python/reading_data.md b/tensorflow/docs_src/api_guides/python/reading_data.md
index 8b6196ea34..7609ca91d0 100644
--- a/tensorflow/docs_src/api_guides/python/reading_data.md
+++ b/tensorflow/docs_src/api_guides/python/reading_data.md
@@ -3,16 +3,25 @@
 Note: The preferred way to feed data into a tensorflow program is using the
 @{$datasets$Datasets API}.
 
-There are three other methods of getting data into a TensorFlow program:
+There are four methods of getting data into a TensorFlow program:
 
+*   `Dataset` API: Easily construct a complex input pipeline. (preferred method)
 *   Feeding: Python code provides the data when running each step.
-*   Reading from files: an input pipeline reads the data from files
+*   `QueueRunner`: a queue-based input pipeline reads the data from files
     at the beginning of a TensorFlow graph.
 *   Preloaded data: a constant or variable in the TensorFlow graph holds
     all the data (for small data sets).
 
 [TOC]
 
+## Dataset API
+
+See the @{$datasets$programmer's guide} for an in-depth explanation of
+@{tf.data.Dataset}. The `Dataset` API allows you to extract and preprocess data
+from different input/file formats, and apply transformations such as batch,
+shuffle, and map to the dataset. This is an improved version of the old input
+methods, feeding and `QueueRunner`.
+
 ## Feeding
 
 TensorFlow's feed mechanism lets you inject data into any Tensor in a
@@ -22,7 +31,7 @@ graph.
 Supply feed data through the `feed_dict` argument to a run() or eval() call
 that initiates computation.
 
-Note: "Feeding" is the least efficient way to feed data into a tensorflow
+Warning: "Feeding" is the least efficient way to feed data into a tensorflow
 program and should only be used for small experiments and debugging.
 
 ```python
@@ -44,9 +53,9 @@ in
 [`tensorflow/examples/tutorials/mnist/fully_connected_feed.py`](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/fully_connected_feed.py),
 and is described in the @{$mechanics$MNIST tutorial}.
 
-## Reading from files
+## `QueueRunner`
 
-A typical pipeline for reading records from files has the following stages:
+A typical queue-based pipeline for reading records from files has the following stages:
 
 1.  The list of filenames
 2.  *Optional* filename shuffling
@@ -57,8 +66,8 @@ A typical pipeline for reading records from files has the following stages:
 7.  *Optional* preprocessing
 8.  Example queue
 
-Note: This section discusses implementing input pipelines using the
-queue-based APIs which can be cleanly replaced by the ${$datasets$Dataset API}.
+Warning: This section discusses implementing input pipelines using the
+queue-based APIs which can be cleanly replaced by the @{$datasets$Dataset API}.
 
 ### Filenames, shuffling, and epoch limits
 
diff --git a/tensorflow/docs_src/programmers_guide/datasets.md b/tensorflow/docs_src/programmers_guide/datasets.md
index fd1c927539..38e5612fb4 100644
--- a/tensorflow/docs_src/programmers_guide/datasets.md
+++ b/tensorflow/docs_src/programmers_guide/datasets.md
@@ -1,6 +1,6 @@
 # Importing Data
 
-The `Dataset` API enables you to build complex input pipelines from
+The @{tf.data.Dataset$`Dataset`} API enables you to build complex input pipelines from
 simple, reusable pieces. For example, the pipeline for an image model might
 aggregate data from files in a distributed file system, apply random
 perturbations to each image, and merge randomly selected images into a batch
author	Katherine Wu <kathywu@google.com>	2017-10-24 10:08:03 -0700
committer	TensorFlower Gardener <gardener@tensorflow.org>	2017-10-24 10:11:45 -0700
commit	58b071639d97afdbc5ac5e222a4be81dcb344962 (patch)
tree	393af2e6b989038c5b59476f0a7ab0b0da42e289
parent	86895d4a87a4d2cf2e1106b3fa3c176378d1029a (diff)