aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/docs_src/api_guides/python/input_dataset.md
blob: a6612d1bf7f1ad31dccb77cc82f82b42b4ac471b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
# Dataset Input Pipeline
[TOC]

@{tf.data.Dataset} allows you to build complex input pipelines. See the
@{$guide/datasets} for an in-depth explanation of how to use this API.

## Reader classes

Classes that create a dataset from input files.

*   @{tf.data.FixedLengthRecordDataset}
*   @{tf.data.TextLineDataset}
*   @{tf.data.TFRecordDataset}

## Creating new datasets

Static methods in `Dataset` that create new datasets.

*   @{tf.data.Dataset.from_generator}
*   @{tf.data.Dataset.from_tensor_slices}
*   @{tf.data.Dataset.from_tensors}
*   @{tf.data.Dataset.list_files}
*   @{tf.data.Dataset.range}
*   @{tf.data.Dataset.zip}

## Transformations on existing datasets

These functions transform an existing dataset, and return a new dataset. Calls
can be chained together, as shown in the example below:

```
train_data = train_data.batch(100).shuffle().repeat()
```

*   @{tf.data.Dataset.apply}
*   @{tf.data.Dataset.batch}
*   @{tf.data.Dataset.cache}
*   @{tf.data.Dataset.concatenate}
*   @{tf.data.Dataset.filter}
*   @{tf.data.Dataset.flat_map}
*   @{tf.data.Dataset.interleave}
*   @{tf.data.Dataset.map}
*   @{tf.data.Dataset.padded_batch}
*   @{tf.data.Dataset.prefetch}
*   @{tf.data.Dataset.repeat}
*   @{tf.data.Dataset.shard}
*   @{tf.data.Dataset.shuffle}
*   @{tf.data.Dataset.skip}
*   @{tf.data.Dataset.take}

### Custom transformation functions

Custom transformation functions can be applied to a `Dataset` using @{tf.data.Dataset.apply}. Below are custom transformation functions from `tf.contrib.data`:

*   @{tf.contrib.data.batch_and_drop_remainder}
*   @{tf.contrib.data.dense_to_sparse_batch}
*   @{tf.contrib.data.enumerate_dataset}
*   @{tf.contrib.data.group_by_window}
*   @{tf.contrib.data.ignore_errors}
*   @{tf.contrib.data.map_and_batch}
*   @{tf.contrib.data.padded_batch_and_drop_remainder}
*   @{tf.contrib.data.parallel_interleave}
*   @{tf.contrib.data.rejection_resample}
*   @{tf.contrib.data.scan}
*   @{tf.contrib.data.shuffle_and_repeat}
*   @{tf.contrib.data.unbatch}

## Iterating over datasets

These functions make a @{tf.data.Iterator} from a `Dataset`.

*   @{tf.data.Dataset.make_initializable_iterator}
*   @{tf.data.Dataset.make_one_shot_iterator}

The `Iterator` class also contains static methods that create a @{tf.data.Iterator} that can be used with multiple `Dataset` objects.

*   @{tf.data.Iterator.from_structure}
*   @{tf.data.Iterator.from_string_handle}

## Extra functions from `tf.contrib.data`

*   @{tf.contrib.data.get_single_element}
*   @{tf.contrib.data.make_saveable_from_iterator}
*   @{tf.contrib.data.read_batch_features}