aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/contrib/data/python/ops/BUILD
Commit message (Collapse)AuthorAge
* [tf.data] Deprecate `tf.contrib.data` and introduce `tf.data.experimental` ↵Gravatar Derek Murray2018-10-01
| | | | | | | | | | | | | | | | | | | | to replace it. This change prepares `tf.data` for TensorFlow 2.0, where `tf.contrib` will no longer exist. It retains the pre-existing endpoints in `tf.contrib.data` with deprecation warnings. Note there are some exceptions to the move: * Deprecated symbols in `tf.contrib.data` have not been moved to `tf.data.experimental`, because replacements already exist. * `tf.contrib.data.LMDBDataset` has not been moved, because we plan to move it to a SIG-maintained repository. * `tf.contrib.data.assert_element_shape()` has not yet been moved, because it depends on functionality in `tf.contrib`, and it will move in a later change. * `tf.contrib.data.AUTOTUNE` has not yet been moved, because we have not yet determined how to `tf_export()` a Python integer. * The stats-related API endpoints have not yet appeared in a released version of TensorFlow, so these are moved to `tf.data.experimental` without retaining an endpoint in `tf.contrib.data`. In addition, this change includes some build rule and ApiDef refactoring: * Some of the "//third_party/tensorflow/python:training" dependencies had to be split in order to avoid a circular dependency. * The `tf.contrib.stateless` ops now have a private core library for the generated wrappers (and accordingly are hidden in their ApiDef) so that `tf.data.experimental.sample_from_datasets()` can depend on them. PiperOrigin-RevId: 215304249
* [tf.data] Move `tf.contrib.data` C++ code to a core "experimental" directory.Gravatar Derek Murray2018-09-28
| | | | | | | | NOTE: All ops and kernels previously previously defined in tensorflow/contrib/data have had their name prefixed with "Experimental" to indicate that they are not (yet) stable, and thus not subject to backwards or forwards compatibility guarantees. PiperOrigin-RevId: 214940819
* [tf.data] Changes `make_batched_features_dataset` and ↵Gravatar Shivani Agrawal2018-09-13
| | | | | | `make_tf_record_dataset` default `prefetch` buffer size to auto-tune (from 1). PiperOrigin-RevId: 212900920
* [tf.data] Adds test for `ParseExampleDataset` serialization.Gravatar Shivani Agrawal2018-08-24
| | | | PiperOrigin-RevId: 210135480
* [tf.data] Add highly experimental IndexedDatasets.Gravatar Brennan Saeta2018-08-22
| | | | PiperOrigin-RevId: 209864144
* [tf.data] Implements `dataset` transformation `parse_example_dataset(..)` ↵Gravatar Shivani Agrawal2018-08-22
| | | | | | which will replace dataset.map(parsing_ops.parse_example(..)). PiperOrigin-RevId: 209836033
* Add a (internal) `MapFn` op that maps a function over batches of inputs.Gravatar Rachel Lim2018-08-08
| | | | PiperOrigin-RevId: 207935291
* [tf.data] Adding `tf.contrib.data.reduce_dataset` which can be used to ↵Gravatar Jiri Simsa2018-07-19
| | | | | | reduce a dataset to a single element. PiperOrigin-RevId: 205281140
* [tf.data] Experimental transformations for windowing and batching of windows.Gravatar Jiri Simsa2018-07-03
| | | | | | | | This change introduces the `window` tf.data transformation, which can be used to create windows of elements (represented as a dataset) from a dataset. This transformation enables applying different batching logic to different components of a dataset. To illustrate the benefits of the transformation, this CL also introduces transformations for batching and padded batching of windows of both dense and sparse tensors. Notably, padded batching of sparse tensors was previously not possible. PiperOrigin-RevId: 203179522
* [tf.data] Cleanup of tf.data.contrib, propertly exporting public API.Gravatar Jiri Simsa2018-06-21
| | | | PiperOrigin-RevId: 201542140
* [tf.data] Improve the error message for `Dataset.padded_batch()`.Gravatar Derek Murray2018-06-08
| | | | | | | | | Previously, we accepted the `padded_shapes` argument without validating that it was compatible with the `input_dataset.output_shapes`. In many cases, we have enough static shape information to do this, and so we now raise an actionable error at the point where the mistake is committed, rather than at runtime. PiperOrigin-RevId: 199800348
* [data-stats] Adds support to collect `features` and `feature-values` ↵Gravatar Shivani Agrawal2018-06-07
| | | | | | | | statistics from `Example` record of dataset. This change-list also applies transformation function `feature_stats()` to collect stats in an associated stats_aggregator (if any) to dataset in `make_batched_feature_dataset()` by default. PiperOrigin-RevId: 199718439
* [tf.data] Input pipeline rewrites prototype.Gravatar Jiri Simsa2018-06-03
| | | | | | | | This CL: - adds `tf.contrib.data.optimize()` transformation that can be used to trigger rewrite-based optimization for the input pipeline. - adds `tf.data.Dataset._as_serialized_graph()` method that returns the serialized graph representation of the dataset PiperOrigin-RevId: 199068055
* Merge changes from github.Gravatar Yifei Feng2018-05-24
| | | | | | | Revert #18413. Too many internal test failures due to the name scope change caused by this change. Revert #18192. Cannot use re2::StringPiece internally. Need alternative for set call. Will pull and clean this up in a separate change. PiperOrigin-RevId: 197991247
* Add hook for checkpointing input pipeline while training with Estimator.Gravatar Saurabh Saxena2018-05-11
| | | | PiperOrigin-RevId: 196331223
* Prototype for tf.data writer API.Gravatar Jiri Simsa2018-04-19
| | | | PiperOrigin-RevId: 193534333
* [tf.data] Add an API for randomly sampling from multiple datasets.Gravatar Derek Murray2018-04-16
| | | | | | Fixes #15999. PiperOrigin-RevId: 193152683
* Merge changes from github.Gravatar Scott Zhu2018-04-13
| | | | PiperOrigin-RevId: 192850372
* [tf.data] Clean up //tensorflow/contrib/data/python/ops/BUILD.Gravatar Derek Murray2018-04-12
| | | | | | | Create spearate targets for each submodule, so that each test can depend on the appropriate subset of Python files. PiperOrigin-RevId: 192679856
* Remove all_opensource_files. It's not needed any more.Gravatar Martin Wicke2018-03-28
| | | | PiperOrigin-RevId: 190878279
* [tf.data] Usability improvements to `tf.contrib.data.make_csv_dataset`.Gravatar Rachel Lim2018-03-26
| | | | PiperOrigin-RevId: 190489086
* [tf.data] Add `tf.contrib.data.prefetch_to_device()`, which supports ↵Gravatar Derek Murray2018-03-22
| | | | | | prefetching to GPU memory. PiperOrigin-RevId: 190158272
* Merge changes from github.Gravatar Jacques Pienaar2018-03-21
| | | | PiperOrigin-RevId: 189945839
* Automated g4 rollback of changelist 189231636Gravatar A. Unique TensorFlower2018-03-15
| | | | PiperOrigin-RevId: 189258641
* Merge changes from github.Gravatar Jacques Pienaar2018-03-15
| | | | PiperOrigin-RevId: 189231636
* Added tf.contrib.data.make_csv_dataset as a convenience method to create aGravatar Rachel Lim2018-03-13
| | | | | | dataset from csv files. PiperOrigin-RevId: 188954555
* Added tf.contrib.data.make_batched_features_dataset as replacement of ↵Gravatar Katherine Wu2018-03-07
| | | | | | tf.contrib.learn.io.read_batch_features. Added warning about the deprecation of tf.contrib.data.read_batch_features. PiperOrigin-RevId: 188240046
* [tf.data] Add `num_parallel_reads` argument to `tf.data.TFRecordDataset`.Gravatar Derek Murray2018-02-28
| | | | | | This provides a convenient way to use the `tf.contrib.data.parallel_interleave()` idiom for reading multiple TFRecord files in parallel. In addition, the `filenames` argument to the initializer can now be a `tf.data.Dataset` of strings, which makes it easier to use `TFRecordDataset` with `Dataset.list_files()`. PiperOrigin-RevId: 187384812
* [tf.data] Add experimental ability to override the function threadpool.Gravatar Derek Murray2018-02-22
| | | | | | | | | | | | | The purpose of this feature is to enable experimentation with differentiating the CPU resources available to different stages of a `tf.data` pipeline. As a concrete example, we might use the new feature to move all input-related work from the inter-op threadpool onto a separate threadpool, leaving the inter-op threadpool free to execute higher priority work (such as dispatching ops that send tensors to an accelerator). The current implementation only allows users to create fixed-size `tensorflow::ThreadPool` resources, but we could imagine opening up this API to allow custom threadpools as well. PiperOrigin-RevId: 186614315
* [tf.data] Remove deprecated `tf.contrib.data.Dataset` class.Gravatar Derek Murray2018-02-09
| | | | | | | | | | | | This change removes the following class: * `tf.contrib.data.Dataset`. IF THIS BREAKS YOU: Replace `tf.contrib.data.Dataset` with `tf.data.Dataset` when constructing a dataset. Note that you may have to modify downstream transformations to use the core API. See "tensorflow/contrib/data/README.md" for details of how to update your code to use the core API. PiperOrigin-RevId: 185197005
* [tf.data] Move C++ code backing `tf.contrib.data.ignore_errors()` to contrib.Gravatar Derek Murray2018-02-07
| | | | | | | | | | | | This change moves the `OpKernel` and `DatasetBase` implementations to "tensorflow/contrib/data/kernels", where they are packaged as a custom op library. This demonstrates (and enforces by continuous integration) the ability to build a C++ Dataset implementation in a custom op library. Other contrib Dataset implementations will move in subsequent changes. PiperOrigin-RevId: 184938885
* Add prefetching into parallel_interleaveGravatar Brennan Saeta2017-12-20
| | | | | | | | | | | | This change adds 2 parameters to parallel_interleave: - prefetch_input_elements: determines the number of iterators to prefetch allowing buffers to warm up and data to be pre-fetched without blocking the main thread (i.e. the GetNext() call). - buffer_output_elements: in order to avoid creating thousands of threads, we fuse in the .prefetch() operator as an additional parameter. The value of this parameter is identical to the value passed to `.prefetch()` PiperOrigin-RevId: 179726088
* [tf.data] Add a transformation for getting the unique elements of a dataset.Gravatar Derek Murray2017-12-20
| | | | PiperOrigin-RevId: 179625211
* Add RandomDataset which generates pseudo random number of type int64.Gravatar Saurabh Saxena2017-11-29
| | | | | | Add tf.contrib.data.shuffle_and_repeat which reshuffles its input on each epoch. Going forward, this will replace reshuffle_each_iteration=true. PiperOrigin-RevId: 177339570
* Add tf.contrib.data.Counter.Gravatar Eugene Brevdo2017-11-21
| | | | PiperOrigin-RevId: 176597546
* [tf.data] Add experimental API for gathering statistics from an Iterator.Gravatar Derek Murray2017-11-16
| | | | PiperOrigin-RevId: 176057576
* Supporting sparse tensors as inputs and outputs for user-defined functions ↵Gravatar Jiri Simsa2017-11-13
| | | | | | passed into tf.data transformations. PiperOrigin-RevId: 175559045
* Further BUILD cleanup in contrib/...Gravatar A. Unique TensorFlower2017-11-10
| | | | PiperOrigin-RevId: 175370768
* Automated g4 rollback of changelist 174912490Gravatar A. Unique TensorFlower2017-11-10
| | | | PiperOrigin-RevId: 174961746
* [tf.data] Move contrib op registrations to tensorflow/contrib/data/ops.Gravatar Derek Murray2017-11-07
| | | | | | | | | | | | | | | | | | This change moves the contrib-level Dataset ops out of the core op libraries (where they were placed for short-term technical reasons) into a more appropriate location in contrib. This enables us to modify the signatures of these ops without being subject to the core backwards compatibility requirements. (We currently must modify the backwards compatibility ledger each time we make an allowed change to a contrib op.) For now, the kernel implementations remain in the core library, because they depend on headers that are not part of the public API. The change also removes some code testing contrib features in the core kernel_tests, since it relies on the contrib op registrations and was already adequately tested in the contrib tests. PiperOrigin-RevId: 174912490
* Adds a PrefetchWithFn op to contrib/data. Alongwith the ↵Gravatar Rohan Jain2017-10-30
| | | | | | | | FunctionBufferingResource, this can be used to prefetch and fill up a buffer by making repeated function calls. Also fixes a TODO in the ProcessFLR implementation to respect alloc_attrs for Rendezvous calls. PiperOrigin-RevId: 173990680
* Generalizing sloppy_interleave, making sloppiness an option.Gravatar Jiri Simsa2017-10-27
| | | | PiperOrigin-RevId: 173687797
* BUILD dependency cleanup in contrib/...Gravatar A. Unique TensorFlower2017-10-26
| | | | PiperOrigin-RevId: 173613863
* Make Iterators saveable.Gravatar Saurabh Saxena2017-10-24
| | | | | | Add tf.contrib.data.make_saveable_from_iterator(iterator) that builds a SaveableObject for an iterator so it can be saved/restored using tf.Saver. PiperOrigin-RevId: 173340191
* [tf.data] Add new custom transformation: `tf.contrib.data.scan()`.Gravatar Derek Murray2017-10-09
| | | | | | | | | | | | `scan()` is similar to `Dataset.map()`, with the addition of a generic piece of state that is accumulated across the elements of the input, and that may be used in the computation of the output elements. This change also updates `rejection_resample()` to use `scan()` rather than a local `tf.ResourceVariable` for accumulating the number of times each class has been encountered. PiperOrigin-RevId: 171542274
* [tf.data] Internal minor code restructureGravatar A. Unique TensorFlower2017-10-02
| | | | PiperOrigin-RevId: 170787468
* [tf.data] Internal cleaning upGravatar A. Unique TensorFlower2017-09-28
| | | | PiperOrigin-RevId: 170421375
* Internal core dataset restructuringGravatar A. Unique TensorFlower2017-09-28
| | | | PiperOrigin-RevId: 170379989
* Internal minor restructuringGravatar A. Unique TensorFlower2017-09-27
| | | | PiperOrigin-RevId: 170250936
* Move of Dataset API to tf.data.Gravatar A. Unique TensorFlower2017-09-22
| | | | | | | | | This also modifies semantics of `dataset.enumerate()`, `dataset.dense_to_sparse_batch()`, `dataset.ignore_errors()` and `dataset.unbatch()`; which now return a transformation function from `Dataset` to `Dataset`. Further API `tf.contrib.data.batch_and_drop_remainder()`, `tf.contrib.data.dense_to_sparse_batch()`, `tf.contrib.data.enumerate_dataset()`, `tf.contrib.data.group_by_window()`, `tf.contrib.data.ignore_errors()`, `tf.contrib.data.read_batch_features()`, `tf.contrib.data.sloppy_interleave()` and `tf.contrib.data.unbatch()` did not move to tf.data. PiperOrigin-RevId: 169746213