aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/python/data
Commit message (Collapse)AuthorAge
* [tf.data] `Dataset.make_one_shot_iterator()` inherits the random seed from ↵Gravatar Derek Murray2018-10-10
| | | | | | | | | | | | | | | the calling graph. This change makes a subtle difference to the behavior of existing programs that create multiple iterators. Previously, one-shot iterators would not inherit the graph seed, and so their values would be non-deterministic (unless explicit seeds were set). After this change, an iterator will inherit its seed from the outer graph. Multiple one-shot iterators created from the same dataset will inherit different seeds, matching the semantics of creating multiple ops with the same graph seed. PiperOrigin-RevId: 216532256
* Automated rollback of commit 950cf87104bfee28e2165fe368f66337b8a1336dGravatar A. Unique TensorFlower2018-10-10
| | | | PiperOrigin-RevId: 216500702
* [tf.data vectorization] Add vectorizer for `Add` opGravatar Rachel Lim2018-10-09
| | | | PiperOrigin-RevId: 216424512
* [tf.data] Lift parameterized test parameters into lambdas if they create TF ops.Gravatar Derek Murray2018-10-09
| | | | | | | | The existing code triggers parts of the TensorFlow runtime that may not have been fully initialized at the time the parameters are evaluated. Lifting into a lambda and invoking the lambda inside the test method will achieve the proper order. PiperOrigin-RevId: 216419757
* [tf.data] NUMA-aware MapAndBatch dataset.Gravatar Brennan Saeta2018-10-09
| | | | PiperOrigin-RevId: 216395709
* [tf.data vectorization] Handle captured inputs in MapVectorization optimizationGravatar Rachel Lim2018-10-09
| | | | PiperOrigin-RevId: 216381943
* Improve docstring for tf.data.Dataset.shuffle()Gravatar A. Unique TensorFlower2018-10-09
| | | | PiperOrigin-RevId: 216370329
* Fix the seeding for `Dataset.shuffle(..., reshuffle_each_iteration=False)`.Gravatar Derek Murray2018-10-08
| | | | | | | | | Previously, we were passing the first (graph-level) seed for both the graph-level and op-level seeds when creating a C++ dataset. This change passes the op-level seed to the appropriate point, and adds a test for the behavior with graph-but-not-op-level seeds. PiperOrigin-RevId: 216280641
* Simple comment fix in CheckpointInputPipelineHook.Gravatar Ruoxin Sang2018-10-08
| | | | PiperOrigin-RevId: 216260216
* Automated rollback of commit 09b0fc199129e0f487a39741bdf674cf09035cbcGravatar Derek Murray2018-10-08
| | | | PiperOrigin-RevId: 216256115
* [tf.data] Choose non-deterministic seed once per Python-level `Dataset` object.Gravatar Derek Murray2018-10-08
| | | | | | | | | | This changes the behavior of randomness-introducing datasets (`tf.data.Dataset.shuffle()`, `tf.data.experimental.shuffle_and_repeat()`, and `tf.data.experimental.RandomDataset`). Previously, when you used the same `tf.data.Dataset` object multiple times in a pipeline (e.g. by zipping two datasets derived from the same randomness-introducing dataset) *and* you did not specify an explicit `seed`, the implementation would choose different non-deterministic seeds for each use of the `Dataset` object. With this change, the seed will be chosen once per `Dataset` (technically, once per `Dataset`-`Graph` combination, due to the vagaries of capturing state in `Dataset.make_one_shot_iterator()`), which means that all uses of the same dataset object will observe the same sequence of values. This change also revealed a small bug in how `Dataset.shuffle(..., reshuffle_each_iteration=False)` is serialized when an explicit seed is specified. The op-level seed was dropped, which could lead to non-deterministic behavior. This change fixes that issue by forwarding the op-level seed to the appropriate place. PiperOrigin-RevId: 216248013
* [tf.data] Adding specialization for `MapDataset`, `ParallelMapDataset`, and ↵Gravatar Jiri Simsa2018-10-08
| | | | | | `MapAndBatchDataset` whose user-provided functions have the property that each output argument take its value directly from an input argument (e.g. `lambda x, y: y, x`). This specialization can produce the result without having to schedule the function using the executor. PiperOrigin-RevId: 216206232
* Fix typoGravatar Makoto Uchida2018-10-08
| | | | PiperOrigin-RevId: 216203408
* Automated rollback of commit ae0bc6f006497cc04a2ee75166d4ec71c7154fd8Gravatar Jiri Simsa2018-10-05
| | | | PiperOrigin-RevId: 215969360
* [tf.data] Adding specialization for `MapDataset`, `ParallelMapDataset`, and ↵Gravatar Jiri Simsa2018-10-05
| | | | | | `MapAndBatchDataset` whose user-provided functions have the property that each output argument take its value directly from an input argument (e.g. `lambda x, y: y, x`). This specialization can produce the result without having to schedule the function using the executor. PiperOrigin-RevId: 215957592
* Automated rollback of commit 6b538d9ce54e878576131cde0c76e43a893180c2Gravatar Smit Hinsu2018-10-04
| | | | PiperOrigin-RevId: 215808649
* [tf.data] Add a notion of `captured args` to MapDefunGravatar Rachel Lim2018-10-04
| | | | PiperOrigin-RevId: 215788485
* [tf.data] Clean up tests for `tf.data.experimental`.Gravatar Derek Murray2018-10-04
| | | | | | This change splits up large test files into smaller ones, and re-enables tests that were disabled for obsolete reasons. PiperOrigin-RevId: 215785396
* Add ability to vectorize nodes that do not derive from function arguments. ↵Gravatar Rachel Lim2018-10-04
| | | | | | (This indirectly handles "Const" outputs automagically, since they are always unstacked.) PiperOrigin-RevId: 215749824
* Automated rollback of commit 70a395f9795a48c21bc35cdf1dc44778f73a7bbaGravatar A. Unique TensorFlower2018-10-04
| | | | PiperOrigin-RevId: 215710849
* [tf.data] Fix bug in `tf.data.experimental.unbatch()`.Gravatar Derek Murray2018-10-03
| | | | | | | | | Previously, if the rank of the input to this transformation was statically unknown, we would erroneously report that the output is a scalar, and violate downstream shape integrity checks. Instead, in that case the output shape should be unknown. PiperOrigin-RevId: 215683027
* Update size of multi_device_iterator_test to medium to fix timeoutsGravatar Smit Hinsu2018-10-03
| | | | PiperOrigin-RevId: 215637785
* [data-stats] Sets user given `tag` and `counter_prefix` with ↵Gravatar Shivani Agrawal2018-10-03
| | | | | | | | `set_stats_aggregator`. `tag` would get prep-end with all the statistics recorded as summary and `counter_prefix` would set the prefix for the statistics recorded as counter. Note: `counter` defaults to `\tensorflow`, and `tag` and `prefix` gets associated with the dataset (not the stats_aggregator). PiperOrigin-RevId: 215609159
* [tf.data] Fix noisy warning.Gravatar Jiri Simsa2018-10-03
| | | | PiperOrigin-RevId: 215607171
* Automated rollback of commit 51b266fba181dffb6b3f9207280cde6b7670dd90Gravatar Jiri Simsa2018-10-03
| | | | PiperOrigin-RevId: 215593867
* [tf.data] Fix noisy warning.Gravatar Jiri Simsa2018-10-03
| | | | PiperOrigin-RevId: 215592456
* Automated rollback of commit b7e9cbab27c893283acc4a6154d7a59dffb23758Gravatar Derek Murray2018-10-02
| | | | PiperOrigin-RevId: 215503549
* Use `defun` instead of `Defun` for `tf.data`, except for ↵Gravatar Shivani Agrawal2018-10-02
| | | | | | `make_one_shot_iterator` which is to be deprecated in future. PiperOrigin-RevId: 215491729
* Internal change.Gravatar Revan Sopher2018-10-02
| | | | PiperOrigin-RevId: 215473351
* [tf.data] Adding `tf.data.Options()`, `tf.data.Dataset.options()`, and ↵Gravatar Jiri Simsa2018-10-01
| | | | | | `tf.data.Dataset.with_options()` to make it possible to respectively represent, get, and set options, such as optimization configuration, of a tf.data input pipeline. PiperOrigin-RevId: 215310764
* [tf.data] Deprecate `tf.contrib.data` and introduce `tf.data.experimental` ↵Gravatar Derek Murray2018-10-01
| | | | | | | | | | | | | | | | | | | | to replace it. This change prepares `tf.data` for TensorFlow 2.0, where `tf.contrib` will no longer exist. It retains the pre-existing endpoints in `tf.contrib.data` with deprecation warnings. Note there are some exceptions to the move: * Deprecated symbols in `tf.contrib.data` have not been moved to `tf.data.experimental`, because replacements already exist. * `tf.contrib.data.LMDBDataset` has not been moved, because we plan to move it to a SIG-maintained repository. * `tf.contrib.data.assert_element_shape()` has not yet been moved, because it depends on functionality in `tf.contrib`, and it will move in a later change. * `tf.contrib.data.AUTOTUNE` has not yet been moved, because we have not yet determined how to `tf_export()` a Python integer. * The stats-related API endpoints have not yet appeared in a released version of TensorFlow, so these are moved to `tf.data.experimental` without retaining an endpoint in `tf.contrib.data`. In addition, this change includes some build rule and ApiDef refactoring: * Some of the "//third_party/tensorflow/python:training" dependencies had to be split in order to avoid a circular dependency. * The `tf.contrib.stateless` ops now have a private core library for the generated wrappers (and accordingly are hidden in their ApiDef) so that `tf.data.experimental.sample_from_datasets()` can depend on them. PiperOrigin-RevId: 215304249
* Automated rollback of commit d78595d333c9b5c8a0705ba6852c08b107d6c462Gravatar A. Unique TensorFlower2018-09-29
| | | | PiperOrigin-RevId: 215073584
* Make cuda_py_test create a gpu and cpu target.Gravatar A. Unique TensorFlower2018-09-29
| | | | | | | | Currently, we run tests on machines with GPUs based on the "gpu" tag, and the tests automatically adapt to whether a GPU is available. Creating two targets, one tagged with "gpu" and one not, will make us run the tests in both modes. PiperOrigin-RevId: 215045035
* [tf.data] Merged contrib.data's DatasetTestBase with the DatasetTestBase in ↵Gravatar Rachel Lim2018-09-28
| | | | | | core (and added that as a base class for all the contrib tests). Also changed the assertDatasetsEqual functions so they are both graph and eager compatible (took the code from CSVDatasetTest) :) PiperOrigin-RevId: 215004892
* [tf.data Introducing tf.data.Dataset.reduce() which reduces elements of a ↵Gravatar Jiri Simsa2018-09-27
| | | | | | (finite) dataset to a single element. PiperOrigin-RevId: 214852364
* [tf.data] Minor refactoring of tf.data tests.Gravatar Jiri Simsa2018-09-27
| | | | PiperOrigin-RevId: 214781794
* [tf.data] Adding a private method for (recursively) tracking dataset inputs.Gravatar Jiri Simsa2018-09-25
| | | | PiperOrigin-RevId: 214495925
* Disabling MultiDeviceIterator in Eager mode. Support is coming soon.Gravatar Rohan Jain2018-09-24
| | | | PiperOrigin-RevId: 214296771
* [tf.data] Add `tf.contrib.data.Optional` support to `Structure`.Gravatar Derek Murray2018-09-23
| | | | | | | | | | | This change switches `tf.contrib.data.Optional` to use a `Structure` class to represent the structure of its value, instead of `output_types`, `output_shapes`, and `output_classes` properties. It adds support for nesting `Optional` objects and representing their structure. This change also makes a modification to the `Structure` class: `Structure.is_compatible_with(x)` now takes another `Structure` as the `x` argument, instead of a value. This makes it easier to work with nested structures (where we might not have a value readily available), and better matches the interface of other `is_compatible_with()` methods (e.g. in `tf.TensorShape` and `tf.DType`). Finally, in the process of making this change, I observed possible crash-failures when a DT_VARIANT tensor containing another DT_VARIANT tensor is copied between CPU and GPU. This change "fixes" the immediate problem by raising an UnimplementedError, but more work will be necessary to support the full range of use cases. PiperOrigin-RevId: 214198993
* Moving MultiDeviceIterator from contrib to core.Gravatar Rohan Jain2018-09-23
| | | | PiperOrigin-RevId: 214173896
* Merge pull request #22170 from Smokrow:patch-1Gravatar TensorFlower Gardener2018-09-21
|\ | | | | | | PiperOrigin-RevId: 214058098
| * Fix lint errorsGravatar Martin Wicke2018-09-21
| |
* | [tf.data] Add a test for state persistence between iterators over the same ↵Gravatar Derek Murray2018-09-18
| | | | | | | | | | | | MapDataset. PiperOrigin-RevId: 213555982
| * Moved example and changed wordingGravatar Moritz Kröger2018-09-18
| |
* | [tf.data] Introducing `tf.data.Dataset.window(size, shift, stride, ↵Gravatar Jiri Simsa2018-09-17
| | | | | | | | | | | | | | | | drop_remainder)`, which can be used for combining elements of input dataset into "windows". A window is itself a finite dataset and, among other things, can be used for generalized batching (see https://github.com/tensorflow/community/pull/5 for details). PiperOrigin-RevId: 213360134
* | Move from deprecated self.test_session() to self.cached_session().Gravatar A. Unique TensorFlower2018-09-17
| | | | | | | | | | | | | | | | self.test_session() has been deprecated in 9962eb5e84b15e309410071b06c2ed2d6148ed44 as its name confuses readers of the test. Moving to cached_session() instead which is more explicit about: * the fact that the session may be reused. * the session is not closed even when doing a "with self.test_session()" statement. PiperOrigin-RevId: 213326167
* | Automated rollback of commit d31f360e1574553ed23b8d483512a2065ac426ebGravatar A. Unique TensorFlower2018-09-11
| | | | | | | | PiperOrigin-RevId: 212551965
| * Update of flat_mapGravatar Smokrow2018-09-11
| | | | | | Rework based on Marks review
| * added example for flat_mapGravatar Smokrow2018-09-11
| |
* | Move from deprecated self.test_session() to self.cached_session().Gravatar A. Unique TensorFlower2018-09-10
| | | | | | | | | | | | | | | | self.test_session() has been deprecated in 9962eb5e84b15e309410071b06c2ed2d6148ed44 as its name confuses readers of the test. Moving to cached_session() instead which is more explicit about: * the fact that the session may be reused. * the session is not closed even when doing a "with self.test_session()" statement. PiperOrigin-RevId: 212336464