[tf.data] Choose non-deterministic seed once per Python-level `Dataset` object. - tensorflow

diff options

author	Derek Murray <mrry@google.com>	2018-10-08 14:17:24 -0700
committer	TensorFlower Gardener <gardener@tensorflow.org>	2018-10-08 14:24:34 -0700
commit	09b0fc199129e0f487a39741bdf674cf09035cbc (patch)
tree	de5935da3271c91bc88dd5700100082b4fc00673 /tensorflow/contrib
parent	13b47e6c4f9d7b295948b1057139bf676e394b6f (diff)

[tf.data] Choose non-deterministic seed once per Python-level `Dataset` object.

This changes the behavior of randomness-introducing datasets (`tf.data.Dataset.shuffle()`, `tf.data.experimental.shuffle_and_repeat()`, and `tf.data.experimental.RandomDataset`). Previously, when you used the same `tf.data.Dataset` object multiple times in a pipeline (e.g. by zipping two datasets derived from the same randomness-introducing dataset) *and* you did not specify an explicit `seed`, the implementation would choose different non-deterministic seeds for each use of the `Dataset` object. With this change, the seed will be chosen once per `Dataset` (technically, once per `Dataset`-`Graph` combination, due to the vagaries of capturing state in `Dataset.make_one_shot_iterator()`), which means that all uses of the same dataset object will observe the same sequence of values. This change also revealed a small bug in how `Dataset.shuffle(..., reshuffle_each_iteration=False)` is serialized when an explicit seed is specified. The op-level seed was dropped, which could lead to non-deterministic behavior. This change fixes that issue by forwarding the op-level seed to the appropriate place. PiperOrigin-RevId: 216248013

Diffstat (limited to 'tensorflow/contrib')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: