aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/core
diff options
context:
space:
mode:
authorGravatar Derek Murray <mrry@google.com>2018-10-08 14:17:24 -0700
committerGravatar TensorFlower Gardener <gardener@tensorflow.org>2018-10-08 14:24:34 -0700
commit09b0fc199129e0f487a39741bdf674cf09035cbc (patch)
treede5935da3271c91bc88dd5700100082b4fc00673 /tensorflow/core
parent13b47e6c4f9d7b295948b1057139bf676e394b6f (diff)
[tf.data] Choose non-deterministic seed once per Python-level `Dataset` object.
This changes the behavior of randomness-introducing datasets (`tf.data.Dataset.shuffle()`, `tf.data.experimental.shuffle_and_repeat()`, and `tf.data.experimental.RandomDataset`). Previously, when you used the same `tf.data.Dataset` object multiple times in a pipeline (e.g. by zipping two datasets derived from the same randomness-introducing dataset) *and* you did not specify an explicit `seed`, the implementation would choose different non-deterministic seeds for each use of the `Dataset` object. With this change, the seed will be chosen once per `Dataset` (technically, once per `Dataset`-`Graph` combination, due to the vagaries of capturing state in `Dataset.make_one_shot_iterator()`), which means that all uses of the same dataset object will observe the same sequence of values. This change also revealed a small bug in how `Dataset.shuffle(..., reshuffle_each_iteration=False)` is serialized when an explicit seed is specified. The op-level seed was dropped, which could lead to non-deterministic behavior. This change fixes that issue by forwarding the op-level seed to the appropriate place. PiperOrigin-RevId: 216248013
Diffstat (limited to 'tensorflow/core')
-rw-r--r--tensorflow/core/kernels/data/shuffle_dataset_op.cc2
1 files changed, 1 insertions, 1 deletions
diff --git a/tensorflow/core/kernels/data/shuffle_dataset_op.cc b/tensorflow/core/kernels/data/shuffle_dataset_op.cc
index 66466d6a36..9f54c381a9 100644
--- a/tensorflow/core/kernels/data/shuffle_dataset_op.cc
+++ b/tensorflow/core/kernels/data/shuffle_dataset_op.cc
@@ -485,7 +485,7 @@ class ShuffleDatasetOp : public ShuffleDatasetOpBase {
int64 buffer_size, int64 seed, int64 seed2, int64 count)
: ShuffleDatasetBase(ctx, input, buffer_size, count),
seed_(seed),
- seed2_(seed) {}
+ seed2_(seed2) {}
string DebugString() const override {
return strings::StrCat("ShuffleDatasetOp(", buffer_size_, ", ", seed_,