| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
| |
In the common case of clean termination, we can avoid performing several atomic
operations and allocations.
PiperOrigin-RevId: 188339594
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
absl::string_view equivalents.
This will allow for a more convenient transition to absl::string_view.
Calls to set StringPiece::set and StringPiece::clear were replaced with the StringPiece constructor as follows:
string_piece_foo.set(data, size) => string_piece_foo = StringPiece(data, size)
string_piece_foo.clear() => string_piece_foo = StringPiece()
PiperOrigin-RevId: 175326576
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
END_PUBLIC
---
Commit 9f8523640 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Update ops-related pbtxt files.
PiperOrigin-RevId: 173145770
---
Commit 01b6b0638 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Cut tracing memory cost
PiperOrigin-RevId: 173144626
---
Commit 5e23e0e67 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Erase cloned instructions on the fly when merging fusion nodes.
This avoids the awkward situation where an RNG which is clearly eligible for fusion becomes ineligible mid-fusion because it suddenly has an extra (dead) user.
PiperOrigin-RevId: 173141716
---
Commit 1038927c0 authored by Saurabh Saxena<srbs@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add SerializeIterator op that serializes an IteratorResource into a variant tensor.
Add DeserializeIterator op that builds IteratorResource from a variant tensor.
Move BundleReaderWrapper and BundleWriterWrapper from dataset.h to iterator_ops.cc.
Add generic key-value store interfaces IteratorStateReader and IteratorStateWriter for reading/writing state of iterators.
Get rid of IteratorBundleReader and IteratorBundleWriter.
PiperOrigin-RevId: 173140858
---
Commit 57f3e529d authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Internal change
PiperOrigin-RevId: 173136642
---
Commit 0e56ffb7b authored by Shanqing Cai<cais@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fix breakages in OSS builds
See example breakages logs at:
http://ci.tensorflow.org/job/tensorflow-cl-cpu-python3-pip/10847/console
http://ci.tensorflow.org/job/tensorflow-cl-gpu/11008/console
1. CL/172477381 added the no_oss tag to tests with oss_serial tags, which broke the logic of OSS_SERIAL tests in pip.sh and run_pip_test.sh. This CL fixes that.
2. The nccl_kernels BUILD target in contrib/nccl/BUILD was missing some dependencies. This CL adds the missing ones.
Fixes: #13918
PiperOrigin-RevId: 173133914
---
Commit 3ed049b67 authored by Alexandre Passos<apassos@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Allows calling keras layers in eager mode.
PiperOrigin-RevId: 173129805
---
Commit 4ec6f2b07 authored by Alexandre Passos<apassos@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Switching contrib.summaries API to be context-manager-centric
PiperOrigin-RevId: 173129793
---
Commit 03b02ffc9 authored by Justine Tunney<jart@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Put Bazel mirror URLs first
PiperOrigin-RevId: 173127955
---
Commit 46ab25e4d authored by David Majnemer<majnemer@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[XLA] Add support for convolutions with no spatial dimensions
PiperOrigin-RevId: 173126950
---
Commit fc56349b7 authored by Derek Murray<mrry@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
[tf.data] Convert dataset arguments to tensors as early as possible.
This change raises a `TypeError` earlier if (for example) the `batch_size`
argument to `Dataset.batch()` has the incorrect type.
PiperOrigin-RevId: 173126678
---
Commit 4f7503a87 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
K-FAC: Support for registering multiple minibatches with register_fully_connected()
PiperOrigin-RevId: 173121735
---
Commit 2845bfcd6 authored by Tim Harley<tharley@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Avoid listing all modified Enter/RefEnter nodes on INFO, use VLOG(1) instead.
Leave a single, simple, message on INFO.
PiperOrigin-RevId: 173121726
---
Commit 434695921 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
K-FAC: _check_registration() supports multiple towers.
PiperOrigin-RevId: 173115870
---
Commit 670dddf4a authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Multi-minibatch support for
tf.contrib.kfac.fisher_blocks.FullyConnectedKFACBasicFB.
PiperOrigin-RevId: 173109677
---
Commit dc13a8e2f authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Fix import of meta graphs with partitioned variables into a scope.
Saver inspects SliceInfo to decide the variable name when creating a
checkpoint. Before this fix even if a partitioned variable ("weights")
was imported into a scope "a" it would still be checkpointed as ("weights")
instead of ("a/weights") since import_scoped_meta_graph was not adjusting
the SliceInfo.
WARNING: if you use import_meta_graph on graphs with partitioned_variables WITH an import_scope argument AND then create a Saver to write/read checkpoints this change
may break your checkpoint loading.
PiperOrigin-RevId: 173105796
---
Commit eea089bdb authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
K-FAC: Multi-tower support for ConvDiagonalFB.
PiperOrigin-RevId: 173105412
---
Commit 9b9cbbe2a authored by Yong Tang<yong.tang.github@outlook.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Add int64 Tperm type support for `Transpose` (#13909)
* Add int64 Tperm type support for `Transpose`
This fix adds int64 Tperm support for `Transpose`. In
`array_ops.cc`, `Transpose` and `ConjugateTranspose`
have been specified as accepting int32 and int64 perm
types. However, only int32 kernels has been registered.
This fix adds the int64 perm support by removing
the constraint on Tperm, resolve the type at runtime,
and copying the data type accordingly to correctly handle
the int64/int32 types.
Additional tests have been added as well.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Add test cases for int64 of perm in Transpose.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Add namespace to hide PermutationHelper
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Enable use_gpu=True for perm type test.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* extra // namespace annotation
* Adding a comment about int32 casting that should be safe.
Permutations only contain values that refer to dimensions, and the maximum number of dimensions we have is 254, so an int32 is always safe here.
---
Commit ac0004e71 authored by Yong Tang<yong.tang.github@outlook.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Add int64 shape support on GPU for stateless random ops. (#13908)
* Add int64 shape support on GPU for stateless random ops.
This fix adds int64 shape support on GPU for stateless random ops
`StatelessRandomUniform`, `StatelessRandomNormal`, `StatelessTruncatedNormal`.
The int64 shape for stateless random ops is already supported on CPU
with int32/int64 processed properly through `MakeShape`.
However, on GPU a type constraint `.TypeConstraint<int32>("T")`
has been improperly added. Such a type constraint actually prevents
an int64 shape type to run on GPU. (As a comparision, no type constraint
on CPU).
This fix removes the type constraint and allows int64 shape to be run on GPU.
This fix also adds test cases for int64 shape support on stateless random ops.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Add test cases for int64 shape support for stateless random ops.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Add int32 to shape types tested.
---
Commit 0d437c3be authored by Yong Tang<yong.tang.github@outlook.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Add int64 padding support for MirrorPad (#13907)
* Add int64 padding support for MirrorPad
This fix adds int64 padding support for `MirrorPad`.
In the `array_ops.cc` the `MirrorPad`/`MirrorPadGrad`
has been specified as supporting int64 padding. The related
kernels does not have the int64 padding registered though.
This fix adds the int64 padding support. This fix also adds
additional test cases for coverage.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Update template for CPU and GPU support of int64 paddings.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Add int64 padding support for MirrorPad
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Put eigen header first like before, just in case.
---
Commit 690003cc0 authored by Yong Tang<yong.tang.github@outlook.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Add `int64` type `multiples` support for `tf.tile` (#13884)
* Add `int64` type `multiples` support for `tf.tile`
In the doc of `tf.tile` (tf.tile.__doc__) both `int32`
and `int64` are supported for `multiples`. However, the kernel
for `int64` is not registered yet.
This fix adds the support of `int64` `multiples` so that the
behavior matches the description of the docs.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Update functors for int64 multiples support in `tf.tile`
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Update test cases for int64 of multiples in `tf.tile`
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Add GPU and non GPU tests
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* format with clang-format -i
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Move Tmultiples after T (as it is auxilliary)
And use `use_gpu=True`
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
---
Commit fd8d517b9 authored by Yunxing Dai<yunxing@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add tests for convolution 1D
RELNOTES: n/a
PiperOrigin-RevId: 173060283
---
Commit 40c475b48 authored by formath<jinpengliu@163.com>
Committed by Vijay Vasudevan<vrv@google.com>:
add segment_reduction_ops to tf_op_files (#13901)
---
Commit bfa4ec194 authored by Tayo Oguntebi<10927929+tayo@users.noreply.github.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Update node_def.proto comments (#13874)
The device field had outdated comments.
Note: We could consider adding tpu as an example here, e.g. "gpu" | "cpu" | "tpu". Thoughts?
---
Commit c9cb5a58d authored by formath<jinpengliu@163.com>
Committed by Vijay Vasudevan<vrv@google.com>:
protobuf lib path bug fix for benckmark on osx (#13878)
---
Commit 1c1dad105 authored by Yong Tang<yong.tang.github@outlook.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Add int64 axis support for reduction ops. (#13891)
* Add int64 axis support for reduction ops.
This fix is a follow up to PR 13863. In PR 13863 the
program crash is fixed if int64 axis is passed to reduction ops,
e.g. reduce_sum, reduce_max, etc. However, 13863 does not
process the case of int64 support, it merely fixes the crash.
This fix adds the support for int64 axis of reduction ops.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Add int64 axis support for mean, prod, sum
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Add int64 axis support for min and max.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Add int64 axis support for reduce_all and reduce_any
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Add test cases for int64 axis support of reduce_any and reduce_all
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
---
Commit 17096081e authored by Yong Tang<yong.tang.github@outlook.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Improve resize_bicubic performance by reorganizing loops (#13840)
* Improve resize_bicubic performance by reorganizing loops
This fix tries to address the issue raised in 13693 where
performance of `resize_bicubic` is not on par with opencv.
This fix rearranges the loops so that it is the same for
num_channel=40 and num_channel=3:
Pre-fix:
```
CHANNEL=40
opencv: 145.08ms
tf: 314.26ms
CHANNEL=3
opencv: 11.95ms
tf: 8.95ms
```
Post-fix:
```
CHANNEL=40
opencv: 144.25ms
tf: 214.55ms
CHANNEL=3
opencv: 11.78ms
tf: 14.07ms
```
This fix fixes 13693.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Keep special handling of `num_channels=3` for `resize_bicubic`
This commit keeps special handling of `num_channels=3` for
`resize_bicubic`:
Without special handling:
```
opencv: 11.78ms
tf: 14.07ms
```
With special handling:
```
opencv: 11.74ms
tf: 9.46ms
```
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Expand Benchmark test for resize_bicubic
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Update from review feedback.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
---
Commit b927df57f authored by Yong Tang<yong.tang.github@outlook.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Update protobuf.cmake to b04e5cba356212e4e8c66c61bbe0c3a20537c5b9 (#13893)
This fix tries to address the issue raised in 8187 where
protobuf.cmake used different version as bazel.
The reason for discrepancy was due to the fact that a customerized
protobuf was needed with Windows patch. Since the patch has been
merged in (https://github.com/google/protobuf/pull/2203),
it makes sense to update protobuf.cmake so that the same version
of cmake is used.
This fix fixes 8187.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
---
Commit d1183ca6a authored by Vijay Vasudevan<vrv@google.com>
Committed by GitHub<noreply@github.com>:
Give each variable a unique name in accumulate_n_v2_eager_test. (#13886)
---
Commit a69945810 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Update pin for bazel-toolchains to latest version
PiperOrigin-RevId: 173002530
---
Commit 9d55c249c authored by Yong Tang<yong.tang.github@outlook.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Fix doc in TF_CALL_ when invoked in mobile platform (#13881)
* Fix doc in TF_CALL_ when defined(IS_MOBILE_PLATFORM) && !defined(__ANDROID_TYPES_FULL__)
This is a small doc fix that includes bool as part of the types
that is supported in mobile (IS_MOBILE_PLATFORM && !__ANDROID_TYPES_FULL__),
as bool is clearly invoked in the following define.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Also add bool to android full version.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
---
Commit ba49d8583 authored by Bjarke Hammersholt Roune<broune@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Slight change to reduce_test to avoid generating inf, which was triggering an inf detector unnecessarily.
PiperOrigin-RevId: 172965466
---
Commit 93e8f3c67 authored by Anna R<annarev@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Adding Python ApiDef overrides.
PiperOrigin-RevId: 172960496
---
Commit 0d6a2e353 authored by Anna R<annarev@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Internal change.
PiperOrigin-RevId: 172960439
---
Commit 62df65c72 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add dtype argument to Mean and Accuracy object-oriented metrics.
PiperOrigin-RevId: 172957714
---
Commit d7409d32b authored by Simone Cirillo<my.accounts@gmx.se>
Committed by Vijay Vasudevan<vrv@google.com>:
Fix import of spatial_softmax from tensorflow.contrib.layers (#13833)
---
Commit df8bce63d authored by Yong Tang<yong.tang.github@outlook.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Fix crash when `int64` axis is passed to `tf.reduce_sum` (#13863)
* Fix crash when `int64` axis is passed to `tf.reduce_sum`
This fix tries to fix the crash triggered by `int64` axis passed
to `tf.reduce_sum`:
```
ubuntu@ubuntu:~/tensorflow2$ (cd && python)
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> v = tf.reduce_sum([1,2,3], tf.constant(0, tf.int64))
2017-10-20 15:55:06.993430: F tensorflow/core/framework/tensor.cc:601] Check failed: dtype() == expected_dtype (9 vs. 3)
ubuntu@ubuntu:~/tensorflow2$
```
The issue is caused by the fact that shape inference in `common_shape_fns.cc`
only assumes int32 without proper handling of diffent types. In `math_ops.cc`
both int32 and int64 are mentioned.
NOTE that this fix does not address the issue that int64 is not supported.
To allow int64 axis it is more than adding a template in `ReductionOp` as the type
of the axis seems to be decided by some other ways in Eigen.
This fix merely fixed the crash so that an error message will return without
exit from the python program "No OpKernel was registered to support Op 'Sum' with these attrs".
Still, I think its worth to at least allow the program to continue in case of unsupported kernel.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Update implementation with a template helper function.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
---
Commit 29c7b4658 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Adding the Stanford Tensorflow class to community resources.
PiperOrigin-RevId: 172956049
---
Commit f758b24a8 authored by Alexandre Passos<apassos@google.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Variable name for the eager test (#13873)
---
Commit a5fe66b15 authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Removed some unnecessary broadcasts in binary ops where only one input needs
broadcasting (which is a fairly common case, even in the fallback path).
PiperOrigin-RevId: 172950493
---
Commit c77090a0a authored by Yong Tang<yong.tang.github@outlook.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Fix issues where int64 crops could not be passed to batch_to_space. (#13862)
* Fix issues where int64 crops could not be passed to batch_to_space.
This fix tries to address the issue where int64 `crops` could
not be passed to `batch_to_space` even though both int32 and
int64 are specified as supported in the docs (tf.batch_to_space.__doc__)
The reason is that BatchToSpace kernel puts a constraint of int32 to crops
data types.
This fix removed the constraint so that int64 `crops` could be supported.
NOTE: Just removing the constraint should work and it is not necessary
to add specification to the kernel class template, as `SubtleMustCopyFlat`
called in the class already correctly handled both int32 and int64 cases.
Besides, other data types (e.g., float or double) will not be passed to the
kernel as they are guarded by the specification in `array_ops.cc`.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Also remove int64/int32 type constraints for SpaceToBatch kernels
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Add test cases for int64 crops of batch_to_space and space_to_batch
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
* Fix test failures.
Signed-off-by: Yong Tang <yong.tang.github@outlook.com>
---
Commit 494837936 authored by Joshua V. Dillon<jvdillon@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Make `tf.contrib.distributions` quadrature family accept a `Tensor` for
`quadrature_grid_and_probs` argument.
PiperOrigin-RevId: 172950094
---
Commit 9c825d32c authored by Jinze Bai<baijinze1994@163.com>
Committed by Vijay Vasudevan<vrv@google.com>:
Merge two GPU kernel launching to one in DiagOp. (#13859)
---
Commit c0ca50a47 authored by Yan Facai (???)<facai.yan@gmail.com>
Committed by Vijay Vasudevan<vrv@google.com>:
ENH: add Relu6GradGrad (#13268)
* ENH: add Relu6GradGrad
* TST: add test case
* CLN: import nn_grad
* TST: add init value
---
Commit 8ff33271e authored by Justin Lebar<jlebar@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Dump the computation's SessionModule as part of the tf_compile rule.
PiperOrigin-RevId: 172946149
---
Commit ebcae4a5e authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add streaming_precision_recall_at_equal_thresholds
This helper method computes streaming tp, fp, tn, fp, precision, and recall for the user in a way that exhibits O(T + N) time and space complexity (instead of O(T * N)), where T is the number of thresholds and N is the size of the predictions tensor.
Thanks to Frank Chu for the efficient algorithm!
PiperOrigin-RevId: 172946073
---
Commit ccfd9c1e5 authored by Sanjoy Das<sanjoy@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Log Hlo IR during AOT compilation
PiperOrigin-RevId: 172944165
---
Commit 985031a10 authored by Alexandre Passos<apassos@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Allows tfe.enable_eager_execution(device_policy=tfe.DEVICE_POLICY_WARN).
PiperOrigin-RevId: 172943398
---
Commit 703182d85 authored by Mingxing Tan<tanmingxing@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Add performance guide for fused decode_and_crop_jpeg optimization.
PiperOrigin-RevId: 172943116
---
Commit 66b1f4383 authored by Francois Chollet<fchollet@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Make Network compatible with eager mode. Currently it only allows to instantiate a Network in eager mode using the regular Keras API, and call it on eager tensors.
PiperOrigin-RevId: 172942569
---
Commit 41df2cec2 authored by ashankar<ashankar@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Testing pending CL: 172939383
---
Commit 37fd95179 authored by Alexandre Passos<apassos@google.com>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
Simplifies capturing code in graph_callable to use recent function improvements.
PiperOrigin-RevId: 172937003
---
Commit d1e7382af authored by A. Unique TensorFlower<gardener@tensorflow.org>
Committed by TensorFlower Gardener<gardener@tensorflow.org>:
BEGIN_PUBLIC
Automated g4 rollback of changelist 172924803
PiperOrigin-RevId: 173347587
|
|
|
|
| |
PiperOrigin-RevId: 162986106
|
|
|
|
| |
PiperOrigin-RevId: 162809937
|
|
|
|
| |
PiperOrigin-RevId: 162782660
|
|
|
|
| |
PiperOrigin-RevId: 157837211
|
|
|
|
|
|
| |
microseconds from milliseconds.
Change: 147764063
|
|
|
|
| |
Change: 139591752
|
|
|
|
|
|
|
|
|
|
| |
std::unordered_map and std::unordered_set.
Add default template argument of std::hash<Key> to gtl::FlatMap and gtl::FlatSet
to better match std::unordered_{map,set}
Improves performance on an RPC-intensive benchmark by ~0.4%
Change: 137754417
|
|
|
|
|
| |
could be deleted by the abort.
Change: 129916678
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Send/Recv paths:
o Allow SendOp and RecvOp implementations to directly use the string buffer
contained in a Rendezvous::ParsedKey object, rather than allocating their
own string object. Saves two allocations per Send/Recv pair.
o Use std::move in a few places to avoid copying a std::function object.
o Eliminated unused ParsedName variable declaration in
DeviceNameUtils::ParseLocalName.
Change: 127630066
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
done by models distributed across many devices. A small
microbenchmark model that runs two banks (A and B) of 30 nodes with a
30x30 full shuffle between them, where each of the nodes in A and in B
run with one node on each of the 30 devices (so 30*29+30+30, or ~930
separate RPCs) was showing ~111,000 allocations per iteration of the graph.
With the changes here, this is now down to ~64,300 allocations per iteration.
Changes include:
o DeviceContext::CopyDeviceTensorToCPU and related helper routines:
use StringPiece instead of const string& for the tensor name (avoids
creating a string in some cases where the caller only has a
StringPiece available).
o Change some Rendezvous and BaseRemoteRendezvous interfaces to
take a 'const Rendezvous::ParsedKey& key', rather than 'const string& key'.
In many cases, the callers were already having to parse the key
into a ParsedKey, and so we were doing the parsing multiple times at
different levels as we processed receiving or sending of a tensor. This
reduces the number of times that we parse a key as it flows from a Send
node through to a Recv node on another worker.
o Changed Rendezvous::ParsedKey so that it makes a copy of the underlying
full key, and then uses StringPiece objects to point into this copy for
the src_device, dst_device, and edge_name pieces. This turns 3 string
allocations into 1 per Rendezvous::ParseKey call.
o Added new StringPiece Rendezvous::ParsedKey::FullKey() accessor to
return a StringPiece for the underlying full key, and used that in a
few places (mostly logging) where that is useful.
o In many places, used std::move(function_variable) when assigning to
an instance variable. This eliminates a very large number of excess
std::function allocations/initializations (~56000 of the baseline
allocations were related to std::function setup or cloning, and this
is now down to ~11000 after this cl).
o In the RPC-based remote workers (StubbyRemoteWorker and
GrpcRemoteWorker), changed the code path in RecvTensorAsync to avoid
creation of a std::function with 6 arguments unless necessary. There
are three cases now handled separately:
(a) We're not logging, and we didn't make a copy of the request that we
need to free: just use the passed in 'StatusCallback done' object
directly, without creating a wrapper std::function object at all
(b) We're not logging, but we made a copy of the request that we
need to free: we create a simple wrapper std::function that
invokes the passed in 'done' callback, and then frees the
req_copy request copy object.
(c) We're logging: we create the std::function object with all the
necessary state to log when the recv has finished.
o Changed DeviceMgr::LookupDevice to take a StringPiece, rather than a
const string&, and changed the hash table to use StringPiece keys.
This allows clients that just have a StringPiece device name in their
hand to avoid a string creation to lookup the Device* object.
o Changed ExecutorState to use a specialized TaggedNodeReadyQueue that
internally uses a gtl::InlinedVector<TaggedNode, 16>, rather than
using a std::deque<TaggedNode> for keeping track of nodes ready to
execute. This is faster because it avoids allocations entirely if the
ready node queue doesn't get bigger than 16, and inlined vectors are
generally faster than std::deque, at a minor risk of using more memory
if this queue grows to very large numbers of ready nodes (mostly imaginable
only in pathological graphs).
o In ExecutorState::Process, allocated a single ExecutorState::AsyncState
object to keep track of all the state we need to preserve for an asynchronously
executed node, rather than keeping this state implicitly via a very large
number of arguments to a lamda function.
o Added new atomic std::atomic<bool> status_is_ok_ in
BaseRemoteRendezvous. This allows us to avoid acquiring the lock when
we just want to check if the status is non-OK in
BaseRemoteRendezvous::Send and BaseRemoteRendezvous::ValidateDevices.
o In GraphMgr::RunAllDone, changed assignment of args.runner to avoid
one extra level of std::function indirection (binding the function directly
to the ThreadPool::Schedule routine, rather than creating an intermediate
lambda function that invokes this inside the body of the lambda.
o Added freelist of RpcRecvTensorCall objects in
third_party/tensorflow/core/distributed_runtime/rpc/rpc_rendezvous_mgr.cc
o Changed third_party/tensorflow/core/framework/rendezvous.cc to keep the
hashtable of Item* objects keyed by uint64 (hash of the tensor name), rather
than the full-string tensor name. Collisions in the 64-bit hash space
should basically never happen.
o Sped up DeviceNameUtils::ParseFullName by optimizing for the common
ordering of parts of /job, /replica, /task, /device. The parsing code
was general enough to handle any order, but did so by comparing the
prefixes 4, 3, 2, and 1 times, respectively, rather than 1, 1, 1, and 1 times.
o Sped up DeviceNameUtils::SplitDeviceName to avoid extra string copies.
Change: 125991891
|
|
|
|
| |
Change: 123900938
|
|
|
|
|
|
|
| |
to/from strings and used these in the rendezvous code. Improves
performance for ptb_word_lm slightly (saves several allocations and an
sscanf per CPU <-> GPU transfer).
Change: 115852277
|
|
|
|
| |
Change: 113089380
|
|
|
|
|
| |
tensorflow/core/ files and build targets.
Change: 113080064
|
|
|
|
|
| |
After this we can replace port.h with types.h.
Change: 112727463
|
|
|
|
|
| |
directly so we can drop it from port.h.
Change: 111613643
|
|
|
|
|
| |
directly so we can drop it from port.h.
Change: 111528649
|
|
|
|
|
|
| |
imported.
Change: 110842260
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes:
* error message that refers to removed `DefaultSession` method.
* -Wnull-conversion warnings
* the "_start_time" attr for recvs when the flag "--brain_enable_scheduling_for_recvs" is set.
* typo in tutorial data download progress message.
* a typo ("however their installing"=>"however installing").
* typo, rename "TensorFlow Mechanics" to "How To" to be consistent with the website.
* a typo ("subtact"=>"subtract").
* protobuf examples in comments in tensorflow::Example.proto.
* formula formatting in MNIST beginner tutorial
* negative fraction-of-queue-full stats
* protobuf inclusion path so that Android demo will build under Blaze.
* small typo (moderatly > moderately)
* Session.run() to check that tensor arguments come from the session's graph.
* another six import
* seq2seq typo in bazel command
Base CL: 108349164
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
error handling, updates to website.
Changes:
- Removes redundant reshape from image models by @mrry
- Default TensorBoard to localhost by @danmane
- Reformatting of tensorflow/core by @josh11b
- Make tutorials backwards compatible to 0.5.0 by @girving
- Improve print documentation (md files not updated).
- Add proper scrolling to sitemap by @martinwicke
Base CL: 107956254
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes:
- Updates to op documentation and index by Josh
- More changes to BUILD files for python 3 support by @girving
- Fix to Eigen to use DenseIndex everywhere by @jiayq
- Enable configuration for cuda compute capability by @zheng-xq,
including updates to docs.
- Route aggregation method through optimizer by schuster
- Updates to install instructions for bazel 0.1.1.
Base CL: 107702099
|
|
TensorFlow is an open source software library for numerical computation
using data flow graphs.
Base CL: 107276108
|