| Commit message (Collapse) | Author | Age |
|
|
|
|
|
| |
TF_FORCE_GPU_ALLOW_GROWTH environment variable.
PiperOrigin-RevId: 213728460
|
|\
| |
| |
| | |
PiperOrigin-RevId: 213726710
|
| |
| |
| |
| |
| |
| |
| |
| | |
VerifiedHloModule is derived from HloModule and verifies itself on destruction. This is designed to be used in HloVerifiedTestBase. This replaces the current mechanism which verifies HloModules in the TearDown method. The VerifiedHloModule approach is cleaner (less state on the test object) and more capable because these verified HLO modules can be passed to methods which require taking ownership of the module (eg, HlotestBase::Execute).
This change required some changes to the parser which enables constructing the parsed HloModule into an already allocated HloModule. Some trivial changes to HloModule are required as well.
PiperOrigin-RevId: 213718126
|
| |
| |
| |
| | |
PiperOrigin-RevId: 213718019
|
| |
| |
| |
| | |
PiperOrigin-RevId: 213716034
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
standard python `print` method, and deprecates the old `tf.Print` operator (to be removed in in v2.0).
It follows the design doc specified in https://github.com/tensorflow/community/pull/14 and additionally incorporates the community feedback and design review decisions.
This CL adds two new internal graph operators: a StringFormat operator that formats a template string with a list of input tensors to insert into the string and outputs a string scalar containing the result, and a PrintV2 operator that prints a string scalar to a specified output stream or logging level.
The formatting op is exposed at `tf.strings.Format`. A new python method is exposed at `tf.print` that takes a list of inputs that may be nested structures and may contain tensors, formats them nicely using the formatting op, and returns a PrintV2 operator that prints them. In Eager mode and inside defuns this PrintV2 operator will automatically be executed, but in graph mode it will need to be either added to `sess.run`, or used as a control dependency for other operators being executed.
As compared to the previous print function, the new print function:
- Has an API that more closely aligns with the standard python3 print
- Supports changing the print logging level/output stream
- allows printing arbitrary (optionally nested) data structures as opposed to just flat lists of tensors
- support printing sparse tensors
- changes printed tensor format to show more meaningful summary (recursively print the first and last elements of each tensor dimension, instead of just the first few elements of the tensor irregardless of dimension).
PiperOrigin-RevId: 213709924
|
| |
| |
| |
| |
| |
| | |
properly set.
PiperOrigin-RevId: 213706101
|
| |
| |
| |
| |
| |
| | |
make sure we run fit for the right number of steps.
PiperOrigin-RevId: 213706042
|
| |
| |
| |
| |
| |
| | |
This is done by making the TapeTensor a template rather than a concrete struct.
PiperOrigin-RevId: 213700425
|
| |
| |
| |
| | |
PiperOrigin-RevId: 213698663
|
| |
| |
| |
| | |
PiperOrigin-RevId: 213693027
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
large number of debugging outputs in the INFO log that look like:
I0917 16:20:11.073992 9191 meta_optimizer.cc:334] Starting optimization for grappler item: tf_graph
I0917 16:20:11.079458 9191 meta_optimizer.cc:334] Starting optimization for grappler item: tf_graph
I0917 16:20:11.084827 12447 meta_optimizer.cc:334] Starting optimization for grappler item: tf_graph
I0917 16:20:11.089359 12447 meta_optimizer.cc:334] Starting optimization for grappler item: tf_graph
After this change those lines will simply no longer appear.
RELNOTES: n/a
PiperOrigin-RevId: 213690759
|
| |
| |
| |
| |
| |
| | |
vectorize a MapDefun function. Also implements conversion for two ops: Cast and Unpack.
PiperOrigin-RevId: 213686720
|
| |
| |
| |
| | |
PiperOrigin-RevId: 213684048
|
| |
| |
| |
| | |
PiperOrigin-RevId: 213681549
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
1. Before inserting a new Transpose node, check if there already is one that
may be reused. In practice, there are two cases: either the array being
transposed is a constant (by far the most common case) or it's not.
* If it is constant, then this doesn't really make a difference:
ResolveConstantTranspose runs anyway, eliminating these Transpose nodes
and also mootifying this change as it leaves no Transpose node to be
reused. So in that case, constant-array-deduping is really the only
thing that prevents duplication of data.
* If it is not constant, that's where this new logic really helps, as
the resulting Transpose nodes are here to stay in the final graph,
and this avoids inserting more than are needed.
2. transpose_a is not supported. However, rather than CHECK-fail, it's more
useful to have this graph transformation bail with a log message. The
resulting 'unresolved' MatMul node could still be handled in some way
at the TFLite level, or we could end up having support for MatMul per se.
PiperOrigin-RevId: 213678294
|
| |
| |
| |
| |
| |
| | |
The main issue is we were keeping the input array, updating it in place and discarding the output array. That was a problem when the input array had multiple consumer ops. Now we're keeping the output array instead, which is the correct thing to do. However, in order to minimize disruption, we keep using the input array's name whenever possible, by means of some array renamings.
PiperOrigin-RevId: 213678219
|
| |
| |
| |
| |
| |
| |
| | |
custom call and try to understand what's inside. convolution_thunk does
it anyway.
PiperOrigin-RevId: 213676051
|
| |
| |
| |
| |
| |
| | |
on the "c_api" target.
PiperOrigin-RevId: 213673549
|
| |
| |
| |
| | |
PiperOrigin-RevId: 213673402
|
| |
| |
| |
| |
| |
| | |
The bug is still there and makes this test flakily fail with fp16.
PiperOrigin-RevId: 213669453
|
| |
| |
| |
| | |
PiperOrigin-RevId: 213667385
|
| |
| |
| |
| | |
PiperOrigin-RevId: 213665390
|
| |
| |
| |
| | |
PiperOrigin-RevId: 213661062
|
| |
| |
| |
| |
| |
| | |
directional feature contributions); fixed ExampleDebugOutputs bug where it errors with empty trees.
PiperOrigin-RevId: 213658470
|
| |
| |
| |
| |
| |
| |
| | |
If this causes trouble (makes graph visualizations harder to read, etc)
then consider increasing the default value of dedupe_array_min_size_bytes.
PiperOrigin-RevId: 213656796
|
| |
| |
| |
| | |
PiperOrigin-RevId: 213655969
|
| |
| |
| |
| |
| |
| |
| |
| | |
This avoids problems which happen because most optimizers do not have sparse updating gpu kernels implemented.
Fixes #22042
PiperOrigin-RevId: 213654354
|
| |
| |
| |
| | |
PiperOrigin-RevId: 213653853
|
|\ \
| | |
| | |
| | |
| | |
| | | |
ROCmSoftwarePlatform:upstream-staging-gpu-common-runtime-1
PiperOrigin-RevId: 213653830
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 213653403
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 213651158
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This is used by the random number generator. Same algorithm as for float, just with more
precision. fp16 is upcasted to fp32 and then processed with the float algorithm.
PiperOrigin-RevId: 213648736
|
|\ \ \
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 213648091
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 213640434
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 213637804
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
It doesn't access the data in any way similarly to kTuple so it should
be handled the same way.
PiperOrigin-RevId: 213630620
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 213630404
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Derive HloModulePass and HloModuleGroupPass from HloPassInterface which run module-scoped and module-group-scoped respectively. Replace all existing uses of HloPassInterface with HloModulePass because all existing passes are module-scoped. Also rewrite HloPassPipeline to support both module-scoped and module-group-scoped passes.
PiperOrigin-RevId: 213629604
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 213618350
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
instead of the fallback exception (prob not implemented).
Additionally, in a nested structure of transformed distributions, it can be useful to know which distribution is raising this error.
PiperOrigin-RevId: 213618306
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
This has been fixed a while ago. Even though TF allows ClipByValue for complex
types it's not implemented anywhere (and it doesn't make sense for complex
numbers) so blacklist complex types.
PiperOrigin-RevId: 213615429
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 213611371
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 213610324
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Being able to run CPU tests remotely while running GPU tests locally required
multiple changes:
1. Unify how we tag GPU tests in TF; we now always use tf_cuda_tests_tags().
2. Tag tests using tf_cuda_tests_tags() with 'local' and 'gpu'; this makes
them not run on non-gpu builds and always runs them locally.
PiperOrigin-RevId: 213601626
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 213595705
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 213595499
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
In tensorflow we don't have DLOG, and we should not use LOG(FATAL).
PiperOrigin-RevId: 213595376
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Also don't allow parallelization for the sort op in parallel_task_assignment.
PiperOrigin-RevId: 213592046
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 213589710
|