| Commit message (Collapse) | Author | Age |
|
|
|
| |
PiperOrigin-RevId: 181643790
|
|
|
|
|
|
|
|
| |
construction and reduce startup time in TensorFlow.
In the motivating case, this change decreases the time for a RegisterGraph RPC with a large graph from 10 minutes to 40s. Without this change, a large portion of the time is spent walking the red-black tree.
PiperOrigin-RevId: 181643594
|
|
|
|
| |
PiperOrigin-RevId: 181642475
|
|
|
|
| |
PiperOrigin-RevId: 181641673
|
|
|
|
|
|
|
|
|
|
|
| |
Adds special code for reducing a tensor of arbitrary rank to a scalar.
This is similar to the column reduction code that we used to run, but it
has a shfl loop at the end like the row reduction code. The result is
much faster for the reduction-to-scalar case. (A shfl loop doesn't make
sense for the column reduction case.)
PiperOrigin-RevId: 181640117
|
|
|
|
| |
PiperOrigin-RevId: 181636626
|
|
|
|
| |
PiperOrigin-RevId: 181635835
|
|
|
|
|
|
|
|
|
| |
Generalize the static analysis across while and for loops.
Convert len builtin to tf.shape()[0].
Add for loop canonicalization and companion tests.
Modify the template behavior for Name nodes to let the template control the target, which allows simplifying the caller.
PiperOrigin-RevId: 181633983
|
|
|
|
|
|
| |
This partially addresses the concerns in issue #13161.
PiperOrigin-RevId: 181629980
|
|
|
|
| |
PiperOrigin-RevId: 181629745
|
|
|
|
| |
PiperOrigin-RevId: 181625889
|
|
|
|
|
|
| |
Passing around copies of std::functions incurs heap allocations and deallocations, which, unfortunately, matters in this case. Minimize the amount of copies.
PiperOrigin-RevId: 181625079
|
|
|
|
|
|
| |
Changing the default value of colocate_gradients_with_ops to True.
PiperOrigin-RevId: 181624864
|
|
|
|
|
|
| |
tf.AUTO_REUSE type.
PiperOrigin-RevId: 181620379
|
|
|
|
| |
PiperOrigin-RevId: 181619109
|
|
|
|
|
|
|
| |
This allows LLVM to vectorize loads/stores in these kernels, among other
things.
PiperOrigin-RevId: 181618991
|
|
|
|
|
|
| |
definition
PiperOrigin-RevId: 181617501
|
|
|
|
| |
PiperOrigin-RevId: 181610422
|
|
|
|
|
|
|
|
| |
Remove obsolete shape_size_fn_ from HloVerifier/ShapeVerifier.
Adds a rank check to FFT shape inference.
PiperOrigin-RevId: 181601294
|
|
|
|
|
|
|
|
|
| |
Note that there are already existing checks in BufferAllocation::AddAssignment
to ensure all buffers are no larger than the allocation. But colocated buffer
sets are used to handle forced aliasing, e.g. kWhile, kCall and kConditional,
which require all buffers to be identically sized.
PiperOrigin-RevId: 181565074
|
|
|
|
|
|
| |
each other.
PiperOrigin-RevId: 181563474
|
|
|
|
| |
PiperOrigin-RevId: 181553949
|
|
|
|
|
|
| |
Started home page.
PiperOrigin-RevId: 181548668
|
|
|
|
| |
PiperOrigin-RevId: 181548597
|
|
|
|
| |
PiperOrigin-RevId: 181548023
|
|
|
|
| |
PiperOrigin-RevId: 181547286
|
|
|
|
|
|
| |
Shapes were not correctly converted to C types.
PiperOrigin-RevId: 181546120
|
|
|
|
|
|
|
|
| |
Also enables the C API in slot_create_test.py, which exercises the new
behavior (previously it would fail because it would create an op with
no OpDef and list inputs).
PiperOrigin-RevId: 181544033
|
|
|
|
| |
PiperOrigin-RevId: 181537854
|
|
|
|
|
|
| |
have been used.
PiperOrigin-RevId: 181532901
|
|
|
|
|
|
| |
CPU and 10% faster on GPU.
PiperOrigin-RevId: 181528321
|
|
|
|
| |
PiperOrigin-RevId: 181527872
|
|
|
|
|
|
|
|
| |
Bazel silently uses the wrong build settings for --config=android_arm64 (--cpu=arm64-v8a is not enough), and actually still uses armeabi-v7a in at least some cases. --fat_apk_cpu fixes this.
See #15581 for more.
PiperOrigin-RevId: 181525260
|
|
|
|
|
|
| |
to fix toolchain installation error
PiperOrigin-RevId: 181524891
|
|
|
|
|
|
| |
Fixes #15737
PiperOrigin-RevId: 181523430
|
|
|
|
| |
PiperOrigin-RevId: 181523204
|
|
|
|
|
|
|
| |
Merge input values at phi nodes correctly: If a phi operand is the phi itself,
and the other operands are all the same, then the phi node is redundant.
PiperOrigin-RevId: 181521522
|
|
|
|
| |
PiperOrigin-RevId: 181519635
|
|
|
|
|
|
| |
ExecutionProfile::compute_cycle_count never worked for CPU and GPU with Hlo
profiling disabled, as far as I can tell.
PiperOrigin-RevId: 181517824
|
|
|
|
|
|
| |
needs to use the scope symbols, not their last assigned value.
PiperOrigin-RevId: 181511978
|
|
|
|
| |
PiperOrigin-RevId: 181511871
|
|
|
|
| |
PiperOrigin-RevId: 181511142
|
|
|
|
|
|
|
|
| |
Runtime constant folding after the graph has been rewritten to include any
feeds, so it's safe and desirable to constant fold PlaceholderWithDefaults
at this point.
PiperOrigin-RevId: 181510650
|
|
|
|
| |
PiperOrigin-RevId: 181508517
|
|
|
|
| |
PiperOrigin-RevId: 181506626
|
|
|
|
|
|
|
| |
* Nesting is implemented by sharing a single EagerVariableStore among a top-level EagerTemplate and all children EagerTemplate objects that are nested underneath it. Variables added to an EagerTemplate object are also added to all EagerTemplate objects under which it is nested.
* This change also simplifies the implementation of __call__ for both Template and EagerTemplate.
PiperOrigin-RevId: 181506600
|
|
|
|
|
|
| |
MacOS build fails for missing include of <array>
PiperOrigin-RevId: 181506335
|
|
|
|
| |
PiperOrigin-RevId: 181505090
|
|
|
|
|
|
| |
both client and server side. Thread count is hardcoded to 8 for now, should be tuned in the future.
PiperOrigin-RevId: 181504374
|
|
|
|
|
|
| |
easier to package custom ops (tfmini) with the core binary on iOS.
PiperOrigin-RevId: 181503662
|