| Commit message (Collapse) | Author | Age |
|
|
|
|
|
| |
benchmarking. At the moment, it returns a default config with only Grappler dependency optimizer disabled. Many benchmarks wrap the subgraph they want to time in control_flow_ops.group() to avoid including the overhead of copying the output back to the Python client in the measurement. In the graph, this only adds a control dependency between the subgraph output and the fetch node, which in turn (often) causes the dependency optimizer to turn all nodes in the graph into no-ops.
PiperOrigin-RevId: 216242463
|
|
|
|
|
|
| |
sqrt(v + epsilon**2) and changed flag name accordingly.
PiperOrigin-RevId: 216240045
|
|
|
|
|
|
| |
mechanism, since the meta optimizer only checks if it has been cancelled before running each sub-optimizer. We can add cancellation to each sub-optimizer if necessary.
PiperOrigin-RevId: 216234262
|
|
|
|
| |
PiperOrigin-RevId: 216230391
|
|\
| |
| |
| | |
PiperOrigin-RevId: 216228494
|
| |
| |
| |
| |
| |
| | |
must be on axis 0, and can only be on 1 or 2-d inputs.
PiperOrigin-RevId: 216226776
|
| |
| |
| |
| | |
PiperOrigin-RevId: 216225505
|
| |
| |
| |
| | |
PiperOrigin-RevId: 216224026
|
| |
| |
| |
| | |
PiperOrigin-RevId: 216217887
|
|\ \
| | |
| | |
| | | |
PiperOrigin-RevId: 216217509
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 216212953
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 216211467
|
| | |
| | |
| | |
| | |
| | |
| | | |
existing namespace.
PiperOrigin-RevId: 216211286
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 216211279
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 216210141
|
| | |
| | |
| | |
| | |
| | |
| | | |
shared resources are very similar to global variables functionally and they are initialized at the same time but since workers are only waiting for global variables being initialized, there is a race condition that sometimes the shared resource is not ready.
PiperOrigin-RevId: 216208679
|
| | |
| | |
| | |
| | |
| | |
| | | |
Will be helpful for specifying serving signatures when exporting SavedModels
PiperOrigin-RevId: 216207284
|
| | |
| | |
| | |
| | |
| | |
| | | |
`MapAndBatchDataset` whose user-provided functions have the property that each output argument take its value directly from an input argument (e.g. `lambda x, y: y, x`). This specialization can produce the result without having to schedule the function using the executor.
PiperOrigin-RevId: 216206232
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 216205396
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 216203408
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This benchmark creates many intermediates values, so we can make sure there's no performance overhead (it looks like there might be currently, or it might be from some other difference). It also runs in a defun and in legacy graph mode.
Results from my machine:
entry {
name: "CondWithManyIntermediatesBenchmark.benchmark_cond_v1_defun"
iters: 500
wall_time: 1.25822591782
}
entry {
name: "CondWithManyIntermediatesBenchmark.benchmark_cond_v2_defun"
iters: 500
wall_time: 5.99376106262
}
entry {
name: "CondWithManyIntermediatesBenchmark.benchmark_cond_v1_graph"
iters: 500
wall_time: 2.05277585983
}
entry {
name: "CondWithManyIntermediatesBenchmark.benchmark_cond_v2_graph"
iters: 500
wall_time: 2.84808516502
}
Clearly we have some work to do! I haven't looked into the time differences at all yet.
PiperOrigin-RevId: 216202325
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 216201732
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 216201714
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Prior to this change, tf.colocate_with(v) would insert spurious operations (a ReadVariableOp and an Identity) in the graph when v is a resource variable, and then
colocate the operations within the block with those newly added, otherwise disconnected, operations.
This commit avoids adding the unnecessary ReadVariableOp/Identity nodes and colocates
operations within the block with the VarHandleOp.
PiperOrigin-RevId: 216201638
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 216200439
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
benchmarks.
original runtime: 4.83492736816 secs
w/ cache runtime: 2.19033999443 secs
PiperOrigin-RevId: 216195286
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 216191084
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 216189458
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 216187878
|
|\ \ \
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 216185979
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 216151605
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 216079665
|
| | | |
| | | |
| | | |
| | | | |
PiperOrigin-RevId: 216066634
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
It fails 1/1000 runs in OSS builds.
PiperOrigin-RevId: 216050192
|
|\ \ \ \
| | | | |
| | | | |
| | | | | |
PiperOrigin-RevId: 216046506
|
| | | | |
| | | | |
| | | | |
| | | | | |
PiperOrigin-RevId: 216041507
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | | |
PiperOrigin-RevId: 216040541
|
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
PiperOrigin-RevId: 216021117
|
|\ \ \ \ \ \
| | | | | | |
| | | | | | |
| | | | | | | |
PiperOrigin-RevId: 216009475
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
for now.
PiperOrigin-RevId: 216003028
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
PiperOrigin-RevId: 216001984
|
| | | | | | |
| | | | | | |
| | | | | | |
| | | | | | | |
PiperOrigin-RevId: 216000752
|
|\ \ \ \ \ \ \
| | | | | | | |
| | | | | | | |
| | | | | | | | |
PiperOrigin-RevId: 215995215
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
The current logic tries to bubble the forward pass tensor to the outermost
graph. That might not always be do-able e.g. when the cond is inside a while loop
it will need to know accumulator logic for while_loop. So instead, the cond_grad
now captures tensors from the forward If op's graph. When the grad If op is
built these tensors will be appropriately captured by the surrounding FuncGraph.
PiperOrigin-RevId: 215993009
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
Heuristic NCHW/NHWC layout assignment works great; we've never had to flip this
flag. Might as well remove it and simplify things a bit.
PiperOrigin-RevId: 215989807
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
PiperOrigin-RevId: 215989259
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
PiperOrigin-RevId: 215989111
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
CudnnConvolutionAlgorithmPicker::PickBestAlgorithm.
Using a struct lets us return additional data -- namely, the elapsed time to
run the best algo -- without adding a fourth entry to the tuple, which would be
confusing.
No functional change.
PiperOrigin-RevId: 215987795
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
instead of assertMultiLineEqual if input is too large
(https://bugs.python.org/issue11763). This change is switching to use unified_diff
in that case.
PiperOrigin-RevId: 215987656
|
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | | | | | |
PiperOrigin-RevId: 215985679
|