| Commit message (Collapse) | Author | Age |
|
|
|
| |
PiperOrigin-RevId: 215324035
|
|
|
|
| |
PiperOrigin-RevId: 215272497
|
|
|
|
|
|
| |
Derive HloModulePass and HloModuleGroupPass from HloPassInterface which run module-scoped and module-group-scoped respectively. Replace all existing uses of HloPassInterface with HloModulePass because all existing passes are module-scoped. Also rewrite HloPassPipeline to support both module-scoped and module-group-scoped passes.
PiperOrigin-RevId: 213629604
|
|
|
|
|
|
|
|
| |
Now that HloSchedule is a field on the HLO module, scheduling can be done as an HLO pass. Similarly, rematerialization which requires a schedule can also be a pass which just gets the schedule from the module.
Also as a clean up, hoist calls to CopyInsertion out of rematerialization.
PiperOrigin-RevId: 212119795
|
|
|
|
|
|
|
|
|
|
|
| |
*** Original change description ***
Add HloSchedule class representing a sequential order of an HloModule.
Currently we represent a sequential schedule of a module using a SequentialHloOrdering::HloModuleSequence which is a type alias of a bare map from HloComputation* to std::vector<HloInstruction*>. This CL replaces this with a proper class which results in better encap...
***
PiperOrigin-RevId: 211726890
|
|
|
|
|
|
| |
Automated rollback of commit 7fa693209fe238478739b3982f652a7e35be91f3
PiperOrigin-RevId: 211681957
|
|
|
|
|
|
|
|
| |
Currently we represent a sequential schedule of a module using a SequentialHloOrdering::HloModuleSequence which is a type alias of a bare map from HloComputation* to std::vector<HloInstruction*>. This CL replaces this with a proper class which results in better encapsulation of code which deals with schedules and better enforcement of invariants.
This CL also fixes a corner-case bug in dataflow analysis, where values of instructions which are live out of the computation erroneously did not interfere with the values of instructions scheduled after the root instruction.
PiperOrigin-RevId: 211656888
|
|
|
|
| |
PiperOrigin-RevId: 210998142
|
|
|
|
|
|
|
|
| |
Unlike Printf, StrFormat does not require type-length qualifiers, e.g
%z, %ll. Nor does it require that you call c_str() to print strings.
So these are fixed up here as well.
PiperOrigin-RevId: 210435915
|
|
|
|
|
|
| |
RemoveUnnecessaryCopies which runs in rematerialization to take advantage of scheduling can sometimes remove copies which are needed to non-interference reasons. This requires running AddSpecialCaseCopies to add them back in. Furthermore, the schedule needs to be updated to account for the changes to the module, so add an UpdateSchedule function which can patch up a schedule in light a limited set of transformations to the module (addition and deletion of instructions).
PiperOrigin-RevId: 210186375
|
|
|
|
|
|
| |
Also move 'using' statements into namespaces.
PiperOrigin-RevId: 210055083
|
|
|
|
| |
PiperOrigin-RevId: 210049592
|
|
|
|
| |
PiperOrigin-RevId: 210018843
|
|
|
|
|
|
|
| |
Unfortunately this has to be one big patch, because e.g. absl::StrCat
doesn't accept a TF StringPiece, but as soon as we switch to
absl::string_view, we have to switch away from all of the TF functions.
PiperOrigin-RevId: 209957896
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
209663919 by yifeif<yifeif@google.com>:
Internal change.
--
209663914 by amitpatankar<amitpatankar@google.com>:
Fix the topk_op_test for numpy>1.15.
--
209660476 by jdduke<jdduke@google.com>:
Fix model lifetime for TensorFlow Lite C# bindings
Ensure the model's existence for the duration of the interpreter,
as per API requirements.
--
209655960 by scottzhu<scottzhu@google.com>:
Unify RNN Cell interface between TF and Keras.
--
209655731 by A. Unique TensorFlower<gardener@tensorflow.org>:
Added tests for PredictionOps and PartitionExamplesOps
--
209655291 by nolivia<nolivia@google.com>:
adding rate class so that we can save global_step/sec using tf.contrib.summary. The function takes the rate in relation to any tensors provided that the numerator and denominator are broadcastable and have dtypes that can be cast to float64
--
209654655 by kramerb<kramerb@google.com>:
[XLA] Switch from tensorflow::gtl::InlinedVector to absl::InlinedVector
This one comes with extra goodies like a move constructor.
--
209653851 by A. Unique TensorFlower<gardener@tensorflow.org>:
Internal build specification change
--
PiperOrigin-RevId: 209663919
|
|
|
|
|
|
|
|
| |
Make use of the back-end specific fusion-operand sharing functions in copy insertion.
This allows potentially better copy insertion/elision.
PiperOrigin-RevId: 205481101
|
|
|
|
| |
PiperOrigin-RevId: 203211687
|
|
|
|
|
|
| |
This allows copy elision to use the same backend-specific HLO dataflow analysis and potentially elide more copies than when running copy insertion/elision on its own.
PiperOrigin-RevId: 203171335
|
|
|
|
|
|
|
|
| |
It already detects layout-changing copies and those are already left unchanged
by copy elision. Special case copies are also skipped because they are tagged
separately (SetCopyElisionAllowed)
PiperOrigin-RevId: 202574858
|
|
|
|
| |
PiperOrigin-RevId: 200472722
|
|
|
|
| |
PiperOrigin-RevId: 200309129
|
|
|
|
|
|
|
|
|
|
|
|
| |
After scheduling HLOs it is very beneficial to try more copy elision: The
sequential ordering from the schedule is stricter than the data-dependency ordering used
during copy insertion.
Also, allow more operands to share a buffer with their user. In particular, the user has to be element-wise only wrt to the specified operand, and not wrt to all operands.
These two changes allow more copies to be eliminated.
PiperOrigin-RevId: 200292049
|
|
|
|
|
|
|
| |
These methods have nothing to do with scheduling.
Also, rename methods CreateMemoryMinimizingSequence in hlo_scheduling.
PiperOrigin-RevId: 200254100
|
|
|
|
| |
PiperOrigin-RevId: 199567935
|
|
|
|
|
|
| |
liveness_util to methods on TuplePointsToAnalysis and HloDataflowAnalysis.
PiperOrigin-RevId: 196903216
|
|
|
|
|
|
|
| |
easier to migrate from TuplePointsToAnalysis/LogicalBuffer to
HloDataflowAnalysis/HloValue. No functional changes.
PiperOrigin-RevId: 195179676
|
|
|
|
|
|
| |
MemorySchedulerAlgorithm which is a function instead of an enum to allow experimentation with non-standard schedulers. Refactoring only; no functional changes to the scheduling itself.
PiperOrigin-RevId: 189830685
|
|
|
|
| |
PiperOrigin-RevId: 186038783
|
|
|
|
| |
PiperOrigin-RevId: 185623948
|
|
|
|
| |
PiperOrigin-RevId: 185598764
|
|
|
|
|
|
|
| |
List scheduling is more easily rematerialized sometimes, this
gives the ability to force list scheduling via the API.
PiperOrigin-RevId: 179246142
|
|
|
|
| |
PiperOrigin-RevId: 178040190
|
|
|
|
| |
PiperOrigin-RevId: 177229069
|
|
|
|
|
|
|
|
| |
determine
if rematerializablity or Fusabiltiy.
PiperOrigin-RevId: 176064783
|
|
|
|
|
|
|
| |
and Recv into {Recv, RecvDone}. See operation_semantics.md for the updated
semantics.
PiperOrigin-RevId: 175216012
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like HloComputation::instructions(), HloModule::computations() used to
return a list of unique_ptrs. But this is an implementation detail that
shouldn't be leaked into the public API.
This patch also adds HloModule::MakeNonFusionComputations(), because
many of the callers of computations() went on to filter out all the
fusion computations.
It would be possible to implement MakeNonFusionComputations() "in place"
using a filtering iterator, but I don't think it's necessary -- we never
have *that* many computations, and since many callers go on to copy the
list of non-fusion computations, making it unconditionally a copy is
simpler and avoids a footgun.
PiperOrigin-RevId: 170529051
|
|
|
|
|
|
|
|
| |
Currently it returns a view of unique_ptr<HloInstruction>s. But the
fact that these are unique_ptrs is an implementation detail, and it's
ugly to leak it everywhere.
PiperOrigin-RevId: 170445375
|
|
|
|
|
|
|
| |
that the memory use tracker used in remat matched expectations, but the
expectation was not always correct in the case of dead values.
PiperOrigin-RevId: 168604525
|
|
|
|
| |
PiperOrigin-RevId: 167609460
|
|
|
|
|
|
|
| |
called_compuatations for a fusion node should only include the fusion
computation that it calls.
PiperOrigin-RevId: 167149669
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Added CompactPointerSet<T>, which is optimized for set size <= 1.
* Changed expensive CHECKs to DCHECKS in buffer_assignment.cc
* Reserve space in DFS state array before starting DFS.
* Use unsigned arithmetic in DFS state maintenance.
* HloInstruction:
- Moved frequently used fields to start for better cache locality.
- Use InlinedVector instead of vector for operand array.
- Use InlinedVector instead of vector for DFS stack.
* Pre-compute "is array" and "is tuple" for LogicalBuffer.
* PointsToSet:
- Combine two ShapeTrees into one.
- Use CompactPointerSet instead of std::set to hold sources.
- Use CompactPointerSet instead of std::set to hold flattened buffers.
* ShapeTree: use unique_ptr instead of optional for shape storage
(reduces size and destruction overhead).
* Add proper const qualifiers to some FlatSet iterator methods.
Co-author=jeff
PiperOrigin-RevId: 165759117
|
|
|
|
|
|
| |
We added a conservartive logic to not rematerialize operations with control dependencies since the rematerialized operations could result in undesired ordering. However, we now realize that when we remat an operation, we also copy the dependencies of them, which guarantees the rematerialized operation has the same constraint as the original operation.
PiperOrigin-RevId: 165654629
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
hlo rematerialization pass.
Changes:
. Wrap each HloInstruction* inside an Item structure that keeps
associated data. This allows us to get rid of a bunch of
hash tables indexed by HloInstruction*.
* Switch to an intrusive linked list (instead of std::list) so
that we can avoid a hash table that maps to std::list::iterator.
* Use inlined vector in a few places.
PiperOrigin-RevId: 163848365
|
|
|
|
| |
PiperOrigin-RevId: 163535344
|
|
|
|
| |
PiperOrigin-RevId: 163210327
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is required for upcoming changes to convert the sequence creation functions
(and HeapSimulator and BufferAssignment) over to using the new
Hlo{Dataflow,Alias}Analysis.
It's required because otherwise there's a dependency cycle:
Hlo{Dataflow,Alias}Analysis depends on HloOrdering
CreateMemoryMinimizingSequence will depend on Hlo{Dataflow,Alias}Analysis
There's already a cycle here, if both HloOrdering and
CreateMemoryMinimizingSequence are in the same file. Also note that:
MinimumMemoryForSequence depends on HeapSimulator
HeapSimulator will depend on Hlo{Dataflow,Alias}Analysis
Hlo{Dataflow,Alias}Analysis depends on HloOrdering
Splitting out the sequence functions resolves the cycle.
Refactoring only; no functional changes.
PiperOrigin-RevId: 159731836
|
|
|
|
|
|
| |
(total_memory/saved_memory).
PiperOrigin-RevId: 159290105
|
|
|
|
|
|
| |
Simplify shape traversal visitors in ShapeUtil and ShapeTree. Add a non-Status form because most uses of the traversal methods do not use it, and remove is_leaf parameter from ShapeTree.ForEach* as it is not frequently used.
PiperOrigin-RevId: 158201574
|
|
|
|
| |
PiperOrigin-RevId: 157415647
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
155425029 by A. Unique TensorFlower <gardener@tensorflow.org>:
Internal change.
--
155424167 by A. Unique TensorFlower <gardener@tensorflow.org>:
Internal change.
--
PiperOrigin-RevId: 155425029
|