| Commit message (Collapse) | Author | Age |
|
|
|
| |
PiperOrigin-RevId: 215324035
|
|
|
|
| |
PiperOrigin-RevId: 212701024
|
|
|
|
| |
PiperOrigin-RevId: 210998142
|
|
|
|
|
|
|
|
|
|
| |
AllUsersConsumeBF16() incorrectly used ValueTypeAfterChange() for the current value being checked, but it should be the original type.
Also fusion computation should be adjusted as soon as the fusion root is adjusted.
There was also redundant work for while computations. Now removed.
PiperOrigin-RevId: 206216822
|
|
|
|
|
|
|
|
| |
The while loop input and output alias each other, so as long as an input is also used by other ops that could not use BF16, the propagation pass could not change such an input/ouput to BF16 even if all uses in the while loop could use BF16. Add copies for each while loop operand. This increases the chance to propagate BF16 through the while loop; if some of these copies do not help, they will remain same-shape copies and be removed at the end.
This can sometimes increase HBM usage because both BF16 and F32 copies are alive, and can sometimes reduce HBM usage.
PiperOrigin-RevId: 203848348
|
|
|
|
|
|
|
|
|
| |
Currently Literal classes sits in literal_util.{h,cc} instead of literal.{h,cc}.
It also contains helper functions that are better fit to be their own separate
class/namespace. This change starts this process by moving most static factory
methods to LiteralUtil namespace.
PiperOrigin-RevId: 203217065
|
|
|
|
|
|
| |
Using simple keys is more efficient.
PiperOrigin-RevId: 202377039
|
|
|
|
|
|
|
|
|
|
|
| |
Domain instructions only there to carry some metadata so they don't
effect the precision of the data so we should propagate BF16 through
them.
The special code needed to handle domain instructions is there as this
is the only HLO what have the same tuple shaped operand and result.
PiperOrigin-RevId: 200968713
|
|
|
|
|
|
|
| |
std::list is just hilariously inefficient and the postorder list creation has
been rewritten not to not depend on splicing anymore so there's no need for the
list. While there remove the old unused postorder list creation code.
PiperOrigin-RevId: 200743677
|
|
|
|
|
|
|
|
|
|
|
|
| |
A TOKEN primitive type was added with cl/199215963 and XLA also has an OPAQUE primitive type. However, in many places in XLA we assume either a tuple or array. This CL fixes many of those instances, but some may remain. Identified instances were discovered by searching for IsTuple or IsArray so the set of fixes is not exhaustive.
Also opportunistically addressed a couple potential points of confusion in the ShapeUtil interface:
(1) Rename ShapeUtil::HasZeroElements to ShapeUtil::IsZeroElementArray. The point of confusion here is that tuples can also have zero elements and HasZeroElements would check fail on tuple shapes. Method no longer check fails if the given shape is not an array.
(2) ShapeUtil::IsNil now returns true only for empty tuples. Previously it also returned true for zero-element array types which was confusing because ShapeUtil::MakeNil creates an empty tuple.
PiperOrigin-RevId: 200452672
|
|
|
|
|
|
|
|
|
|
| |
fusion.
We now use a set to track all the potential changes, and do the actual changes
on the HLOs at the end. This also makes the boolean return value (whether
anything is changed) correct.
PiperOrigin-RevId: 195160025
|
|
|
|
| |
PiperOrigin-RevId: 193465140
|
|
|
|
| |
PiperOrigin-RevId: 190251081
|
|
|
|
|
|
|
| |
If CRS has tuple output, it needs special handling in conversion folding.
BF16 propagation could result in BF16->BF16 conversions, which can be removed.
PiperOrigin-RevId: 189380578
|
|
|
|
| |
PiperOrigin-RevId: 187899955
|
|
|
|
| |
PiperOrigin-RevId: 187644155
|
|
|
|
|
|
|
|
|
| |
Previously, the propagation pass might produce different procision in the fused
computation's root than the fusion itself, when the fused root doesn't define a buffer.
Add explicit converts at such fusion roots.
PiperOrigin-RevId: 186812368
|
|
Using BFloat16Support provided by the backend to determine what precision is needed for
each HloInstruction. If the implementation of some HLOs already reduces input precision to BF16, this pass can enable BF16 on more ops without affecting the result.
PiperOrigin-RevId: 186656378
|