| Commit message (Collapse) | Author | Age |
|
|
|
|
|
| |
duplicate definitions.
PiperOrigin-RevId: 211662523
|
|
|
|
| |
PiperOrigin-RevId: 211661670
|
|
|
|
|
|
|
|
| |
use_gpu does not affect the creation of the session, it only affects the
context manager in which nodes are added to the graph, so it should not
be included in the consistency check.
PiperOrigin-RevId: 211659833
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LSTM Op.
This introduces a connection between forward and backward cells across subsequent layers when stacking bidirectional LSTM Ops on top of each other.
In more detail:
Previously, the Op had only one input that was fed into the layer in the
following way:
INPUT (INPUT_REVERSED)
| |
-----------------------
| FW_LSTM BW_LSTM | <----- bidi-LSTM cell (with one input / two outputs)
-----------------------
| |
FW_OUT BW_OUT
Now, the Op can have an (optional) auxiliary input in the following way:
AUX_INPUT (AUX_INPUT_REVERSED)
| |
INPUT | (INPUT_R'D.)|
| | | |
-------------------------
| \ / \ / |
| FW_LSTM BW_LSTM | <----- bidi-LSMT cell (with 2 inputs / 2 outputs)
-------------------------
| |
FW_OUT BW_OUT
When stacking these Ops, previously, only the following flow was allowed:
Input
/ \
FW_LSTM1 BW_LSTM1
| |
| |
FW_LSTM2 BW_LSTM2
| |
| |
FW_LSTM3 BW_LSTM3
\ /
Output
With the introduction of an auxiliary input to the bidi-LSTM layer, the forward
(FW_LSTMi) output of the ith layer is fed into as the input to the next layer
(hence, inputs to both FW_LSTM{i+1} and BW_LSTM{i+1}) and the backward output is
fed as the auxiliary inputs to both FW_LSTM{i+1} and BW_LSTM{i+1}). This way, the
stacking can be changed to allow for the "cross-linking" between subsequent
layer in the following way:
Input
/ \
FW_LSTM1 BW_LSTM1
| \ / |
| / \ |
FW_LSTM2 BW_LSTM2
| \ / |
| / \ |
FW_LSTM3 BW_LSTM3
\ /
Output
PiperOrigin-RevId: 211659472
|
|
|
|
|
|
|
|
| |
Currently we represent a sequential schedule of a module using a SequentialHloOrdering::HloModuleSequence which is a type alias of a bare map from HloComputation* to std::vector<HloInstruction*>. This CL replaces this with a proper class which results in better encapsulation of code which deals with schedules and better enforcement of invariants.
This CL also fixes a corner-case bug in dataflow analysis, where values of instructions which are live out of the computation erroneously did not interfere with the values of instructions scheduled after the root instruction.
PiperOrigin-RevId: 211656888
|
|
|
|
|
|
|
|
|
| |
~25-30% speedup when compiled with AVX.
* collapse inner dims before contraction
* eval kernel tensor before contraction
PiperOrigin-RevId: 211651030
|
|
|
|
|
|
|
|
| |
This prevents these build-time rules from accessing any GPUs which might
be present on the build machine and interfering with GPU tests which
might be running concurrently.
PiperOrigin-RevId: 211647681
|
|
|
|
| |
PiperOrigin-RevId: 211643209
|
|
|
|
|
|
| |
accumulator.
PiperOrigin-RevId: 211642436
|
|
|
|
| |
PiperOrigin-RevId: 211639440
|
|
|
|
| |
PiperOrigin-RevId: 211637019
|
|
|
|
| |
PiperOrigin-RevId: 211633744
|
|
|
|
| |
PiperOrigin-RevId: 211631516
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Split CondState into CondState (which corresponds to scope previously) and
AncestorState (which tracks which switch/merge nodes are an ancestor of a
ndoe). Previously CondState tracked both but that resulted in difficult to
follow meet rules. Instead by splitting these out the meet for merge and
non-merge are straight forward set operations. The ancestor relation is
similarly easy to compute along with CondState computation.
* Enhance the redundant switch checking: previously we only considered the
predicates but
%s=switch(val=%P, pred=switch(%P_1, %P):then)
is also redundant as if %P is true then %s:else is dead.
* Enhance in-edge testing to insert a switch if a value from an outer context
is consumed inside an inner context.
* Rename CondStateMap to StateMap to match new usage.
PiperOrigin-RevId: 211622021
|
|
|
|
| |
PiperOrigin-RevId: 211621189
|
|
|
|
| |
PiperOrigin-RevId: 211598349
|
|
|
|
| |
PiperOrigin-RevId: 211592901
|
|
|
|
|
|
|
|
|
| |
Add a missing check to InferConvolveShape(), the output feature dimension needs to be divisible by feature_group_count.
Also fix some tests which took a const reference to the return value of
a function which doesn't return a reference.
PiperOrigin-RevId: 211592011
|
|
|
|
| |
PiperOrigin-RevId: 211588937
|
|
|
|
| |
PiperOrigin-RevId: 211586062
|
|\
| |
| |
| | |
PiperOrigin-RevId: 211584024
|
| |
| |
| |
| | |
PiperOrigin-RevId: 211581486
|
|\ \
| | |
| | |
| | | |
PiperOrigin-RevId: 211581348
|
| | |
| | |
| | |
| | |
| | |
| | | |
strategy.
PiperOrigin-RevId: 211576839
|
| | |
| | |
| | |
| | |
| | |
| | | |
the user doesn't have to pass it again to session_config.
PiperOrigin-RevId: 211576564
|
| | |
| | |
| | |
| | |
| | |
| | | |
Also minor fix of enabling quantization of shared weights if hybrid evaluation is true.
PiperOrigin-RevId: 211573947
|
| | |
| | |
| | |
| | |
| | |
| | | |
Parameter server strategy where variables are shared across sessions.
PiperOrigin-RevId: 211573447
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This is implemented as custom op instead of builtin op because Relu1 is not
supported in Tensorflow and not commonly used.
PiperOrigin-RevId: 211571619
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 211570665
|
| | |
| | |
| | |
| | |
| | |
| | | |
function.
PiperOrigin-RevId: 211564198
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
- Remove unnecessary use of test_session() in tests that run with eager
execution enabled.
- Use cached_session() instead of test_session()
(self.test_session() has been deprecated in
9962eb5e84b15e309410071b06c2ed2d6148ed44 as its name confuses readers of the
test. Moving to cached_session() instead which is more explicit about:
* the fact that the session may be reused.
* the session is not closed even when doing a "with self.test_session()"
statement.)
PiperOrigin-RevId: 211562969
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 211562900
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
If the feature_group_count is 1, don't bother showing it as it is not very
informative and a very common scenario. This is consistent with the
HloCustomCall's feature_group_count attribute.
PiperOrigin-RevId: 211560372
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 211557743
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 211557740
|
| | |
| | |
| | |
| | |
| | |
| | | |
This patch uses take by value and move idiom to optimize copying of constructor arguments.
PiperOrigin-RevId: 211553877
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 211552101
|
| | |
| | |
| | |
| | |
| | |
| | | |
value}}` and `^^key:value^^`. This change consolidate these two format.
PiperOrigin-RevId: 211550259
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | | |
switching from assertAllEqual to assertAllClose.
PiperOrigin-RevId: 211543406
|
| | |
| | |
| | |
| | |
| | |
| | | |
microcontrollers
PiperOrigin-RevId: 211543125
|
| | |
| | |
| | |
| | | |
PiperOrigin-RevId: 211542593
|
| |/
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
- Remove unnecessary test_session() boilerplate when executing eagerly
- Use self.cached_session() instead of self.test_session() when using graphs
self.test_session() has been deprecated in 9962eb5e84b15e309410071b06c2ed2d6148ed44 as its name confuses readers of the test. Moving to cached_session() instead which is more explicit about:
* the fact that the session may be reused.
* the session is not closed even when doing a "with self.test_session()" statement.
PiperOrigin-RevId: 211542360
|
| |
| |
| |
| | |
PiperOrigin-RevId: 211541639
|
| |
| |
| |
| | |
PiperOrigin-RevId: 211540844
|
| |
| |
| |
| | |
PiperOrigin-RevId: 211535930
|
| |
| |
| |
| | |
PiperOrigin-RevId: 211534283
|
| |
| |
| |
| | |
PiperOrigin-RevId: 211532963
|
| |
| |
| |
| | |
PiperOrigin-RevId: 211531374
|
| |
| |
| |
| | |
PiperOrigin-RevId: 211524810
|