| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
| |
Merge XLA-specific operator registrations into a single file rather than having many tiny files.
In passing, register a fill function for bfloat16 numpy type; needed for the np.arange() call in the sort unit test.
PiperOrigin-RevId: 201005718
|
|
|
|
| |
PiperOrigin-RevId: 201005676
|
|
|
|
| |
PiperOrigin-RevId: 201004909
|
|
|
|
|
|
| |
This makes the failure output less confusing.
PiperOrigin-RevId: 201001511
|
|
|
|
|
|
| |
comments on renaming the file and op.
PiperOrigin-RevId: 200999974
|
|
|
|
|
|
|
|
| |
fusion test
This allows making the GPU emitter checks more restrictive (this would be a miscompile otherwise). Layout assignment cannot run with pre-assigned layouts currently.
PiperOrigin-RevId: 200993754
|
|
|
|
|
|
| |
(Prepare for upgrading Bazel to 0.14.1 on Windows)
PiperOrigin-RevId: 200988382
|
|
|
|
|
|
|
|
|
|
|
| |
Domain instructions only there to carry some metadata so they don't
effect the precision of the data so we should propagate BF16 through
them.
The special code needed to handle domain instructions is there as this
is the only HLO what have the same tuple shaped operand and result.
PiperOrigin-RevId: 200968713
|
|
|
|
|
|
|
|
| |
If the input and output element type for a bitcast is the same (it is
only a layout and shape change) then its effective output precision is
same as its input precision.
PiperOrigin-RevId: 200966788
|
|
|
|
| |
PiperOrigin-RevId: 200934420
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also fixed other problems
- Fix bounds checking on tensor index
- Fix tensor byte size to be size_t
- Fix memory leak in buffer allocation
- Remove dependency on core tensorflow
In a susbsequent CL I will refactor to not require logging and instead send
ValueError or RuntimeErrors back as exceptions that properly use TFLite
ErrorReporters.
PiperOrigin-RevId: 200915674
|
|
|
|
|
|
|
|
|
|
|
| |
Handles case where max_buckets_for_bucketized overwrites an existing key in the bucket_size_to_feature_ids_dict.
This can happen if
a) There are no bucketized features
b) The max buckets for bucketized features is actually 2 (clashing with max_buckets_for_indicator)
PiperOrigin-RevId: 200908269
|
|
|
|
| |
PiperOrigin-RevId: 200895985
|
|
|
|
| |
PiperOrigin-RevId: 200892708
|
|
|
|
| |
PiperOrigin-RevId: 200852741
|
|
|
|
| |
PiperOrigin-RevId: 200849332
|
|
|
|
| |
PiperOrigin-RevId: 200843761
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was hoping not to do this, but the motivating benchmark for all this work has
reshapes on degenerate dimensions. This also forced me to introduce a new node
to the analysis which isn't great (we don't want to replicate HLO inside
IndexedArrayAnalysis!) but this is cleanest solution I can think of.
In brief I support gather-reshape folding with degenerate dimensions by
disallowing it in the core tricky part of the algorithm and instead reshaping
the degenerate dimensions "in and out" in a helper that calls the core part of
the folding logic.
Also worth calling out that before we weren't doing something conservative -- we
were just buggy. For instance the CHECK_NE(candidate_operand_dim, 0) in
ComputeReshapePassthroughDimPairs can fail with degenerate dims.
I also made some other supporting changes:
- I was not checking window bounds in ComputeArrayForGather. I've fixed this
and beefed up testing in this area (the hammer for all my nails).
- Added a bunch of VLOG(3) info that was useful when debugging.
- Added a simple helper to the test that makes the strings I'm matching against
"whitespace insensitive" so that I can indent these.
I'm happy to pull these out into separate CLs if that makes reviewing easier but
for now I took the path of least resistance. :)
PiperOrigin-RevId: 200821883
|
|
|
|
| |
PiperOrigin-RevId: 200820561
|
|
|
|
| |
PiperOrigin-RevId: 200817339
|
|
|
|
| |
PiperOrigin-RevId: 200814177
|
|
|
|
| |
PiperOrigin-RevId: 200809229
|
|
|
|
|
|
| |
`tf.data.Dataset.map()`, switching from using a fixed-size circular buffer to a deque.
PiperOrigin-RevId: 200808498
|
|
|
|
|
|
| |
ControlEdge. There cannot be a duplicate, since fetch_node and feed_node are newly created. This change reduces the complexity of FetchOutputs from quadratic to linear.
PiperOrigin-RevId: 200807286
|
|
|
|
| |
PiperOrigin-RevId: 200802842
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a function being converted to defun runs, it can output
non-tensor values. The "shape" of these non-tensor values is
set to None in _output_shapes. When we set the shapes at the
end of the defun __call__, we need to skip these Nones.
Also, unrelatedly, add a test for basic gradient tape and a test
for using strided_slice inside a compiled and taped defun.
strided_slice "stresses" XLA's constant inference for arguments
that must be constant.
PiperOrigin-RevId: 200802717
|
|
|
|
|
|
|
|
|
|
| |
TensorShape.__eq__ will return false if there are any unknown
dimensions in the shapes being compared, even if both shapes have
unknown dims in the same place. This means that the assert in
control_flow_ops.py would sometimes spuriously trigger. This change
removes the assert since it was for debugging anyway.
PiperOrigin-RevId: 200801159
|
|
|
|
| |
PiperOrigin-RevId: 200800013
|
|
|
|
| |
PiperOrigin-RevId: 200799531
|
|
|
|
|
|
| |
(Also planning to use this in Sonnet)
PiperOrigin-RevId: 200797385
|
|
|
|
| |
PiperOrigin-RevId: 200795402
|
|
|
|
|
|
| |
This simply wraps the Transpose with a Conj.
PiperOrigin-RevId: 200793274
|
|
|
|
|
|
|
|
| |
The full list of inputs to the generated TF function is created by appending
the Tensor-valued keyword arguments (sorted by key) to the list of
Tensor-valued args.
PiperOrigin-RevId: 200792676
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* EagerPyFunc now validates its assumption that returned tensors are backed by memory on the same device that the EagerPyFunc kernel executed on.
* Make the Python trampolining mechanism ensure that this requirement of the kernel is met.
* Allow tf.contrib.eager.py_func to execute correctly on devices other than CPU and GPU:0.
Prior to this change, tf.contrib.eager.py_func() would copy data from CPU to GPU:0 if necessary, but not the other way around. As a result, the assumptions made by the EagerPyFunc kernel implementation about the placement of returned tensors would be violated.
The test added in py_func_test.py, when executed on a machine with a GPU will:
- Fail with a segmentation fault (dereferencing GPU memory) without the changes to py_func.cc and script_ops.py
- Fail with an error message with the change to py_func.cc but without the change to script_ops.py
- Pass with changes to py_func.cc and script_ops.py
PiperOrigin-RevId: 200792057
|
|
|
|
| |
PiperOrigin-RevId: 200791799
|
|
|
|
| |
PiperOrigin-RevId: 200791012
|
|
|
|
| |
PiperOrigin-RevId: 200790410
|
|
|
|
|
|
|
|
| |
1) Update stats
2) Update the number of examples visited.
3) If the number of examples reaches the target, grow the tree.
PiperOrigin-RevId: 200790145
|
|
|
|
| |
PiperOrigin-RevId: 200789288
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior this change, we were using the C API for everything except
Tensor.shape calls, which returned the result from the original Python
shape inference code. With this change, we use the C API in this case
as well. The C API has better shape inference, so this has the effect
of returning more precise shapes in some cases.
This change can be disabled by setting the environment variable
TF_C_API_GRAPH_CONSTRUCTION_SHAPES=0. However, this toggle will
be removed altogether in the near future.
This also fixes a bug in the SWIG that could cause large shape dimensions
to be incorrect.
PiperOrigin-RevId: 200783822
|
|
|
|
|
|
|
|
| |
This makes it more convenient to use layer of different dtypes in a model. Instead of having to manually cast intermediate tensors between layers of different dtypes, they will automatically be casted.
This is also useful for the upcoming mixed precision API.
PiperOrigin-RevId: 200783477
|
|
|
|
| |
PiperOrigin-RevId: 200783258
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change:
* Creates a new global variable, control_flow_ops._ENABLE_COND_V2, to use
cond_v2 by default when calling tf.cond. This variable can also be
controlled via the environment variable TF_ENABLE_COND_V2.
* Moves cond_v2 out of contrib so it's accessible from control_flow_ops.py.
* Lazily "imports" some modules in cond_v2 to avoid circular dependencies.
Note that these lazy "imports" must be imported by the cond_v2 caller (or
recursively by one of the caller's imports) in order for cond_v2 to have
access to them.
* Renames the cond_v2 module to cond_v2_impl, and creates a new cond_v2 module
that imports the cond_v2 method and the necessary extra imports. This is
useful for explicitly calling cond_v2 outside of control_flow_ops.cond.
PiperOrigin-RevId: 200778208
|
|
|
|
| |
PiperOrigin-RevId: 200777514
|
|
|
|
| |
PiperOrigin-RevId: 200774484
|
|
|
|
| |
PiperOrigin-RevId: 200771096
|
|
|
|
|
|
|
|
| |
Delete most print statements, use logging instead of print, and close files (to clear the "Unclosed file" warnings).
Normally this produces thousands of lines of output. Mostly noise.
PiperOrigin-RevId: 200769210
|
|
|
|
|
|
| |
mode.
PiperOrigin-RevId: 200768236
|
|
|
|
| |
PiperOrigin-RevId: 200766687
|
|
|
|
| |
PiperOrigin-RevId: 200764324
|