aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/compiler
Commit message (Collapse)AuthorAge
...
* | [XLA] Fix handling of tuple constants in HLO constant folding.Gravatar Peter Hawkins2018-10-03
| | | | | | | | PiperOrigin-RevId: 215676675
* | [TF:XLA] Use xla::Iota rather than expanding Range ops to constants.Gravatar Peter Hawkins2018-10-03
| | | | | | | | PiperOrigin-RevId: 215668016
* | [XLA] Add a size limit to the constant folder to avoid forming giant ↵Gravatar Peter Hawkins2018-10-03
| | | | | | | | | | | | constants during compilation. PiperOrigin-RevId: 215663002
* | [TF:XLA] Improve the accounting for subcomputations in the heap simulator.Gravatar Dimitris Vardoulakis2018-10-03
| | | | | | | | | | | | | | | | Subtract the size of the aliased buffers from the subcomputation estimate instead of from the current computation. This way, the memory estimate for the current computation is more accurate. For the newly added test, the heap simulation calculates 48 bytes at head instead of the correct 64 bytes. PiperOrigin-RevId: 215653047
* | [XLA] Revise the way to express a CPU specific test.Gravatar Bixia Zheng2018-10-03
| | | | | | | | | | | | | | Use #ifdef XLA_TEST_BACKEND_CPU to protect the test instead of disabling it for all the other backends except for the CPU backend. PiperOrigin-RevId: 215651036
* | [XLA] Disable a test for layout changing elementwise operations.Gravatar Bixia Zheng2018-10-03
| | | | | | | | | | | | | | | | | | Rename the test to make it obvious that it is for testing the codegen correctness in handling layout changing elementwise operations. Keep the test only for the CPU backend. PiperOrigin-RevId: 215630611
* | Move out-params to end of argument list and add an out_ prefix; NFCGravatar Sanjoy Das2018-10-03
| | | | | | | | PiperOrigin-RevId: 215624875
* | Fix handling of tuples in CreateCopyWithNewLayout.Gravatar A. Unique TensorFlower2018-10-03
| | | | | | | | | | | | | | | | | | | | | | If the layout of a single tensor in a tuple is different from its use, then CreateCopyWithNewLayout will do a deep copy of the entire tuple. Not only does this operation create unnecessary copies of elements where the layout is the same, it will throw an error if the tuple contains elements like token[] that cannot be copied. As a result, layout assignment on TPU occassionally causes mysterious compilation failures for code that runs correctly on CPU and GPU. PiperOrigin-RevId: 215615731
* | Automated rollback of commit 2af8fd975aaf5c70ebb396895fa15a8f034a8440Gravatar Tong Shen2018-10-03
| | | | | | | | PiperOrigin-RevId: 215608349
* | Skip control flow functionalization if there is no Switch or Merge node.Gravatar Tong Shen2018-10-03
| | | | | | | | PiperOrigin-RevId: 215580891
* | [XLA] In the HLO parser, give the module a non-empty default name.Gravatar A. Unique TensorFlower2018-10-02
| | | | | | | | | | | | Otherwise, when parsing a single instruction, the parsed module doesn't have a name, which won't pass the hlo verifier check. PiperOrigin-RevId: 215519412
* | [XLA:CPU] Re-enable the inliner pass in the cpu compiler.Gravatar A. Unique TensorFlower2018-10-02
| | | | | | | | PiperOrigin-RevId: 215517752
* | [XLA] Modify the function that determines whether an instruction can changeGravatar Bixia Zheng2018-10-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | layout so that it can be used by the HLO verifier. Change the function to a static member function of the LayoutAssignment class. Add an std::function member to LayoutAssignment to store the function object passed down from the backend compiler class and use it to decide whether an instruction can change layouts. Fix affected test cases. PiperOrigin-RevId: 215515611
* | [XLA] Merge the single instruction parsing and the full module parsing in ↵Gravatar A. Unique TensorFlower2018-10-02
| | | | | | | | | | | | one function. PiperOrigin-RevId: 215501702
* | [XLA] Support parsing the canonical format of HLO text.Gravatar A. Unique TensorFlower2018-10-02
| | | | | | | | | | | | Also stop truncating operands in the canonical format. PiperOrigin-RevId: 215466465
* | [XLA] A test that disables layout assignment should only contain layoutGravatar Bixia Zheng2018-10-02
| | | | | | | | | | | | | | | | | | | | | | consistent HLO instructions. Fix a dot test that disables layout assignment pass to not generate layout inconsistent HLO instructions. This includes only adding the dot result to an addend with the same layout, and disabling algebraic simplification which may transform a dot to a multiplication with inconsistent layouts. PiperOrigin-RevId: 215463477
* | Fixes for few issues in HloModule::CreateFromProto()Gravatar A. Unique TensorFlower2018-10-02
| | | | | | | | PiperOrigin-RevId: 215460064
| * [xla] Improve validation of Broadcast shapeGravatar Keno Fischer2018-10-02
| | | | | | | | | | | | | | | | | | If one misreads the semantics of this instruction, it's easy to cause an out of bounds access into the dimensions here. Add an extra check to return a proper error to the user rather than crashing in that case. Ref #22130
* | Add proto serialization/deserialization testing to the HLO parser tests.Gravatar Mark Heffernan2018-10-02
| | | | | | | | | | | | | | | | | | | | Many of the HLO parser tests verify that an text form of an HLO module preserves all information when running through ToString then parsing. It makes sense to also use these tests to exercise proto serialization/deserialization. This is done by adding additional instantiations of the parameterized parsing tests. This caught several bugs which are fixed in this CL: (1) Domain instructions were not being serialized properly. (2) Host send/recv instructions did not preserve the is_host_transfer bit. (3) Sparse literals could not be serialized or deserialized. PiperOrigin-RevId: 215445200
* | [XLA] Replace the last FlatMap in XLA with a simple array.Gravatar Benjamin Kramer2018-10-02
| | | | | | | | | | | | A hash map for 18 pointers is just a waste of space. PiperOrigin-RevId: 215428176
* | [XLA] Fix some outdated comments referring to FlatMapGravatar Benjamin Kramer2018-10-02
| | | | | | | | | | | | Also convert unordered_map to flat/node_hash_map where the comments allow. PiperOrigin-RevId: 215410566
* | Make StatelessRandomOpsTest.testRandomNormalIsFinite actually test ↵Gravatar Peter Hawkins2018-10-02
| | | | | | | | | | | | | | | | stateless_random_normal. Fixes #22611 PiperOrigin-RevId: 215385610
* | Add a hint parameter to TransferLiteralToDeviceAsync that the implementation ↵Gravatar A. Unique TensorFlower2018-10-02
| | | | | | | | | | | | can use to accelerate transfers. PiperOrigin-RevId: 215362667
* | Fix layout assignment for cross module all reduceGravatar A. Unique TensorFlower2018-10-02
| | | | | | | | | | | | | | | | | | Previously we could have ended up with the different HLOs being assigned different layouts what made lowering impossible. This change enforces a consistent layout between the communicating nodes the same way it is done for send&recv pairs. PiperOrigin-RevId: 215359420
* | Merge pull request #21958 from MattConley:CudaOccupancyGravatar TensorFlower Gardener2018-10-01
|\ \ | | | | | | | | | PiperOrigin-RevId: 215331087
* | | [XLA] Migrate from gtl::FlatSet to absl::flat_hash_setGravatar Benjamin Kramer2018-10-01
| | | | | | | | | | | | PiperOrigin-RevId: 215324035
* | | [XLA] Add kAllToAll and kCollectivePermute to ↵Gravatar A. Unique TensorFlower2018-10-01
| | | | | | | | | | | | | | | | | | EffectiveOperandPrecisionIsOutputPrecision list. PiperOrigin-RevId: 215311766
* | | [TF/XLA] Optimize `Encapsulator::GetFunctionNameAttr()`.Gravatar Derek Murray2018-10-01
| | | | | | | | | | | | | | | | | | The previous version was hitting a very slow path in `GetNodeAttr()`, which is expensive when the named attr is not found. This change inlines the logic of finding the two relevant attrs inside `GetFunctionNameAttr()` and avoids constructing a status object with a serialized `NodeDef` when the attr can't be found. PiperOrigin-RevId: 215298411
* | | Improve error message in transpose shape inference.Gravatar Tayo Oguntebi2018-10-01
| | | | | | | | | | | | PiperOrigin-RevId: 215294817
* | | Clean up the build_xla_ops to use the generated C++ TF op wrappers.Gravatar Sanjoy Das2018-10-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This cleanup will make the future CL implementing lazy compilation simpler. Includes some supporting changes: - Teach NewInternalScope to create a scope that doesn't do shape inference. We need this because we don't have a ShapeRefiner that has been run over the entire graph available in the build_xla_ops pass. - Add a WithAssignedDevice modifier to tensorflow::Scope. - Make cc_op_gen write out an Operation field for nodes which may not necessarily have any outputs. We already did this in most cases, but we weren't doing it for nodes that have possibly-empty list outputs. - Minor change renaming ops/xla_jit_op.cc to ops/xla_jit_ops.cc, now that we have more than one XLA JIT op. PiperOrigin-RevId: 215293817
* | | [XLA] Migrate from gtl::FlatMap to absl::flat_hash_mapGravatar Benjamin Kramer2018-10-01
| | | | | | | | | | | | PiperOrigin-RevId: 215272497
* | | Allow zero number of inputs in XRT execute operation.Gravatar A. Unique TensorFlower2018-10-01
| | | | | | | | | | | | PiperOrigin-RevId: 215252408
* | | Bugfix: When a subgraph is encapsulated and replaced by XlaLaunch op, the ↵Gravatar A. Unique TensorFlower2018-10-01
| | | | | | | | | | | | | | | | | | requested device placement of the XlaLaunch op must be derived from the subgraph. PiperOrigin-RevId: 215239672
* | | Name fusion parameters simply "param_X".Gravatar Mark Heffernan2018-10-01
| | | | | | | | | | | | | | | | | | | | | | | | Where "X" is the parameter number. Previously, fusion parameter names including the name of the original instruction which produced the value which was confusing. PiperOrigin-RevId: 215238171
* | | [TF:XLA] Teach deadness analysis more of distributive property.Gravatar A. Unique TensorFlower2018-10-01
| | | | | | | | | | | | PiperOrigin-RevId: 215183847
* | | [HloOrdering] Make parameter always defined before other instructions.Gravatar Yunxing Dai2018-09-30
| | | | | | | | | | | | | | | | | | | | | - Make parameter always defined before other instructions. - Add extra indentations to the predecessor field in ToString() method to make it clear. PiperOrigin-RevId: 215162840
* | | Handle noinline gradient function in control flow functionalization.Gravatar Tong Shen2018-09-28
| | | | | | | | | | | | PiperOrigin-RevId: 215003704
* | | [TF:XLA] Add comment explaining why there is no PrimitiveTypeToDataType ↵Gravatar Peter Hawkins2018-09-28
| | | | | | | | | | | | | | | | | | function. PiperOrigin-RevId: 214945748
* | | [XLA] Use a result cache to speed up InstructionFusion::CanFuseOnAllPaths()Gravatar Yuanzhong Xu2018-09-27
| | | | | | | | | | | | PiperOrigin-RevId: 214848216
* | | Don't use tensorflow::Edge after freeing itGravatar Sanjoy Das2018-09-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Even with this bug we were accidentally doing the right thing (so the test case doesn't actually fail without the fix): deleting an Edge sets its input and output indices to kControlSlot-1 so we'd normally expect to fail when there is a control edge out of the TF cluster (because a control edge would be recognized as a data edge). But AddEdge(x, -1, y, -1) seems to do the right thing for both control and data edges. PiperOrigin-RevId: 214831204
* | | Updating the V2 variables API.Gravatar Alexandre Passos2018-09-27
| | | | | | | | | | | | PiperOrigin-RevId: 214824023
* | | Add opaque field to custom call.Gravatar Mark Heffernan2018-09-27
| | | | | | | | | | | | | | | | | | The intent of this field is to enable more information to be encoded in the custom call and passed through to the backend. PiperOrigin-RevId: 214800539
* | | [XLA] Allow the stream to be used for host-to-device transfers to be ↵Gravatar A. Unique TensorFlower2018-09-27
| | | | | | | | | | | | | | | | | | specified separately from the compute stream in ServiceRunOptions PiperOrigin-RevId: 214778267
* | | Fixes bug in tf2xla NMS implementation.Gravatar Tayo Oguntebi2018-09-26
| | | | | | | | | | | | PiperOrigin-RevId: 214711381
* | | Skip SymbolicGradientOp when doing constant folding in control flow ↵Gravatar Tong Shen2018-09-26
| | | | | | | | | | | | | | | | | | | | | | | | functionalization. If we want to evaluate SymbolicGradient op in constant folding, we need to construct Device object and attach it to FunctionLibraryRuntime. In graph rewriting pass, we do not have Device object created yet; it will only be created in XlaCompiler. PiperOrigin-RevId: 214702943
* | | Add xlogy and xdivy op.Gravatar A. Unique TensorFlower2018-09-26
| | | | | | | | | | | | PiperOrigin-RevId: 214700693
* | | [TF:XLA] Fix XLA lowering of TF BroadcastTo operator.Gravatar Peter Hawkins2018-09-26
| | | | | | | | | | | | PiperOrigin-RevId: 214675055
* | | [XLA] Remove use of DeconstructTuple from MakeFakeArgumentsOrDie.Gravatar Peter Hawkins2018-09-26
| | | | | | | | | | | | | | | | | | DeconstructTuple doesn't support nested tuples yet, so MakeFakeArgumentsOrDie failed if any of the arguments were tuple-shaped. But we don't really need it here anyway, just build the arguments one-by-one. PiperOrigin-RevId: 214671374
* | | [TF] Add new internal ops _VarHandlesOp and _ReadVariablesOp.Gravatar Peter Hawkins2018-09-26
| | | | | | | | | | | | | | | | | | | | | | | | The purpose of these ops is to fix a latency problem observed for an inference benchmark. Often a inference step starts by reading the value of many (hundreds) of weights. For a resource variable, this requires a VarHandleOp and a ReadVariableOp per variable. Running hundreds of trivial ops can add hundreds of microseconds of latency to the critical path of an inference step. The inter-op latency of the executor can be hundreds of nanoseconds, which rapidly adds up. This change introduces two fused ops _VarHandlesOp and _ReadVariablesOp that allow us to read many variables in a pair of larger ops, rather than many tiny ops. PiperOrigin-RevId: 214662338
* | | [XLA] Don't use NumUniqueInstructionIds() as a proxy for instruction_count()Gravatar Michael Kuperstein2018-09-26
| | | | | | | | | | | | | | | | | | | | | | | | It used to be a reasonable proxy, but that's no longer the case. This is because GetUniqueId() in XlaBuilder uses a *global* (rather than a module-global) counter. Since HloModule::CreateFromProto no-longer uniquifies ids coming in from protos, the potentially very high IDs coming from GetUniqueId() become the module's next_unique_id. There is another case of this in TuplePointsTo, that will be handled separately. PiperOrigin-RevId: 214614576