aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/compiler/aot
Commit message (Collapse)AuthorAge
* Updating the V2 variables API.Gravatar Alexandre Passos2018-09-27
| | | | PiperOrigin-RevId: 214824023
* Automated rollback of commit da3357ecbdd6772413e8bbceeab8238971be11ceGravatar Adrian Kuegel2018-09-20
| | | | PiperOrigin-RevId: 213771631
* Automated rollback of commit 31c0857f6b5d79f4a7b16ee4af85f0bde8b5f5daGravatar A. Unique TensorFlower2018-09-20
| | | | PiperOrigin-RevId: 213764810
* Add AOT test case for XlaSort.Gravatar Adrian Kuegel2018-09-20
| | | | | | | The only tensorflow op that uses XlaSort is nn.top_k, so we add a test case using nn.top_k. PiperOrigin-RevId: 213763591
* Preserve unique ids when serializing/deserializing HLO protos.Gravatar Mark Heffernan2018-09-12
| | | | | | Re-assigning unique IDs broke serialization of HloSchedule, and keeping IDs stable improves the fidelity of the proto serialization. This change requires that instructions in HLO module protos have valid, module-scope-unique ids so change the XLA builder to hand out module-scope-unique ids. Previously, instruction ids were only unique in the computation scope. PiperOrigin-RevId: 212692339
* Automated rollback of commit d6f107761459dfdf8773a148e11193a3512a51a6Gravatar A. Unique TensorFlower2018-09-10
| | | | PiperOrigin-RevId: 212289067
* Automated rollback of commit 24787842adfefe35f5a520313d775b14c29f143aGravatar A. Unique TensorFlower2018-09-06
| | | | PiperOrigin-RevId: 211895566
* [XLA] Make tensorflow/compiler use absl::{StrCat,string_view,InlinedVector} ↵Gravatar Benjamin Kramer2018-09-05
| | | | | | | | consistently StringPiece is an alias for absl::string_view, InlinedVector is aliased to absl::InlinedVector. StrCat is compatible, so swapping it out is safe. PiperOrigin-RevId: 211691840
* Set CUDA_VISIBLE_DEVICES='' tfcompile and tfcompile tests' genrules.Gravatar Justin Lebar2018-09-05
| | | | | | | | This prevents these build-time rules from accessing any GPUs which might be present on the build machine and interfering with GPU tests which might be running concurrently. PiperOrigin-RevId: 211647681
* Change headers to directly include absl::Span, and clean up the buildGravatar Tim Shen2018-08-30
| | | | | | dependencies as well. PiperOrigin-RevId: 211038094
* [XLA] Rename all (Mutable)ArraySlice to absl::Span.Gravatar Tim Shen2018-08-30
| | | | PiperOrigin-RevId: 210998142
* Mark MatMulAndAddCompWithProfiling as manual.Gravatar Justin Lebar2018-08-27
| | | | PiperOrigin-RevId: 210495040
* Use absl functions instead of str_util within tf2xla.Gravatar Justin Lebar2018-08-23
| | | | PiperOrigin-RevId: 210040583
* Remove tf2xla's str_util in favor of absl.Gravatar Justin Lebar2018-08-23
| | | | PiperOrigin-RevId: 210038492
* [XLA] Use absl string types and functions instead of the TF versions.Gravatar Justin Lebar2018-08-23
| | | | | | | Unfortunately this has to be one big patch, because e.g. absl::StrCat doesn't accept a TF StringPiece, but as soon as we switch to absl::string_view, we have to switch away from all of the TF functions. PiperOrigin-RevId: 209957896
* [XLA] gtl::optional->absl::optionalGravatar Yunxing Dai2018-08-21
| | | | PiperOrigin-RevId: 209686671
* [XLA] Use absl::make_unique instead of xla::MakeUnique.Gravatar Justin Lebar2018-08-20
| | | | | | Same for WrapUnique. PiperOrigin-RevId: 209531124
* Remove XlaCompiledCpuFunction::args()Gravatar Sanjoy Das2018-08-10
| | | | | | This lets us remove XlaCompiledCpuFunction::args_ and some awkwardness from XlaCompiledCpuFunction::Run. PiperOrigin-RevId: 208309249
* Introduce and use a BufferInfo class.Gravatar Sanjoy Das2018-08-10
| | | | | | | | | | | The BufferInfo represents information about buffer assignment in XlaCompiledCpuFunction. Arg sizes and temp sizes are now derived from BufferInfo instead of being discrete sources of information. Also made StaticData() private, tfcompile clients should not need to access it directly. PiperOrigin-RevId: 208283305
* [XLA:CPU] Migrate aot/runtine.{h,cc} to xla_compiled_cpu_function.{h,cc}Gravatar Sanjoy Das2018-08-03
| | | | | | | | | | | | | As a follow-on cleanup for cl/206980796 ("Overhaul XLA:CPU's calling convention.") I want to introduce a BufferInfo class that encapsulates whether a buffer is a constant, an entry parameter or a temp without using the fragile "size < 0" scheme I have today. To do this efficiently I need a place to put the BufferInfo class that will be visible to MallocContiguousBuffers. Instead of creating (what seemed to me) an odd layering with BufferInfo in aot/runtime.h I decided to pull in the runtime into xla_compiled_cpu_function since that's the only user. PiperOrigin-RevId: 207333245
* Reformat tensorflow/compiler/aot/tfcompile.bzl; NFCGravatar Sanjoy Das2018-08-01
| | | | | | And fix two lint issues. PiperOrigin-RevId: 207051473
* Reland "Overhaul XLA:CPU's calling convention."Gravatar Sanjoy Das2018-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | aligned_buffer_bytes in compiler/aot/runtime.cc was checking sizes[i] == -1 (as opposed to checking sizes[i] < 0) to decide whether sizes[i] should count towards the total size. Original CL description: Overhaul XLA:CPU's calling convention. This CL introduces a clean separation between calls to "thread local" and "global" computations in XLA:CPU. Global computations are: - kWhile body and condition computations - kConditional true and false computations - kCall callees Parameters and results buffers for these calls are assigned a static BufferAllocation::Slice by buffer assignment and so they don't require pointers to result buffers and parameters to be explicitly passed in. In fact, passing in result and parameters buffers is actively misleading because in cases like: while_condition { val = (s32[], pred[]) infeed() ROOT result = get-tuple-element(val), index=0 } there is no instruction explicitly copying the result of the computation into the result buffer. Instead, it is up to the caller to pick up the correct result buffer by asking buffer assignment (which would be buffer where infeed wrote its second tuple component). Thread local computations are all the other nested computations except fusion, e.g. computations used by kMap and kReduce. Parameters and result buffers for these calls are assigned a "thread local" BufferAllocation::Slice which in XLA:CPU are mapped to allocas. Since these are not static addresses, we *do* need to pass in parameter and result buffers. The output is written to the result buffer by "allocating" the storage for the root into the result buffer passed in by the caller. There are two cleanup items that I kept off this CL to make reviews easier: - We should rename "temps" to something more generic, like "buffer_table". I'll do that in a followup CL. - We should use GatherComputationsByAllocationType from buffer_assignment.cc to CHECK that we use thread local calls for thread local callees and global calls for global callees. PiperOrigin-RevId: 206980796
* Automated rollback of commit fba2d773f45f10882aa475ac75cbf9884995d626Gravatar Sanjoy Das2018-07-31
| | | | PiperOrigin-RevId: 206855848
* Overhaul XLA:CPU's calling convention.Gravatar Sanjoy Das2018-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This CL introduces a clean separation between calls to "thread local" and "global" computations in XLA:CPU. Global computations are: - kWhile body and condition computations - kConditional true and false computations - kCall callees Parameters and results buffers for these calls are assigned a static BufferAllocation::Slice by buffer assignment and so they don't require pointers to result buffers and parameters to be explicitly passed in. In fact, passing in result and parameters buffers is actively misleading because in cases like: while_condition { val = (s32[], pred[]) infeed() ROOT result = get-tuple-element(val), index=0 } there is no instruction explicitly copying the result of the computation into the result buffer. Instead, it is up to the caller to pick up the correct result buffer by asking buffer assignment (which would be buffer where infeed wrote its second tuple component). Thread local computations are all the other nested computations except fusion, e.g. computations used by kMap and kReduce. Parameters and result buffers for these calls are assigned a "thread local" BufferAllocation::Slice which in XLA:CPU are mapped to allocas. Since these are not static addresses, we *do* need to pass in parameter and result buffers. The output is written to the result buffer by "allocating" the storage for the root into the result buffer passed in by the caller. There are two cleanup items that I kept off this CL to make reviews easier: - We should rename "temps" to something more generic, like "buffer_table". I'll do that in a followup CL. - We should use GatherComputationsByAllocationType from buffer_assignment.cc to CHECK that we use thread local calls for thread local callees and global calls for global callees. PiperOrigin-RevId: 206843794
* Move xla_computation.* from xla/client/xla_client up to xla/client.Gravatar Mark Heffernan2018-07-25
| | | | | | | | | Plan is to move everything in xla/client/xla_client up to xla/client and remove the directory. No functional change. PiperOrigin-RevId: 206055680
* Fix OOB check for result_index in header generationGravatar Jongmin Baek2018-06-20
|
* Merge commit for internal changesGravatar Gunhan Gulsoy2018-06-03
|\
| * Add an explanatory comment.Gravatar Sanjoy Das2018-06-01
| | | | | | | | PiperOrigin-RevId: 198949796
* | fix typoGravatar ManHyuk2018-05-27
| |
* | Merge commit for internal changesGravatar Ankur Taly2018-05-22
|\|
| * Update calls to addPassesToEmitFileGravatar A. Unique TensorFlower2018-05-22
| | | | | | | | PiperOrigin-RevId: 197564506
* | Fix alignment crashes in AVX512 builds (#19121)Gravatar Mark Ryan2018-05-17
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Fix issue #15588 by simplifying the code The allocator.h code tried to be clever and use 32 byte alignment for SSE/AVX2/etc use, and 64 byte alignment for AVX512. Unfortunately, the #ifdef in use (from EIGEN) is not useful; the bazel BUILD files do not propagate the tf_copts() compiler flags when the allocator.cc/allocator.h files get compiled, to EIGEN does not see the actual AVX512 using compiler flags... Rather than changing compiler flag propagation throughout a whole bunch of code, there's an opportunity to just simplify the code and always use 64 byte alignment. Yes it wastes a bit of space, but on the other hand now these allocations are cache line aligned which isn't a bad thing... and an ifdef can be dropped Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> * Set EIGEN_MAX_ALIGN_BYTES=64 This patch sets a 64 byte upper bound on the alignment of memory allocated by eigen. This is necessary to prevent crashes during the execution of the unit tests when they are compiled with AVX512 support. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com> * Update the tensorflow/compiler/aot tests for 64 byte alignment Modifications to the tensorflow/core/framework/allocator.h to always use 64 byte alignment causes failures in the tensorflow/compiler/aot unit tests. This patch updates these tests so that they pass with 64 byte aligned allocated memory. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com> * Update Tensor.Slice_Basic for 64 byte alignment The test case //tensorflow/core:framework_tensor_test:Tensor.Slice_Basic fails with EIGEN_MAX_ALIGN_BYTES set to 64. The reason is that the slices it takes of the sample tensor are 32 byte and not 64 byte aligned. This commit increases one of the dimensions of the original tensor to ensure that the slices taken by the test cases are indeed 64 byte aligned. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com> * Update ScopedAllocatorConcatOpTest.Reshape for 64 byte alignment The ScopedAllocatorConcatOpTest.Reshape test requires that the elements of the field_shapes parameter of ExecOp are multiples of Allocator::kAllocatorAlignment in size. If they are not, the backing tensor allocated by PrepOp will have too many elements and reshaping will fail. This commit modifies the test case, making the elements 64 bytes in size, the new value for Allocator::kAllocatorAlignment. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>
* Automated g4 rollback of changelist 196691101Gravatar Peter Hawkins2018-05-16
| | | | PiperOrigin-RevId: 196879933
* Remove unused BUILD dependenciesGravatar A. Unique TensorFlower2018-05-15
| | | | PiperOrigin-RevId: 196742598
* Automated g4 rollback of changelist 196683444Gravatar Peter Hawkins2018-05-15
| | | | PiperOrigin-RevId: 196691101
* [TF:XLA] Generalize existing support for keeping variables on an XLA device ↵Gravatar Peter Hawkins2018-05-15
| | | | | | in reshaped form, instead allowing XLA devices to keep all tensors in a reshaped form outside an XLA computation. PiperOrigin-RevId: 196683444
* [XLA] Ergonomic improvements to --xla_hlo_profile.Gravatar Justin Lebar2018-05-14
| | | | | | | | | | | | | | | | | | | | | - Don't display ops with 0 optimal seconds and 0 actual cycles. These are ops that were expected to be free and were actually free. - Fix HloCostAnalysis to mark parameters, constants, and get-tuple-element as expected-to-be-free per the definition above. - Allow optimal-seconds < 0 to indicate "I don't know". Use this for custom calls, and then hide such ops from the "seconds above the optimum" table. - Don't display "<none>" and "<unknown>" -- instead, just display the empty string. Less visual noise. - Instead of showing ~5 ops per category in the categories tables, show everything. This isn't so noisy now that we're hiding "free" ops, and it makes finding optimization opportunities much easier. PiperOrigin-RevId: 196564177
* Add test with tf.cond.Gravatar Jacques Pienaar2018-05-07
| | | | PiperOrigin-RevId: 195745718
* [XLA] Redesign: migrate tensorflow/compiler/tf2xla, tensorflow/compiler/aot:Gravatar A. Unique TensorFlower2018-04-30
| | | | | | | | | | - xla::ComputationBuilder -> xla::XlaBuilder - xla::ComputationDataHandle -> xla::XlaOp - xla::Computation -> xla::XlaComputation - xla::CompileOnlyClient::AotComputationInstance -> xla::CompileOnlyClient::AotXlaComputationInstance - xla::SessionModule -> xla::HloSnapshot PiperOrigin-RevId: 194874462
* HLO profiling for tfcompile.Gravatar Sanjoy Das2018-04-27
| | | | | | | | | | | | This CL extends the --xla_hlo_profile knob to tfcompile. tf_library rules can now set enable_xla_hlo_profiling to True to: - Have the generated code update per-HLO profile counters as it executes. - Have tfcompile generate and serialize an instance HloProfilePrinterData with a compiled model that can be used to pretty-print the collected profile counters. PiperOrigin-RevId: 194627272
* Merge changes from github.Gravatar Yifei Feng2018-04-23
| | | | PiperOrigin-RevId: 194031845
* Use tensorflow::se instead of perftools::gputools for StreamExecutor.Gravatar Justin Lebar2018-04-23
| | | | PiperOrigin-RevId: 194010749
* Internal ChangeGravatar A. Unique TensorFlower2018-04-23
| | | | PiperOrigin-RevId: 193929733
* Update for ObjectMemoryBuffer.h rename in upstream LLVM.Gravatar Benjamin Kramer2018-04-16
| | | | | | This will require a version bump in workspace.bzl PiperOrigin-RevId: 193052084
* Automated g4 rollback of changelist 192768744Gravatar A. Unique TensorFlower2018-04-13
| | | | PiperOrigin-RevId: 192770717
* Split byte_order.h off cpu_info.hGravatar A. Unique TensorFlower2018-04-13
| | | | PiperOrigin-RevId: 192768744
* Move dummy AssertOp and CheckNumericsOp to ↵Gravatar A. Unique TensorFlower2018-04-12
| | | | | | | | //third_party/tensorflow/compiler/tf2xla/kernels. Enable type DT_STRING for AssertOp and ConstOp, in order to make dummy Assert compile with a const string (assert message) as its input. PiperOrigin-RevId: 192695938
* Remove all_opensource_files. It's not needed any more.Gravatar Martin Wicke2018-03-28
| | | | PiperOrigin-RevId: 190878279
* Replaced calls to deprecated tensorflow::StringPiece methods with theirGravatar A. Unique TensorFlower2018-03-27
| | | | | | | | tensorflow::str_util equivalents. This will allow the deprecated methods to be removed. PiperOrigin-RevId: 190650553
* Update file due to changes in Bazel (PACKAGE_NAME is deprecated)Gravatar A. Unique TensorFlower2018-03-22
| | | | PiperOrigin-RevId: 190051589