aboutsummaryrefslogtreecommitdiffhomepage
path: root/third_party/eigen.BUILD
Commit message (Collapse)AuthorAge
* eigen: Add install_eigen_headers target for installing to system (#20281)Gravatar Jason Zaman2018-06-28
| | | | | | | | | | Eigen provides files that are both GPL and MPL. Tensorflow uses only the MPL headers. This target collects all the headers into genfiles so they can be easily installed to /usr/include/ later. Thanks to dennisjenkins@google.com for all the help testing and figuring out what was missing. And to pcloudy@google.com for pointers to the solution. Signed-off-by: Jason Zaman <jason@perfinion.com>
* Fix alignment crashes in AVX512 builds (#19121)Gravatar Mark Ryan2018-05-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Fix issue #15588 by simplifying the code The allocator.h code tried to be clever and use 32 byte alignment for SSE/AVX2/etc use, and 64 byte alignment for AVX512. Unfortunately, the #ifdef in use (from EIGEN) is not useful; the bazel BUILD files do not propagate the tf_copts() compiler flags when the allocator.cc/allocator.h files get compiled, to EIGEN does not see the actual AVX512 using compiler flags... Rather than changing compiler flag propagation throughout a whole bunch of code, there's an opportunity to just simplify the code and always use 64 byte alignment. Yes it wastes a bit of space, but on the other hand now these allocations are cache line aligned which isn't a bad thing... and an ifdef can be dropped Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> * Set EIGEN_MAX_ALIGN_BYTES=64 This patch sets a 64 byte upper bound on the alignment of memory allocated by eigen. This is necessary to prevent crashes during the execution of the unit tests when they are compiled with AVX512 support. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com> * Update the tensorflow/compiler/aot tests for 64 byte alignment Modifications to the tensorflow/core/framework/allocator.h to always use 64 byte alignment causes failures in the tensorflow/compiler/aot unit tests. This patch updates these tests so that they pass with 64 byte aligned allocated memory. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com> * Update Tensor.Slice_Basic for 64 byte alignment The test case //tensorflow/core:framework_tensor_test:Tensor.Slice_Basic fails with EIGEN_MAX_ALIGN_BYTES set to 64. The reason is that the slices it takes of the sample tensor are 32 byte and not 64 byte aligned. This commit increases one of the dimensions of the original tensor to ensure that the slices taken by the test cases are indeed 64 byte aligned. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com> * Update ScopedAllocatorConcatOpTest.Reshape for 64 byte alignment The ScopedAllocatorConcatOpTest.Reshape test requires that the elements of the field_shapes parameter of ExecOp are multiples of Allocator::kAllocatorAlignment in size. If they are not, the backing tensor allocated by PrepOp will have too many elements and reshaping will fail. This commit modifies the test case, making the elements 64 bytes in size, the new value for Allocator::kAllocatorAlignment. Signed-off-by: Mark Ryan <mark.d.ryan@intel.com>
* Branch 174861804 (#14326)Gravatar Martin Wicke2017-11-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add ImportGraphDefTest.testMultipleImport to importer_test.py This tests the name deduping behavior of import_graph_def. This behavior is actually defined by the op creation logic, not import_graph_def, but I added a test here since the C++ ImportGraphDef function must emulate it (and presumably we'd like to maintain the import_graph_def behavior moving forward). PiperOrigin-RevId: 174536014 * Apply lib_internal defines to both lib_internal and lib_internal_impl Should fix checkpoint reading with snappy compression. Will follow up with testing for this sort of checkpoint issue. PiperOrigin-RevId: 174538693 * n/a (internal change only) PiperOrigin-RevId: 174539513 * A few changes to ApiDef generation: - Create a separate api_def_*.pbtxt file for each op. - Add attribute and argument descriptions to ApiDef. - Apply overrides based on op_gen_overrides.pbtxt file. PiperOrigin-RevId: 174540421 * Add uniquify_names option to ImportGraphDef. This option allows ImportGraphDef to mimic the behavior of the Python import_graph_def function, which automatically creates unique node names instead of raising an exception (this is due to the Python op construction logic, not import_graph_def directly). This change is a steps towards switching import_graph_def to use the C API version. PiperOrigin-RevId: 174541334 * Fix bad_color param on tf.contrib.summary.image PiperOrigin-RevId: 174549117 * Hlo parser: support control-predecessors. Also, - Changed from printing control-sucessors to printing control-predecessors because predecessors are defined before use. - Surround the predecessors with {}. PiperOrigin-RevId: 174552224 * Support pad node. PiperOrigin-RevId: 174581035 * Add tf.contrib.framework.sort, wrapping tf.nn.top_k (#288). Comparable to np.sort, but their "kind" parameter is not implemented (only one sort algorithm) and "order" is not applicable (tensors do not have fields). PiperOrigin-RevId: 174588000 * [TF2XLA] Don't change output port for control dependency in CopySubgraph. If the output is being squashed then we want control output 0, except where the input is a control dependency. PiperOrigin-RevId: 174633829 * Use latest nsync: allows running bazel after having downloaded for "make" build The downloads directory for the make build is within the source tree seen by bazel, which means that BUILD files (by whatever name) without those downloaded trees must all be valid in their new location, or not recognized by bazel as being BUILD files. The new version of nsync handles that, and this change pulls in that new version. PiperOrigin-RevId: 174652898 * Add profiling support to Service::ExecuteParallel. PiperOrigin-RevId: 174682772 * Replicate `Estimator.model_fn` across available GPUs. def replicate_model_fn(model_fn, optimizer_fn, devices=None): """Replicate `Estimator.model_fn` over GPUs. ... I tested that it seems to give the right result on cnn_mnist.py on 1 CPU, 1 real GPU, 4 allow_soft_placement=True GPUs. Some measurements on CNN MNIST across steps 19300-20000: 1) no replicate_model_fn call: global_step/sec: 156.254 global_step/sec: 155.074 global_step/sec: 155.74 global_step/sec: 153.636 global_step/sec: 157.218 global_step/sec: 159.644 2) replicate across one hardware GPU: global_step/sec: 158.171 global_step/sec: 165.618 global_step/sec: 162.773 global_step/sec: 159.204 global_step/sec: 162.289 global_step/sec: 167.173 3) replicate across 4 software GPUs on one hardware GPU (soft placement): global_step/sec: 75.47 global_step/sec: 76.16 global_step/sec: 75.18 Loss numbers didn't change across the three configurations. PiperOrigin-RevId: 174704385 * Enables wrapping input pipeline into tf.while_loop for all users. PiperOrigin-RevId: 174708213 * SerializeIterator: do not unref the resource until we're finished using it. This change avoids a potential use-after-free error if the resource is concurrently serialized and destroyed (e.g. by a DestroyResourceOp or Session::Reset()). PiperOrigin-RevId: 174713115 * Improve error message when a function is already defined with the same name and different hash string. PiperOrigin-RevId: 174715563 * Fix generate_examples build - Add -march=native to host_copts and host_cxxopts in configure.py - Make string.h for abstracting string differences at core interpreter level - Use tensorflow special arg parse instead of flags - Switch to using tool instead of data for dependency - Fix python3 compatibility + Use six.StringIO instead of StringIO.StringIO + Use print_function + Properly set binary flags on TempFile's used in toco_convert - Misc other path fixes PiperOrigin-RevId: 174717673 * Add input format agnostic way to parse HLOs. PiperOrigin-RevId: 174719153 * Remove misleading comment from Eigen build file. PiperOrigin-RevId: 174719222 * Basic plumbing for calling C API from import_graph_def() PiperOrigin-RevId: 174724070 * Memory leak detected when running a heap checker in our tests. PiperOrigin-RevId: 174726228 * [tpu:profiler] Support the Input Pipeline Analyzer tool in TPU profiler (WIP) o. move input pipeline analyzer related proto for grpc between red and green VMs o. rename perftools.gputools.profiler.collector::TfStatsHelperResult to tensorflow::tpu::TfOpStats. PiperOrigin-RevId: 174730411 * Clean up some reference cycles in eager mode. ResourceVariables enter graph mode to get a handle. We should probably revisit that, but in the meantime we can break the resulting reference cycles. PiperOrigin-RevId: 174732964 * Improved encoding on shapes in grappler. PiperOrigin-RevId: 174733491 * [tf.data] Remove unused members from IteratorContext. PiperOrigin-RevId: 174734277 * Refactor helper functions a bit for virtual gpu changes later. PiperOrigin-RevId: 174735029 * Fix invalid flush_secs argument. PiperOrigin-RevId: 174745329 * Replace the implementation of tf.flags with absl.flags. Previous tf.flags implementation is based on argparse. It contains -h/--help flags, which displays all flags. absl.app's --help flag only displays flags defined in the main module. There is a --helpfull flag that displays all flags. So added --helpshort --helpfull flags. app.run now raises SystemError on unknown flags (fixes #11195). Accessing flags before flags are parsed will now raise an UnparsedFlagAccessError, instead of causing implicit flag parsing previously. PiperOrigin-RevId: 174747028 * Fold Transpose into Matmul and SparseMatmul. Fold ConjugateTranspose in BatchMatmul. PiperOrigin-RevId: 174750173 * BUGFIX: special_math.ndtri didn't work with dynamic shapes. This was due to use of constant_op.constant(..., shape=p.shape), where sometimes p was a Tensor of unknown shape. PiperOrigin-RevId: 174764744 * Create a routine that can collapse a subgraph into a fused op PiperOrigin-RevId: 174765540 * Force CUDA runtime initialization only when device count is larger than 0. PiperOrigin-RevId: 174767565 * Remove use of xrange which is not python3 compatible. PiperOrigin-RevId: 174768741 * More thoroughly disable the should_use_result decorator when executing eagerly. It was creating reference cycles. Adds a test that TensorArrays create no reference cycles in eager mode. PiperOrigin-RevId: 174768765 * Fix device querying in Keras backend. PiperOrigin-RevId: 174769308 * Fix race bug in AdaptiveSharedBatchScheduler. In ASBSQueue::Schedule, when a new batch is created, it was added to the scheduler outside of the queue's lock. This was done to prevent any unforeseen interactions between the queue lock and scheduler lock. However, this wasn't being done in a thread safe way. PiperOrigin-RevId: 174769383 * Supports multi-dimensional logits and labels in multi class head. PiperOrigin-RevId: 174770444 * Refactor eager benchmarks to subclass Benchmark. PiperOrigin-RevId: 174770787 * Add `parallel_interleave` to tf/contrib/data/__init__.py so that it is directly addressable from tf.contrib.data. PiperOrigin-RevId: 174771870 * Fix DepthToSpaceGrad and SpaceToDepthGrad on data_format NCHW. This fixes #14243. PiperOrigin-RevId: 174772870 * Allow for an old_row_vocab_size, in case a subset of the old_row_vocab_file was used during the checkpoint creation (as is allowed in FeatureColumn._VocabularyListCategoricalColumn). PiperOrigin-RevId: 174781749 * Go: Update generated wrapper functions for TensorFlow ops. PiperOrigin-RevId: 174781987 * [BufferAssignment] Sort allocation's "Assigned" objects before converting to a proto. This makes the buffer assignment's proto dump deterministic. RELNOTES: BufferAssignment's protocol buffer dump is now deterministic. PiperOrigin-RevId: 174783549 * [TF TensorArray] allow reading from an unwritten index if fully defined element_shape is given. This allows one to write to only some indices of a TensorArray before calling stack. Elements that were not written to are treated as all zero tensors. PiperOrigin-RevId: 174783569 * Remove binary dependency from optimize_for_inference_lib PiperOrigin-RevId: 174787363 * Update ops-related pbtxt files. PiperOrigin-RevId: 174787397 * Automated g4 rollback of changelist 174523638 PiperOrigin-RevId: 174788331 * Skip non-existent fetch nodes PiperOrigin-RevId: 174795864 * Automated g4 rollback of changelist 174735029 PiperOrigin-RevId: 174796480 * Add InceptionResNetV2 to tf.keras and update applications module to match Keras 2.0.9. PiperOrigin-RevId: 174796893 * Fix for LLVM API changes for fast math (https://reviews.llvm.org/rL317488). PiperOrigin-RevId: 174799735 * [TF:XLA] Add two disabled tests with while ops that permute tuple elements. These tests permute the tuple elements of a 3-tuple in each iteration in the following cyclic manner (132), i.e. a shift to the left. The first test just return the result tuple, the second returns the sum of all tuple elements (which is expected to be constant 6, no matter which permutation) Both tests are disabled for now because they fail on all back-ends. PiperOrigin-RevId: 174806092 * Refactor function Optimize. PiperOrigin-RevId: 174813300 * Add a unit test for gradient computation with layout optimizer. PiperOrigin-RevId: 174814136 * Previously if ComputeConstant seen a parameter it failed to proceed. After this change we can specify a list of parameters to it and if we specify enough then it will do the computation. The primary goal of this change is to make the HloEvaluator usable with ComputationBuilder from tests through ComputeConstant in cases where the input is a parameter (fed by a literal). PiperOrigin-RevId: 174845108 * Use nesting to reduce the number of modules listed in the API TOC. PiperOrigin-RevId: 174846842 * Added CPU matrix exponential op to TensorFlow. Uses Eigen's unsupported implementation. PiperOrigin-RevId: 174858966 * variables_to_restore: Differentiate python variables by string name rather than object. variables_to_restore ensured that duplicate variables weren't added to the return map by comparing python variable object. Normally there is only one Variable object for each underlying variable, so this wasn't a problem. But when one initializes a graph by importing a GraphDef, duplicate python Variable objects are created for each occurrence of a variable in a collection (say, global variables and moving average variables). This change fixes variables_to_restore to work with an imported graph def by not comparing Variable objects. PiperOrigin-RevId: 174861804
* Move most foo.BUILD files into third_partyGravatar Justine Tunney2016-12-29
This frees up space on the TensorFlow GitHub home page! Change: 143161497