aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/core/BUILD
Commit message (Collapse)AuthorAge
* On android, don't preserve doc strings for registered ops. In a binary thatGravatar A. Unique TensorFlower2016-03-10
| | | | | | | | | | doesn't filter op registrations, this saves >185k. Also, this may save a few cycles during startup for mobile (untested), since the doc string won't be parsed. This introduces use of a TF_LEAN_BINARY macro that we can use to control other such options. Change: 116889235
* Rollback of "Add MirrorPad op. This op is a variety of Pad op implementing ↵Gravatar Vijay Vasudevan2016-03-09
| | | | | | | reflect and symmetric modes of Numpy pad." Change: 116836742
* Add MirrorPad op. This op is a variety of Pad op implementing reflect andGravatar A. Unique TensorFlower2016-03-09
| | | | | symmetric modes of Numpy pad. Change: 116828726
* Replace uses of RE2 with Scanner, for sources that are used on mobile.Gravatar A. Unique TensorFlower2016-03-09
| | | | | | | | | | | This CL also adds the Scanner class to do simple scans over strings, to mimic regexp behavior like [a-zA-Z][a-zA-Z0-9]* with: Scanner scan(s); scan.One(Scanner::LETTER); scan.Any(Scanner::LETTER_DIGIT); bool matched = scan.GetResult(); Change: 116803757
* Remove redundant filegroup definitions from tensorflow/core/BUILD and just ↵Gravatar Andrew Harp2016-03-08
| | | | | | use filegroups from tensorflow/core/kernels/BUILD for Android targets. Change: 116688861
* Remove some unnecessary py_api_versions from BUILD files.Gravatar Vijay Vasudevan2016-03-08
| | | | Change: 116619279
* TensorFlow: create a CPU device that is only created when GPUs are alsoGravatar Vijay Vasudevan2016-03-08
| | | | | | | potentially linked into the binary. This makes sure that the :core_cpu target will never have any GPU code linked in, which was confusing and weird. Change: 116618884
* Enables all `image_ops` functions to accept NumPy arrays.Gravatar Derek Murray2016-03-07
| | | | | Fixes #1409. Change: 116554875
* TensorFlow: use GOOGLE_CUDA to prevent gpu library code from needingGravatar Vijay Vasudevan2016-03-05
| | | | | to be linked into CPU-only binaries. Change: 116473970
* TensorFlow: enable cuda host memory allocation for GPU compatibleGravatar Vijay Vasudevan2016-03-05
| | | | | | | | buffers when copying to the CPU device. Re-arranges some of the internal gpu libraries to be library vs. runtime specific. Change: 116472314
* Merge attention_ops (which only has one op) into image_ops.Gravatar Josh Levenberg2016-03-03
| | | | Change: 116256253
* Merge summary_ops into logging_ops (both of which were small).Gravatar Josh Levenberg2016-03-03
| | | | Change: 116199874
* Remove extra LICENSE files.Gravatar Josh Levenberg2016-03-03
| | | | Change: 116188107
* Add optional comprehensive logging of memory allocation/deallocation events. ↵Gravatar A. Unique TensorFlower2016-03-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When enabled, the following events are recorded: The start of a step, with the numerical step_id and a textual handle describing the step. A Tensor allocation, including the step_id, the name of the OpKernel, the data type, shape, allocation size, allocation_id, data pointer location, and allocator used (the allocation_id is local to an allocator). A Tensor deallocation, including the allocation_id and allocator used. A raw memory allocation, including the step_id, the name of the component (e.g. Eigen), the number of bytes, data pointer location, allocation_id and allocator used. A raw memory deallocation, including the step_id, the name of the component (e.g. Eigen), allocation_id and allocator used. For now many Tensor allocations show 'unknown' for the kernel and step_id. These mostly come from Tensors allocated by the system from protocol buffers, and Tensors allocated by Ops using the Tensor constructor directly instead of calling OpKernelContext::allocate_temp. The latter can in principle be cleaned up one by one as necessary. The former would require some plumbing to associate an allocation with the appropriate step_id. With this CL memory logging is enabled by raising the VLOG level to 1. Once there is an ability to set process-wide options programmatically it would make sense to update the machinery to do that. Currently recorded events are logged as INFO, and they can all be retrieved by filtering the log for lines including __LOG_MEMORY__. Some example lines are as follows: I0301 13:38:55.797563 81179 log_memory.cc:18] __LOG_MEMORY__ MemoryLogTensorAllocation { step_id: -6 kernel_name: "Unknown (from Proto)" tensor { dtype: DT_FLOAT shape { } allocation_description { requested_bytes: 4 allocated_bytes: 4 allocator_name: "cuda_host" allocation_id: 2 has_single_reference: true ptr: 8717861408 } } } I0301 13:38:55.802245 81179 log_memory.cc:18] __LOG_MEMORY__ MemoryLogTensorAllocation { step_id: -6 kernel_name: "Unknown" tensor { dtype: DT_FLOAT shape { } allocation_description { requested_bytes: 4 allocated_bytes: 256 allocator_name: "gpu_bfc" allocation_id: 1 has_single_reference: true ptr: 47378989056 } } } I0301 13:38:55.802347 81179 log_memory.cc:18] __LOG_MEMORY__ MemoryLogTensorDeallocation { allocation_id: 2 allocator_name: "cuda_host" } [...] I0301 13:38:55.806454 81179 log_memory.cc:18] __LOG_MEMORY__ MemoryLogStep { step_id: 1 handle: "->/init;0" } I0301 13:38:55.806659 81220 log_memory.cc:18] __LOG_MEMORY__ MemoryLogTensorOutput { step_id: 1 kernel_name: "random_normal/shape" tensor { dtype: DT_INT32 shape { dim { size: 4 } } allocation_description { requested_bytes: 16 allocated_bytes: 16 allocator_name: "cuda_host" allocation_id: 1 ptr: 8717860896 } } } [...] I0301 13:38:56.362898 81218 log_memory.cc:18] __LOG_MEMORY__ MemoryLogTensorAllocation { step_id: 1 kernel_name: "conv1/truncated_normal" tensor { dtype: DT_FLOAT shape { dim { size: 11 } dim { size: 11 } dim { size: 3 } dim { size: 96 } } allocation_description { requested_bytes: 139392 allocated_bytes: 139520 allocator_name: "gpu_bfc" allocation_id: 36 has_single_reference: true ptr: 47379030016 } } } I0301 13:38:56.362894 81217 log_memory.cc:18] __LOG_MEMORY__ MemoryLogTensorDeallocation { allocation_id: 24 allocator_name: "gpu_bfc" } I0301 13:38:56.362903 81213 log_memory.cc:18] __LOG_MEMORY__ MemoryLogTensorOutput { step_id: 1 kernel_name: "conv5/truncated_normal/mul" tensor { dtype: DT_FLOAT shape { dim { size: 3 } dim { size: 3 } dim { size: 1024 } dim { size: 1024 } } allocation_description { requested_bytes: 37748736 allocated_bytes: 37748736 allocator_name: "gpu_bfc" allocation_id: 34 ptr: 48512711168 } } } [...] I0229 16:39:57.482980 76558 log_memory.cc:18] __LOG_MEMORY__ MemoryLogRawAllocation { step_id: 13 operation: "xentropy/EigenAllocator" num_bytes: 64 ptr: 47386857472 allocation_id: 625 allocator_name: "gpu_bfc" } I0229 16:39:57.483147 76558 log_memory.cc:18] __LOG_MEMORY__ MemoryLogRawDeallocation { step_id: 13 operation: "xentropy/EigenAllocator" allocation_id: 625 allocator_name: "gpu_bfc" deferred: true } I0229 16:39:57.483197 76558 log_memory.cc:18] __LOG_MEMORY__ MemoryLogRawDeallocation { step_id: 13 operation: "xentropy/EigenAllocator" allocation_id: 625 allocator_name: "gpu_bfc" } Change: 116065112
* Added a benchmark for the performance of dyanmic rnn with memory swapping. ↵Gravatar Yuan Yu2016-02-29
| | | | | | | | | | | For the large model, the overhead is roughly 35%. More improvements over this base-line implementation are coming. But if you have long sequence models that are currently running out of memory, I encourage you to try this out. Calculation: Dynamic LSTM No Memory Swap vs. Memory Swap batch max_t units no_swap swap swap/no_swap 512 100 512 0.702892 0.946286 1.346275 512 100 256 0.292875 0.451330 1.541033 512 100 128 0.162116 0.257621 1.589119 Change: 115912325
* Fix OSS build breakage due to missing std namespace on vector.Gravatar A. Unique TensorFlower2016-02-29
| | | | Change: 115879158
* Bugfix to the Any protobufGravatar Eugene Brevdo2016-02-29
| | | | Change: 115740568
* Delete deprecated core:kernel_lib target now that it is noGravatar Josh Levenberg2016-02-29
| | | | | longer referenced. Change: 115711086
* Add CTC (Connectionist Temporal Classification) Ops to TF contrib.Gravatar Eugene Brevdo2016-02-26
| | | | | | | | This includes: * ctc_loss * ctc_greedy_decoder * ctc_beam_search_decoder Change: 115683564
* Final fix to TestReporter (hopefully).Gravatar Eugene Brevdo2016-02-26
| | | | Change: 115675044
* Move core/common_runtime/gpu_device_context.h intoGravatar Josh Levenberg2016-02-26
| | | | | core:gpu_lib, out of the non-gpu core:core_cpu* targets. Change: 115641392
* Initial version of the open-source distributed TensorFlow runtime.Gravatar Derek Murray2016-02-25
| | | | | | | | | | | | This includes a gRPC server (grpc_tensorflow_server) that can serve as both the master of a distributed TensorFlow computation, and an individual worker in the computation. The GrpcSession class is included to allow client programs (including Python clients) to interact with a server. See tensorflow/core/distributed_runtime/README.md for usage instructions. This change partially addresses issue #23. Change: 115634191
* Rollback of "TestReporter is back in. Maybe also fixed the Android build."Gravatar Vijay Vasudevan2016-02-25
| | | | | Test fails. Change: 115602477
* Make gpu_lib for non-cuda deps that we use in public kernels.Gravatar Vijay Vasudevan2016-02-25
| | | | Change: 115598732
* Execute TODO to rename io.* to save_restore_tensor.*. This willGravatar Josh Levenberg2016-02-25
| | | | | | hopefully reduce confusion since io.* is not the implementation of the ".../kernels:io" build target. Change: 115593814
* TestReporter is back in. Maybe also fixed the Android build.Gravatar Eugene Brevdo2016-02-25
| | | | Change: 115589642
* TensorFlow: fix bug in StringPiece::contains which made it alwaysGravatar Vijay Vasudevan2016-02-25
| | | | | | return true. Add a unittest to catch this type of regression in the future. Change: 115573280
* Added TestReporter and test / benchmark reporting tools.Gravatar Eugene Brevdo2016-02-25
| | | | | | | | These tools are meant to allow recording of benchmark & unit test structured output to pbtxt files in a directory only when the environment variable TEST_REPORT_FILE_PREFIX is set. For now, only saving of C++ microbenchmark output is supported. Change: 115518303
* Make core/framework/graph_def_util.h publicly accessible.Gravatar Josh Levenberg2016-02-24
| | | | Change: 115384748
* Give tensorflow/core/kernels/ its own BUILD file.Gravatar Josh Levenberg2016-02-24
| | | | Change: 115379524
* Adds a C++ test utility that makes it possible to run multi-process tests.Gravatar Derek Murray2016-02-23
| | | | | This will be necessary for tests the distributed runtime (issue #23). Change: 115339579
* Flush denormals to zero on both CPU and GPUGravatar Geoffrey Irving2016-02-19
| | | | | | | | | | | | Two different mechanisms are required. On the CPU, we push and pop the appropriate processor flags in the executor (for the master thread) *and* in each threadpool thread, since the processor flags are thread local. On the GPU, we set -ftz=true for both nvcc and gcudacc so that kernels that we build flush denormals to zero using instruction flags. Caveat: On GPU, only single precision denormals are flushed to zero; double precision is unchanged. Change: 115114845
* Support NCHW in forward and backward convolution op.Gravatar Xiaoqiang Zheng2016-02-19
| | | | | Test both layouts in tests. Change: 115096872
* TensorFlow: more fix to android BUILDGravatar Vijay Vasudevan2016-02-17
| | | | Change: 114834125
* TensorFlow: fix android buildGravatar Vijay Vasudevan2016-02-16
| | | | Change: 114831071
* Fixing an error where .cc file was accidentally included in ↵Gravatar A. Unique TensorFlower2016-02-16
| | | | | | extended_ops_headers. Change: 114784491
* Split android_tensorflow_lib into lite and full versions, to make it easier ↵Gravatar A. Unique TensorFlower2016-02-16
| | | | | | to package custom operator sets with core binary. Change: 114781921
* Update versions of bower components to reflect those inside Google. This ↵Gravatar Dan Smilkov2016-02-16
| | | | | | also fixes the problem where users are asked which version of polymer to install when they run `bower install`. Change: 114774859
* Rewrite of transpose so that its compilation time is tolerable. MainGravatar A. Unique TensorFlower2016-02-16
| | | | | | | | | | | | approach: 1. Do not instantiate templates for all tf types. Instead, various types is casted to one of uint8/uint16/uint32/uint64/string. 2. Use eigen3 for 2/3/4 rank tensors' transpose and fallback to a naive routine which is only templatized on type T but not on NDIMS. Change: 114763098
* Split all headers for extended ops into its own filegroup.Gravatar A. Unique TensorFlower2016-02-12
| | | | | This allows one group to include headers from the other. Change: 114578983
* TensorFlow: fix android OSS build to remove stray ruleGravatar Vijay Vasudevan2016-02-12
| | | | Change: 114565136
* Link with -lm.Gravatar Manjunath Kudlur2016-02-12
| | | | | Framework calls ceil somewhere deep inside. Change: 114539499
* Clean up saturate_cast, test, and move to tf.saturate_castGravatar Geoffrey Irving2016-02-11
| | | | Change: 114470777
* Moves MemoryType inference code out of OpKernel so that it can reused.Gravatar A. Unique TensorFlower2016-02-11
| | | | Change: 114448861
* Adding argmax and equal_to to android extended ops.Gravatar A. Unique TensorFlower2016-02-10
| | | | Change: 114378906
* Add versions to checkpointsGravatar Geoffrey Irving2016-02-10
| | | | | | | Checkpoints now have a version scheme analogous to that for GraphDefs. We have no plans to ever deprecate a checkpoint version, but it's good to have the scheme in place in case we need to. Change: 114364388
* Splitting tensorflow android standard ops into two modules to enable ↵Gravatar A. Unique TensorFlower2016-02-10
| | | | | | separate compilation. Change: 114356795
* Improvements to TensorFlow android rulesGravatar A. Unique TensorFlower2016-02-09
| | | | Change: 114273085
* Move private "framework_headers" target out of the publicGravatar Josh Levenberg2016-02-09
| | | | | section of the BUILD file. Change: 114255368
* Cleaning up pylints - removing some unnecessary ones, beingGravatar David G. Andersen2016-02-08
| | | | | more careful about re-enabling wildcard-import where appropriate. Change: 114167131