aboutsummaryrefslogtreecommitdiffhomepage
Commit message (Collapse)AuthorAge
...
* Get rid of attempt to adaptively use Cholesky for inverting SPD matrices in ↵Gravatar A. Unique TensorFlower2016-02-22
| | | | | | matrix_inverse. Cholesky is not enough faster than PartialPivLU in Eigen to be worth the extra pass over the data to check for symmetry, even for relatively large matrices. For (n=2k) SPD matrices Cholesky does give ~5% speedup over LU, but for other matrix types we see an equivalent slowdown. Change: 115276994
* Make the TensorFlow Serving tutorial layout consistent with the other tutorialsGravatar Jarek Wilkiewicz2016-02-22
| | | | | on tensorflow.org. Change: 115270889
* Correct device mis-placement for switch with apply_gradients.Gravatar Lukasz Kaiser2016-02-22
| | | | Change: 115269320
* Upgraded to the latest version of eigen, which adds a missing #includeGravatar Benoit Steiner2016-02-22
| | | | Change: 115268843
* Added the ability to track allocation sizes in the tracking allocator itself ↵Gravatar Benoit Steiner2016-02-22
| | | | | | | if the underlying allocator doesn't already do it. Change: 115263741
* Fix the license in tensor_format.cc.Gravatar Xiaoqiang Zheng2016-02-22
| | | | Change: 115261957
* Change the API for node-radar so instead of having many callbacks,Gravatar A. Unique TensorFlower2016-02-22
| | | | | it returns metadata at a single point of control. Change: 115255052
* Added support for half floats to eigen, which is the first step to support halfGravatar Benoit Steiner2016-02-22
| | | | | floats in TensorFlow. The code was tested on Tegra x1. Change: 115253733
* Add IsAligned method to Tensor.Gravatar A. Unique TensorFlower2016-02-22
| | | | | | | This is useful when you produce tensors using Split, but don't want to copy them unnecessarily if you need them aligned. Without this, you need to pessimistically assume that they are all unaligned and copy all of them. Change: 115251844
* Added an "mark_as_used" option to unique_name() to control whether theGravatar Sherry Moore2016-02-22
| | | | | | call changes state. If it's set to False, the new name that will be used will be returned without actually marking the name as having been used. Change: 115249981
* Fix division by 0 bug in EventAccumulator.Gravatar A. Unique TensorFlower2016-02-22
| | | | Change: 115249194
* Add shape function for RandomCrop.Gravatar A. Unique TensorFlower2016-02-22
| | | | | | The absence of shape function makes import_graph_def() fail when RandomCrop exists in GraphDef. Change: 115243268
* Add operator overload for !=.Gravatar A. Unique TensorFlower2016-02-22
| | | | Change: 115243253
* Add dummy Variable.__iter__ to avoid endless hang when accidently treating a ↵Gravatar A. Unique TensorFlower2016-02-22
| | | | | | var as a list. Change: 115243053
* Adds implementations of PosixEnv::SchedClosure[After]().Gravatar Derek Murray2016-02-22
| | | | | | | | | | | The current implementations are very simplistic: they spawn a thread in response to each call. This is needed to deal with the fact that some users of SchedClosure rely on the ability to spawn blocking closures, and we lack an unbounded threadpool. TODO(mrry): Replace the currently-blocking users of this API with asynchronous implementations. Change: 115239594
* Adds functionality to the platform library.Gravatar Derek Murray2016-02-22
| | | | | | The testing::SrcDir() function supports data dependencies in cc_test targets. Change: 115239392
* Replace info-card non-printable whitespace with <wbr>Gravatar A. Unique TensorFlower2016-02-22
| | | | Change: 115220444
* Add a few more special GPU kernels for int32 and bool.Gravatar Yuan Yu2016-02-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This improves the performance of dynamic RNN into the reasonable range when comparing with cond-based static unrolling. Before: Graph Creation: Static Unroll vs. Dynamic Unroll LSTM max_t dt(static) dt(dynamic) dt(dynamic)/dt(static) 1 0.973783 1.598944 1.641992 25 13.994146 1.849802 0.132184 50 27.849715 2.052574 0.073702 Calculation: Static Unroll with Dynamic Flow LSTM vs. Dynamic Unroll LSTM batch max_t units gpu dt(static) dt(dynamic) dt(dynamic)/dt(static) 256 50 512 False 1.262335 1.349654 1.069172 256 50 256 False 0.720269 0.742385 1.030706 256 50 128 False 0.342915 0.360554 1.051439 256 100 512 False 2.512101 2.592826 1.032134 256 100 256 False 1.398599 1.449359 1.036294 256 100 128 False 0.688278 0.723332 1.050930 512 50 512 False 1.777011 2.040112 1.148058 512 50 256 False 0.854183 0.915705 1.072024 512 50 128 False 0.609203 0.624703 1.025443 512 100 512 False 3.731255 4.289601 1.149640 512 100 256 False 1.763375 1.867427 1.059007 512 100 128 False 1.226971 1.274628 1.038841 256 50 512 True 0.190479 0.217636 1.142570 256 50 256 True 0.086440 0.119876 1.386814 256 50 128 True 0.061334 0.097079 1.582790 256 100 512 True 0.381617 0.432454 1.133215 256 100 256 True 0.174479 0.239955 1.375264 256 100 128 True 0.122436 0.190479 1.555740 512 50 512 True 0.322039 0.355348 1.103433 512 50 256 True 0.129060 0.163209 1.264603 512 50 128 True 0.073067 0.106976 1.464091 512 100 512 True 0.653037 0.719606 1.101936 512 100 256 True 0.259759 0.323882 1.246856 512 100 128 True 0.147856 0.215792 1.459475 After: Graph Creation: Static Unroll vs. Dynamic Unroll LSTM max_t dt(static) dt(dynamic) dt(dynamic)/dt(static) 1 0.945166 1.643999 1.739376 25 13.471901 1.787826 0.132708 50 26.668288 2.041938 0.076568 Calculation: Static Unroll with Dynamic Flow LSTM vs. Dynamic Unroll LSTM batch max_t units gpu dt(static) dt(dynamic) dt(dynamic)/dt(static) 256 50 512 False 1.282594 1.293548 1.008540 256 50 256 False 0.707062 0.738919 1.045055 256 50 128 False 0.353723 0.365117 1.032211 256 100 512 False 2.573490 2.579687 1.002408 256 100 256 False 1.397638 1.448193 1.036172 256 100 128 False 0.699666 0.727913 1.040371 512 50 512 False 1.755335 1.849683 1.053749 512 50 256 False 0.857895 0.917298 1.069242 512 50 128 False 0.606808 0.625990 1.031610 512 100 512 False 3.608412 3.964380 1.098649 512 100 256 False 1.744636 1.862331 1.067461 512 100 128 False 1.221435 1.277420 1.045835 256 50 512 True 0.191454 0.204069 1.065890 256 50 256 True 0.083181 0.092068 1.106844 256 50 128 True 0.055699 0.064500 1.158020 256 100 512 True 0.377481 0.403046 1.067727 256 100 256 True 0.171492 0.189591 1.105542 256 100 128 True 0.112558 0.135522 1.204021 512 50 512 True 0.324426 0.348642 1.074641 512 50 256 True 0.125665 0.143196 1.139510 512 50 128 True 0.069971 0.077949 1.114019 512 100 512 True 0.670467 0.704176 1.050278 512 100 256 True 0.256430 0.300047 1.170094 512 100 128 True 0.142816 0.161151 1.128383 Change: 115179042
* Add cache_device to Variable construction and variable_scope.Gravatar Eugene Brevdo2016-02-22
| | | | | | | | | | | This is necessary for, e.g., RNN where one wants to cache Variables locally even when they are accessed through a conditional like cond. Without the local caching, each cond creates a Switch that bypasses the current Variable copy deduplication code and forces a (possibly slow) copy for each iteration. With local caching, the Variable is copied once to the local device and then that local copy is accessed at each iteration. Change: 115151788
* Temporarily disable denormal test for open sourceGravatar Geoffrey Irving2016-02-20
| | | | | | | The test is breaking for OSS, and I can't look at it in detail right now. The fact that it breaks is harmless (it's equivalent to yesterday), so disable the test in the open source case for the moment. Change: 115128265
* Add assert_scalar_int to contrib.Gravatar A. Unique TensorFlower2016-02-19
| | | | Change: 115121755
* Flush denormals to zero on both CPU and GPUGravatar Geoffrey Irving2016-02-19
| | | | | | | | | | | | Two different mechanisms are required. On the CPU, we push and pop the appropriate processor flags in the executor (for the master thread) *and* in each threadpool thread, since the processor flags are thread local. On the GPU, we set -ftz=true for both nvcc and gcudacc so that kernels that we build flush denormals to zero using instruction flags. Caveat: On GPU, only single precision denormals are flushed to zero; double precision is unchanged. Change: 115114845
* Fixed a small mistake that prevents the gpu_bfc_allocator_test from compilingGravatar Benoit Steiner2016-02-19
| | | | Change: 115113975
* Adopt TensorFlow's standard variable regularization mechanism.Gravatar A. Unique TensorFlower2016-02-19
| | | | | | | | | Support for variable regularization was recently added to tf.get_variable(), and I've modified layers.py to take advantage of it. This CL changes the semantics of tf.fully_connected() to match TensorFlow's standard conventions: a regularization function will only be applied at the time that a variable is first created. Change: 115111632
* Fixing the backward compatibility test.Gravatar Xiaoqiang Zheng2016-02-19
| | | | Change: 115111581
* Updated image recognition docs to point to new tutorial.Gravatar Pete Warden2016-02-19
| | | | Change: 115111428
* Add tag option to summarize_tensor.Gravatar A. Unique TensorFlower2016-02-19
| | | | Change: 115102931
* Support NCHW in forward and backward convolution op.Gravatar Xiaoqiang Zheng2016-02-19
| | | | | Test both layouts in tests. Change: 115096872
* Exposes the memory limit in the allocator's stats.Gravatar A. Unique TensorFlower2016-02-19
| | | | Change: 115036211
* TensorFlow: disable proto_test for now, since we don't know how toGravatar Vijay Vasudevan2016-02-18
| | | | | support it in pip installed form. Change: 115034582
* Loosen test tolerances which seem to fail on Mac.Gravatar Vincent Vanhoucke2016-02-18
| | | | Change: 115027725
* TensorFlow: make pip wheel depend on 3.0.0b2 protobuf,Gravatar Vijay Vasudevan2016-02-18
| | | | | since that's roughly what we package too. Change: 115027171
* Create tf.image.central_crop() to output a centrally cropped version of an ↵Gravatar Jon Shlens2016-02-18
| | | | | | image. Change: 115023941
* Fixed new lines in URLS in file format docs for bug #1169.Gravatar Pete Warden2016-02-18
| | | | Change: 115018272
* Fixed new lines in URLS in file format docs for bug #1169.Gravatar Pete Warden2016-02-18
| | | | Change: 115017744
* Added _MetaGraphFilename() function to return the name of meta_graph file.Gravatar Sherry Moore2016-02-18
| | | | | Remove the correct file in the case of sharded checkpoints. Change: 115015294
* TensorFlow: no such thing as LOG(INFO) << endlGravatar Vijay Vasudevan2016-02-18
| | | | Change: 115013578
* TensorFlow: change cuda-diagnostics to search for so.1Gravatar Vijay Vasudevan2016-02-18
| | | | Change: 115010103
* Support more than two session factories.Gravatar Derek Murray2016-02-18
| | | | | | | | | | | | | This CL changes how the session factory is chosen. Previous an empty target would always use DIRECT_SESSION, and a non-empty target would always REMOTE_SESSION. In preparation for multiple distributed session implementations (issue #23), we now delegate to the SessionFactory to see whether it accepts a given SessionOptions. Existing programs should continue to work unmodified. NOTE: This CL assumes that the domains of the registered session factories do not overlap. We may need to revisit this in future. Change: 115008046
* Added documentation for import_meta_graph and export_meta_graph.Gravatar Sherry Moore2016-02-18
| | | | Change: 115005379
* Tracks some basic statistics in cpu and gpu allocator.Gravatar A. Unique TensorFlower2016-02-18
| | | | | | | The basic stats is basicly free in gpu allocator. The cpu stats collection can be optionally turned on. Change: 115000479
* Enable deserializing protos > 64MB in length.Gravatar Manjunath Kudlur2016-02-18
| | | | | Update protobuf commit Change: 114990608
* TensorFlow: switch dsoloader path for libcuda to end in .so.1Gravatar Vijay Vasudevan2016-02-18
| | | | Change: 114990321
* Move iron-component-page from dev-deps to deps in bower.json since it is ↵Gravatar Dan Smilkov2016-02-18
| | | | | | used by tf-regex-group, tf-categorizer and tf-collapsable-pane components. Also make the version number exact instead of ^1.0.0, and make it match the version inside google. Change: 114987970
* Context manager should return self.Gravatar A. Unique TensorFlower2016-02-18
| | | | Change: 114985229
* Avoid use of std::function in GPUBFCAllocator deallocation path.Gravatar A. Unique TensorFlower2016-02-18
| | | | | | | | | | | | | | | | | | Speeds up allocation microbenchmarks by 8% to 15% Run on REDACTED (40 X 2801 MHz CPUs); 2016/02/17-16:56:24 CPU: Intel Ivybridge with HyperThreading (20 cores) dL1:32KB dL2:256KB dL3:25MB Benchmark Base (ns) New (ns) Improvement ------------------------------------------------------------------ BM_Allocation 184 164 +10.9% BM_AllocationThreaded/1 185 169 +8.6% BM_AllocationThreaded/4 1966 1771 +9.9% BM_AllocationThreaded/16 9989 9197 +7.9% BM_AllocationDelayed/1 204 183 +10.3% BM_AllocationDelayed/10 171 146 +14.6% BM_AllocationDelayed/100 152 130 +14.5% BM_AllocationDelayed/1000 155 131 +15.5% Change: 114984794
* Create tensorboard/lib/js/backend, a typed, convenient interface for talking ↵Gravatar A. Unique TensorFlower2016-02-18
| | | | | | | | to tensorboard. It loads the data from the tensorboard backend, and also presents a slightly cleaner abstraction. It's typed and tested. Change: 114984720
* TensorFlow: fix missing GetCudaVersion() for libcuda.so. h/t to @3XX0.Gravatar Vijay Vasudevan2016-02-18
| | | | Change: 114983764
* TensorFlow: change mutex.h to include platform-independent thread_annotationsGravatar Vijay Vasudevan2016-02-18
| | | | | and change thread_annotations.h to prefer default for android Change: 114980803
* Fix Shape inference for tf.BatchMatrixSolve and tf.BatchMatrixSolveLsGravatar A. Unique TensorFlower2016-02-18
| | | | Change: 114975142