aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow
Commit message (Collapse)AuthorAge
* Drop failed sub-streams during both Get and Return.Gravatar Todd Wang2018-08-03
| | | | | | | | | | | | The old code ensured that failed sub-streams would not be re-used, but had two flaws: 1) It only checked for failed sub-streams during Return. 2) It didn't actually remove the failed sub-streams from our state. The new code fixes these two flaws, and adds an extra test that explains why (1) is insufficient. PiperOrigin-RevId: 207333296
* PUBLIC: Support bfloat16 for Spatial PartitionGravatar Youlong Cheng2018-08-03
| | | | | | RELNOTES: n/a PiperOrigin-RevId: 207333246
* [XLA:CPU] Migrate aot/runtine.{h,cc} to xla_compiled_cpu_function.{h,cc}Gravatar Sanjoy Das2018-08-03
| | | | | | | | | | | | | As a follow-on cleanup for cl/206980796 ("Overhaul XLA:CPU's calling convention.") I want to introduce a BufferInfo class that encapsulates whether a buffer is a constant, an entry parameter or a temp without using the fragile "size < 0" scheme I have today. To do this efficiently I need a place to put the BufferInfo class that will be visible to MallocContiguousBuffers. Instead of creating (what seemed to me) an odd layering with BufferInfo in aot/runtime.h I decided to pull in the runtime into xla_compiled_cpu_function since that's the only user. PiperOrigin-RevId: 207333245
* [XLA] Show cumulative cycle percent in xla_hlo_profile table.Gravatar Justin Lebar2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Looks like: 5624727 cycles (100.% 100?) :: 3865.8 usec [...] TOTAL 2121832 cycles (37.72% 38?) :: 1458.3 usec 1932379 cycles (34.36% 72?) :: 1328.1 usec 264366 cycles ( 4.70% 77?) :: 181.7 usec The first line with the total is a little wird, but I figured it was better to do it this way than to waste a precious character of horizontal space. I also considered rendering it as e.g. "?38%". This is slightly more expressive, but it gets hard to read pretty fast with two characters smushed against both of the numbers. I put the sigma at the end because I find it easier to read: With the sigma at the beginning, its tips often blend in with the first number; e.g. I find "?77" less readable than "77?". Similarly I considered displaying more than two significant figures in the percent, but since it's cumulative *anyway*, I didn't think these were relevant. This formatting is somewhat inconsistent with how we do the categories tables: 258 ( 6.68% ?87.81%) non-fusion elementwise (12 ops) I can change these to match if we want, but I sort of think of them as a different case. The categories tables have a lot more whitespace in between entries (namely, one line per instruction in the category), so noisiness is not nearly as significant a concern. PiperOrigin-RevId: 207329731
* Merge pull request #21282 from av8ramit:merge_r110_backGravatar TensorFlower Gardener2018-08-03
|\ | | | | | | PiperOrigin-RevId: 207329479
* | PUBLIC: PREDICT mode should respect ctx.device_assignment.Gravatar Youlong Cheng2018-08-03
| | | | | | | | PiperOrigin-RevId: 207326276
* | Merge pull request #21002 from miaout17:protobufGravatar TensorFlower Gardener2018-08-03
|\ \ | | | | | | | | | PiperOrigin-RevId: 207325536
* \ \ Merge pull request #21119 from bstriner:patch_eigenGravatar TensorFlower Gardener2018-08-03
|\ \ \ | | | | | | | | | | | | PiperOrigin-RevId: 207325529
* | | | Include same-line comments in origin_info.Gravatar Dan Moldovan2018-08-03
| | | | | | | | | | | | | | | | PiperOrigin-RevId: 207325109
* | | | [XLA] Show metric name in categories table header.Gravatar Justin Lebar2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of ********** microseconds above estimated optimum report ********** [...] ********** categories table ********** The left hand side numbers are microseconds above estimated optimum. [...] we now print ********** microseconds above estimated optimum report ********** [...] ********** categories table for microseconds above estimated optimum ********** [...] which I think is more explicit and harder to misread. PiperOrigin-RevId: 207325046
* | | | Add experimental Callable API to ClientSession.Gravatar A. Unique TensorFlower2018-08-03
| | | | | | | | | | | | | | | | PiperOrigin-RevId: 207323298
* | | | [tf.data] Add checkpointing for memory-based `cache()`.Gravatar Jiri Simsa2018-08-03
| | | | | | | | | | | | | | | | PiperOrigin-RevId: 207320100
* | | | Merge pull request #21092 from karllessard:java-constantsGravatar TensorFlower Gardener2018-08-03
|\ \ \ \ | | | | | | | | | | | | | | | PiperOrigin-RevId: 207319780
* | | | | Fix the pip install TLS issue for pip test builds.Gravatar Amit Patankar2018-08-03
| | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207319608
* | | | | [XLA:GPU] Add VLOG to ForThunk.Gravatar Justin Lebar2018-08-03
| | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207317857
* | | | | Fix tf.quantize_and_dequantize_(v2|v3) to be consistent with the doc.Gravatar Jingyue Wu2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I choose round-half-to-even, which matches cudnnConvolutionBiasActivationForward and cudnnTransformTensor. I can add an attribute for the rounding mode in the future if necessary. PiperOrigin-RevId: 207316630
* | | | | Update loss and metric function weighting logic in keras.Gravatar Pavithra Vijay2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Fixes divide by zero error when all batch weights are 0. - Unifies the logic between the existing keras metrics and the new metrics module. - This change is not backward compatible (since logic is different). PiperOrigin-RevId: 207311700
* | | | | Allow setting server_def directly on TFE_Context.Gravatar Akshay Modi2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Any time that the server def is updated, the context is effectively "reset" by clearing all the caches. - Check that the FLR returned is not a nullptr instead of seg faulting. - Consolidate caches within the context object. PiperOrigin-RevId: 207308086
* | | | | Merge pull request #21138 from samikama:WiringUpdateGravatar TensorFlower Gardener2018-08-03
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207306967
* | | | | | Add support for float output arrays in Quantized custom operators (custom ↵Gravatar A. Unique TensorFlower2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ops only). PiperOrigin-RevId: 207306198
| | * | | | Add licence noticeGravatar karl@kubx.ca2018-08-03
| | | | | |
* | | | | | Unbreaks tests broken after the defun while loop change.Gravatar Alexandre Passos2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not add placeholders to the function body as XLA cannot compile them. PiperOrigin-RevId: 207299427
* | | | | | [XLA:GPU] cuBlas supports complex floats, use gemm instead of our O(n^3) ↵Gravatar Benjamin Kramer2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implementation Also increase test coverage for C64 a bit. PiperOrigin-RevId: 207297946
* | | | | | [XLA] In ReplayComputation, make it easier to see the final run if ↵Gravatar Justin Lebar2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | --xla_hlo_profile is enabled. I often find myself searching for the last profiling run in a ReplayComputation log. This makes it much easier to find. PiperOrigin-RevId: 207294644
* | | | | | Automated rollback of commit 493d7588172bcf476309b3954db342839ca37872Gravatar Akshay Agrawal2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207294037
* | | | | | Add a way to whitebox test deadness analysis.Gravatar Sanjoy Das2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I don't think adding whitebox tests is necessary for the code that's checked in today, but I'm working on a CL for which I'd prefer writing whitebox tests. Also fix a minor issue with SymbolPredicate::ToString() where we were dropping the must_be_true() bit. PiperOrigin-RevId: 207289695
* | | | | | Makes `optimized_dataset_op_test` working with pip again.Gravatar Shivani Agrawal2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207289283
* | | | | | [XLA:GPU] Add a fast version of gemmStridedBatched for cuda 9.1Gravatar Benjamin Kramer2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It's unfortunate that this was only added in 9.1, but I haven't found a good way of emulating the behavior on 9.0 without falling back to non-batched gemms. PiperOrigin-RevId: 207286575
* | | | | | [XLA:GPU] Use strided batched gemm instead of building pointer tables.Gravatar Benjamin Kramer2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is mostly a huge amount of plumbing just to call into the cublas functions. blasGemmStridedBatched has been available since CUDA 8.0. For autotuning we'd need cublasGemmStridedBatchedEx, which is new in CUDA 9.2 so I didn't wire that up yet. PiperOrigin-RevId: 207285707
* | | | | | Delete unused memberGravatar Alexandre Passos2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207284323
* | | | | | [Doc]: Fix #21344Gravatar Asim Shankar2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207283527
* | | | | | Disable two eager on TPU tests until we find a fixGravatar Igor Ganichev2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207282495
* | | | | | Merge pull request #20318 from aaroey:create_inference_graph_use_grapplerGravatar TensorFlower Gardener2018-08-03
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207278109
* | | | | | | Estimator test for multiclass, core head for multiclass.Gravatar A. Unique TensorFlower2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207268708
* | | | | | | [XLA:GPU] Allow multi-output fusion of element-wise instructions, in ↵Gravatar Thomas Joerg2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | addition to loop fusions. PiperOrigin-RevId: 207253181
* | | | | | | [XLA:GPU] Forward batched dot to cublas instead of expanding itGravatar Benjamin Kramer2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This gives a huge speedup for users of batchdot. This is a minimal implementation without autotuning and without support for strided batch gemm. PiperOrigin-RevId: 207247740
* | | | | | | compat: Update forward compatibility horizon to 2018-08-03Gravatar A. Unique TensorFlower2018-08-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207238096
* | | | | | | Add the CollectiveAllReduceStrategy.Gravatar Yuefeng Zhou2018-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207215423
* | | | | | | Add missing dependency of ops.pyGravatar Igor Ganichev2018-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207215039
* | | | | | | [XLA] Add guard for bytes accessed in HloCostAnalysis, need layout to determine.Gravatar Chris Leary2018-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207213865
* | | | | | | Implementation of ctc beam search decoder op in custom op fashion.Gravatar A. Unique TensorFlower2018-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207210333
* | | | | | | [tf.data / Bigtable] Update docs and method docstringsGravatar Misha Brukman2018-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add link to updating scope on a running VM * Add code formatting and Python syntax highlighting * Clarify kwargs argument formatting * Fix method name in docstring PiperOrigin-RevId: 207204628
* | | | | | | [XLA:GPU] Don't emit HostToDevice copiesGravatar Sanjoy Das2018-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This became unnecessary with cl/206243319 "Implement constant buffer allocation for XLA:GPU". PiperOrigin-RevId: 207204478
| | | * | | | 2nd code review: Documentation fixesGravatar karl@kubx.ca2018-08-02
| | | | | | |
* | | | | | | Add some symbols back to saver.py temporarily to unbreak some users of ↵Gravatar Allen Lavoie2018-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | non-public TF APIs PiperOrigin-RevId: 207197647
* | | | | | | Remove sagan example.Gravatar Xuechen Li2018-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207195679
* | | | | | | Experimental Cl which adds `LatencyStatsDataset` op after each `Dataset` op ↵Gravatar Shivani Agrawal2018-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to record latency on each edge of dataset input pipeline. PiperOrigin-RevId: 207190025
* | | | | | | Respect log_device_placement when op is executed remotely.Gravatar Akshay Modi2018-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For logging copies, we can set the device_policy to DEVICE_PLACEMENT_WARN PiperOrigin-RevId: 207186848
* | | | | | | In ring_reducer.cc if the reduction does not need a final op, thenGravatar A. Unique TensorFlower2018-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | signal group_size_tensor_ready_ immediately, without initialization. PiperOrigin-RevId: 207184621
* | | | | | | Merge pull request #21346 from Intel-tensorflow:nightly-build-fixGravatar TensorFlower Gardener2018-08-02
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | PiperOrigin-RevId: 207183550