tensorflow - machine learning framework

	Commit message (Collapse)	Author	Age
*	Automated g4 rollback of changelist 177799252	A. Unique TensorFlower	2017-12-05
\| \| \| \|	PiperOrigin-RevId: 177989542
*	[TF:XLA] Add support for FusedBatchNormGrad where is_training=False.	Peter Hawkins	2017-12-05
\| \| \| \| \| \|	Also add support for rank != 4 tensors to the TF/XLA fused batchnorm operators, although the TF core ops don't actually support other ranks yet so this is not tested. PiperOrigin-RevId: 177987592
*	Only parse known flags in tf.app.run().	Yilei Yang	2017-12-05
\| \| \| \| \| \| \| \|	This requires absl-py 0.1.6. Also remove the manual tag on //tensorflow/python:app_test. PiperOrigin-RevId: 177986813
*	Add the tf2xla_supported_ops tool, which dumps ops supported by tf2xla.	A. Unique TensorFlower	2017-12-05
\| \| \| \| \| \| \| \|	Also fix a TODO in XlaOpRegistry to filter by the types allowed by the OpDef. Also see #14798 PiperOrigin-RevId: 177986664
*	Add ImportGraphDefOptions::uniquify_prefix.	Skye Wanderman-Milne	2017-12-05
\| \| \| \| \| \| \|	This option is necessary to mimic the Python import_graph_def method's behavior. PiperOrigin-RevId: 177986165
*	nn_impl.py cleanup: used keepdims instead of deprecated keep_dims.	A. Unique TensorFlower	2017-12-05
\| \| \| \|	PiperOrigin-RevId: 177972555
*	Adding variant-based serialization and deserialization for sparse tensors.	Jiri Simsa	2017-12-05
\| \| \| \|	PiperOrigin-RevId: 177971801
*	Simplify code in dependency optimizer.	A. Unique TensorFlower	2017-12-05
\| \| \| \| \| \| \|	Change dependency optimizer to remove isolated NoOps when it is safe. Fix bug in arithmetic optimizer: Only remove deduped nodes if we know the fetches. PiperOrigin-RevId: 177970063
*	Improve handling of operations that are known to TOCO but not to TF Lite.	A. Unique TensorFlower	2017-12-05
\| \| \| \|	PiperOrigin-RevId: 177966156
*	Make RevBlock a subclass of Layer	A. Unique TensorFlower	2017-12-05
\| \| \| \|	PiperOrigin-RevId: 177964932
*	Add a helper to HloSharding to easily create trivial flat tuples without ↵	A. Unique TensorFlower	2017-12-05
\| \| \| \| \| \| \| \|	requiring a ShapeTree. PiperOrigin-RevId: 177956572
*	Estimate Placeholder as a no-op.	Max Galkin	2017-12-05
\| \| \| \|	PiperOrigin-RevId: 177956552
*	[TF:XLA] Add support for NCHW format to SpaceToDepth and DepthToSpace.	Peter Hawkins	2017-12-05
\| \| \| \|	PiperOrigin-RevId: 177953076
*	[XLA] Mark Rng as side-effecting and add a rematerialization test to ensure ↵	Blake Hechtman	2017-12-05
\| \| \| \| \| \|	that rng instructions are not rematerialized. This also lists Rng as non-rematerializable. PiperOrigin-RevId: 177932160
*	Fix bugs in neutral element code and add more unit tests to cover matmul ↵	A. Unique TensorFlower	2017-12-05
\| \| \| \| \| \|	with input shape != output shape. PiperOrigin-RevId: 177920882
*	Generates a warning if the global step is not increased.	Jianwei Xie	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177908680
*	Enable transferring a tuple literal to a replicated device.	Mark Heffernan	2017-12-04
\| \| \| \| \| \| \| \|	Use ShapedBuffer to allocate required memory for the shape, then transfer the literal to the allocated addresses on each replica. Also, add Allocate() method to ShapedBuffer. PiperOrigin-RevId: 177900588
*	[TF2XLA] Change the implementation of Diag and MatrixDiag to use arithmetic ↵	A. Unique TensorFlower	2017-12-04
\| \| \| \| \| \|	rather than Pad. PiperOrigin-RevId: 177896187
*	Reproduce an issue with MonitoredSession when saving a variable on a GPU.	Igor Saprykin	2017-12-04
\| \| \| \| \| \|	Also arrange for continuous testing with GPUs. PiperOrigin-RevId: 177895214
*	Modifying _get_examples in graph_io.py to utilize tf.cond.	A. Unique TensorFlower	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177892591
*	Treat integer default initializers like floating point ones.	Alexandre Passos	2017-12-04
\| \| \| \| \| \|	This fixes subtle problems with partitioned variables. PiperOrigin-RevId: 177892499
*	Fix tf.identity(resource variable) with eager execution and a device	A. Unique TensorFlower	2017-12-04
\| \| \| \| \| \|	copy. PiperOrigin-RevId: 177891209
*	Add BF16 tests for reduce-window.	Yuanzhong Xu	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177890892
*	Fix bug with uniquified colocation attrs in ImportGraphDef.	Skye Wanderman-Milne	2017-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The colocation attrs must be updated after all NodeDefs have been processed. The nodes are processed and uniquified in topological order, which allows us to update the inputs simultaneously due to the topological ordering, but this doesn't work for the colocation groups. I also considered updating all the NodeDefs with prefixes or unique names at the very beginning, before starting conversion. This would make the logic simpler, but require us to potentially keep a full copy of all the NodeDefs in memory (so we could edit them), so I decided to edit in-place after construction. We might want to consider this alternate in future though. PiperOrigin-RevId: 177890362
*	[XLA] Add --print_result flag to replay_computation tool.	Justin Lebar	2017-12-04
\| \| \| \| \| \| \| \| \|	Before, we assumed that if you passed --use_fake_data, you didn't care about the output of the computation. With this patch, we decouple the decision of using fake data from the decision of whether or not to print the results. PiperOrigin-RevId: 177889877
*	Correct trivial spelling error in internal_convert_to_tensor	A. Unique TensorFlower	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177886163
*	Fix minor typos in the doc of SpaceToDepth and DepthToSpace.	Jingyue Wu	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177884096
*	[XLA:GPU] Use more threads per thread block.	Justin Lebar	2017-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before this change, we supported two algorithms for choosing the number of threads per block: * "optimize-for-latency" algorithm assumed that each thread would want the maximum number of registers it could have, and chose a block size small enough to accommodate this. * "optimize-for-throughput" algorithm packed as many threads into a block as possible. In practice we always chose the optimize-for-latency algorithm. This change removes the choice of algorithm and changes us to unconditionally use a new one. In our new algorithm, we choose the smallest block size that still has the potential to allow the GPU to reach maximum occupancy. When each thread's register usage is small, we can pack many of these blocks into one SM and hit maximum occupancy. When the threads' register usage is larger, we degrade gracefully (unlike with larger block sizes, where the occupancy degredation is more quantized). On our benchmarks, this is a moderate (0-10%) speedup on K40, and a large (10-25%) speedup on P100. PiperOrigin-RevId: 177879741
*	[XLA] Add a default implementation of Literal::ToString for rank >= 6 tensors.	Peter Hawkins	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177878887
*	Add a single capacity prefetch to `tf.contrib.data.read_batch_features`.	A. Unique TensorFlower	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177877751
*	hsv_in_yiq gpu implementation.	A. Unique TensorFlower	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177876455
*	Enable bfloat16 use from Python:	Peter Hawkins	2017-12-04
\| \| \| \| \| \| \|	* add a bfloat16 Python type and NumPy extension. * allow the bfloat16 type in a number places in the Python libraries. PiperOrigin-RevId: 177875784
*	Fix ResourceVariable's docstring example.	Reed Wanderman-Milne	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177875589
*	[StreamExecutor] Add UnqueryableDeviceParams for all nvidia GPUs.	Justin Lebar	2017-12-04
\| \| \| \| \| \| \| \| \| \| \|	Some properties of nvidia GPUs cannot be queried via the driver API -- these are hardcoded in the UnqueryableDeviceParams struct in StreamExecutor. Before this change, we only had values for sm_35. This change adds the values for all other nvidia GPUs, sm_20 through sm_70. PiperOrigin-RevId: 177874401
*	Fix edge case with ImportGraphDefOption.uniquify_names = true.	Skye Wanderman-Milne	2017-12-04
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change fixes the case where a newly-generated uniquified name conflicts with another NodeDef being imported (the original NodeDef names are required to be unique among each other, so this is only an issue when we create new names). Note that this behavior is not well defined in the Python import_graph_def method. It will always generate unique names, but the exact naming scheme may depend on the order the NodeDefs are imported. I didn't write a corresponding Python unit test or try to make this change produce the same names for this reason. PiperOrigin-RevId: 177872720
*	Allow test_util.evaluate handle nested tensors.	Sergio Guadarrama	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177871523
*	Sort sections in operation semantics alphabetically.	Nick Desaulniers	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177871286
*	Fix TFGAN's `clip_weights_test.py` bugs.	A. Unique TensorFlower	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177870577
*	Internal change.	Anna R	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177869591
*	[XLA:CPU] Avoid over-aligning parameter buffers	Sanjoy Das	2017-12-04
\| \| \| \| \| \| \| \| \|	We sometimes pass scalars to non-entry computations, and since these are pointers pointing to elements in a buffer and are not individually allocated buffers, they don't have to follow the same alignment rules as buffers, even though they incidentally do so today. PiperOrigin-RevId: 177868506
*	[XLA:CPU] Use an AVX optimized reduction step for row-major matrix-vector dot	Sanjoy Das	2017-12-04
\| \| \| \| \| \|	The optimization is to use the vhaddps instruction when possible. PiperOrigin-RevId: 177868238
*	Internal-only changes	Shanqing Cai	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177865604
*	Marks args as runtime consts in XLA EncapsulateSubgraphsPass.	Vinu Rajashekhar	2017-12-04
\| \| \| \| \| \| \|	- Using the GuaranteeConstOp. - Runs a backwards analysis on the args to see if all the paths lead to GuaranteeConstOps/ConstOps. PiperOrigin-RevId: 177862716
*	Update pin for bazel-toolchains to latest version	A. Unique TensorFlower	2017-12-04
\| \| \| \| \| \|	https://github.com/bazelbuild/bazel-toolchains/releases/tag/b49ba36 PiperOrigin-RevId: 177858255
*	Getting rid of obsolete function is_variable_registered from ↵	A. Unique TensorFlower	2017-12-04
\| \| \| \| \| \|	LayerCollection. Replaced it with a simple function that returns a list of all the registered variables. PiperOrigin-RevId: 177857623
*	Sanitize formatting in IdTableWithHashBuckets doc comment.	Max Galkin	2017-12-04
\| \| \| \| \| \| \| \| \| \|	Fixes list formatting and sanitizes words in angle brackets, which aren't rendered in the web doc: https://www.tensorflow.org/versions/master/api_docs/python/tf/contrib/lookup/IdTableWithHashBuckets Follows the working formatting example of TextFileInitializer. PiperOrigin-RevId: 177856349
*	[TF:XLA] Fix wrong output of FloorDiv op for DT_HALF values.	Peter Hawkins	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177851804
*	Apply oss_serial tag to tests that use portpicker to create local clusters	Shanqing Cai	2017-12-04
\| \| \| \| \| \|	to avoid port conflicts with other tests during parallel bazel tests. PiperOrigin-RevId: 177851615
*	Actually use ApiDef when generating Python API.	Anna R	2017-12-04
\| \| \| \|	PiperOrigin-RevId: 177851421
*	[XLA:GPU] Switch from specifying maxntid to reqntid.	Justin Lebar	2017-12-04
\| \| \| \| \| \| \| \| \| \| \| \|	maxntid specifies the max number of threads in a block, whereas reqntid says that we will use exactly this many threads in a block. This doesn't have any effect on the benchmarks I ran, but we might as well do it in case it helps ptxas make a better decision at some point on some GPU. At least it will prevent the next person to come along from doing this same investigation I just did. :) PiperOrigin-RevId: 177851116