| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* add a new config option sycl_nodouble for SYCL build
When TF is built with SYCL enabled, the SYCL device code is generated
at build time. Currently, all the data types such as float and double
are registered to generate the device code.
The SYCL device code is compiled into SPIR at build time, and then
passed to OpenCL implemenation at runtime. Since double precision is
an optional feature in the OpenCL spec, it is possible that an OpenCL
implemenation does not support double.
To make some platforms without double support work, this new config
option disables double register for SYCL device code.
This patch just changes the cwise_add operation as an example, and
other operations will be changed in future small patches one by one.
* change action_env to cxxopt in tools/bazel.rc to pass the build option
* correct naming and refine #ifdef into a common place
Rename SYCL_NO_DOUBLE to TENSORFLOW_SYCL_NO_DOUBLE.
Refine #ifdef to cwise_ops_common.h, so the enable/disable of double
operation is defined in a single place for all the cwise ops.
* add TF_CALL_SYCL_NUMBER_TYPES to unify the sycl kernel register
* also consider __ANDROID_TYPES_SLIM__
another thing need to mention is that, once all cwise ops finished,
the REGISTER* defined within __ANDROID_TYPES_SLIM__ in file
cwise_ops_common.h will be defined as empty. Anyway, this will be
in another patch.
|
| |
|
|
|
| |
Used `configure` script for reference
|
| |
|
|\ |
|
| | |
|
|\| |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* Enable grappler to propagate shapes through queues.
Change: 154789133
* Add whitelist support in uid of RunConfig.
Change: 154794859
* Fix a bunch of bad links and missing docs in contrib.
Change: 154820641
* Don't try to refine the shapes for a node if its inference context wasn't
successfully built by the AddNode() method.
Change: 154838211
* Fix issue related to empty bazel.rc file.
Change: 154840138
* Remove overly precise CHECK when rendering debug output for a function.
An `_Arg` node can have more than three attrs, because the runtime may
(and does) add system-defined attrs (viz. "_output_shapes") that do
not change the meaning of the op.
Change: 154850526
* Port makefile build breakage
Change: 154855106
* [TF:XLA] Try to incorporate Tensorflow node structure for large HLO GraphDefs.
This change assumes that a TF subgraph/op does not cross the boundary of a HLO
computation and always put top-level TF subgraphs/ops under HLO computations.
Change: 154855884
* Added a unit test to check what happens when 2 shapes with known rank but
unknown dimensions are merged
Change: 154856675
* [XLA] Refactor constant folding operations into a dedicated module
Refactor constant folding operations into a dedicated module, and added a new
ReplaceInstruction() API to collapse { computation->ReplaceInstruction();
changed=true}.
Change: 154857025
* Java: Docs: Update instructions for Windows.
Inspired by
http://stackoverflow.com/questions/43741775/tensorflow-in-java-running-failed
Change: 154859066
* Add more documentation for features and labels.
Change: 154859649
* Added link to high-performance models
Change: 154860213
* Navigation and index for new performance section documents.
Change: 154862215
* Fix shape mismatch between loss and weights.
Change: 154862650
* Add examples to TensorShape documentation and ran autoformatter.
Change: 154862667
* Move linking of cudnn_plugin, cublas_plugin and cufft_plugin from
stream_executor to the ops that need them.
Change: 154863520
* Properly track the persistent memory usage of lookup tables.
Change: 154866686
* Reset the inputs to ShapeRefiner::RunShapeFn so that it behaves the same every time it's called.
To properly handle queues that have populated by several enqueue ops, merge the shapes of the inputs to all the enqueue ops before calling InferenceContext::set_output_handle_shape(). This ensures that we detect incorrect queue setups (where the 2 enqueue ops might generate tensors with incompatible shapes), and that we take all the known shape information instead of that of just one of the enqueue ops.
Change: 154866747
* Making sure an error message will be produced by session_manager when a non-tensor object is passed in.
Otherwise the 'name' property is missing.
Change: 154868022
* Don't needlessly synchronize the CUDA stream in CropAndResize.
Make the op Async so we don't block an executor thread while waiting for the result of the box bounds check to be copied back to the host.
Change: 154868460
* Add contribution guidelines and standards section to CONTRIBUTING.md
Several parts are largely based on the post by @yaroslavvb at: #7443#issuecomment-279182613
Fixes #7443
Change: 154876045
* Final draft
Change: 154876563
* Final draft
Change: 154876646
* Fix losses documentation.
Fix documentation of get_total_loss() to be correct.
And add a helpful comment about a common pitfall.
Change: 154876822
* [XLA] Second change for HLO interpreter.
Extends HloEvaluator to allow evaluation of HLO Computation or single HLO instruction
with non-constant operands, by traversing the instruction in post order and keeps track of
each instruction along the way as evaluated literals.
Change: 154877580
* [tf distributions] Move the remaining whitelisted distributions to core.
Change: 154878206
* Add shape to error message.
Change: 154880260
* Revert "Fix build issue when `/usr/bin/python` path is not available (#9547)"
This reverts commit 95f37ebf0bd46c328266f65bbd16d319c0efab3d.
|
| |
| |
| |
| | |
Change: 154978617
|
| |
| |
| |
| |
| |
| | |
Avoid using redirection ">>& foo.txt".
Instead use "2>&1 >> foo.txt"
Fixes #9587
|
| |
| |
| |
| | |
Change: 154840138
|
|/
|
|
|
|
|
|
|
|
| |
* Update issue template (thanks jart@) and add env collection script.
* Improve shell script and issue template more (address review)
- Make shell script for stringent by using -u
- Use uppercase GIT_VERSION instead of __git_version__.
- Add OS platform to required information
|
|
|
|
| |
Change: 151705528
|
|
|
|
| |
Change: 148954491
|
|
|
|
| |
Change: 146918929
|
|
|
|
| |
Change: 144609556
|
|
|
|
| |
Change: 142074581
|
|
|
|
|
|
|
|
| |
Additionally:
- change single quotes to double quotes to make path rewriting easier
- guard windows lib reference with PLATFORM_WINDOWS
- fixed failing kmeans test
Change: 141515942
|
|
|
|
| |
Change: 138675832
|
|
|
|
| |
Change: 134721831
|
| |
|
|
|
|
| |
Change: 131310818
|
|
|
|
| |
Change: 124202095
|
|
|
|
| |
Change: 123975418
|
|
|
|
| |
Change: 123427036
|
|
|
|
|
|
|
|
| |
This has no practical effect, as CUDA builds are always with nvcc, but
it lets us modify the build config rule
//third_party/gpus/cuda:using_nvcc so it returns true, rather than
false, for CUDA builds.
Change: 122288952
|
|
|
|
|
|
|
|
|
|
| |
protobufs. This speeds-up graph serialization by ~15x for users building TensorFlow from source.
Note that you can now install a faster pip binary for protobuf using the instructions
in https://github.com/tensorflow/tensorflow/commit/8ac009728db931ef3119a337bd23250c89bc7efe
This only affects building and running from within the bazel environment.
Change: 118374862
|
|
|
|
|
| |
bazelrc template, since they don't actually do anything.
Change: 116606748
|
|
|
|
|
|
| |
Checking for a specific crosstool_top directory doesn't work when TensorFlow
is a sub-module for a different project.
Change: 116592676
|
|
|
|
|
| |
Update protobuf commit
Change: 114990608
|
|
|
|
| |
Change: 112920860
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change 109922312
Update dockerfiles and instructions.
This CL does two things:
* updates dockerfiles to use 0.6.0
* updates the instructions for the new tag format.
Change 109920508
Fix broken cast_op_test
Change 109919316
Enforce converting to int64 for SparseTensor indices and shape
Change 109916130
Fix imagenet for Python 3
It needed some binary file modes and an iteritems -> items.
Change 109912827
Enable fast c++ implementation in protobuf's python interface.
Base CL: 109922840
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change 109344341
Teach ./configure about Python 3 (and other minor Python 3 issues)
./configure now writes bazel.rc based on a bazel.rc.template, which gives us a
place to tell bazel which version of Python we were using.
Also fix a few tests whose Python 3 support had degraded.
The only thing left before we have Python 3 support is
https://github.com/google/protobuf/pull/1023
Change 109343002
Update ops.pbtxt to reflect 109321497.
Change 109342838
Do memory deallocation outside the critical section in gpu_event_mgr.cc.
Change 109334210
PTB LSTM example: use slicing instead of splitting the inputs.
Change 109332238
Cleanup TensorBoard local development environment
Change 109331051
Use __all__ in __init__.py to restrict exported modules
Specifically, __all__ is now anything that (1) doesn't begin with an underscore
and (2) isn't a non-whitelisted module.
This fixes one tiny piece of b/25561952. Specifically, the following no longer
exist: tf.np, tf.math_ops, and tf.variables. tf.ops and tf.tensor_util still
exist but shouldn't; that will have to wait for a later CL.
Change 109327154
tf.tuple allow Tensors to be passed in as control_inputs like tf.control_dependencies.
Change 109324239
Make tf.control_dependencies(None) clear the control dependencies.
Use that to prevent ops created for Variables to inherit the current
control dependencies.
This fixes issues when using ExponentialMovingAverages with control
dependencies.
Change 109323719
Added support for boolean tf.scatter_update.
Base CL: 109348398
|
|
TensorFlow is an open source software library for numerical computation
using data flow graphs.
Base CL: 107276108
|