| Commit message (Collapse) | Author | Age |
|
|
|
| |
Change: 119706066
|
|
|
|
| |
Change: 119703236
|
|
|
|
| |
Change: 119698585
|
|
|
|
|
|
|
| |
allow colocation of variable initialization with the device the variable is on, instead of always being on the chief supervisor.
Also updates variable_scope.get_variable() and create_partitioned_variables() to take advantage of this when an initializer fn is passed in.
Change: 119697860
|
|
|
|
| |
Change: 119695799
|
|
|
|
|
|
| |
util/command_line_flags.h
Change: 119694323
|
|
|
|
| |
Change: 119691101
|
|
|
|
|
|
| |
Tensor::flat_outer_dims() by adding a template param to specify the desired rank of the output. This is useful, e.g., in order to turn a tensor into a (rank-3) batch of matrices.
Change: 119685357
|
|
|
|
| |
Change: 119671640
|
|
|
|
| |
Change: 119666432
|
|
|
|
| |
Change: 119657458
|
|
|
|
|
|
| |
* Set supervisor fields for event writing to None for non-chiefs, so that non-chiefs do not accidentally write summaries.
* Added RuntimeErrors for public API endpoints to alert users when they try to use non-chief supervisors to write events.
Change: 119654303
|
|
|
|
| |
Change: 119650236
|
|
|
|
| |
Change: 119649248
|
|
|
|
| |
Change: 119648305
|
|
|
|
| |
Change: 119643238
|
|
|
|
| |
Change: 119605636
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Usage example: ./remote_test.sh --num-workers 3 --sync-replicas
Also changed:
1) In local and remote tests, let different workers contact separate GRPC
sessions.
2) In local and remote tests, adding the capacity to specify the number of
workers. Before it was hard-coded at 2.
Usage example:
./remote_test.sh --num-workers 2 --sync-replicas
3) Using device setter in mnist_replica.py
Change: 119599547
|
|
|
|
| |
Change: 119591021
|
|
|
|
| |
Change: 119589456
|
|
|
|
|
|
|
|
|
| |
Goals:
- Have enough of each summary type that tag grouping is useful.
(Wound up recording e.g. mean and stddev and min/max for each variable)
- Use every summary type (adds images)
- Write to multiple directories so there are several "runs"
Change: 119585022
|
|
|
|
|
|
| |
Update bower dependencies.
Also force urls to lowercase.
Change: 119584968
|
|
|
|
| |
Change: 119572994
|
|
|
|
|
| |
Moving favicon to datauri.
Change: 119569013
|
|
|
|
| |
Change: 119565375
|
|
|
|
| |
Change: 119565115
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clarify that OUT_OF_RANGE is raised only when reaching the end of
input for interable contents.
Change the few places where we incorrectly raised OUT_OF_RANGE to raise
ILLEGAL_ARGUMENT instead.
This will make code that catches the OUT_OF_RANGE exception more robust
as it won't get confused by spurious uses of the exception class.
Change: 119560848
|
|
|
|
|
|
|
|
|
| |
on the caller. This allows us to use the macros for other purposes than
calling REGISTER_KERNEL, in particular in variadic template parameter lists.
Update REGISTER_KERNEL_BUILDER accordingly to add a semicolon, so that existing
code continues to compile.
Change: 119551677
|
|
|
|
| |
Change: 119549296
|
|
|
|
|
| |
correctly.
Change: 119549145
|
|
|
|
| |
Change: 119544956
|
|
|
|
|
| |
This is used only within Google.
Change: 119543426
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Dimension.__str__ now spits out ? for unknown dimensions and an integer
otherwise. Previously this logic was contained in TensorShape.__str__.
In addition, the exception produced TensorShape.merge_with now encodes all of
the two shapes, so something like
Dimensions 2 and 9 are not compatible
becomes
Shapes (?, 2) and (4, 9) are not compatible
Change: 119536897
|
|
|
|
| |
Change: 119533248
|
|
|
|
|
| |
Use eigen ThreadPool instead of tensorflow one if TENSORFLOW_USE_EIGEN_THREADPOOL is defined. This will allow to switch to the new non-blocking ThreadPool.
Change: 119512280
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The RNN performance bug:
* When passing sequence_length to rnn(), calculations were being performed past
max_sequence_length.
This bug had one major side effect:
* It slowed down the calculation past max_sequence_length (it *should*
return zeros for outputs and copy state through)
The calculations themselves were still correct: The state was still
copied through and the output was still all zeros. But that calculation
was performed via a vector-conditional select() instead of a single
scalar cond(). As a result a lot of extra copying was happening both
in fw and backprop.
Thanks to Nat Roth (natusroth@gmail) for unearthing this bug.
**************
Also:
- updates to benchmarks.py (allow more specific benchmarks, added
support for --benchmarks=all).
- cleaned up RNN benchmarks code a bit.
New and updated benchmarks:
Calculation: Static Unroll with Halved Sequence Length vs. Half Static Unroll
batch full_t units gpu dt(half_seq_len) dt(unroll_half) dt(half_seq_len)/dt(unroll_half)
128 50 256 False 0.164351 0.155019 1.060204
128 50 256 True 0.033295 0.028203 1.180550
Calculation: Static Unroll with Dynamic Flow LSTM vs. Dynamic Unroll LSTM
batch max_t units gpu dt(static) dt(dynamic) dt(dynamic)/dt(static)
256 50 512 False 1.759111 1.692570 0.962173
256 50 512 True 0.178953 0.190454 1.064269
256 50 256 False 0.533132 0.567228 1.063955
256 50 256 True 0.078298 0.085024 1.085905
256 50 128 False 0.220362 0.215350 0.977255
256 50 128 True 0.053379 0.059129 1.107723
Change: 119495675
|
|
|
|
|
| |
Deprecated control_flow_ops.While. Use tf.while_loop.
Change: 119488170
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
users. It allows a tensor produced by a run call to stay "in-place" so that a future run call can use it in-place. To achieve this, a run call can now return a handle of a tensor to the client, which can then be fed to a subsequent run call. This feature is complimentary to partial run, though there are some overlaps.
Here are a few properties of the current implementation:
1. Tensors are stored in the state of a session. The tensors are garbage collected if the client doesn't have a reference to the tensor or the session is closed.
2. There is no change to the current session API. We introduced two ops to manage the conversions between tensors and its handles. (There is a third op to garbage collect a tensor.) See the example below.
3. It fits quite well into the current feed-fetch design/implementation. It tries to reuse the graph (and caches) as much as possible so to make things efficient.
Below is a simple example. More examples can be found in sessopn_ops_test.py.
# Return a handle.
a = tf.constant(10)
b = tf.constant(5)
c = tf.mul(a, b)
h = tf.get_session_handle(c).eval()
# Feed a tensor handle.
f, x = tf.get_session_tensor(dtypes.int32)
y = tf.mul(x, 10)
result = sess.run(y, feed_dict={f: h.handle})
# result == 500
Change: 119481352
|
|
|
|
| |
Change: 119458778
|
|
|
|
| |
Change: 119448828
|
|
|
|
| |
Change: 119434669
|
|
|
|
| |
Change: 119431584
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
module.
io_wrapper provides some functions that check whether the path is a GCS path and
calls the relevant functions from either gfile or gcs. This is *not* intended to
be a general-purpose interface; it only implements the things that are necessary
for loading events from GCS/GFile storage.
We're doing this because having the entire event loader stack care about the
difference between GCS and GFile is bad from an encapsulation perspective; this
way, we can present one consistent interface.
Change: 119427191
|
|
|
|
| |
Change: 119427077
|
|
|
|
|
| |
due to internal issues.
Change: 119424490
|
|
|
|
|
|
| |
Previously, if the port was undefined, an out-of-bounds access would
be made. This change adds the appropriate checks.
Change: 119424297
|
|
|
|
|
| |
Consolidate linalg shape inference functions.
Change: 119423897
|
|
|
|
| |
Change: 119423048
|
|
|
|
| |
Change: 119420831
|
|
|
|
| |
Change: 119419160
|