| Commit message (Collapse) | Author | Age |
|
|
|
| |
Change: 146741901
|
|
|
|
| |
Change: 146741882
|
|
|
|
| |
Change: 146740995
|
|
|
|
| |
Change: 146740764
|
|
|
|
| |
Change: 146739972
|
|
|
|
|
|
|
| |
by using a .tfdbg_history file under user's home directory.
Previously, each instance of CommandHistory held ephemeral command history, which goes away as soon as, e.g., a run-start UI or run-end UI exists.
Change: 146739723
|
|
|
|
|
|
|
|
|
|
|
| |
Add a RunStats Java class that is a limited version of what
StatSummarizer in C++ does.
Between this and the rest of the Java API, I think we have
the required feature coverage to switch the implementation
of TensorFlowInferenceInterface from custom JNI to being
pure-Java, implemented on top of the full Java API.
Change: 146738405
|
|
|
|
| |
Change: 146738266
|
|
|
|
| |
Change: 146735638
|
|
|
|
| |
Change: 146731752
|
|
|
|
|
|
| |
summarywriter link updated -- it's deprecated but this points to the correct one
tensorboard tutorial renamed
Change: 146728429
|
|
|
|
| |
Change: 146725170
|
|
|
|
| |
Change: 146722046
|
|
|
|
|
| |
This version removes the 64 MB default limit for serialized proto sizes.
Change: 146721509
|
|
|
|
| |
Change: 146718366
|
|
|
|
| |
Change: 146717512
|
|
|
|
| |
Change: 146717131
|
|
|
|
| |
Change: 146715468
|
|
|
|
| |
Change: 146713025
|
|
|
|
| |
Change: 146712044
|
|
|
|
| |
Change: 146699987
|
|
|
|
|
|
|
| |
broke export/inference for any classification model using weights.
+ Added a test.
Change: 146697049
|
|
|
|
|
|
| |
This lets us control whether fast-math is enabled during tests, now that
fast-math is not controlled via flag.
Change: 146696974
|
|
|
|
|
| |
It should match the param type of LiteralUtil::IsAll.
Change: 146692488
|
|
|
|
| |
Change: 146690299
|
|
|
|
|
|
| |
This is like LiteralUtil::IsAll, but for floating-point values. Used in
a later patch.
Change: 146687678
|
|
|
|
| |
Change: 146684854
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(1) We now allocate a contiguous array of memory to hold count information
for each node in a frame, allocating either 8 bits to hold a
PendingCounts::PackedCounts structure (if the dead and pending count
fields can fit in 8 bits), or an aligned region of 64 bits to hold a
PendingCounts::LargeCounts structure (if the counts are too large to
fit in the 8-bit representation). The byte offset and representation
of the data for a given node are now held in a 32-bit handle. This
makes all the count data contiguous for a given frame (better cache
locality), and avoids the need for the FlatMap that we were using to
hold a mapping from id to LargeCounts for those nodes that needed
the overflow space.
(2) Sped up the fast path of adjustments needed when normal non-merge
nodes complete by adding a specialized "adjust_for_activation" routine
that does all the necessary bookkeeping in one routine, rather than
having to manipulate the count data structure several times.
(3) Several tensorflow/core/common_runtime/executor.cc changes related to this:
. Stored the new PendingCounts::Handle for a given node in the ItemState
data structure, rather than in a separate frame_local_ids_ array. This
means that we touch one less cache line per edge in ActivateNodes.
. Changed initialization code to use the new PendingCounts::AllocateHandle
routine for allocating space to hold the counts for a node. This allowed
simplifying the ControlFlowInfo that is computed at initialization (we
just need the unique set of frame names now, rather than a mapping from
frame name to number of nodes in that frame).
(4) Updated tensorflow/core/common_runtime/pending_counts_test.cc to reflect
the new interface and to test the new compound adjust_for_activation routine.
Speeds up an image processing benchmark on my desktop from 3370 images/sec
to 3540 images/sec (+5.0%).
Change: 146684811
|
|
|
|
|
|
| |
* Add public last_updated() method to NodeStepper class.
* The Stepper CLI uses this new method to inform the user of what variables (if any) are updated in the last cont/step action.
Change: 146683089
|
|
|
|
| |
Change: 146677928
|
|
|
|
|
| |
This ensures slots end up colocated on the right device.
Change: 146660303
|
|
|
|
| |
Change: 146642486
|
|
|
|
| |
Change: 146639240
|
|
|
|
|
| |
Shallow copies are not enough when using a MultiRNNCell.
Change: 146611566
|
|
|
|
| |
Change: 146608995
|
|
|
|
|
|
| |
benchmarking scripts
Change: 146578260
|
|
|
|
|
|
|
| |
Each cont() call will use TFDBG's own RunOptions.debug_options to generate intermediate tensor dumps and load the dumps and cached the relevant DebugTensorDatum objects. If future cont() calls require the intermediate tensor values, they will be obtained from the cached DebugTensorDatum objects, saving unnecessary recomputation.
The use of such cached intermediate tensor dumps can be disabled in individual cont() calls by using "use_dumped_intermediates=False".
Change: 146572672
|
|
|
|
| |
Change: 146572093
|
|
|
|
| |
Change: 146565394
|
|
|
|
| |
Change: 146564734
|
|
|
|
| |
Change: 146552777
|
|
|
|
|
|
| |
Pass in benchmark argument to tf_cc_logged_benchmark. Also, add _gpu suffix to benchmarks.
Change: 146552100
|
|
|
|
| |
Change: 146548294
|
|
|
|
|
|
| |
RelaxedOneHotCategorical distributions. Fixed some bugs in the implementations that caused tests not to pass. Using numpy.finfo(np_dtype).tiny to prevent underflow issues.
Change: 146548214
|
|
|
|
| |
Change: 146539043
|
|
|
|
|
|
| |
gather has a GPU kernel, and scatter_nd will soon have a GPU kernel. this should
allow more of the calculations to stay on the GPU.
Change: 146538474
|
|
|
|
|
|
| |
operations when propagating tensors to the input slots of its consumers.
Change: 146536099
|
|
|
|
| |
Change: 146535501
|
|
|
|
|
|
| |
Remove HistogramAccumulatorSummary from hidden_ops.txt and logging_ops.py; this
op no longer exists.
Change: 146534447
|
|
|
|
| |
Change: 146532572
|