aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/core/graph/graph.cc
Commit message (Collapse)AuthorAge
* Automated rollback of commit 950cf87104bfee28e2165fe368f66337b8a1336dGravatar A. Unique TensorFlower2018-10-10
| | | | PiperOrigin-RevId: 216500702
* [tf.data vectorization] Add vectorizer for `Add` opGravatar Rachel Lim2018-10-09
| | | | PiperOrigin-RevId: 216424512
* Partial support tfe.defun in tf.gradients.Gravatar Alexandre Passos2018-10-08
| | | | | | | | Doesn't attempt to deal with cases where we might have already generated the functiondef for the parent function as in that case we cannot easily modify the forward pass. PiperOrigin-RevId: 216243224
* [tf.data] Add utility to deduplicate graph node names (after vectorization)Gravatar Rachel Lim2018-10-03
| | | | PiperOrigin-RevId: 215595078
* Check that IsValid{Input|Output}Tensor is only given non-control edgesGravatar Sanjoy Das2018-10-01
| | | | PiperOrigin-RevId: 215338658
* Make Graph::UpdateEdge() be O(e) instead of O(E)Gravatar Asim Shankar2018-08-28
| | | | | | | | | | | where: - E = number of edges in the graph - e = number of edges on the node of interest e is necessarily <= E and is typically really small (# of inputs to an operation + control edges) PiperOrigin-RevId: 210624296
* Removed redundant std::string -> string conversions.Gravatar A. Unique TensorFlower2018-08-28
| | | | PiperOrigin-RevId: 210565027
* [TF:XLA] Replace bespoke NodeSlot class in subgraph encapsulation code with ↵Gravatar Peter Hawkins2018-06-13
| | | | | | | | | | InputTensor and OutputTensor classes from TF core. Add equality and hash methods to InputTensor and OutputTensor. No functional changes intended. PiperOrigin-RevId: 200440015
* Collective Ops Part 7Gravatar A. Unique TensorFlower2018-05-22
| | | | | | | | Complete just enough of the core implementation to run multi-device collectives locally within a single process. Interfaces are still private and not availble for general use. PiperOrigin-RevId: 197617132
* Replaced calls to tensorflow::StringPiece::ToString with std::string ↵Gravatar A. Unique TensorFlower2018-05-07
| | | | | | | | | | conversions. That is, instances of sp.ToString() are replaced with std::string(sp). This will allow tensorflow::StringPiece::ToString to be removed, which is necessary before it can be replaced with absl::string_view. PiperOrigin-RevId: 195689392
* Prepare nodes that will be allocated using ScopedAllocator.Gravatar Ayush Dubey2018-04-30
| | | | | | | | This includes changes to Executor that (1) set scope_id on nodes that are decorated with _scoped_allocator attribute, (2) mark such nodes to never forward input. PiperOrigin-RevId: 194807086
* Sort control inputs alphabetically in ToGraphDefSubRange.Gravatar Skye Wanderman-Milne2018-04-04
| | | | PiperOrigin-RevId: 191677358
* Added const to Node* in various parts of the code base.Gravatar Mingsheng Hong2018-02-26
| | | | PiperOrigin-RevId: 187050526
* Enabled XLA for TF C API.Gravatar Mingsheng Hong2018-02-09
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary of changes: 1. Set MarkForCompilationPassFlags::tf_xla_cpu_global_jit default to true in C_API unit test env when XLA-execute is intended. Together with setting session config config.graph_options.optimizer_options.global_jit_level to > 0, this turns on XLA for the entire graph (eligible nodes only, with _Arg and _RetVal nodes excluded). We decided against defaulting MarkForCompilationPassFlags::tf_xla_cpu_global_jit to true, due to performance concerns with the single-threaded nature of the XLA CPU backend (see https://www.tensorflow.org/performance/xla/jit#turning_on_jit_compilation). 2. In FindCompilationCandidates() during MarkForCompilationPass, skip compiling any '_Arg'-typed nodes. This is necessary to avoid hitting a "Invalid argument number" error during MarkForCompilationPass. 3. Extended C API based build rules to link in XLA libraries, and added unit test "CAPI.Session_Min_XLA_CPU". Also added some misc improvements and debugging aids. PiperOrigin-RevId: 185193314
* Merge changes from github.Gravatar Yifei Feng2017-11-22
| | | | PiperOrigin-RevId: 176695926
* Automated g4 rollback of changelist 176615107Gravatar Yifei Feng2017-11-22
| | | | PiperOrigin-RevId: 176622438
* Merge changes from github.Gravatar Yifei Feng2017-11-21
| | | | PiperOrigin-RevId: 176615107
* Enable Operation._add_control_inputs() with the C API and related improvementsGravatar Skye Wanderman-Milne2017-10-13
| | | | | | | | | | | | | | This change: - Implements the C API logic for Operation._add_control_inputs() - Adds type-checking to Operation._add_control_input() - Makes Graph::AddControlEdge() update the node def if necessary - Makes Graph::AddControlEdge() a no-op if the control edge already exists The AddControlEdge() changes may have a performance impact if anything is sensitive to AddControlEdge(), but nothing is to my knowledge. I'm not sure what benchmarks would confirm this. PiperOrigin-RevId: 172158589
* Bump min graph consumer version when adding functions to itGravatar Igor Ganichev2017-10-06
| | | | PiperOrigin-RevId: 171352662
* implementing _update_input for the C APIGravatar Olivia Nordquist2017-09-26
| | | | PiperOrigin-RevId: 170147211
* adding InputTensor functionality for symmetry with OutputTensorGravatar Olivia Nordquist2017-09-14
| | | | PiperOrigin-RevId: 168708049
* Add WhileContext class and add plumbing for creating them.Gravatar Skye Wanderman-Milne2017-09-13
| | | | | | | | | | | | | | | | This change introduces WhileContext, which stores information about a while loop and will be used in future changes to generate while loop gradient graphs. Exit nodes in a while loop now have a pointer to their associated WhileContext. This will be used to retrieve the context for a given loop. This change adds an optional parameter to BuildWhileLoop() to create a WhileContext for the while loop (currently this is always true, but gradients will generate while loops without associated contexts). This change also adds a as-yet-unused option to BuildWhileLoop() to return the predicate output. PiperOrigin-RevId: 168562303
* Add function support to Tensorflow C APIGravatar Igor Ganichev2017-08-30
| | | | | | | | This change adds minimal functionality. Support for FunctionOptions, attributes, output name rewriting, function name generation, etc is comming next. PiperOrigin-RevId: 167091238
* Make Graph::IsValidNode publicGravatar Igor Ganichev2017-08-24
| | | | | | | It can be reimplemented with existing public APIs, but instead of doing so, making this one public seems better. PiperOrigin-RevId: 166407897
* Speed up the graph to graphdef conversionGravatar Benoit Steiner2017-08-17
| | | | PiperOrigin-RevId: 165640923
* Add log messages to Graph::IsValidNodeGravatar Igor Ganichev2017-08-16
| | | | | | Also, add Edge::DebugString() method. PiperOrigin-RevId: 165510102
* Implementing set_device for the C APIGravatar Olivia Nordquist2017-07-18
| | | | PiperOrigin-RevId: 162379684
* Merge changes from github.Gravatar Shanqing Cai2017-07-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | END_PUBLIC --- Commit d0f53f77f authored by Penghao Cen<scorpiocph@gmail.com> Committed by Shanqing Cai<cais@google.com>: Minor fix typo (#11323) --- Commit 02fcf564e authored by Chris Song<sjhshy@gmail.com> Committed by Chris Song<sjhshy@gmail.com>: Fix misspells. --- Commit 764c9b6b4 authored by Louis Tiao<ltiao@users.noreply.github.com> Committed by GitHub<noreply@github.com>: Fixed typo in docstring --- Commit f8cd1283e authored by Shanqing Cai<cais@google.com> Committed by Shanqing Cai<cais@google.com>: Chaser --- Commit 01383b946 authored by Shanqing Cai<cais@google.com> Committed by Shanqing Cai<cais@google.com>: Adapt TensorFlowTestCase.setUp() to new reset_default_graph() semantics Avoid calling reset_default_graph() directly to prevent exceptions in cases where test methods error out from within nested graph contexts, which can leave _default_graph_stack non-empty in certain Python versions. --- Commit 0ffc37890 authored by Amit Patankar<amitpatankar@google.com> Committed by Amit Patankar<amitpatankar@google.com>: Removing second declaration of functions. --- Commit f9c9cacb0 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Refactor ElementalIrEmitter's slice index finding code into IrArray::Index::SourceIndexOfSlice(). PiperOrigin-RevId: 161140653 --- Commit ba297aec9 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Update ops-related pbtxt files. PiperOrigin-RevId: 161138258 --- Commit 68d666737 authored by Alexandre Passos<apassos@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Fixes a reentrant lock issue with tensors using ndarray memory which uses tensor memory. PiperOrigin-RevId: 161137788 --- Commit a2ee8bca3 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Add support for int8 x int8 -> int32 matrix multiplication via cublasGemmEx to stream_executor. PiperOrigin-RevId: 161137741 --- Commit 755fa7b50 authored by Mark Daoust<markdaoust@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Block generate_test, and docs generating from running in python3. - Doc generation is currently unsupported in python3 - These both end in errors in python 3.5.1+ PiperOrigin-RevId: 161137467 --- Commit 97cbcac45 authored by Peter Hawkins<phawkins@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: [TF:XLA] Fix failure in functionalize_control_flow rewrite for Enter nodes that are unused. Make sure we ignore such nodes without producing an error. PiperOrigin-RevId: 161136545 --- Commit dabcb60bc authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: [XLA] Add reasonable error messages to Builder::Build for bad parameter numbers. PiperOrigin-RevId: 161136262 --- Commit 0cbd249e8 authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Add complex tensors support to `matrix_determinant`. PiperOrigin-RevId: 161132422 --- Commit 335f1f14d authored by A. Unique TensorFlower<gardener@tensorflow.org> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Extend static shape inference for SparseTensors with dense_shapes constructed using slicing. PiperOrigin-RevId: 161132391 --- Commit 53604916e authored by Jianwei Xie<xiejw@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Fixed the missing labels test in TPUEstimator. PiperOrigin-RevId: 161131282 --- Commit 9f57dc8dd authored by Bruno Rosa<bruno.rosa@eldorado.org.br> Committed by Bruno Rosa<bruno.rosa@eldorado.org.br>: Use mcpu instead of march for ppc64le march is not support by gcc on ppc64le --- Commit 7d5c74a9c authored by Skye Wanderman-Milne<skyewm@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Move duplicate detection logic from Graph to FunctionLibraryDefinition Turns out this is more useful, since there are many function libraries that don't belong to a graph. This will be used in a future change. Note that this maintains the current behavior of Graph. In addition, updates FunctionDefsEqual() to handle unset attr entries (I ran into this when using this in said future change). PiperOrigin-RevId: 161126628 --- Commit 2caec3af1 authored by Shanqing Cai<cais@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Disable more timeseries py tests failing in OSS PIP GPU builds PiperOrigin-RevId: 161124799 --- Commit 0b5cce367 authored by Eugene Brevdo<ebrevdo@google.com> Committed by TensorFlower Gardener<gardener@tensorflow.org>: Get TopK op working on GPU again. Extend using cub's radix sort. 1. Undo rollback of Andreas Kirsch's initial implementation. 2. Use cub segmented radix sort if Andreas' heap-based impl for large k and small num_cols (thresholds of k=100, n=1000 determined empirically). 3. Use cub segmented radix sort if k == num_cols (this case is always faster). 4. Added benchmarks. Benchmarks show that the GPU implementation is up to 3x slower for small k but can be 10x faster for large num_cols and k. Benchmarks: Benchmark: m_128_n_10_k_5_use_gpu_False wall_time: 0.000166 s Throughput: 0.0077 GB/s Benchmark: m_128_n_10_k_5_use_gpu_True wall_time: 0.000796 s Throughput: 0.00161 GB/s Benchmark: m_128_n_10_k_9_use_gpu_False wall_time: 0.00017 s Throughput: 0.00751 GB/s Benchmark: m_128_n_10_k_9_use_gpu_True wall_time: 0.000796 s Throughput: 0.00161 GB/s Benchmark: m_128_n_10_k_10_use_gpu_False wall_time: 0.00017 s Throughput: 0.00753 GB/s Benchmark: m_128_n_10_k_10_use_gpu_True wall_time: 0.000775 s Throughput: 0.00165 GB/s Benchmark: m_128_n_100_k_1_use_gpu_False wall_time: 0.000155 s Throughput: 0.0826 GB/s Benchmark: m_128_n_100_k_1_use_gpu_True wall_time: 0.000796 s Throughput: 0.0161 GB/s Benchmark: m_128_n_100_k_50_use_gpu_False wall_time: 0.000247 s Throughput: 0.0519 GB/s Benchmark: m_128_n_100_k_50_use_gpu_True wall_time: 0.0008 s Throughput: 0.016 GB/s Benchmark: m_128_n_100_k_99_use_gpu_False wall_time: 0.000261 s Throughput: 0.049 GB/s Benchmark: m_128_n_100_k_99_use_gpu_True wall_time: 0.000794 s Throughput: 0.0161 GB/s Benchmark: m_128_n_100_k_100_use_gpu_False wall_time: 0.000239 s Throughput: 0.0536 GB/s Benchmark: m_128_n_100_k_100_use_gpu_True wall_time: 0.000777 s Throughput: 0.0165 GB/s Benchmark: m_128_n_1000_k_1_use_gpu_False wall_time: 0.000324 s Throughput: 0.395 GB/s Benchmark: m_128_n_1000_k_1_use_gpu_True wall_time: 0.000916 s Throughput: 0.14 GB/s Benchmark: m_128_n_1000_k_10_use_gpu_False wall_time: 0.00042 s Throughput: 0.305 GB/s Benchmark: m_128_n_1000_k_10_use_gpu_True wall_time: 0.000902 s Throughput: 0.142 GB/s Benchmark: m_128_n_1000_k_500_use_gpu_False wall_time: 0.0011 s Throughput: 0.116 GB/s Benchmark: m_128_n_1000_k_500_use_gpu_True wall_time: 0.00097 s Throughput: 0.132 GB/s Benchmark: m_128_n_1000_k_990_use_gpu_False wall_time: 0.00133 s Throughput: 0.0962 GB/s Benchmark: m_128_n_1000_k_990_use_gpu_True wall_time: 0.000993 s Throughput: 0.129 GB/s Benchmark: m_128_n_1000_k_1000_use_gpu_False wall_time: 0.00102 s Throughput: 0.126 GB/s Benchmark: m_128_n_1000_k_1000_use_gpu_True wall_time: 0.000964 s Throughput: 0.133 GB/s Benchmark: m_128_n_10000_k_10_use_gpu_False wall_time: 0.002 s Throughput: 0.64 GB/s Benchmark: m_128_n_10000_k_10_use_gpu_True wall_time: 0.00288 s Throughput: 0.445 GB/s Benchmark: m_128_n_10000_k_100_use_gpu_False wall_time: 0.00233 s Throughput: 0.549 GB/s Benchmark: m_128_n_10000_k_100_use_gpu_True wall_time: 0.00325 s Throughput: 0.394 GB/s Benchmark: m_128_n_10000_k_5000_use_gpu_False wall_time: 0.0127 s Throughput: 0.101 GB/s Benchmark: m_128_n_10000_k_5000_use_gpu_True wall_time: 0.00381 s Throughput: 0.336 GB/s Benchmark: m_128_n_10000_k_9900_use_gpu_False wall_time: 0.015 s Throughput: 0.0853 GB/s Benchmark: m_128_n_10000_k_9900_use_gpu_True wall_time: 0.00438 s Throughput: 0.292 GB/s Benchmark: m_128_n_10000_k_10000_use_gpu_False wall_time: 0.0104 s Throughput: 0.123 GB/s Benchmark: m_128_n_10000_k_10000_use_gpu_True wall_time: 0.00427 s Throughput: 0.3 GB/s Benchmark: m_128_n_100000_k_100_use_gpu_False wall_time: 0.0148 s Throughput: 0.865 GB/s Benchmark: m_128_n_100000_k_100_use_gpu_True wall_time: 0.0262 s Throughput: 0.488 GB/s Benchmark: m_128_n_100000_k_1000_use_gpu_False wall_time: 0.0201 s Throughput: 0.636 GB/s Benchmark: m_128_n_100000_k_1000_use_gpu_True wall_time: 0.0263 s Throughput: 0.486 GB/s Benchmark: m_128_n_100000_k_50000_use_gpu_False wall_time: 0.214 s Throughput: 0.0599 GB/s Benchmark: m_128_n_100000_k_50000_use_gpu_True wall_time: 0.0322 s Throughput: 0.398 GB/s Benchmark: m_128_n_100000_k_99000_use_gpu_False wall_time: 0.262 s Throughput: 0.0489 GB/s Benchmark: m_128_n_100000_k_99000_use_gpu_True wall_time: 0.0377 s Throughput: 0.34 GB/s Benchmark: m_128_n_100000_k_100000_use_gpu_False wall_time: 0.118 s Throughput: 0.108 GB/s Benchmark: m_128_n_100000_k_100000_use_gpu_True wall_time: 0.0365 s Throughput: 0.351 GB/s END_PUBLIC BEGIN_PUBLIC BEGIN_PUBLIC Automated g4 rollback of changelist 157169178 PiperOrigin-RevId: 161476569
* Move duplicate detection logic from Graph to FunctionLibraryDefinitionGravatar Skye Wanderman-Milne2017-07-06
| | | | | | | | | | | Turns out this is more useful, since there are many function libraries that don't belong to a graph. This will be used in a future change. Note that this maintains the current behavior of Graph. In addition, updates FunctionDefsEqual() to handle unset attr entries (I ran into this when using this in said future change). PiperOrigin-RevId: 161126628
* Don't crash when converting ill formed graphs to graph defs: this can happenGravatar Benoit Steiner2017-06-29
| | | | | | with legacy fed inputs whose 2 inputs may remain unconnected. PiperOrigin-RevId: 160589677
* Prepare to remove a bunch of proto.h includes from tensorflow/core headersGravatar Geoffrey Irving2017-06-29
| | | | | | | | | | | | The goal is to make kernels mostly independent of proto headers, which will let us lock down our .so imports. This CL does not remove any actual headers, but changes a bunch of files so that header removal is possible in a followup CL. It also marks the headers that will be removed with // TODO(b/62899350): Remove RELNOTES: n/a PiperOrigin-RevId: 160552878
* Use std::shared_ptr instead of core::RefCounted for Node::PropertiesGravatar Skye Wanderman-Milne2017-06-23
| | | | | | Also changes Node::Properties to a struct and removes underscores from public member variables. This change should make it easier to work with Properties moving forward as the refcount will be automatically updated. PiperOrigin-RevId: 160003281
* Prepare to not include node_def.proto.h in node_def_util.hGravatar Geoffrey Irving2017-06-23
| | | | | | | | | | The goal is to make kernels mostly independent of proto headers, which will let us lock down our .so imports. This CL makes a bunch of .cc files either include node_def.proto.h themselves or not need the definition of NodeDef; a second CL will make node_def_util.h not include node_def.proto.h. RELNOTES: n/a PiperOrigin-RevId: 159982117
* This change significantly reduces time and resources used to load large ↵Gravatar A. Unique TensorFlower2017-06-01
| | | | | | | | | | | | | | | | | | | | | | | | TensorFlow graphs. For a real-world large graph (13k nodes, 20k edges), this change: * reduces all heap allocations by 19% * reduces retained (final) heap allocations by 2.2% * reduces CPU time by 11.2% In most TF graphs, the set of unique values set to Node::assigned_device_name() is quite small. This change adds an interning table to the Graph object, which contains all of the unique values used for Node::set_assigned_device_name(), as well as a look-up table. This is the main source of the reduction in retained heap memory; nearly all nodes are assigned to just one or two unique devices. This change removes the "string assigned_device_name_" field from the Node class, and replaces it with "int assigned_device_name_index_". However, because you need both the index and the name table to get the actual value, the Node::assigned_device_name() accessor needs access to the parent Graph. This requires adding a "Graph* graph_" field to the Node class. In the future, if all users of this property are converted to use Graph::assigned_device_name(Node*), then the Node::graph_ field can be deleted, and the space reclaimed. However, doing so is out of the scope of this CL, and even with this new pointer field, the Node class is smaller than it was before, so this is still a net win. The placement algorithm in simple_placer.cc is one of the main accessors of the Node::assigned_device_name property. This CL contains significant changes to simple_placer.cc, which directly take advantage of the fact that the property is an index into a name table, rather than treating it simply as a string. Many temporary allocations are also removed, which is the main source of the reduction in total heap allocations. This CL also contains a few changes that remove short-lived allocations in unrelated code, such as the changes in op.cc/h, costmodel.cc, etc. It is extremely easy in C++ to accidentally allocate memory, especially when implicit conversions and copy constructors allocate memory. All of the changes in this CL were motivated by empirical measurement, using CPU profiling and heap profiling. PiperOrigin-RevId: 157762909
* Many algorithms need to enumerate the set of nodes within a graph, while ↵Gravatar A. Unique TensorFlower2017-05-22
| | | | | | | | | | | | excluding the special Sink and Source nodes. The checks for skipping Source and Sink are duplicated in dozens of loops. This CL adds a new Graph::op_nodes() method, which returns an enumerable range of all operation nodes, excluding Sink and Source. This allows many for loops to be simplified. This simplification is being done mainly for readability / reliability. There may be a tiny performance difference owing to this change (as well as making the Graph::nodes() and Graph::op_nodes() methods inlineable), but the measured difference is not reliably large enough to be significant. The changes to graph.h and graph.cc are quite minimal. I updated all of the uses of Graph::nodes() that I could reliably determine were unaffected by the change. Most uses immediately checked node->IsOp(). Some compared node->type_string() against literal strings, none of which were "_SINK" or "_SOURCE", and so using op_nodes() was more appropriate than nodes(). In some cases, it was not obvious whether an existing use of Graph::node() wanted to enumerate Sink / Source, so I left those uses unaffected. PiperOrigin-RevId: 156782112
* Automated g4 rollback of changelist 156251356Gravatar Geoffrey Irving2017-05-17
| | | | PiperOrigin-RevId: 156315860
* Automated g4 rollback of changelist 156244933Gravatar Geoffrey Irving2017-05-16
| | | | PiperOrigin-RevId: 156251356
* Reduce direct references to NodeDef in favor of Node and AttrSliceGravatar Geoffrey Irving2017-05-16
| | | | | | | | This is one step towards replacing in-memory use of NodeDef with a customized NodeInfo class. There are still quite a few Node::def() references, but far fewer than before. Those remaining require more work, either because they are part of kernel registration (which is a bunch of functions), copy and modify the NodeDef, etc. Follow-on CLs will remove more. RELNOTES: n/a PiperOrigin-RevId: 156244933
* Fix use of incorrect OpDef from another graph's function library in ↵Gravatar Peter Hawkins2017-05-15
| | | | | | Graph::CopyNode(). PiperOrigin-RevId: 156086326
* This change reduces the CPU time spent adding nodes to a graph. For an ↵Gravatar A. Unique TensorFlower2017-05-11
| | | | | | | | example large graph (13k nodes, 20k edges), this change reduces the CPU time spent loading the graph by 5%. The existing code uses a long sequence of string comparisons and tests, whenever a node is added. The CHECK(class_ == NC_UNITIALIZED) statement can never actually test anything, because all of the string comparisons (except those against empty strings, which serve no purpose) test against a disjoint set of strings, so no collisions are possible. PiperOrigin-RevId: 155768893
* This CL removes the Graph.edge_set_ field. This field stores a set of the ↵Gravatar A. Unique TensorFlower2017-05-01
| | | | | | | | | Edge* that are in a Graph. However, Graph already stores this information, in Graph.edges_. There's really no good reason to keep both of these collections. To convert everything to use Graph.edges_ instead of Graph.edge_set_, I defined a class which handled excluding nullptr from iteration of the edges_ vector. This caused changes to non-contractual behavior of the runtime (enumeration order), so the unit tests are updated to reflect this. On a real-world graph used by our team, which contains 13190 nodes and 20796 edges, this change reduced heap allocation from 39.1 MB to 38.0 MB, for a drop of about 3%. Change: 154781831
* Make FunctionLibraryDefinition::AddFunctionDef() check for conflicting op nameGravatar Skye Wanderman-Milne2017-05-01
| | | | | This prevents a function from masking an existing op. Change: 154720287
* Split graph_to_functiondef into its own library.Gravatar Peter Hawkins2017-04-25
| | | | | | Add a non-const overload of Graph::input_node(). Fix comment in description of Merge op. Change: 154225918
* Make ImportGraphDef() work with functions.Gravatar Skye Wanderman-Milne2017-04-04
| | | | | | | | | In addition to modify graph_constructor.cc, this patch adds some other functionality to enable importing fucntions: * Ability to add FunctionDefLibraries to Graphs and FunctionLibraryDefinitions (in addition to existing functions) * FunctionDefsEqual() utility function Change: 152205258
* Replace OpRegistryInterface* with FunctionLibraryDefinition in Graph.Gravatar Skye Wanderman-Milne2017-03-16
| | | | | This is a first step towards supporting functions in C++ graph construction, e.g. being able to import GraphDefs with functions. Change: 150382046
* Enable the direct use of TensorHandles as feed values through ResourceHandlesGravatar Shanqing Cai2017-03-09
| | | | | | | | | | | This is motivated by, among other goals, the need to enhance memory efficiency during TFDBG's stepper operations. The stepper caches TensorHandles to already-continued-to tensors and use them as feeds if later continue-to actions depend on the tensors as transitive inputs. However, previously the TensorHandles had to be converted to Numpy arrays by calling eval() and the Numpy arrays were then fed back to next Session.run() calls. This mode of operation involved at least two unnecessary tensor-numpy and numpy-tensor copying. This CL makes it possible to use the ResourceHandle representations TensorHandles directly as feed values, eliminating the need for the aforementioned copying. To this end, the following changes are made 1) the underlying representations of TensorHandles are changed from string to ResourceHandle. A custom numpy struct type is created to allow ResourceHandle of the TensorHandle subtype to be fed during Session.run() calls. 2) added GetSessionHandleOpV2, which deprecates GetSessionHandleOp. The V2 op outputs a DT_RESOURCE Tensor, instead of a string Tensor in the deprecated version. Change: 149672538
* Add a helper method Node::input_edges() that populates a vector of all input ↵Gravatar Peter Hawkins2017-01-23
| | | | | | edges to a node, indexed by input number. Change: 145301512
* Add control edge support to TensorId.Gravatar Skye Wanderman-Milne2017-01-03
| | | | | | | | | | | | | | | | | | | | | | Benchmark before this patch: Benchmark Time(ns) CPU(ns) Iterations ----------------------------------------------------- BM_ParseTensorName/0 11 11 62210085 BM_ParseTensorName/1 11 11 61929533 BM_ParseTensorName/2 15 15 47013375 BM_ParseTensorName/3 11 11 61695893 BM_ParseTensorName/4 13 13 53245481 With patch: Benchmark Time(ns) CPU(ns) Iterations ----------------------------------------------------- BM_ParseTensorName/0 12 12 58120033 BM_ParseTensorName/1 13 13 55956526 BM_ParseTensorName/2 16 16 41977752 BM_ParseTensorName/3 12 12 58654154 BM_ParseTensorName/4 14 14 51448825 BM_ParseTensorName/5 14 14 51380180 Change: 143492979
* Adds VariableV2 with a sane shape_fn.Gravatar A. Unique TensorFlower2016-12-05
| | | | Change: 141071094
* Add to the C++ Node class the ability to fetch input nodes and edgesGravatar Vijay Vasudevan2016-08-25
| | | | | | | | | | | | | | | | | by index. There are various locations in code where users currently use iteration to find the edge by its already known index, and these functions would be useful to accomplish. In addition, this implements the equivalent functionality of 'op.inputs[i]' in our python Operation class. Given the new functionality, it exposed a weird use of NoOp for nodes that actually had multiple inputs. Modified the test to use custom op definitions to be more correct. Currently this iterates over the edge list, which in the common case will be fast and introduces no additional state to Node. In the future we may want to revisit this. Change: 131299794