Filter graph on server-side before serving it to the client.

Updated the HTTP endpoint (and the doc) so the user can control filtering using the query parameters `limit_attr_size` and `large_attrs_key`. For now, the filtering includes removal of values of large attributes (size > 1KB). This significantly reduces the size of graphs that have embedded tensor values in them. For instance, a graph shared by our users with embedded values is reduced from 94MB to 290 KB. The values for the filtered-out attributes show up in the info card in the graph view as "Too large to show...". The filtering doesn't add significant computation cost. On a graph of 30K nodes (very large by standards), the filtering added 0.4 seconds. Added also test when the server serves the graph. Change: 116168623
author: Dan Smilkov <dsmilkov@gmail.com> 2016-03-02 12:25:22 -0800
committer: TensorFlower Gardener <gardener@tensorflow.org> 2016-03-03 10:01:28 -0800
commit: 3212eb3c874452ea928e23246e524b147f31cd15 (patch)
tree: 3675f54596a57c372dd78a7b2d0ccc0eb0a0df4d
parent: 0b93524659052624f7c6437db2f0edc44bbb2644 (diff)
10 files changed, 245 insertions, 17 deletions
diff --git a/tensorflow/tensorboard/backend/BUILD b/tensorflow/tensorboard/backend/BUILD
index 6eece0eff3..627e166e0b 100644
--- a/tensorflow/tensorboard/backend/BUILD
+++ b/tensorflow/tensorboard/backend/BUILD
@@ -13,6 +13,7 @@ py_library(
     srcs_version = "PY2AND3",
     deps = [
         ":float_wrapper",
+        ":process_graph",
         "//tensorflow/python:platform",
         "//tensorflow/python:summary",
         "//tensorflow/python:util",
@@ -31,6 +32,15 @@ py_test(
 )
 
 py_library(
+    name = "process_graph",
+    srcs = ["process_graph.py"],
+    srcs_version = "PY2AND3",
+    # Used by mldash and potentially other projects
+    visibility = ["//visibility:public"],
+    deps = [],
+)
+
+py_library(
     name = "float_wrapper",
     srcs = ["float_wrapper.py"],
     srcs_version = "PY2AND3",
diff --git a/tensorflow/tensorboard/backend/handler.py b/tensorflow/tensorboard/backend/handler.py
index ae74125149..2e626355aa 100644
--- a/tensorflow/tensorboard/backend/handler.py
+++ b/tensorflow/tensorboard/backend/handler.py
@@ -41,6 +41,8 @@ from tensorflow.python.platform import resource_loader
 from tensorflow.python.summary import event_accumulator
 from tensorflow.python.util import compat
 from tensorflow.tensorboard.backend import float_wrapper
+from tensorflow.tensorboard.backend import process_graph
+
 
 DATA_PREFIX = '/data'
 RUNS_ROUTE = '/runs'
@@ -242,6 +244,23 @@ class TensorboardHandler(BaseHTTPServer.BaseHTTPRequestHandler):
       self.send_response(404)
       return
 
+    limit_attr_size = query_params.get('limit_attr_size', None)
+    if limit_attr_size is not None:
+      try:
+        limit_attr_size = int(limit_attr_size)
+      except ValueError:
+        self.send_error(400, 'The query param `limit_attr_size` must be'
+                        'an integer')
+        return
+
+    large_attrs_key = query_params.get('large_attrs_key', None)
+    try:
+      process_graph.prepare_graph_for_ui(graph, limit_attr_size,
+                                         large_attrs_key)
+    except ValueError as e:
+      self.send_error(400, e.message)
+      return
+
     # Serialize the graph to pbtxt format.
     graph_pbtxt = str(graph)
     # Gzip it and send it to the user.
diff --git a/tensorflow/tensorboard/backend/process_graph.py b/tensorflow/tensorboard/backend/process_graph.py
new file mode 100644
index 0000000000..07a302ec86
--- /dev/null
+++ b/tensorflow/tensorboard/backend/process_graph.py
@@ -0,0 +1,67 @@
+# Copyright 2016 Google Inc. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+"""Graph post-processing logic. Used by both TensorBoard and mldash."""
+
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+
+def prepare_graph_for_ui(graph, limit_attr_size=1024,
+                         large_attrs_key='_too_large_attrs'):
+  """Prepares (modifies in-place) the graph to be served to the front-end.
+
+  For now, it supports filtering out attributes that are
+  too large to be shown in the graph UI.
+
+  Args:
+    graph: The GraphDef proto message.
+    limit_attr_size: Maximum allowed size in bytes, before the attribute
+        is considered large. Default is 1024 (1KB). Must be > 0 or None.
+        If None, there will be no filtering.
+    large_attrs_key: The attribute key that will be used for storing attributes
+        that are too large. Default is '_too_large_attrs'. Must be != None if
+        `limit_attr_size` is != None.
+
+  Raises:
+    ValueError: If `large_attrs_key is None` while `limit_attr_size != None`.
+    ValueError: If `limit_attr_size` is defined, but <= 0.
+  """
+  # Check input for validity.
+  if limit_attr_size is not None:
+    if large_attrs_key is None:
+      raise ValueError('large_attrs_key must be != None when limit_attr_size'
+                       '!= None.')
+
+    if limit_attr_size <= 0:
+      raise ValueError('limit_attr_size must be > 0, but is %d' %
+                       limit_attr_size)
+
+  # Filter only if a limit size is defined.
+  if limit_attr_size is not None:
+    for node in graph.node:
+      # Go through all the attributes and filter out ones bigger than the
+      # limit.
+      keys = node.attr.keys()
+      for key in keys:
+        size = node.attr[key].ByteSize()
+        if size > limit_attr_size or size < 0:
+          del node.attr[key]
+          # Add the attribute key to the list of "too large" attributes.
+          # This is used in the info card in the graph UI to show the user
+          # that some attributes are too large to be shown.
+          node.attr[large_attrs_key].list.s.append(str(key))
+
diff --git a/tensorflow/tensorboard/backend/server_test.py b/tensorflow/tensorboard/backend/server_test.py
index a9d59638cd..8d5e06f468 100644
--- a/tensorflow/tensorboard/backend/server_test.py
+++ b/tensorflow/tensorboard/backend/server_test.py
@@ -22,15 +22,19 @@ from __future__ import division
 from __future__ import print_function
 
 import base64
+import gzip
 import json
 import os
 import shutil
+import StringIO
 import threading
+import zlib
 
 from six.moves import http_client
 from six.moves import xrange  # pylint: disable=redefined-builtin
 import tensorflow as tf
 
+from google.protobuf import text_format
 from tensorflow.python.summary import event_multiplexer
 from tensorflow.tensorboard.backend import server
 
@@ -70,6 +74,18 @@ class TensorboardServerTest(tf.test.TestCase):
     self.assertEqual(response.status, 200)
     return json.loads(response.read().decode('utf-8'))
 
+  def _decodeResponse(self, response):
+    """Decompresses (if necessary) the response from the server."""
+    encoding = response.getheader('Content-Encoding')
+    content = response.read()
+    if encoding in ('gzip', 'x-gzip', 'deflate'):
+      if encoding == 'deflate':
+        data = StringIO.StringIO(zlib.decompress(content))
+      else:
+        data = gzip.GzipFile('', 'rb', 9, StringIO.StringIO(content))
+      content = data.read()
+    return content
+
   def testBasicStartup(self):
     """Start the server up and then shut it down immediately."""
     pass
@@ -97,7 +113,7 @@ class TensorboardServerTest(tf.test.TestCase):
                   'scalars': ['simple_values'],
                   'histograms': ['histogram'],
                   'images': ['image'],
-                  'graph': False}})
+                  'graph': True}})
 
   def testHistograms(self):
     """Test the format of /data/histograms."""
@@ -137,12 +153,34 @@ class TensorboardServerTest(tf.test.TestCase):
     response = self._get('/data/individualImage?%s' % image_query)
     self.assertEqual(response.status, 200)
 
+  def testGraph(self):
+    """Test retrieving the graph definition."""
+    response = self._get('/data/graph?run=run1&limit_attr_size=1024'
+                         '&large_attrs_key=_very_large_attrs')
+    self.assertEqual(response.status, 200)
+    # Decompress (unzip) the response, since graphs come gzipped.
+    graph_pbtxt = self._decodeResponse(response)
+    # Parse the graph from pbtxt into a graph message.
+    graph = tf.GraphDef()
+    graph = text_format.Parse(graph_pbtxt, graph)
+    self.assertEqual(len(graph.node), 2)
+    self.assertEqual(graph.node[0].name, 'a')
+    self.assertEqual(graph.node[1].name, 'b')
+    # Make sure the second node has an attribute that was filtered out because
+    # it was too large and was added to the "too large" attributes list.
+    self.assertEqual(graph.node[1].attr.keys(), ['_very_large_attrs'])
+    self.assertEqual(graph.node[1].attr['_very_large_attrs'].list.s,
+                     ['very_large_attr'])
+
   def _GenerateTestData(self):
     """Generates the test data directory.
 
-    The test data has a single run named run1 which contains a histogram and an
-    image at timestamp and step 0 and scalar events containing the value i at
-    step 10 * i and wall time 100 * i, for i in [1, _SCALAR_COUNT).
+    The test data has a single run named run1 which contains:
+     - a histogram
+     - an image at timestamp and step 0
+     - scalar events containing the value i at step 10 * i and wall time
+         100 * i, for i in [1, _SCALAR_COUNT).
+     - a graph definition
     """
     temp_dir = self.get_temp_dir()
     self.addCleanup(shutil.rmtree, temp_dir)
@@ -157,6 +195,14 @@ class TensorboardServerTest(tf.test.TestCase):
                                         sum_squares=5,
                                         bucket_limit=[0, 1, 2],
                                         bucket=[1, 1, 1])
+    # Add a simple graph event.
+    graph_def = tf.GraphDef()
+    node1 = graph_def.node.add()
+    node1.name = 'a'
+    node2 = graph_def.node.add()
+    node2.name = 'b'
+    node2.attr['very_large_attr'].s = 'a' * 2048  # 2 KB attribute
+    writer.add_event(tf.Event(graph_def=graph_def.SerializeToString()))
 
     # 1x1 transparent GIF.
     encoded_image = base64.b64decode(
diff --git a/tensorflow/tensorboard/components/tf-dashboard-common/urlGenerator.ts b/tensorflow/tensorboard/components/tf-dashboard-common/urlGenerator.ts
index 3932bac1d9..7148fd3fce 100644
--- a/tensorflow/tensorboard/components/tf-dashboard-common/urlGenerator.ts
+++ b/tensorflow/tensorboard/components/tf-dashboard-common/urlGenerator.ts
@@ -25,7 +25,8 @@ module TF {
       compressedHistograms: RunTagUrlFn;
       images: RunTagUrlFn;
       individualImage: (query: string) => string;
-      graph: (run: string) => string;
+      graph: (run: string, limit_attr_size?: number, large_attrs_key?: string)
+          => string;
     };
 
     export var routes = ["runs", "scalars", "histograms",
@@ -43,8 +44,19 @@ module TF {
       function individualImageUrl(query: string) {
         return "/data/individualImage?" + query;
       }
-      function graphUrl(run: string) {
-        return "/data/graph?run=" + encodeURIComponent(run);
+      function graphUrl(run: string, limit_attr_size?: number,
+          large_attrs_key?: string) {
+        let query_params = [["run", run]];
+        if (limit_attr_size != null) {
+          query_params.push(["limit_attr_size", String(limit_attr_size)]);
+        }
+        if (large_attrs_key != null) {
+          query_params.push(["large_attrs_key", large_attrs_key]);
+        }
+        let query = query_params.map(param => {
+          return param[0] + "=" + encodeURIComponent(param[1]);
+        }).join("&");
+        return "/data/graph?" + query;
       }
       return {
         runs: () => "/data/runs",
diff --git a/tensorflow/tensorboard/components/tf-graph-common/lib/graph.ts b/tensorflow/tensorboard/components/tf-graph-common/lib/graph.ts
index 7e303e9059..ed89706b45 100644
--- a/tensorflow/tensorboard/components/tf-graph-common/lib/graph.ts
+++ b/tensorflow/tensorboard/components/tf-graph-common/lib/graph.ts
@@ -20,6 +20,14 @@ module tf.graph {
 export const NAMESPACE_DELIM = "/";
 export const ROOT_NAME = "__root__";
 
+/** Attribute key used for storing attributes that are too large. */
+export const LARGE_ATTRS_KEY = "_too_large_attrs";
+/**
+ * Maximum allowed size in bytes, before the attribute is considered large
+ * and filtered out of the graph.
+ */
+export const LIMIT_ATTR_SIZE = 1024;
+
 // Separator between the source and the destination name of the edge.
 export const EDGE_KEY_DELIM = "--";
 
@@ -618,8 +626,8 @@ class MetaedgeImpl implements Metaedge {
     this.totalSize += MetaedgeImpl.computeSizeOfEdge(edge, h);
   }
 
-  private static computeSizeOfEdge(edge: BaseEdge, h: hierarchy.Hierarchy)
-      : number {
+  private static computeSizeOfEdge(edge: BaseEdge, h: hierarchy.Hierarchy):
+      number {
     let opNode = <OpNode> h.node(edge.v);
     if (opNode.outputShapes == null) {
       // No shape information. Asssume a single number. This gives
diff --git a/tensorflow/tensorboard/components/tf-graph-common/lib/parser.ts b/tensorflow/tensorboard/components/tf-graph-common/lib/parser.ts
index bc7c6c836f..f88da0dd33 100644
--- a/tensorflow/tensorboard/components/tf-graph-common/lib/parser.ts
+++ b/tensorflow/tensorboard/components/tf-graph-common/lib/parser.ts
@@ -137,7 +137,8 @@ export function parsePbtxt(input: string): TFNode[] {
     "node.attr.value.tensor.string_val": true,
     "node.attr.value.tensor.tensor_shape.dim": true,
     "node.attr.value.list.shape": true,
-    "node.attr.value.list.shape.dim": true
+    "node.attr.value.list.shape.dim": true,
+    "node.attr.value.list.s": true
   };
 
   /**
diff --git a/tensorflow/tensorboard/components/tf-graph-dashboard/tf-graph-dashboard.html b/tensorflow/tensorboard/components/tf-graph-dashboard/tf-graph-dashboard.html
index 1b0079f9d9..d26bf2e8f4 100644
--- a/tensorflow/tensorboard/components/tf-graph-dashboard/tf-graph-dashboard.html
+++ b/tensorflow/tensorboard/components/tf-graph-dashboard/tf-graph-dashboard.html
@@ -115,7 +115,8 @@ Polymer({
     return _.map(runsWithGraph, function(runName) {
       return {
         name: runName,
-        path: graphUrlGen(runName)
+        path: graphUrlGen(runName, tf.graph.LIMIT_ATTR_SIZE,
+            tf.graph.LARGE_ATTRS_KEY)
       };
     });
   },
diff --git a/tensorflow/tensorboard/components/tf-graph-info/tf-node-info.html b/tensorflow/tensorboard/components/tf-graph-info/tf-node-info.html
index cb9abc6cee..d715925d2c 100644
--- a/tensorflow/tensorboard/components/tf-graph-info/tf-node-info.html
+++ b/tensorflow/tensorboard/components/tf-graph-info/tf-node-info.html
@@ -396,10 +396,25 @@
         },
         _getAttributes: function(node) {
           this.async(this._resizeList.bind(this, "#attributesList"));
-          return node && node.attr ? node.attr.map(function(entry) {
-            return {key: entry.key, value: JSON.stringify(entry.value)};
-          }) : [];
-
+          if (!node || !node.attr) {
+            return [];
+          }
+          var attrs = [];
+          _.each(node.attr, function(entry) {
+            // Unpack the "too large" attributes into separate attributes
+            // in the info card, with values "too large to show".
+            if (entry.key === tf.graph.LARGE_ATTRS_KEY) {
+              attrs = attrs.concat(entry.value.list.s.map(function(key) {
+                return {key: key, value: "Too large to show..."};
+              }));
+            } else {
+              attrs.push({
+                key: entry.key,
+                value: JSON.stringify(entry.value)
+              });
+            }
+          });
+          return attrs;
         },
         _getDevice: function(node) {
           return node ? node.device : null;
diff --git a/tensorflow/tensorboard/http_api.md b/tensorflow/tensorboard/http_api.md
index f2fea8756f..055c4bcef7 100644
--- a/tensorflow/tensorboard/http_api.md
+++ b/tensorflow/tensorboard/http_api.md
@@ -187,13 +187,27 @@ replaced with other images. (See Notes for details on the reservoir sampling.)
 An example call to this route would look like this:
 /individualImage?index=0&tagname=input%2Fimage%2F2&run=train
 
-## `/graph?run=foo`
+## `/graph?run=foo&limit_attr_size=1024&large_attrs_key=key`
 
 Returns the graph definition for the given run in gzipped pbtxt format. The
 graph is composed of a list of nodes, where each node is a specific TensorFlow
 operation which takes as inputs other nodes (operations).
 
-An example pbtxt response of graph with 3 nodes:
+The query parameters `limit_attr_size` and `large_attrs_key` are optional.
+
+`limit_attr_size` specifies the maximum allowed size in bytes, before the
+attribute is considered large and filtered out of the graph. If specified,
+it must be an int and > 0. If not specified, no filtering is applied.
+
+`large_attrs_key` is the attribute key that will be used for storing
+attributes that are too large. The value of this key (list of strings)
+should be used by the client in order to determine which attributes
+have been filtered. Must be specified if `limit_attr_size` is specified.
+
+For the query `/graph?run=foo&limit_attr_size=1024&large_attrs_key=_too_large`,
+here is an example pbtxt response of a graph with 3 nodes, where the second
+node had two large attributes "a" and "b" that were filtered out (size > 1024):
+
 node {
   op: "Input"
   name: "A"
@@ -201,6 +215,21 @@ node {
 node {
   op: "Input"
   name: "B"
+  attr {
+    key: "small_attr"
+    value: {
+      s: "some string"
+    }
+  }
+  attr {
+    key: "_too_large"
+    value {
+      list {
+        s: "a"
+        s: "b"
+      }
+    }
+  }
 }
 node {
   op: "MatMul"
@@ -209,6 +238,26 @@ node {
   input: "B"
 }
 
+Prior to filtering, the original node "B" had the following content:
+node {
+  op: "Input"
+  name: "B"
+  attr {
+    key: "small_attr"
+    value: {
+      s: "some string"
+    }
+  }
+  attr {
+    key: "a"
+    value { Very large object... }
+  }
+  attr {
+    key: "b"
+    value { Very large object... }
+  }
+}
+
 ## Notes
 
 All returned values, histograms, and images are returned in the order they were
author	Dan Smilkov <dsmilkov@gmail.com>	2016-03-02 12:25:22 -0800
committer	TensorFlower Gardener <gardener@tensorflow.org>	2016-03-03 10:01:28 -0800
commit	3212eb3c874452ea928e23246e524b147f31cd15 (patch)
tree	3675f54596a57c372dd78a7b2d0ccc0eb0a0df4d
parent	0b93524659052624f7c6437db2f0edc44bbb2644 (diff)