aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/python/kernel_tests/gradient_checker.py
diff options
context:
space:
mode:
authorGravatar Vijay Vasudevan <vrv@google.com>2015-12-01 13:26:53 -0800
committerGravatar Vijay Vasudevan <vrv@google.com>2015-12-01 13:26:53 -0800
commit795f35da2d458cbae477ac2fe2bff80c1427a771 (patch)
treefd17153d79e388c1c017c33eb8bfcf70b0d929fb /tensorflow/python/kernel_tests/gradient_checker.py
parent3972c791b9f4d9a61b9ad6399b481df396f359ff (diff)
TensorFlow: upstream changes to git
Change: Clean up documentation for ReverseSequence Change: Updated several tensorflow operations to use 32bit indices on GPU. Change: Add attribute batch_dim to ReverseSequenceOp. Change: Fix error in convert_to_records.py. As reported in https://github.com/tensorflow/tensorflow/issues/370 by AlexUnderMicrocontRoll. Change: Update TensorBoard README. Change: Fixes to boolean flags reported in https://github.com/tensorflow/tensorflow/issues/379. Supports: --bool_flag=True --> True --bool_flag=False --> False --bool_flag=gibberish --> False --bool_flag --> True --nobool_flag --> False Fixes #379 Change: Update generated Op docs. Change: Enable local development of TensorBoard using gulp Also make tf-tensorboard a regular component rather than special case This is mostly effected by creating tfserve.js, which is a small server with clever routing to load from bower_components/ and components/ using the paths that work within google3. Workflow: `gulp serve` Change: Add a full working code example to the tensorboard and summaries tutorial Change: Fix seq2seq_test when running on GPU. The "proj_w" and "proj_b" variables were being created before the `test_session()`'s device function took effect, which pushed the placement algorithm into making an incorrect decision. Change: Add a sentence in TensorBoard README on how to serialize summary data to logs and provide link to the how-to tutorial on the TensorFlow website. Change: Add error-catching code if string_input_producer is supplied a null input. Before this change, it would die with an opaque shape error from inside the queue. This change catches (most) python null lists being passed directly in, and at runtime detects null tensors. Adds two tests for this to input_test.py Change: Speed up for models that use the same variable multiple times in the case where variables must be copied across devices: - Have Variables wrap the Variable op in an Identity op when converted to Tensor. This avoids multiple copies across devices if a variable is used multiple time in a computation. - Add Variable.mutable() to return the non-wrapped Variable op for used when assigning new values. - Add an as_ref parameter to convert_to_tensor() to allow code to specify if they plan to assign a new value to the result of the conversion. Make Variable return the result of Variable.mutable() when as_ref is True. - Make all ops that assign values to variables pass as_ref=True when converting their arguments. Change: Change to reduce critical section times in gpu_event_mgr.h: (1) Call stream->ThenRecordEvent outside the EventMgr critical section (2) Do memory deallocation outside the critical section Speeds up one configuration of ptb_word_lm from 2924 words per second (wps) to 3278 wps on my desktop machine with a Titan X. Change: Remove some colons that break the open source build ::tensorflow::StringPiece breaks for @raingo, see https://github.com/tensorflow/tensorflow/issues/358. tensorflow::StringPiece (without the leading colons) seems to fix the problem. Change: Added check that inputs to Operation is a list and make a defensive copy of the input. This is for cases where the input list is changed such as in _add_input. Change: Use standard names for TensorFlow dtypes in the tutorial. Change: Add tests for tensor inputs. Change: Fix build after declaring more types for ops Change: Switch to 32 bit indexing to speedup convolutions and concatenations. Change: Add convert_image op to convert between types for images (similar to OpenCV's cvtScale). Change: Make cast work between numeric types (bool, uint8, int16, int32, int64, float, double). Change: Padding input data for odd number of paddings, so we can use cudnn anyway. + Fix total padding computation when padding==VALID. + This CL makes the Googlenet benchmark run 5x faster. Change: Support IndexedSlices in ConcatGrad Change: * sampled softmax op uses one embedding lookup for positive and negative samples * float64 support for sampled softmax Change: Move RNN code out of models.rnn (without breaking existing code). The API may still undergo minor changes, until full documentation as added. Change: Changed to use per-step stacks for the accumulators used in while-loop gradient computation. This addresses the problem caused by using concat without sufficient static shape information. It should also improve performance as we avoided those expensive concats. Change: Update generated Op docs. Change: Improve error messages when the optimizer finds no variables to minimize or when none of the variables has gradients. Change: Say that -1 isn't just for flattening in reshape docs Also add scalar reshape (reshape(t, [])) as an example. This fixes https://github.com/tensorflow/tensorflow/issues/281. Change: This is a test. Base CL: 109118714
Diffstat (limited to 'tensorflow/python/kernel_tests/gradient_checker.py')
-rw-r--r--tensorflow/python/kernel_tests/gradient_checker.py83
1 files changed, 52 insertions, 31 deletions
diff --git a/tensorflow/python/kernel_tests/gradient_checker.py b/tensorflow/python/kernel_tests/gradient_checker.py
index 69cc811a6b..d0cdc3b3bc 100644
--- a/tensorflow/python/kernel_tests/gradient_checker.py
+++ b/tensorflow/python/kernel_tests/gradient_checker.py
@@ -34,7 +34,7 @@ from tensorflow.python.ops import gradients
from tensorflow.python.platform import logging
-def _Product(t):
+def _product(t):
if isinstance(t, int):
return t
else:
@@ -44,11 +44,11 @@ def _Product(t):
return y
-def _ComputeTheoricalJacobian(x, x_shape, x_data, dy, dy_shape, dx):
+def _compute_theoretical_jacobian(x, x_shape, x_data, dy, dy_shape, dx):
"""Computes the theoretical Jacobian for dy/dx.
Computes the theoretical Jacobian using the ops generated by
- ComputeGradient().
+ compute_gradient().
Args:
x: the tensor "x".
@@ -64,9 +64,9 @@ def _ComputeTheoricalJacobian(x, x_shape, x_data, dy, dy_shape, dx):
"dy_size" is the number of elements in dy.
"""
# To compute the jacobian, we treat x and y are one-dimensional vectors
- x_size = _Product(x_shape)
- x_val_size = _Product(x_shape[1:]) # This is used for sparse gradients
- dy_size = _Product(dy_shape)
+ x_size = _product(x_shape)
+ x_val_size = _product(x_shape[1:]) # This is used for sparse gradients
+ dy_size = _product(dy_shape)
jacobian = np.zeros((x_size, dy_size), dtype=x_data.dtype)
# For each of the entry of dy, we set this to be 1 and
@@ -92,7 +92,7 @@ def _ComputeTheoricalJacobian(x, x_shape, x_data, dy, dy_shape, dx):
return jacobian
-def _ComputeNumericJacobian(x, x_shape, x_data, y, y_shape, delta):
+def _compute_numeric_jacobian(x, x_shape, x_data, y, y_shape, delta):
"""Computes the numeric Jacobian for dy/dx.
Computes the numeric Jacobian by slightly perturbing the inputs and
@@ -113,8 +113,8 @@ def _ComputeNumericJacobian(x, x_shape, x_data, y, y_shape, delta):
"""
# To compute the jacobian, we treat x and y are one-dimensional vectors
- x_size = _Product(x_shape)
- y_size = _Product(y_shape)
+ x_size = _product(x_shape)
+ y_size = _product(y_shape)
jacobian = np.zeros((x_size, y_size), dtype=x_data.dtype)
# For each of the entry of x, we slightly perturbs this by adding and
@@ -134,7 +134,7 @@ def _ComputeNumericJacobian(x, x_shape, x_data, y, y_shape, delta):
return jacobian
-def _ComputeDxAndDy(x, y, y_shape):
+def _compute_dx_and_dy(x, y, y_shape):
"""Returns a node to compute gradient of x wrt y."""
# We make up a dy so that we can compute the gradients. We don't really use
# the value of dy -- we will always feed it. We need to add an identity node
@@ -149,8 +149,14 @@ def _ComputeDxAndDy(x, y, y_shape):
return grads[0], dy_orig
-def _ComputeGradient(x, x_shape, dx, y, y_shape, dy,
- x_init_value=None, delta=1e-3):
+def _compute_gradient(x,
+ x_shape,
+ dx,
+ y,
+ y_shape,
+ dy,
+ x_init_value=None,
+ delta=1e-3):
"""Computes the theoretical and numerical jacobian."""
t = dtypes.as_dtype(x.dtype)
allowed_types = [dtypes.float32, dtypes.float64]
@@ -170,16 +176,21 @@ def _ComputeGradient(x, x_shape, dx, y, y_shape, dy,
dtype = np.float64
x_data = np.asfarray(np.random.random_sample(x_shape), dtype=dtype)
- jacob_t = _ComputeTheoricalJacobian(x, x_shape, x_data, dy, y_shape, dx)
- jacob_n = _ComputeNumericJacobian(x, x_shape, x_data, y, y_shape, delta)
+ jacob_t = _compute_theoretical_jacobian(x, x_shape, x_data, dy, y_shape, dx)
+ jacob_n = _compute_numeric_jacobian(x, x_shape, x_data, y, y_shape, delta)
return jacob_t, jacob_n
-def _ComputeGradientList(
- x, x_shape, y, y_shape, x_init_value=None, delta=1e-3, init_targets=None):
+def _compute_gradient_list(x,
+ x_shape,
+ y,
+ y_shape,
+ x_init_value=None,
+ delta=1e-3,
+ init_targets=None):
"""Compute gradients for a list of x values."""
assert isinstance(x, list)
- dx, dy = zip(*[_ComputeDxAndDy(xi, y, y_shape) for xi in x])
+ dx, dy = zip(*[_compute_dx_and_dy(xi, y, y_shape) for xi in x])
if init_targets is not None:
assert isinstance(init_targets, (list, tuple))
@@ -187,15 +198,20 @@ def _ComputeGradientList(
init.run()
if x_init_value is None:
x_init_value = [None] * len(x)
- ret = [_ComputeGradient(xi, x_shapei, dxi, y, y_shape, dyi,
- x_init_valuei, delta)
- for xi, x_shapei, dxi, dyi, x_init_valuei in
- zip(x, x_shape, dx, dy, x_init_value)]
+ ret = [_compute_gradient(xi, x_shapei, dxi, y, y_shape, dyi, x_init_valuei,
+ delta)
+ for xi, x_shapei, dxi, dyi, x_init_valuei in zip(x, x_shape, dx, dy,
+ x_init_value)]
return ret
-def ComputeGradient(
- x, x_shape, y, y_shape, x_init_value=None, delta=1e-3, init_targets=None):
+def compute_gradient(x,
+ x_shape,
+ y,
+ y_shape,
+ x_init_value=None,
+ delta=1e-3,
+ init_targets=None):
"""Computes and returns the theoretical and numerical Jacobian.
Args:
@@ -219,20 +235,25 @@ def ComputeGradient(
number of elements in y. If x is a list, returns a list of two numpy arrays.
"""
if isinstance(x, list):
- return _ComputeGradientList(x, x_shape, y, y_shape, x_init_value,
- delta, init_targets)
+ return _compute_gradient_list(x, x_shape, y, y_shape, x_init_value, delta,
+ init_targets)
else:
if init_targets is not None:
assert isinstance(init_targets, (list, tuple))
for init in init_targets:
init.run()
- dx, dy = _ComputeDxAndDy(x, y, y_shape)
- ret = _ComputeGradient(x, x_shape, dx, y, y_shape, dy, x_init_value, delta)
+ dx, dy = _compute_dx_and_dy(x, y, y_shape)
+ ret = _compute_gradient(x, x_shape, dx, y, y_shape, dy, x_init_value, delta)
return ret
-def ComputeGradientError(
- x, x_shape, y, y_shape, x_init_value=None, delta=1e-3, init_targets=None):
+def compute_gradient_error(x,
+ x_shape,
+ y,
+ y_shape,
+ x_init_value=None,
+ delta=1e-3,
+ init_targets=None):
"""Computes the gradient error.
Computes the maximum error for dy/dx between the computed Jacobian and the
@@ -263,8 +284,8 @@ def ComputeGradientError(
Returns:
The maximum error in between the two Jacobians.
"""
- grad = ComputeGradient(x, x_shape, y, y_shape, x_init_value,
- delta, init_targets)
+ grad = compute_gradient(x, x_shape, y, y_shape, x_init_value, delta,
+ init_targets)
if isinstance(grad, tuple):
grad = [grad]
return max(np.fabs(j_t - j_n).max() for j_t, j_n in grad)