From 02b7fa3dfe3e82ca61581bf3365788c8acaa2b19 Mon Sep 17 00:00:00 2001 From: Amit Patankar Date: Wed, 6 Jun 2018 14:04:40 -0700 Subject: Adding a constraint for the setuptools version. --- tensorflow/tools/pip_package/setup.py | 1 + 1 file changed, 1 insertion(+) diff --git a/tensorflow/tools/pip_package/setup.py b/tensorflow/tools/pip_package/setup.py index 78d955c637..97f625e7e9 100644 --- a/tensorflow/tools/pip_package/setup.py +++ b/tensorflow/tools/pip_package/setup.py @@ -54,6 +54,7 @@ REQUIRED_PACKAGES = [ 'numpy >= 1.13.3', 'six >= 1.10.0', 'protobuf >= 3.4.0', + 'setuptools <= 39.1.0', 'tensorboard >= 1.8.0, < 1.9.0', 'termcolor >= 1.1.0', ] -- cgit v1.2.3 From da3f4f86267a42f1a7780222143d79b167a75eb1 Mon Sep 17 00:00:00 2001 From: Amit Patankar Date: Wed, 6 Jun 2018 14:27:59 -0700 Subject: Removing the force downgrade install. --- tensorflow/tools/ci_build/builds/pip.sh | 4 ---- 1 file changed, 4 deletions(-) diff --git a/tensorflow/tools/ci_build/builds/pip.sh b/tensorflow/tools/ci_build/builds/pip.sh index 883bb93647..5fa75e1d61 100755 --- a/tensorflow/tools/ci_build/builds/pip.sh +++ b/tensorflow/tools/ci_build/builds/pip.sh @@ -322,10 +322,6 @@ create_activate_virtualenv_and_install_tensorflow() { pip install -v ${PIP_FLAGS} ${WHL_PATH} || \ die "pip install (forcing to reinstall tensorflow) FAILED" echo "Successfully installed pip package ${TF_WHEEL_PATH}" - - # Force downgrade setuptools. - pip install --upgrade setuptools==39.1.0 - } ################################################################################ -- cgit v1.2.3 From 60cb7f88afda606df2b700ce0bb662f22e1a7709 Mon Sep 17 00:00:00 2001 From: Derek Murray Date: Thu, 7 Jun 2018 12:53:11 -0700 Subject: Consolidate `tf.data` release notes. --- RELEASE.md | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index c1ed69bd45..8f76e7efb4 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -14,8 +14,13 @@ ## Bug Fixes and Other Changes * `tf.data`: - * The `DatasetBase::DebugString()` method is now `const`. - * Added the `tf.contrib.data.sample_from_datasets()` API for randomly sampling from multiple datasets. + * `Dataset.from_generator()` now accepts an `args` list, in order to create nested generators. + * `Dataset.list_files()` now produces determinstic results when `shuffle=False` or a `seed` is passed. + * `tf.contrib.data.sample_from_datasets()` and `tf.contrib.data.choose_from_datasets()` make it easier to sample or deterministically choose elements from multiple datasets. + * `tf.contrib.data.make_csv_dataset()` now supports line breaks in quoted strings, and two infrequently used arguments removed. + * (C++) `DatasetBase::DebugString()` is now `const`. + * (C++) `DatasetBase::MakeIterator()` has been renamed to `DatasetBase::MakeIteratorInternal()`. + * (C++) `IteratorBase::Initialize()` method was added to support raising errors during iterator construction. * Eager Execution: * `tf.keras`: * Move Keras code out of _impl folder and remove API files. @@ -24,8 +29,6 @@ * Accelerated Linear Algebra (XLA): * TensorFlow Debugger (tfdbg) CLI: * `tf.contrib`: - * Add `tf.contrib.data.choose_from_datasets()`. - * `tf.contrib.data.make_csv_dataset()` now supports line breaks in quoted strings. Two arguments were removed from `make_csv_dataset`. * `tf.contrib.framework.zero_initializer` supports ResourceVariable. * Adding "constrained_optimization" to tensorflow/contrib. * Other: @@ -35,7 +38,6 @@ * More consistent GcsFileSystem behavior for certain reads past EOF. * Update benchmark for tf.scan to match ranges across eager and graph modes. * Fixed bug in `tf.reduce_prod gradient` for complex dtypes. - * Add optional `args` argument to `Dataset.from_generator()`. * Allow the use of '.' in variables (e.g. "hparams.parse('a.b=1.0')"), which would previously raise an error. This will correspond to an attribute name with an embedded '.' symbol (e.g. 'a.b'), which can only be accessed indirectly (e.g. through getattr and setattr). To set this up the user will first need to explicitly add the variable to the hparam object (e.g. "hparams.add_hparam(name='a.b', value=0.0)"). * Benchmark for tf.scan in graph and eager modes. * Added complex128 support to FFT, FFT2D, FFT3D, IFFT, IFFT2D, and IFFT3D. @@ -45,7 +47,6 @@ * LinearOperator[1D,2D,3D]Circulant added to `tensorflow.linalg`. * Conv3D, Conv3DBackpropInput, Conv3DBackpropFilter now supports arbitrary. * Added `tf.train.Checkpoint` for reading/writing object-based checkpoints. - * `Dataset.list_files()` now produces determinstic results when `shuffle=False` or a `seed` is passed. * Added LinearOperatorKronecker, a dense-free implementation of the Kronecker Product. * Allow LinearOperator to broadcast. * SavedModelBuilder will now deduplicate asset names that point to files with the same basename and the same contents. Note that this may result in new asset files included in SavedModels in cases where assets with the same name but different contents were previously overwriting each other. -- cgit v1.2.3 From d3b482dadfa1b59ec04ee668ebd899e6bcb4b7b8 Mon Sep 17 00:00:00 2001 From: Shanqing Cai Date: Fri, 8 Jun 2018 14:55:26 -0400 Subject: Update RELEASE.md (r1.9) for tfdbg and XLA --- RELEASE.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 8f76e7efb4..879ce6e440 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -26,8 +26,7 @@ * Move Keras code out of _impl folder and remove API files. * `tf.keras.Model.save_weights` now saves in TensorFlow format by default. * Enable dataset iterators to be passed to `tf.keras.Model` training/eval methods. -* Accelerated Linear Algebra (XLA): -* TensorFlow Debugger (tfdbg) CLI: +* TensorFlow Debugger (tfdbg) CLI: fix an issue in which the TensorBoard Debugger Plugin could not handle total source file size exceeding gRPC message size limit (4 MB). * `tf.contrib`: * `tf.contrib.framework.zero_initializer` supports ResourceVariable. * Adding "constrained_optimization" to tensorflow/contrib. -- cgit v1.2.3 From a08c8a79f3d0ea5a7fac74d8f5e9da5def89170b Mon Sep 17 00:00:00 2001 From: Mark Daoust Date: Mon, 4 Jun 2018 11:11:06 -0700 Subject: Fix visibility for tf.keras.__version__ PiperOrigin-RevId: 199161696 --- tensorflow/python/keras/__init__.py | 4 ++++ tensorflow/python/keras/integration_test.py | 3 +++ 2 files changed, 7 insertions(+) diff --git a/tensorflow/python/keras/__init__.py b/tensorflow/python/keras/__init__.py index 197f306097..3493069a5b 100644 --- a/tensorflow/python/keras/__init__.py +++ b/tensorflow/python/keras/__init__.py @@ -41,8 +41,12 @@ from tensorflow.python.keras.layers import Input from tensorflow.python.keras.models import Model from tensorflow.python.keras.models import Sequential +from tensorflow.python.util.tf_export import tf_export + __version__ = '2.1.6-tf' +tf_export('keras.__version__').export_constant(__name__, '__version__') + del absolute_import del division del print_function diff --git a/tensorflow/python/keras/integration_test.py b/tensorflow/python/keras/integration_test.py index 2e83544d97..2a05699407 100644 --- a/tensorflow/python/keras/integration_test.py +++ b/tensorflow/python/keras/integration_test.py @@ -29,6 +29,9 @@ from tensorflow.python.platform import test class KerasIntegrationTest(test.TestCase): + def test_version(self): + self.assertTrue(keras.__version__.endswith('-tf')) + def test_vector_classification_sequential(self): with self.test_session(): np.random.seed(1337) -- cgit v1.2.3 From 0eac1ebafc1e16e6440658d6b431998f3e682bbc Mon Sep 17 00:00:00 2001 From: Francois Chollet Date: Mon, 4 Jun 2018 14:46:38 -0700 Subject: Add various missing aliases for symbols in tf.keras submodules. PiperOrigin-RevId: 199198086 --- tensorflow/python/keras/losses.py | 35 ++++++++++++++--- tensorflow/python/ops/init_ops.py | 21 +++++++---- .../tensorflow.keras.initializers.constant.pbtxt | 18 +++++++++ .../tensorflow.keras.initializers.identity.pbtxt | 18 +++++++++ .../tensorflow.keras.initializers.normal.pbtxt | 18 +++++++++ .../tensorflow.keras.initializers.ones.pbtxt | 18 +++++++++ .../tensorflow.keras.initializers.orthogonal.pbtxt | 18 +++++++++ .../api/golden/tensorflow.keras.initializers.pbtxt | 40 ++++++++++++++++++++ ...nsorflow.keras.initializers.random_normal.pbtxt | 18 +++++++++ ...sorflow.keras.initializers.random_uniform.pbtxt | 18 +++++++++ ...rflow.keras.initializers.truncated_normal.pbtxt | 18 +++++++++ .../tensorflow.keras.initializers.uniform.pbtxt | 18 +++++++++ .../tensorflow.keras.initializers.zeros.pbtxt | 18 +++++++++ .../tools/api/golden/tensorflow.keras.losses.pbtxt | 44 ++++++++++++++++++++++ .../api/golden/tensorflow.keras.metrics.pbtxt | 44 ++++++++++++++++++++++ 15 files changed, 350 insertions(+), 14 deletions(-) create mode 100644 tensorflow/tools/api/golden/tensorflow.keras.initializers.constant.pbtxt create mode 100644 tensorflow/tools/api/golden/tensorflow.keras.initializers.identity.pbtxt create mode 100644 tensorflow/tools/api/golden/tensorflow.keras.initializers.normal.pbtxt create mode 100644 tensorflow/tools/api/golden/tensorflow.keras.initializers.ones.pbtxt create mode 100644 tensorflow/tools/api/golden/tensorflow.keras.initializers.orthogonal.pbtxt create mode 100644 tensorflow/tools/api/golden/tensorflow.keras.initializers.random_normal.pbtxt create mode 100644 tensorflow/tools/api/golden/tensorflow.keras.initializers.random_uniform.pbtxt create mode 100644 tensorflow/tools/api/golden/tensorflow.keras.initializers.truncated_normal.pbtxt create mode 100644 tensorflow/tools/api/golden/tensorflow.keras.initializers.uniform.pbtxt create mode 100644 tensorflow/tools/api/golden/tensorflow.keras.initializers.zeros.pbtxt diff --git a/tensorflow/python/keras/losses.py b/tensorflow/python/keras/losses.py index d82ebd9c31..9f548bfe04 100644 --- a/tensorflow/python/keras/losses.py +++ b/tensorflow/python/keras/losses.py @@ -30,19 +30,31 @@ from tensorflow.python.util.tf_export import tf_export @tf_export('keras.metrics.mean_squared_error', - 'keras.losses.mean_squared_error') + 'keras.metrics.mse', + 'keras.metrics.MSE', + 'keras.losses.mean_squared_error', + 'keras.losses.mse', + 'keras.losses.MSE') def mean_squared_error(y_true, y_pred): return K.mean(math_ops.square(y_pred - y_true), axis=-1) @tf_export('keras.metrics.mean_absolute_error', - 'keras.losses.mean_absolute_error') + 'keras.metrics.mae', + 'keras.metrics.MAE', + 'keras.losses.mean_absolute_error', + 'keras.losses.mae', + 'keras.losses.MAE') def mean_absolute_error(y_true, y_pred): return K.mean(math_ops.abs(y_pred - y_true), axis=-1) @tf_export('keras.metrics.mean_absolute_percentage_error', - 'keras.losses.mean_absolute_percentage_error') + 'keras.metrics.mape', + 'keras.metrics.MAPE', + 'keras.losses.mean_absolute_percentage_error', + 'keras.losses.mape', + 'keras.losses.MAPE') def mean_absolute_percentage_error(y_true, y_pred): diff = math_ops.abs( (y_true - y_pred) / K.clip(math_ops.abs(y_true), K.epsilon(), None)) @@ -50,7 +62,11 @@ def mean_absolute_percentage_error(y_true, y_pred): @tf_export('keras.metrics.mean_squared_logarithmic_error', - 'keras.losses.mean_squared_logarithmic_error') + 'keras.metrics.msle', + 'keras.metrics.MSLE', + 'keras.losses.mean_squared_logarithmic_error', + 'keras.losses.msle', + 'keras.losses.MSLE') def mean_squared_logarithmic_error(y_true, y_pred): first_log = math_ops.log(K.clip(y_pred, K.epsilon(), None) + 1.) second_log = math_ops.log(K.clip(y_true, K.epsilon(), None) + 1.) @@ -117,7 +133,11 @@ def binary_crossentropy(y_true, y_pred): @tf_export('keras.metrics.kullback_leibler_divergence', - 'keras.losses.kullback_leibler_divergence') + 'keras.metrics.kld', + 'keras.metrics.KLD', + 'keras.losses.kullback_leibler_divergence', + 'keras.losses.kld', + 'keras.losses.KLD') def kullback_leibler_divergence(y_true, y_pred): y_true = K.clip(y_true, K.epsilon(), 1) y_pred = K.clip(y_pred, K.epsilon(), 1) @@ -129,7 +149,10 @@ def poisson(y_true, y_pred): return K.mean(y_pred - y_true * math_ops.log(y_pred + K.epsilon()), axis=-1) -@tf_export('keras.metrics.cosine_proximity', 'keras.losses.cosine_proximity') +@tf_export('keras.metrics.cosine_proximity', + 'keras.metrics.cosine', + 'keras.losses.cosine_proximity', + 'keras.losses.cosine') def cosine_proximity(y_true, y_pred): y_true = nn.l2_normalize(y_true, axis=-1) y_pred = nn.l2_normalize(y_pred, axis=-1) diff --git a/tensorflow/python/ops/init_ops.py b/tensorflow/python/ops/init_ops.py index 1f8d8dc4f3..2df230d470 100644 --- a/tensorflow/python/ops/init_ops.py +++ b/tensorflow/python/ops/init_ops.py @@ -86,7 +86,7 @@ class Initializer(object): @tf_export("keras.initializers.Zeros", "initializers.zeros", - "zeros_initializer") + "zeros_initializer", "keras.initializers.zeros") class Zeros(Initializer): """Initializer that generates tensors initialized to 0.""" @@ -102,7 +102,8 @@ class Zeros(Initializer): return {"dtype": self.dtype.name} -@tf_export("keras.initializers.Ones", "initializers.ones", "ones_initializer") +@tf_export("keras.initializers.Ones", "initializers.ones", "ones_initializer", + "keras.initializers.ones") class Ones(Initializer): """Initializer that generates tensors initialized to 1.""" @@ -119,7 +120,7 @@ class Ones(Initializer): @tf_export("keras.initializers.Constant", "initializers.constant", - "constant_initializer") + "constant_initializer", "keras.initializers.constant") class Constant(Initializer): """Initializer that generates tensors with constant values. @@ -225,7 +226,8 @@ class Constant(Initializer): @tf_export("keras.initializers.RandomUniform", "initializers.random_uniform", - "random_uniform_initializer") + "random_uniform_initializer", "keras.initializers.uniform", + "keras.initializers.random_uniform") class RandomUniform(Initializer): """Initializer that generates tensors with a uniform distribution. @@ -262,7 +264,8 @@ class RandomUniform(Initializer): @tf_export("keras.initializers.RandomNormal", "initializers.random_normal", - "random_normal_initializer") + "random_normal_initializer", "keras.initializers.normal", + "keras.initializers.random_normal") class RandomNormal(Initializer): """Initializer that generates tensors with a normal distribution. @@ -299,7 +302,8 @@ class RandomNormal(Initializer): @tf_export("keras.initializers.TruncatedNormal", - "initializers.truncated_normal", "truncated_normal_initializer") + "initializers.truncated_normal", "truncated_normal_initializer", + "keras.initializers.truncated_normal") class TruncatedNormal(Initializer): """Initializer that generates a truncated normal distribution. @@ -482,7 +486,7 @@ class VarianceScaling(Initializer): @tf_export("keras.initializers.Orthogonal", "initializers.orthogonal", - "orthogonal_initializer") + "orthogonal_initializer", "keras.initializers.orthogonal") class Orthogonal(Initializer): """Initializer that generates an orthogonal matrix. @@ -1062,7 +1066,8 @@ class ConvolutionOrthogonal3D(ConvolutionOrthogonal): return self._dict_to_tensor(p, ksize, ksize, ksize) -@tf_export("keras.initializers.Identity", "initializers.identity") +@tf_export("keras.initializers.Identity", "initializers.identity", + "keras.initializers.identity") class Identity(Initializer): """Initializer that generates the identity matrix. diff --git a/tensorflow/tools/api/golden/tensorflow.keras.initializers.constant.pbtxt b/tensorflow/tools/api/golden/tensorflow.keras.initializers.constant.pbtxt new file mode 100644 index 0000000000..bddc37b907 --- /dev/null +++ b/tensorflow/tools/api/golden/tensorflow.keras.initializers.constant.pbtxt @@ -0,0 +1,18 @@ +path: "tensorflow.keras.initializers.constant" +tf_class { + is_instance: "" + is_instance: "" + is_instance: "" + member_method { + name: "__init__" + argspec: "args=[\'self\', \'value\', \'dtype\', \'verify_shape\'], varargs=None, keywords=None, defaults=[\'0\', \"\", \'False\'], " + } + member_method { + name: "from_config" + argspec: "args=[\'cls\', \'config\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "get_config" + argspec: "args=[\'self\'], varargs=None, keywords=None, defaults=None" + } +} diff --git a/tensorflow/tools/api/golden/tensorflow.keras.initializers.identity.pbtxt b/tensorflow/tools/api/golden/tensorflow.keras.initializers.identity.pbtxt new file mode 100644 index 0000000000..a4c5a61490 --- /dev/null +++ b/tensorflow/tools/api/golden/tensorflow.keras.initializers.identity.pbtxt @@ -0,0 +1,18 @@ +path: "tensorflow.keras.initializers.identity" +tf_class { + is_instance: "" + is_instance: "" + is_instance: "" + member_method { + name: "__init__" + argspec: "args=[\'self\', \'gain\', \'dtype\'], varargs=None, keywords=None, defaults=[\'1.0\', \"\"], " + } + member_method { + name: "from_config" + argspec: "args=[\'cls\', \'config\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "get_config" + argspec: "args=[\'self\'], varargs=None, keywords=None, defaults=None" + } +} diff --git a/tensorflow/tools/api/golden/tensorflow.keras.initializers.normal.pbtxt b/tensorflow/tools/api/golden/tensorflow.keras.initializers.normal.pbtxt new file mode 100644 index 0000000000..7485772784 --- /dev/null +++ b/tensorflow/tools/api/golden/tensorflow.keras.initializers.normal.pbtxt @@ -0,0 +1,18 @@ +path: "tensorflow.keras.initializers.normal" +tf_class { + is_instance: "" + is_instance: "" + is_instance: "" + member_method { + name: "__init__" + argspec: "args=[\'self\', \'mean\', \'stddev\', \'seed\', \'dtype\'], varargs=None, keywords=None, defaults=[\'0.0\', \'1.0\', \'None\', \"\"], " + } + member_method { + name: "from_config" + argspec: "args=[\'cls\', \'config\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "get_config" + argspec: "args=[\'self\'], varargs=None, keywords=None, defaults=None" + } +} diff --git a/tensorflow/tools/api/golden/tensorflow.keras.initializers.ones.pbtxt b/tensorflow/tools/api/golden/tensorflow.keras.initializers.ones.pbtxt new file mode 100644 index 0000000000..a89f78d1e1 --- /dev/null +++ b/tensorflow/tools/api/golden/tensorflow.keras.initializers.ones.pbtxt @@ -0,0 +1,18 @@ +path: "tensorflow.keras.initializers.ones" +tf_class { + is_instance: "" + is_instance: "" + is_instance: "" + member_method { + name: "__init__" + argspec: "args=[\'self\', \'dtype\'], varargs=None, keywords=None, defaults=[\"\"], " + } + member_method { + name: "from_config" + argspec: "args=[\'cls\', \'config\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "get_config" + argspec: "args=[\'self\'], varargs=None, keywords=None, defaults=None" + } +} diff --git a/tensorflow/tools/api/golden/tensorflow.keras.initializers.orthogonal.pbtxt b/tensorflow/tools/api/golden/tensorflow.keras.initializers.orthogonal.pbtxt new file mode 100644 index 0000000000..ee1e9bbae2 --- /dev/null +++ b/tensorflow/tools/api/golden/tensorflow.keras.initializers.orthogonal.pbtxt @@ -0,0 +1,18 @@ +path: "tensorflow.keras.initializers.orthogonal" +tf_class { + is_instance: "" + is_instance: "" + is_instance: "" + member_method { + name: "__init__" + argspec: "args=[\'self\', \'gain\', \'seed\', \'dtype\'], varargs=None, keywords=None, defaults=[\'1.0\', \'None\', \"\"], " + } + member_method { + name: "from_config" + argspec: "args=[\'cls\', \'config\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "get_config" + argspec: "args=[\'self\'], varargs=None, keywords=None, defaults=None" + } +} diff --git a/tensorflow/tools/api/golden/tensorflow.keras.initializers.pbtxt b/tensorflow/tools/api/golden/tensorflow.keras.initializers.pbtxt index 093c56595b..14a667870d 100644 --- a/tensorflow/tools/api/golden/tensorflow.keras.initializers.pbtxt +++ b/tensorflow/tools/api/golden/tensorflow.keras.initializers.pbtxt @@ -40,6 +40,46 @@ tf_module { name: "Zeros" mtype: "" } + member { + name: "constant" + mtype: "" + } + member { + name: "identity" + mtype: "" + } + member { + name: "normal" + mtype: "" + } + member { + name: "ones" + mtype: "" + } + member { + name: "orthogonal" + mtype: "" + } + member { + name: "random_normal" + mtype: "" + } + member { + name: "random_uniform" + mtype: "" + } + member { + name: "truncated_normal" + mtype: "" + } + member { + name: "uniform" + mtype: "" + } + member { + name: "zeros" + mtype: "" + } member_method { name: "deserialize" argspec: "args=[\'config\', \'custom_objects\'], varargs=None, keywords=None, defaults=[\'None\'], " diff --git a/tensorflow/tools/api/golden/tensorflow.keras.initializers.random_normal.pbtxt b/tensorflow/tools/api/golden/tensorflow.keras.initializers.random_normal.pbtxt new file mode 100644 index 0000000000..a6df1e87a3 --- /dev/null +++ b/tensorflow/tools/api/golden/tensorflow.keras.initializers.random_normal.pbtxt @@ -0,0 +1,18 @@ +path: "tensorflow.keras.initializers.random_normal" +tf_class { + is_instance: "" + is_instance: "" + is_instance: "" + member_method { + name: "__init__" + argspec: "args=[\'self\', \'mean\', \'stddev\', \'seed\', \'dtype\'], varargs=None, keywords=None, defaults=[\'0.0\', \'1.0\', \'None\', \"\"], " + } + member_method { + name: "from_config" + argspec: "args=[\'cls\', \'config\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "get_config" + argspec: "args=[\'self\'], varargs=None, keywords=None, defaults=None" + } +} diff --git a/tensorflow/tools/api/golden/tensorflow.keras.initializers.random_uniform.pbtxt b/tensorflow/tools/api/golden/tensorflow.keras.initializers.random_uniform.pbtxt new file mode 100644 index 0000000000..37a0fa0d55 --- /dev/null +++ b/tensorflow/tools/api/golden/tensorflow.keras.initializers.random_uniform.pbtxt @@ -0,0 +1,18 @@ +path: "tensorflow.keras.initializers.random_uniform" +tf_class { + is_instance: "" + is_instance: "" + is_instance: "" + member_method { + name: "__init__" + argspec: "args=[\'self\', \'minval\', \'maxval\', \'seed\', \'dtype\'], varargs=None, keywords=None, defaults=[\'0\', \'None\', \'None\', \"\"], " + } + member_method { + name: "from_config" + argspec: "args=[\'cls\', \'config\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "get_config" + argspec: "args=[\'self\'], varargs=None, keywords=None, defaults=None" + } +} diff --git a/tensorflow/tools/api/golden/tensorflow.keras.initializers.truncated_normal.pbtxt b/tensorflow/tools/api/golden/tensorflow.keras.initializers.truncated_normal.pbtxt new file mode 100644 index 0000000000..f97e93f0b7 --- /dev/null +++ b/tensorflow/tools/api/golden/tensorflow.keras.initializers.truncated_normal.pbtxt @@ -0,0 +1,18 @@ +path: "tensorflow.keras.initializers.truncated_normal" +tf_class { + is_instance: "" + is_instance: "" + is_instance: "" + member_method { + name: "__init__" + argspec: "args=[\'self\', \'mean\', \'stddev\', \'seed\', \'dtype\'], varargs=None, keywords=None, defaults=[\'0.0\', \'1.0\', \'None\', \"\"], " + } + member_method { + name: "from_config" + argspec: "args=[\'cls\', \'config\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "get_config" + argspec: "args=[\'self\'], varargs=None, keywords=None, defaults=None" + } +} diff --git a/tensorflow/tools/api/golden/tensorflow.keras.initializers.uniform.pbtxt b/tensorflow/tools/api/golden/tensorflow.keras.initializers.uniform.pbtxt new file mode 100644 index 0000000000..58186b1383 --- /dev/null +++ b/tensorflow/tools/api/golden/tensorflow.keras.initializers.uniform.pbtxt @@ -0,0 +1,18 @@ +path: "tensorflow.keras.initializers.uniform" +tf_class { + is_instance: "" + is_instance: "" + is_instance: "" + member_method { + name: "__init__" + argspec: "args=[\'self\', \'minval\', \'maxval\', \'seed\', \'dtype\'], varargs=None, keywords=None, defaults=[\'0\', \'None\', \'None\', \"\"], " + } + member_method { + name: "from_config" + argspec: "args=[\'cls\', \'config\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "get_config" + argspec: "args=[\'self\'], varargs=None, keywords=None, defaults=None" + } +} diff --git a/tensorflow/tools/api/golden/tensorflow.keras.initializers.zeros.pbtxt b/tensorflow/tools/api/golden/tensorflow.keras.initializers.zeros.pbtxt new file mode 100644 index 0000000000..a262390687 --- /dev/null +++ b/tensorflow/tools/api/golden/tensorflow.keras.initializers.zeros.pbtxt @@ -0,0 +1,18 @@ +path: "tensorflow.keras.initializers.zeros" +tf_class { + is_instance: "" + is_instance: "" + is_instance: "" + member_method { + name: "__init__" + argspec: "args=[\'self\', \'dtype\'], varargs=None, keywords=None, defaults=[\"\"], " + } + member_method { + name: "from_config" + argspec: "args=[\'cls\', \'config\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "get_config" + argspec: "args=[\'self\'], varargs=None, keywords=None, defaults=None" + } +} diff --git a/tensorflow/tools/api/golden/tensorflow.keras.losses.pbtxt b/tensorflow/tools/api/golden/tensorflow.keras.losses.pbtxt index ae5f6305b7..eca6b91538 100644 --- a/tensorflow/tools/api/golden/tensorflow.keras.losses.pbtxt +++ b/tensorflow/tools/api/golden/tensorflow.keras.losses.pbtxt @@ -1,5 +1,25 @@ path: "tensorflow.keras.losses" tf_module { + member_method { + name: "KLD" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "MAE" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "MAPE" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "MSE" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "MSLE" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } member_method { name: "binary_crossentropy" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" @@ -12,6 +32,10 @@ tf_module { name: "categorical_hinge" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" } + member_method { + name: "cosine" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } member_method { name: "cosine_proximity" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" @@ -28,6 +52,10 @@ tf_module { name: "hinge" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" } + member_method { + name: "kld" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } member_method { name: "kullback_leibler_divergence" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" @@ -36,6 +64,14 @@ tf_module { name: "logcosh" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" } + member_method { + name: "mae" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "mape" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } member_method { name: "mean_absolute_error" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" @@ -52,6 +88,14 @@ tf_module { name: "mean_squared_logarithmic_error" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" } + member_method { + name: "mse" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "msle" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } member_method { name: "poisson" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" diff --git a/tensorflow/tools/api/golden/tensorflow.keras.metrics.pbtxt b/tensorflow/tools/api/golden/tensorflow.keras.metrics.pbtxt index 42729e4237..a97a9b5758 100644 --- a/tensorflow/tools/api/golden/tensorflow.keras.metrics.pbtxt +++ b/tensorflow/tools/api/golden/tensorflow.keras.metrics.pbtxt @@ -1,5 +1,25 @@ path: "tensorflow.keras.metrics" tf_module { + member_method { + name: "KLD" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "MAE" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "MAPE" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "MSE" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "MSLE" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } member_method { name: "binary_accuracy" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" @@ -16,6 +36,10 @@ tf_module { name: "categorical_crossentropy" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" } + member_method { + name: "cosine" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } member_method { name: "cosine_proximity" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" @@ -32,10 +56,22 @@ tf_module { name: "hinge" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" } + member_method { + name: "kld" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } member_method { name: "kullback_leibler_divergence" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" } + member_method { + name: "mae" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "mape" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } member_method { name: "mean_absolute_error" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" @@ -52,6 +88,14 @@ tf_module { name: "mean_squared_logarithmic_error" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" } + member_method { + name: "mse" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } + member_method { + name: "msle" + argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" + } member_method { name: "poisson" argspec: "args=[\'y_true\', \'y_pred\'], varargs=None, keywords=None, defaults=None" -- cgit v1.2.3 From 7c33a7751d77cfd70a5c441da369440f4f6b633a Mon Sep 17 00:00:00 2001 From: Pavithra Vijay Date: Thu, 7 Jun 2018 09:20:57 -0700 Subject: Fix bug due to incorrect nesting of return statement in eager iterator evaluation. PiperOrigin-RevId: 199645638 --- tensorflow/python/keras/engine/training_eager.py | 10 ++-- .../python/keras/engine/training_eager_test.py | 54 ++++++++++++++++++++++ 2 files changed, 59 insertions(+), 5 deletions(-) diff --git a/tensorflow/python/keras/engine/training_eager.py b/tensorflow/python/keras/engine/training_eager.py index 081e46aa66..a70b488f25 100644 --- a/tensorflow/python/keras/engine/training_eager.py +++ b/tensorflow/python/keras/engine/training_eager.py @@ -501,11 +501,11 @@ def iterator_test_loop(model, inputs, steps, verbose=0): if verbose == 1: progbar.update(step_index + 1) - for i in range(len(outs)): - outs[i] /= num_samples - if len(outs) == 1: - return outs[0] - return outs + for i in range(len(outs)): + outs[i] /= num_samples + if len(outs) == 1: + return outs[0] + return outs def batch_test_loop(model, diff --git a/tensorflow/python/keras/engine/training_eager_test.py b/tensorflow/python/keras/engine/training_eager_test.py index d9446fd437..7906d208eb 100644 --- a/tensorflow/python/keras/engine/training_eager_test.py +++ b/tensorflow/python/keras/engine/training_eager_test.py @@ -20,6 +20,7 @@ from __future__ import print_function import numpy as np +from tensorflow.python.data.ops import dataset_ops from tensorflow.python import keras from tensorflow.python.framework import ops from tensorflow.python.framework import test_util as tf_test_util @@ -670,6 +671,59 @@ class CorrectnessTest(test.TestCase): outs = model.evaluate(x, y) self.assertEqual(outs[1], 0.) + @tf_test_util.run_in_graph_and_eager_modes() + def test_loss_correctness_with_iterator(self): + # Test that training loss is the same in eager and graph + # (by comparing it to a reference value in a deterministic case) + model = keras.Sequential() + model.add( + keras.layers.Dense( + 3, activation='relu', input_dim=4, kernel_initializer='ones')) + model.add( + keras.layers.Dense(2, activation='softmax', kernel_initializer='ones')) + model.compile( + loss='sparse_categorical_crossentropy', + optimizer=RMSPropOptimizer(learning_rate=0.001)) + x = np.ones((100, 4), dtype=np.float32) + np.random.seed(123) + y = np.random.randint(0, 1, size=(100, 1)) + dataset = dataset_ops.Dataset.from_tensor_slices((x, y)) + dataset = dataset.repeat(100) + dataset = dataset.batch(10) + iterator = dataset.make_one_shot_iterator() + history = model.fit(iterator, epochs=1, steps_per_epoch=10) + self.assertEqual(np.around(history.history['loss'][-1], decimals=4), 0.6173) + + @tf_test_util.run_in_graph_and_eager_modes() + def test_metrics_correctness_with_iterator(self): + model = keras.Sequential() + model.add( + keras.layers.Dense( + 8, activation='relu', input_dim=4, kernel_initializer='ones')) + model.add( + keras.layers.Dense(1, activation='sigmoid', kernel_initializer='ones')) + model.compile( + loss='binary_crossentropy', + metrics=['accuracy'], + optimizer=RMSPropOptimizer(learning_rate=0.001)) + np.random.seed(123) + x = np.random.randint(10, size=(100, 4)).astype(np.float32) + y = np.random.randint(2, size=(100, 1)).astype(np.float32) + dataset = dataset_ops.Dataset.from_tensor_slices((x, y)) + dataset = dataset.batch(10) + iterator = dataset.make_one_shot_iterator() + outs = model.evaluate(iterator, steps=10) + self.assertEqual(np.around(outs[1], decimals=1), 0.5) + + y = np.zeros((100, 1), dtype=np.float32) + dataset = dataset_ops.Dataset.from_tensor_slices((x, y)) + dataset = dataset.repeat(100) + dataset = dataset.batch(10) + iterator = dataset.make_one_shot_iterator() + outs = model.evaluate(iterator, steps=10) + self.assertEqual(outs[1], 0.) + + if __name__ == '__main__': ops.enable_eager_execution() test.main() -- cgit v1.2.3 From 5177fd2f9acb9b46b9182ad782bb8b7b9386baeb Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Tue, 5 Jun 2018 15:59:21 -0700 Subject: Only calls compare function if values were read from event file PiperOrigin-RevId: 199373169 --- tensorflow/python/estimator/exporter.py | 7 +++--- tensorflow/python/estimator/exporter_test.py | 34 ++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+), 3 deletions(-) diff --git a/tensorflow/python/estimator/exporter.py b/tensorflow/python/estimator/exporter.py index a7212bb83e..766ea23f2a 100644 --- a/tensorflow/python/estimator/exporter.py +++ b/tensorflow/python/estimator/exporter.py @@ -360,9 +360,10 @@ class BestExporter(Exporter): for value in event.summary.value: if value.HasField('simple_value'): event_eval_result[value.tag] = value.simple_value - if best_eval_result is None or self._compare_fn( - best_eval_result, event_eval_result): - best_eval_result = event_eval_result + if event_eval_result: + if best_eval_result is None or self._compare_fn( + best_eval_result, event_eval_result): + best_eval_result = event_eval_result return best_eval_result diff --git a/tensorflow/python/estimator/exporter_test.py b/tensorflow/python/estimator/exporter_test.py index 4cb4bffc8d..c4b006955c 100644 --- a/tensorflow/python/estimator/exporter_test.py +++ b/tensorflow/python/estimator/exporter_test.py @@ -148,6 +148,40 @@ class BestExporterTest(test.TestCase): "checkpoint_path", {"loss": 20}, False) self.assertEqual(None, export_result) + def test_best_exporter_with_empty_event(self): + + def _serving_input_receiver_fn(): + pass + + export_dir_base = tempfile.mkdtemp() + gfile.MkDir(export_dir_base) + gfile.MkDir(export_dir_base + "/export") + gfile.MkDir(export_dir_base + "/eval") + + eval_dir_base = os.path.join(export_dir_base, "eval_continuous") + estimator_lib._write_dict_to_summary(eval_dir_base, {}, 1) + estimator_lib._write_dict_to_summary(eval_dir_base, {"loss": 60}, 2) + + exporter = exporter_lib.BestExporter( + name="best_exporter", + serving_input_receiver_fn=_serving_input_receiver_fn, + event_file_pattern="eval_continuous/*.tfevents.*", + assets_extra={"from/path": "to/path"}, + as_text=False, + exports_to_keep=1) + + estimator = test.mock.Mock(spec=estimator_lib.Estimator) + estimator.model_dir = export_dir_base + estimator.export_savedmodel.return_value = "export_result_path" + + export_result = exporter.export(estimator, export_dir_base, + "checkpoint_path", {"loss": 100}, False) + self.assertEqual(None, export_result) + + export_result = exporter.export(estimator, export_dir_base, + "checkpoint_path", {"loss": 10}, False) + self.assertEqual("export_result_path", export_result) + def test_garbage_collect_exports(self): export_dir_base = tempfile.mkdtemp() gfile.MkDir(export_dir_base) -- cgit v1.2.3 From 4fe8d4a14936dc38558a858283574993909c9895 Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Sun, 27 May 2018 10:49:12 -0700 Subject: TPUEstimator.export_savedmodel() saves a SavedModel with both TPU and CPU graphs. PiperOrigin-RevId: 198229550 --- tensorflow/contrib/tpu/python/tpu/tpu_estimator.py | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py b/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py index 4465833f88..c8c08a5a63 100644 --- a/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py +++ b/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py @@ -1807,7 +1807,7 @@ class TPUEstimator(estimator_lib.Estimator): export_outputs['classes'] = export_output_lib.ClassificationOutput(classes=classes) - tpu.outside_compilation(host_call, logits) + tpu.outside_compilation(host_call, [logits]) ... ``` @@ -1969,7 +1969,7 @@ class TPUEstimator(estimator_lib.Estimator): input_receiver_fn_map[mode]} export_tags = [tag_constants.SERVING, tag_constants.TPU] mode = _REWRITE_FOR_INFERENCE_MODE - try: + if self._export_to_tpu: (super(TPUEstimator, self). _add_meta_graph_for_mode(builder, input_receiver_fn_map, @@ -1978,9 +1978,6 @@ class TPUEstimator(estimator_lib.Estimator): save_variables=False, mode=mode, export_tags=export_tags)) - except Exception as error: # pylint: disable=broad-except - logging.warning('Saving meta graph for TPU failed: {}.' - .format(str(error))) def _call_model_fn(self, features, labels, mode, config): if mode == _REWRITE_FOR_INFERENCE_MODE: -- cgit v1.2.3 -- cgit v1.2.3 From 982f3e3038f8d07964b2c58843a51bd9745a8990 Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Fri, 1 Jun 2018 16:32:20 -0700 Subject: Allow user to opt out of saving metagraph for TPU with TPUEstimator.export_output(). PiperOrigin-RevId: 198944144 --- tensorflow/contrib/tpu/python/tpu/tpu_estimator.py | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py b/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py index c8c08a5a63..7c770912b4 100644 --- a/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py +++ b/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py @@ -1830,6 +1830,7 @@ class TPUEstimator(estimator_lib.Estimator): predict_batch_size=None, batch_axis=None, eval_on_tpu=True, + export_to_tpu=True, warm_start_from=None): """Constructs an `TPUEstimator` instance. @@ -1872,6 +1873,8 @@ class TPUEstimator(estimator_lib.Estimator): False or `PER_HOST_V2`, batch_axis is ignored. eval_on_tpu: If False, evaluation runs on CPU or GPU. In this case, the model_fn must return `EstimatorSpec` when called with `mode` as `EVAL`. + export_to_tpu: If True, `export_savedmodel()` exports a metagraph for + serving on TPU besides the one on CPU. warm_start_from: Optional string filepath to a checkpoint or SavedModel to warm-start from, or a `tf.estimator.WarmStartSettings` object to fully configure warm-starting. If the string @@ -1943,6 +1946,8 @@ class TPUEstimator(estimator_lib.Estimator): use_tpu, eval_on_tpu) + self._export_to_tpu = export_to_tpu + self._is_input_fn_invoked = None def _add_meta_graph_for_mode(self, @@ -1965,11 +1970,11 @@ class TPUEstimator(estimator_lib.Estimator): save_variables, mode=mode) - input_receiver_fn_map = {_REWRITE_FOR_INFERENCE_MODE: - input_receiver_fn_map[mode]} - export_tags = [tag_constants.SERVING, tag_constants.TPU] - mode = _REWRITE_FOR_INFERENCE_MODE if self._export_to_tpu: + input_receiver_fn_map = {_REWRITE_FOR_INFERENCE_MODE: + input_receiver_fn_map[mode]} + export_tags = [tag_constants.SERVING, tag_constants.TPU] + mode = _REWRITE_FOR_INFERENCE_MODE (super(TPUEstimator, self). _add_meta_graph_for_mode(builder, input_receiver_fn_map, -- cgit v1.2.3 From fd44596bc4b3ea8c67838b728b450a44e35c1b89 Mon Sep 17 00:00:00 2001 From: Anna R Date: Mon, 11 Jun 2018 17:21:06 -0700 Subject: Merging --- tensorflow/tools/api/generator/BUILD | 24 +++++++ .../tools/api/generator/create_python_api.py | 54 +++++++++++++-- tensorflow/tools/api/generator/doc_srcs.py | 65 ++++++++++++++++++ tensorflow/tools/api/generator/doc_srcs_test.py | 80 ++++++++++++++++++++++ 4 files changed, 217 insertions(+), 6 deletions(-) create mode 100644 tensorflow/tools/api/generator/doc_srcs.py create mode 100644 tensorflow/tools/api/generator/doc_srcs_test.py diff --git a/tensorflow/tools/api/generator/BUILD b/tensorflow/tools/api/generator/BUILD index f0c5877a90..3a28153e52 100644 --- a/tensorflow/tools/api/generator/BUILD +++ b/tensorflow/tools/api/generator/BUILD @@ -5,12 +5,21 @@ licenses(["notice"]) # Apache 2.0 exports_files(["LICENSE"]) +load("//tensorflow/tools/api/generator:api_gen.bzl", "TENSORFLOW_API_INIT_FILES") + +py_library( + name = "doc_srcs", + srcs = ["doc_srcs.py"], + srcs_version = "PY2AND3", +) + py_binary( name = "create_python_api", srcs = ["create_python_api.py"], srcs_version = "PY2AND3", visibility = ["//visibility:public"], deps = [ + ":doc_srcs", "//tensorflow/python:no_contrib", ], ) @@ -24,3 +33,18 @@ py_test( "//tensorflow/python:client_testlib", ], ) + +py_test( + name = "tensorflow_doc_srcs_test", + srcs = ["doc_srcs_test.py"], + args = [ + "--package=tensorflow.python", + ] + TENSORFLOW_API_INIT_FILES, + main = "doc_srcs_test.py", + srcs_version = "PY2AND3", + deps = [ + ":doc_srcs", + "//tensorflow/python:client_testlib", + "//tensorflow/python:no_contrib", + ], +) diff --git a/tensorflow/tools/api/generator/create_python_api.py b/tensorflow/tools/api/generator/create_python_api.py index 9f210ad42b..31f287b7fe 100644 --- a/tensorflow/tools/api/generator/create_python_api.py +++ b/tensorflow/tools/api/generator/create_python_api.py @@ -25,6 +25,8 @@ import os import sys from tensorflow.python.util import tf_decorator +from tensorflow.python.util import tf_export +from tensorflow.tools.api.generator import doc_srcs _API_CONSTANTS_ATTR = '_tf_api_constants' @@ -36,10 +38,9 @@ _SYMBOLS_TO_SKIP_EXPLICITLY = { # would have side effects. 'tensorflow.python.platform.flags.FLAGS' } -_GENERATED_FILE_HEADER = """\"\"\"Imports for Python API. - -This file is MACHINE GENERATED! Do not edit. -Generated by: tensorflow/tools/api/generator/create_python_api.py script. +_GENERATED_FILE_HEADER = """# This file is MACHINE GENERATED! Do not edit. +# Generated by: tensorflow/tools/api/generator/create_python_api.py script. +\"\"\"%s \"\"\" from __future__ import print_function @@ -254,6 +255,44 @@ def get_module(dir_path, relative_to_dir): return dir_path.replace('/', '.').strip('.') +def get_module_docstring(module_name, package): + """Get docstring for the given module. + + This method looks for docstring in the following order: + 1. Checks if module has a docstring specified in doc_srcs. + 2. Checks if module has a docstring source module specified + in doc_srcs. If it does, gets docstring from that module. + 3. Checks if module with module_name exists under base package. + If it does, gets docstring from that module. + 4. Returns a default docstring. + + Args: + module_name: module name relative to tensorflow + (excluding 'tensorflow.' prefix) to get a docstring for. + package: Base python package containing python with target tf_export + decorators. + + Returns: + One-line docstring to describe the module. + """ + # Module under base package to get a docstring from. + docstring_module_name = module_name + + if module_name in doc_srcs.TENSORFLOW_DOC_SOURCES: + docsrc = doc_srcs.TENSORFLOW_DOC_SOURCES[module_name] + if docsrc.docstring: + return docsrc.docstring + if docsrc.docstring_module_name: + docstring_module_name = docsrc.docstring_module_name + + docstring_module_name = package + '.' + docstring_module_name + if (docstring_module_name in sys.modules and + sys.modules[docstring_module_name].__doc__): + return sys.modules[docstring_module_name].__doc__ + + return 'Public API for tf.%s namespace.' % module_name + + def create_api_files( output_files, package, root_init_template, output_dir): """Creates __init__.py files for the Python API. @@ -296,7 +335,10 @@ def create_api_files( continue contents = '' if module or not root_init_template: - contents = _GENERATED_FILE_HEADER + text + _GENERATED_FILE_FOOTER + contents = ( + _GENERATED_FILE_HEADER % + get_module_docstring(module, package) + text + + _GENERATED_FILE_FOOTER) else: # Read base init file with open(root_init_template, 'r') as root_init_template_file: @@ -309,7 +351,7 @@ def create_api_files( raise ValueError( 'Missing outputs for python_api_gen genrule:\n%s.' 'Make sure all required outputs are in the ' - 'tensorflow/tools/api/generator/BUILD file.' % + 'tensorflow/tools/api/generator/api_gen.bzl file.' % ',\n'.join(sorted(missing_output_files))) diff --git a/tensorflow/tools/api/generator/doc_srcs.py b/tensorflow/tools/api/generator/doc_srcs.py new file mode 100644 index 0000000000..74f6db98fd --- /dev/null +++ b/tensorflow/tools/api/generator/doc_srcs.py @@ -0,0 +1,65 @@ +# Copyright 2018 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================== +"""Specifies sources of doc strings for API modules.""" +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import collections + + +# Specifies docstring source for a module. +# Only one of docstring or docstring_module_name should be set. +# * If docstring is set, then we will use this docstring when +# for the module. +# * If docstring_module_name is set, then we will copy the docstring +# from docstring source module. +DocSource = collections.namedtuple( + 'DocSource', ['docstring', 'docstring_module_name']) +# Each attribute of DocSource is optional. +DocSource.__new__.__defaults__ = (None,) * len(DocSource._fields) + +TENSORFLOW_DOC_SOURCES = { + 'app': DocSource(docstring_module_name='platform.app'), + 'compat': DocSource(docstring_module_name='util.compat'), + 'distributions': DocSource( + docstring_module_name='ops.distributions.distributions'), + 'bitwise': DocSource(docstring_module_name='ops.bitwise_ops'), + 'errors': DocSource(docstring_module_name='framework.errors'), + 'gfile': DocSource(docstring_module_name='platform.gfile'), + 'graph_util': DocSource(docstring_module_name='framework.graph_util'), + 'image': DocSource(docstring_module_name='ops.image_ops'), + 'keras.estimator': DocSource(docstring_module_name='estimator.keras'), + 'linalg': DocSource(docstring_module_name='ops.linalg_ops'), + 'logging': DocSource(docstring_module_name='ops.logging_ops'), + 'losses': DocSource(docstring_module_name='ops.losses.losses'), + 'manip': DocSource(docstring_module_name='ops.manip_ops'), + 'math': DocSource(docstring_module_name='ops.math_ops'), + 'metrics': DocSource(docstring_module_name='ops.metrics'), + 'nn': DocSource(docstring_module_name='ops.nn_ops'), + 'nn.rnn_cell': DocSource(docstring_module_name='ops.rnn_cell'), + 'python_io': DocSource(docstring_module_name='lib.io.python_io'), + 'resource_loader': DocSource( + docstring_module_name='platform.resource_loader'), + 'sets': DocSource(docstring_module_name='ops.sets'), + 'sparse': DocSource(docstring_module_name='ops.sparse_ops'), + 'spectral': DocSource(docstring_module_name='ops.spectral_ops'), + 'strings': DocSource(docstring_module_name='ops.string_ops'), + 'sysconfig': DocSource(docstring_module_name='platform.sysconfig'), + 'test': DocSource(docstring_module_name='platform.test'), + 'train': DocSource(docstring_module_name='training.training'), + 'train.queue_runner': DocSource( + docstring_module_name='training.queue_runner'), +} diff --git a/tensorflow/tools/api/generator/doc_srcs_test.py b/tensorflow/tools/api/generator/doc_srcs_test.py new file mode 100644 index 0000000000..9ba95a3439 --- /dev/null +++ b/tensorflow/tools/api/generator/doc_srcs_test.py @@ -0,0 +1,80 @@ +# Copyright 2018 The TensorFlow Authors. All Rights Reserved. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ============================================================================= +"""Tests for tensorflow.tools.api.generator.doc_srcs.""" + +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import argparse +import importlib +import sys + +from tensorflow.python.platform import test +from tensorflow.tools.api.generator import doc_srcs + + +FLAGS = None + + +class DocSrcsTest(test.TestCase): + + def testModulesAreValidAPIModules(self): + for module_name in doc_srcs.TENSORFLOW_DOC_SOURCES: + # Convert module_name to corresponding __init__.py file path. + file_path = module_name.replace('.', '/') + if file_path: + file_path += '/' + file_path += '__init__.py' + + if file_path not in FLAGS.outputs: + self.assertFalse('%s is not a valid API module' % module_name) + + def testHaveDocstringOrDocstringModule(self): + for module_name, docsrc in doc_srcs.TENSORFLOW_DOC_SOURCES.items(): + if docsrc.docstring and docsrc.docstring_module_name: + self.assertFalse( + '%s contains DocSource has both a docstring and a ' + 'docstring_module_name. ' + 'Only one of "docstring" or "docstring_module_name" should be set.' + % (module_name)) + + def testDocstringModulesAreValidModules(self): + for _, docsrc in doc_srcs.TENSORFLOW_DOC_SOURCES.items(): + if docsrc.docstring_module_name: + doc_module_name = '.'.join([ + FLAGS.package, docsrc.docstring_module_name]) + if doc_module_name not in sys.modules: + sys.assertFalse( + 'docsources_module %s is not a valid module under %s.' % + (docsrc.docstring_module_name, FLAGS.package)) + + +if __name__ == '__main__': + parser = argparse.ArgumentParser() + parser.add_argument( + 'outputs', metavar='O', type=str, nargs='+', + help='create_python_api output files.') + parser.add_argument( + '--package', type=str, + help='Base package that imports modules containing the target tf_export ' + 'decorators.') + FLAGS, unparsed = parser.parse_known_args() + + importlib.import_module(FLAGS.package) + + # Now update argv, so that unittest library does not get confused. + sys.argv = [sys.argv[0]] + unparsed + test.main() -- cgit v1.2.3 From e042e3e051d3bd6bfb63dfd4ad407a82f7d1dacc Mon Sep 17 00:00:00 2001 From: Anna R Date: Tue, 12 Jun 2018 17:47:58 -0700 Subject: Remove unused tf_export import --- tensorflow/tools/api/generator/create_python_api.py | 1 - 1 file changed, 1 deletion(-) diff --git a/tensorflow/tools/api/generator/create_python_api.py b/tensorflow/tools/api/generator/create_python_api.py index 31f287b7fe..e3ab056efc 100644 --- a/tensorflow/tools/api/generator/create_python_api.py +++ b/tensorflow/tools/api/generator/create_python_api.py @@ -25,7 +25,6 @@ import os import sys from tensorflow.python.util import tf_decorator -from tensorflow.python.util import tf_export from tensorflow.tools.api.generator import doc_srcs -- cgit v1.2.3 From f055a9f2f21154140785b9da7c3b2eae88e65623 Mon Sep 17 00:00:00 2001 From: Brennan Saeta Date: Tue, 12 Jun 2018 18:09:35 -0700 Subject: Check to ensure the Cloud TPU is ready before resolving. Cherry picking this into the TF 1.9 release. PiperOrigin-RevId: 200095692 Previous commit: 32c8013f0ab3feb139648ae759e2d0168fb5dc95 --- .../python/training/tpu_cluster_resolver.py | 3 ++ .../python/training/tpu_cluster_resolver_test.py | 44 ++++++++++++++++++++++ 2 files changed, 47 insertions(+) diff --git a/tensorflow/contrib/cluster_resolver/python/training/tpu_cluster_resolver.py b/tensorflow/contrib/cluster_resolver/python/training/tpu_cluster_resolver.py index 880fca4ea6..935ad5ff37 100644 --- a/tensorflow/contrib/cluster_resolver/python/training/tpu_cluster_resolver.py +++ b/tensorflow/contrib/cluster_resolver/python/training/tpu_cluster_resolver.py @@ -255,6 +255,9 @@ class TPUClusterResolver(ClusterResolver): request = self._service.projects().locations().nodes().get(name=full_name) response = request.execute() + if 'state' in response and response['state'] != 'READY': + raise RuntimeError('TPU "%s" is not yet ready; state: "%s"' % + (self._tpu, response['state'])) if 'health' in response and response['health'] != 'HEALTHY': raise RuntimeError('TPU "%s" is unhealthy: "%s"' % (self._tpu, response['health'])) diff --git a/tensorflow/contrib/cluster_resolver/python/training/tpu_cluster_resolver_test.py b/tensorflow/contrib/cluster_resolver/python/training/tpu_cluster_resolver_test.py index 5fac55fd02..7e002cc72f 100644 --- a/tensorflow/contrib/cluster_resolver/python/training/tpu_cluster_resolver_test.py +++ b/tensorflow/contrib/cluster_resolver/python/training/tpu_cluster_resolver_test.py @@ -157,6 +157,50 @@ class TPUClusterResolverTest(test.TestCase): job { name: 'worker' tasks { key: 0 value: '10.1.2.3:8470' } } """ self._verifyClusterSpecEquality(actual_cluster_spec, expected_proto) + + @mock.patch.object(TPUClusterResolver, '_requestComputeMetadata', + mock_request_compute_metadata) + def testUnhealthyCloudTpu(self): + tpu_map = { + 'projects/test-project/locations/us-central1-c/nodes/test-tpu-1': { + 'ipAddress': '10.1.2.3', + 'port': '8470', + 'health': 'UNHEALTHY' + } + } + + tpu_cluster_resolver = TPUClusterResolver( + project=None, + zone=None, + tpu='test-tpu-1', + coordinator_name=None, + credentials=None, + service=self.mock_service_client(tpu_map=tpu_map)) + + with self.assertRaises(RuntimeError): + tpu_cluster_resolver.cluster_spec() + + @mock.patch.object(TPUClusterResolver, '_requestComputeMetadata', + mock_request_compute_metadata) + def testNotReadyCloudTpu(self): + tpu_map = { + 'projects/test-project/locations/us-central1-c/nodes/test-tpu-1': { + 'ipAddress': '10.1.2.3', + 'port': '8470', + 'state': 'CREATING' + } + } + + tpu_cluster_resolver = TPUClusterResolver( + project=None, + zone=None, + tpu='test-tpu-1', + coordinator_name=None, + credentials=None, + service=self.mock_service_client(tpu_map=tpu_map)) + + with self.assertRaises(RuntimeError): + tpu_cluster_resolver.cluster_spec() def testSimpleSuccessfulRetrieval(self): tpu_map = { -- cgit v1.2.3 From 9a087a42293be8342570039d2c6d329a0589b773 Mon Sep 17 00:00:00 2001 From: Nick Felt Date: Wed, 13 Jun 2018 00:30:09 -0700 Subject: Update tensorboard dependency to 1.9.x --- tensorflow/tools/pip_package/setup.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tensorflow/tools/pip_package/setup.py b/tensorflow/tools/pip_package/setup.py index 97f625e7e9..92a1465cea 100644 --- a/tensorflow/tools/pip_package/setup.py +++ b/tensorflow/tools/pip_package/setup.py @@ -55,7 +55,7 @@ REQUIRED_PACKAGES = [ 'six >= 1.10.0', 'protobuf >= 3.4.0', 'setuptools <= 39.1.0', - 'tensorboard >= 1.8.0, < 1.9.0', + 'tensorboard >= 1.9.0, < 1.10.0', 'termcolor >= 1.1.0', ] -- cgit v1.2.3 From b1d0048f2be83d6c6f7e1be996ef9c8358922aa6 Mon Sep 17 00:00:00 2001 From: Pete Warden Date: Wed, 13 Jun 2018 01:06:50 -0700 Subject: Documentation for Raspberry Pi installation --- tensorflow/docs_src/install/index.md | 2 + tensorflow/docs_src/install/install_raspbian.md | 317 ++++++++++++++++++++++++ 2 files changed, 319 insertions(+) create mode 100644 tensorflow/docs_src/install/install_raspbian.md diff --git a/tensorflow/docs_src/install/index.md b/tensorflow/docs_src/install/index.md index 4f85383925..c2e5a991d4 100644 --- a/tensorflow/docs_src/install/index.md +++ b/tensorflow/docs_src/install/index.md @@ -6,6 +6,7 @@ operating systems: * macOS 10.12.6 (Sierra) or later. * Ubuntu 16.04 or later * Windows 7 or later. + * Raspbian 9.0 or later. Although you might be able to install TensorFlow on other laptop or desktop systems, we only support (and only fix issues in) the preceding configurations. @@ -16,6 +17,7 @@ that enables you to write applications in Python: * @{$install_linux$Installing TensorFlow on Ubuntu} * @{$install_mac$Installing TensorFlow on macOS} * @{$install_windows$Installing TensorFlow on Windows} + * @{$install_raspbian$Installing TensorFlow on a Raspberry Pi} * @{$install_sources$Installing TensorFlow from Sources} Many aspects of the Python TensorFlow API changed from version 0.n to 1.0. diff --git a/tensorflow/docs_src/install/install_raspbian.md b/tensorflow/docs_src/install/install_raspbian.md new file mode 100644 index 0000000000..2f425162a1 --- /dev/null +++ b/tensorflow/docs_src/install/install_raspbian.md @@ -0,0 +1,317 @@ +# Installing TensorFlow on Raspbian + +This guide explains how to install TensorFlow on a Raspberry Pi running +Raspbian. Although these instructions might also work on other Pi variants, we +have only tested (and we only support) these instructions on machines meeting +the following requirements: + +* Raspberry Pi devices running Raspbian 9.0 or higher + +## Determine how to install TensorFlow + +You must pick the mechanism by which you install TensorFlow. The supported +choices are as follows: + +* "Native" pip. +* Cross-compiling from sources. + +**We recommend pip installation.** + +## Installing with native pip + +We have uploaded the TensorFlow binaries to piwheels.org. Therefore, you can +install TensorFlow through pip. + +The [REQUIRED_PACKAGES section of +setup.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/pip_package/setup.py) +lists the packages that pip will install or upgrade. + +### Prerequisite: Python + +In order to install TensorFlow, your system must contain one of the following +Python versions: + +* Python 2.7 +* Python 3.4+ + +If your system does not already have one of the preceding Python versions, +[install](https://wiki.python.org/moin/BeginnersGuide/Download) it now. It +should already be included when Raspbian was installed though, so no extra steps +should be needed. + +### Prerequisite: pip + +[Pip](https://en.wikipedia.org/wiki/Pip_\(package_manager\)) installs and +manages software packages written in Python. If you intend to install with +native pip, then one of the following flavors of pip must be installed on your +system: + +* `pip3`, for Python 3.n (preferred). +* `pip`, for Python 2.7. + +`pip` or `pip3` was probably installed on your system when you installed Python. +To determine whether pip or pip3 is actually installed on your system, issue one +of the following commands: + +
$ pip3 -V # for Python 3.n
+$ pip -V  # for Python 2.7
+ +If it gives the error "Command not found", then the package has not been +installed yet. To install if for the first time, run: + +
$ sudo apt-get install python3-pip # for Python 3.n
+sudo apt-get install python-pip # for Python 2.7
+ +You can find more help on installing and upgrading pip in +[the Raspberry Pi documentation](https://www.raspberrypi.org/documentation/linux/software/python.md). + +### Prerequisite: Atlas + +[Atlas](http://math-atlas.sourceforge.net/) is a linear algebra library that +numpy depends on, and so needs to be installed before TensorFlow. To add it to +your system, run the following command: + +
$ sudo apt install libatlas-base-dev
+ +### Install TensorFlow + +Assuming the prerequisite software is installed on your Pi, install TensorFlow +by invoking **one** of the following commands: + +
 $ pip3 install tensorflow     # Python 3.n
+     $ pip install tensorflow      # Python 2.7
+ +This can take some time on certain platforms like the Pi Zero, where some Python +packages like scipy that TensorFlow depends on need to be compiled before the +installation can complete. The Python 3 version will typically be faster to +install because piwheels.org has pre-built versions of the dependencies +available, so this is our recommended option. + +### Next Steps + +After installing TensorFlow, [validate your +installation](#ValidateYourInstallation) to confirm that the installation worked +properly. + +### Uninstalling TensorFlow + +To uninstall TensorFlow, issue one of following commands: + +
$ pip uninstall tensorflow
+$ pip3 uninstall tensorflow 
+ +## Cross-compiling from sources + +Cross-compilation means building on a different machine than than you'll be +deploying on. Since Raspberry Pi's only have limited RAM and comparatively slow +processors, and TensorFlow has a large amount of source code to compile, it's +easier to use a MacOS or Linux desktop or laptop to handle the build process. +Because it can take over 24 hours to build on a Pi, and requires external swap +space to cope with the memory shortage, we recommend using cross-compilation if +you do need to compile TensorFlow from source. To make the dependency management +process easier, we also recommend using Docker to help simplify building. + +Note that we provide well-tested, pre-built TensorFlow binaries for Raspbian +systems. So, don't build a TensorFlow binary yourself unless you are very +comfortable building complex packages from source and dealing with the +inevitable aftermath should things not go exactly as documented + +### Prerequisite: Docker + +Install Docker on your machine as described in the [Docker +documentation](https://docs.docker.com/engine/installation/#/on-macos-and-windows). + +### Clone the TensorFlow repository + +Start the process of building TensorFlow by cloning a TensorFlow repository. + +To clone **the latest** TensorFlow repository, issue the following command: + +
$ git clone https://github.com/tensorflow/tensorflow 
+ +The preceding git clone command creates a subdirectory named +`tensorflow`. After cloning, you may optionally build a **specific branch** +(such as a release branch) by invoking the following commands: + +
+$ cd tensorflow
+$ git checkout Branch # where Branch is the desired branch
+
+ +For example, to work with the `r1.0` release instead of the master release, +issue the following command: + +
$ git checkout r1.0
+ +### Build from source + +To compile TensorFlow and produce a binary pip can install, do the following: + +1. Start a terminal. +2. Navigate to the directory containing the tensorflow source code. +3. Run a command to cross-compile the library, for example: + +
$ CI_DOCKER_EXTRA_PARAMS="-e CI_BUILD_PYTHON=python3 -e CROSSTOOL_PYTHON_INCLUDE_PATH=/usr/include/python3.4" \
+tensorflow/tools/ci_build/ci_build.sh PI-PYTHON3 tensorflow/tools/ci_build/pi/build_raspberry_pi.sh
+ 
+ +This will build a pip .whl file for Python 3.4, with Arm v7 instructions that +will only work on the Pi models 2 or 3. These NEON instructions are required for +the fastest operation on those devices, but you can build a library that will +run across all Pi devices by passing `PI_ONE` at the end of the command line. +You can also target Python 2.7 by omitting the initial docker parameters. Here's +an example of building for Python 2.7 and Raspberry Pi model Zero or One +devices: + +
$ tensorflow/tools/ci_build/ci_build.sh PI tensorflow/tools/ci_build/pi/build_raspberry_pi.sh PI_ONE
+ +This will take some time to complete, typically twenty or thirty minutes, and +should produce a .whl file in an output-artifacts sub-folder inside your source +tree at the end. This wheel file can be installed through pip or pip3 (depending +on your Python version) by copying it to a Raspberry Pi and running a terminal +command like this (with the name of your actual file substituted): + +
$ pip3 install tensorflow-1.9.0-cp34-none-linux_armv7l.whl
+ +### Troubleshooting the build + +The build script uses Docker internally to create a Linux virtual machine to +handle the compilation. If you do have problems running the script, first check +that you're able to run Docker tests like `docker run hello-world` on your +system. + +If you're building from the latest development branch, try syncing to an older +version that's known to work, for example release 1.9, with a command like this: + +
$ git checkout r1.0
+ + + +## Validate your installation + +To validate your TensorFlow installation, do the following: + +1. Ensure that your environment is prepared to run TensorFlow programs. +2. Run a short TensorFlow program. + +### Prepare your environment + +If you installed on native pip, Virtualenv, or Anaconda, then do the following: + +1. Start a terminal. +2. If you installed TensorFlow source code, navigate to any directory *except* + one containing TensorFlow source code. + +### Run a short TensorFlow program + +Invoke python from your shell as follows: + +
$ python
+ +Enter the following short program inside the python interactive shell: + +```python +# Python +import tensorflow as tf +hello = tf.constant('Hello, TensorFlow!') +sess = tf.Session() +print(sess.run(hello)) +``` + +If the system outputs the following, then you are ready to begin writing +TensorFlow programs: + +
Hello, TensorFlow!
+ +If you're running with Python 3.5, you may see a warning when you first import +TensorFlow. This is not an error, and TensorFlow should continue to run with no +problems, despite the log message. + +If the system outputs an error message instead of a greeting, see [Common +installation problems](#common_installation_problems). + +If you are new to machine learning, we recommend the [Machine Learning Crash +Course](https://developers.google.com/machine-learning/crash-course). + +If you are experienced with machine learning but new to TensorFlow, see +@{$get_started/eager}. + +## Common installation problems + +We are relying on Stack Overflow to document TensorFlow installation problems +and their remedies. The following table contains links to Stack Overflow answers +for some common installation problems. If you encounter an error message or +other installation problem not listed in the following table, search for it on +Stack Overflow. If Stack Overflow doesn't show the error message, ask a new +question about it on Stack Overflow and specify the `tensorflow` tag. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Stack Overflow Link Error Message
42006320
ImportError: Traceback (most recent call last):
+File ".../tensorflow/core/framework/graph_pb2.py", line 6, in 
+from google.protobuf import descriptor as _descriptor
+ImportError: cannot import name 'descriptor'
+
33623453
IOError: [Errno 2] No such file or directory:
+  '/tmp/pip-o6Tpui-build/setup.py'
+
35190574
SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify
+  failed
42009190
+  Installing collected packages: setuptools, protobuf, wheel, numpy, tensorflow
+  Found existing installation: setuptools 1.1.6
+  Uninstalling setuptools-1.1.6:
+  Exception:
+  ...
+  [Errno 1] Operation not permitted:
+  '/tmp/pip-a1DXRT-uninstall/.../lib/python/_markerlib' 
33622019
ImportError: No module named copyreg
37810228During a pip install operation, the system returns: +
OSError: [Errno 1] Operation not permitted
+
33622842An import tensorflow statement triggers an error such as the + following:
Traceback (most recent call last):
+  File "", line 1, in 
+  File "/usr/local/lib/python2.7/site-packages/tensorflow/__init__.py",
+    line 4, in 
+    from tensorflow.python import *
+    ...
+  File "/usr/local/lib/python2.7/site-packages/tensorflow/core/framework/tensor_shape_pb2.py",
+    line 22, in 
+    serialized_pb=_b('\n,tensorflow/core/framework/tensor_shape.proto\x12\ntensorflow\"d\n\x10TensorShapeProto\x12-\n\x03\x64im\x18\x02
+      \x03(\x0b\x32
+      .tensorflow.TensorShapeProto.Dim\x1a!\n\x03\x44im\x12\x0c\n\x04size\x18\x01
+      \x01(\x03\x12\x0c\n\x04name\x18\x02 \x01(\tb\x06proto3')
+  TypeError: __init__() got an unexpected keyword argument 'syntax'
+
-- cgit v1.2.3 From 76b8b01740233ff289d70a0d516c6e0ac0e6b042 Mon Sep 17 00:00:00 2001 From: Allen Lavoie Date: Mon, 11 Jun 2018 11:55:34 -0700 Subject: Use the Keras session for saving/loading in TensorFlow format Fixes issues when there's no default session PiperOrigin-RevId: 200088574 --- tensorflow/python/keras/engine/network.py | 10 ++++-- tensorflow/python/keras/engine/saving_test.py | 52 ++++++++++++++++++--------- 2 files changed, 44 insertions(+), 18 deletions(-) diff --git a/tensorflow/python/keras/engine/network.py b/tensorflow/python/keras/engine/network.py index 9dbf94a276..3d567b8378 100644 --- a/tensorflow/python/keras/engine/network.py +++ b/tensorflow/python/keras/engine/network.py @@ -20,6 +20,7 @@ from __future__ import division from __future__ import print_function import copy +import functools import json import os import weakref @@ -1264,7 +1265,11 @@ class Network(base_layer.Layer): with h5py.File(filepath, 'w') as f: saving.save_weights_to_hdf5_group(f, self.layers) else: - self._checkpointable_saver.save(filepath) + if context.executing_eagerly(): + session = None + else: + session = backend.get_session() + self._checkpointable_saver.save(filepath, session=session) def load_weights(self, filepath, by_name=False): """Loads all layer weights, either from a TensorFlow or an HDF5 weight file. @@ -1324,7 +1329,8 @@ class Network(base_layer.Layer): 'loading TensorFlow-formatted weights (got by_name=True to ' 'load_weights).') if not context.executing_eagerly(): - finalizer = status.run_restore_ops + session = backend.get_session() + finalizer = functools.partial(status.run_restore_ops, session=session) if self.built: finalizer() else: diff --git a/tensorflow/python/keras/engine/saving_test.py b/tensorflow/python/keras/engine/saving_test.py index 30bcd3d185..b5448a9be1 100644 --- a/tensorflow/python/keras/engine/saving_test.py +++ b/tensorflow/python/keras/engine/saving_test.py @@ -404,26 +404,27 @@ class TestWholeModelSaving(test.TestCase): os.remove(fname) def test_saving_lambda_numpy_array_arguments(self): - if h5py is None: - self.skipTest('h5py required to run this test') + with self.test_session(): + if h5py is None: + self.skipTest('h5py required to run this test') - mean = np.random.random((4, 2, 3)) - std = np.abs(np.random.random((4, 2, 3))) + 1e-5 - inputs = keras.layers.Input(shape=(4, 2, 3)) - output = keras.layers.Lambda(lambda image, mu, std: (image - mu) / std, - arguments={'mu': mean, 'std': std})(inputs) - model = keras.models.Model(inputs, output) - model.compile(loss='mse', optimizer='sgd', metrics=['acc']) + mean = np.random.random((4, 2, 3)) + std = np.abs(np.random.random((4, 2, 3))) + 1e-5 + inputs = keras.layers.Input(shape=(4, 2, 3)) + output = keras.layers.Lambda(lambda image, mu, std: (image - mu) / std, + arguments={'mu': mean, 'std': std})(inputs) + model = keras.models.Model(inputs, output) + model.compile(loss='mse', optimizer='sgd', metrics=['acc']) - fd, fname = tempfile.mkstemp('.h5') - keras.models.save_model(model, fname) + fd, fname = tempfile.mkstemp('.h5') + keras.models.save_model(model, fname) - model = keras.models.load_model(fname) - os.close(fd) - os.remove(fname) + model = keras.models.load_model(fname) + os.close(fd) + os.remove(fname) - self.assertAllClose(mean, model.layers[1].arguments['mu']) - self.assertAllClose(std, model.layers[1].arguments['std']) + self.assertAllClose(mean, model.layers[1].arguments['mu']) + self.assertAllClose(std, model.layers[1].arguments['std']) def test_saving_model_with_long_layer_names(self): if h5py is None: @@ -580,6 +581,25 @@ class TestWeightSavingAndLoadingTFFormat(test.TestCase): # Indirectly tests that the user is prompted model.save_weights(prefix, save_format='tensorflow', overwrite=False) + def test_no_default_session(self): + with ops.Graph().as_default(): + self.assertFalse(ops.get_default_session()) + data = np.random.random((1000, 32)).astype(np.float32) + labels = np.random.random((1000, 10)).astype(np.float32) + + model = keras.models.Sequential([ + keras.layers.Dense(10, activation='softmax'), + keras.layers.Dense(10, activation='softmax')]) + + model.compile(optimizer=training_module.RMSPropOptimizer(0.001), + loss='categorical_crossentropy', + metrics=['accuracy']) + + model.fit(data, labels) + fname = os.path.join(self.get_temp_dir(), 'weights', 'ckpt') + model.save_weights(fname) + model.load_weights(fname) + def test_no_graph_pollution(self): with context.graph_mode(): graph = ops.Graph() -- cgit v1.2.3 From 50ba6dd3a182c9578bc10cb2a21d7914a1e7bac1 Mon Sep 17 00:00:00 2001 From: Akshay Modi Date: Mon, 11 Jun 2018 10:42:15 -0700 Subject: Don't call back into python during insert (which will leave the set in a broken condition if the runtime decides to let another thread run). Thank you for finding the bug. The watched_variables_ set should not really require a lock since all our functions hold the GIL (verified by looking at the generated SWIG). The reason that there was a concurrent access to the set is that the insert was calling back into python (which might release the GIL and let another thread run, which will also attempt to insert a variable and break the set). I included the lock to be safe though, since its non-trivial to verify without looking at the generated swig wrappers that the GIL is held. PiperOrigin-RevId: 200074843 --- tensorflow/python/eager/pywrap_tfe_src.cc | 82 ++++++++++++++++--------------- 1 file changed, 43 insertions(+), 39 deletions(-) diff --git a/tensorflow/python/eager/pywrap_tfe_src.cc b/tensorflow/python/eager/pywrap_tfe_src.cc index e3ce0ef9d0..52b3268903 100644 --- a/tensorflow/python/eager/pywrap_tfe_src.cc +++ b/tensorflow/python/eager/pywrap_tfe_src.cc @@ -873,22 +873,6 @@ static tensorflow::DataType FastTensorDtype(PyObject* tensor) { return static_cast(id); } -static tensorflow::int64 FastHandleId(PyObject* variable) { - PyObject* handle = PyObject_GetAttrString(variable, "handle"); - if (handle == nullptr) { - return -1; - } - tensorflow::int64 id = FastTensorId(handle); - Py_DECREF(handle); - return id; -} - -struct CompareByHandleId { - bool operator()(PyObject* lhs, PyObject* rhs) { - return FastHandleId(lhs) < FastHandleId(rhs); - } -}; - class GradientTape : public tensorflow::eager::GradientTape { public: @@ -897,35 +881,63 @@ class GradientTape persistent) {} virtual ~GradientTape() { - for (PyObject* v : watched_variables_) { - Py_DECREF(v); + for (const IdAndVariable& v : watched_variables_) { + Py_DECREF(v.variable); } } void WatchVariable(PyObject* v) { - auto insert_result = watched_variables_.insert(v); - if (insert_result.second) { - // Only increment the reference count if we aren't already watching this - // variable. - Py_INCREF(v); - } - PyObject* handle = PyObject_GetAttrString(v, "handle"); + tensorflow::Safe_PyObjectPtr handle(PyObject_GetAttrString(v, "handle")); if (handle == nullptr) { return; } - tensorflow::int64 id = FastTensorId(handle); - Py_DECREF(handle); + tensorflow::int64 id = FastTensorId(handle.get()); + if (!PyErr_Occurred()) { this->Watch(id); } + + tensorflow::mutex_lock l(watched_variables_mu_); + auto insert_result = watched_variables_.emplace(id, v); + + if (insert_result.second) { + // Only increment the reference count if we aren't already watching this + // variable. + Py_INCREF(v); + } } - const std::set WatchedVariables() { - return watched_variables_; + PyObject* GetVariablesAsPyTuple() { + tensorflow::mutex_lock l(watched_variables_mu_); + PyObject* result = PyTuple_New(watched_variables_.size()); + Py_ssize_t pos = 0; + for (const IdAndVariable& id_and_variable : watched_variables_) { + PyTuple_SET_ITEM(result, pos++, id_and_variable.variable); + Py_INCREF(id_and_variable.variable); + } + return result; } private: - std::set watched_variables_; + // We store an IdAndVariable in the map since the map needs to be locked + // during insert, but should not call back into python during insert to avoid + // deadlocking with the GIL. + struct IdAndVariable { + tensorflow::int64 id; + PyObject* variable; + + IdAndVariable(tensorflow::int64 id, PyObject* variable) + : id(id), variable(variable) {} + }; + struct CompareById { + bool operator()(const IdAndVariable& lhs, const IdAndVariable& rhs) { + return lhs.id < rhs.id; + } + }; + + tensorflow::mutex watched_variables_mu_; + std::set watched_variables_ + GUARDED_BY(watched_variables_mu_); }; typedef struct { @@ -1217,15 +1229,7 @@ void TFE_Py_TapeSetWatchVariable(PyObject* variable) { } PyObject* TFE_Py_TapeWatchedVariables(PyObject* tape) { - const auto& watched_variables = - reinterpret_cast(tape)->tape->WatchedVariables(); - PyObject* result = PyTuple_New(watched_variables.size()); - Py_ssize_t pos = 0; - for (PyObject* variable : watched_variables) { - PyTuple_SET_ITEM(result, pos++, variable); - Py_INCREF(variable); - } - return result; + return reinterpret_cast(tape)->tape->GetVariablesAsPyTuple(); } namespace { -- cgit v1.2.3 From ec769c7ec368adf90aaa0b6d2a97525da14e1a37 Mon Sep 17 00:00:00 2001 From: Akshay Modi Date: Mon, 11 Jun 2018 16:27:12 -0700 Subject: Remove memory leak in read variable call, and record gradient call. Fix #19385 PiperOrigin-RevId: 200132949 --- tensorflow/python/eager/pywrap_tfe_src.cc | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/tensorflow/python/eager/pywrap_tfe_src.cc b/tensorflow/python/eager/pywrap_tfe_src.cc index 52b3268903..6c9481c3af 100644 --- a/tensorflow/python/eager/pywrap_tfe_src.cc +++ b/tensorflow/python/eager/pywrap_tfe_src.cc @@ -1873,6 +1873,8 @@ PyObject* RecordGradient(PyObject* op_name, PyObject* inputs, PyObject* attrs, delete backward_function; }); + Py_DECREF(num_inputs); + Py_RETURN_NONE; } @@ -1931,8 +1933,10 @@ bool ReadVariableOp(const FastPathOpExecInfo& parent_op_exec_info, Py_INCREF(output->get()); // stay alive after since tuple steals. PyTuple_SET_ITEM(outputs.get(), 0, output->get()); - if (!RecordGradient(GetPythonObjectFromString("ReadVariableOp"), - inputs.get(), Py_None, outputs.get(), Py_None)) { + tensorflow::Safe_PyObjectPtr op_string( + GetPythonObjectFromString("ReadVariableOp")); + if (!RecordGradient(op_string.get(), inputs.get(), Py_None, outputs.get(), + Py_None)) { return false; } } -- cgit v1.2.3 From c77fead531bc3756d765ba90e2e549abd7adf320 Mon Sep 17 00:00:00 2001 From: Brennan Saeta Date: Wed, 13 Jun 2018 15:46:12 -0700 Subject: Make GCS ops work in open source --- tensorflow/contrib/cloud/__init__.py | 5 +++-- tensorflow/contrib/cloud/kernels/BUILD | 1 + tensorflow/core/platform/cloud/gcs_file_system.cc | 4 +++- tensorflow/core/platform/default/build_config.bzl | 2 ++ 4 files changed, 9 insertions(+), 3 deletions(-) diff --git a/tensorflow/contrib/cloud/__init__.py b/tensorflow/contrib/cloud/__init__.py index a6e13ea3ae..ef7aa7624c 100644 --- a/tensorflow/contrib/cloud/__init__.py +++ b/tensorflow/contrib/cloud/__init__.py @@ -27,8 +27,9 @@ from tensorflow.python.util.all_util import remove_undocumented _allowed_symbols = [ 'BigQueryReader', - 'ConfigureColabSession', - 'ConfigureGcs', + 'BlockCacheParams', + 'configure_colab_session', + 'configure_gcs', 'ConfigureGcsHook', ] remove_undocumented(__name__, _allowed_symbols) diff --git a/tensorflow/contrib/cloud/kernels/BUILD b/tensorflow/contrib/cloud/kernels/BUILD index 40160706f7..1311063ec0 100644 --- a/tensorflow/contrib/cloud/kernels/BUILD +++ b/tensorflow/contrib/cloud/kernels/BUILD @@ -79,6 +79,7 @@ tf_kernel_library( srcs = ["gcs_config_ops.cc"], visibility = ["//tensorflow:internal"], deps = [ + "//tensorflow/contrib/cloud:gcs_config_ops_op_lib", "//tensorflow/core:framework", "//tensorflow/core:lib", "//tensorflow/core/platform/cloud:curl_http_request", diff --git a/tensorflow/core/platform/cloud/gcs_file_system.cc b/tensorflow/core/platform/cloud/gcs_file_system.cc index 22ae6121e0..803b08f1a3 100644 --- a/tensorflow/core/platform/cloud/gcs_file_system.cc +++ b/tensorflow/core/platform/cloud/gcs_file_system.cc @@ -804,7 +804,9 @@ void GcsFileSystem::ResetFileBlockCache(size_t block_size_bytes, mutex_lock l(block_cache_lock_); file_block_cache_ = MakeFileBlockCache(block_size_bytes, max_bytes, max_staleness_secs); - stats_->Configure(this, &throttle_, file_block_cache_.get()); + if (stats_) { + stats_->Configure(this, &throttle_, file_block_cache_.get()); + } } // A helper function to build a FileBlockCache for GcsFileSystem. diff --git a/tensorflow/core/platform/default/build_config.bzl b/tensorflow/core/platform/default/build_config.bzl index 9e52ba344a..f12732b434 100644 --- a/tensorflow/core/platform/default/build_config.bzl +++ b/tensorflow/core/platform/default/build_config.bzl @@ -633,6 +633,7 @@ def tf_additional_cloud_op_deps(): "//tensorflow:with_gcp_support_ios_override": [], "//tensorflow:with_gcp_support": [ "//tensorflow/contrib/cloud:bigquery_reader_ops_op_lib", + "//tensorflow/contrib/cloud:gcs_config_ops_op_lib", ], "//conditions:default": [], }) @@ -645,6 +646,7 @@ def tf_additional_cloud_kernel_deps(): "//tensorflow:with_gcp_support_ios_override": [], "//tensorflow:with_gcp_support": [ "//tensorflow/contrib/cloud/kernels:bigquery_reader_ops", + "//tensorflow/contrib/cloud/kernels:gcs_config_ops", ], "//conditions:default": [], }) -- cgit v1.2.3 From f9a44a69c35dcf7f1c0f42e1ae9971bae0148099 Mon Sep 17 00:00:00 2001 From: Brennan Saeta Date: Wed, 13 Jun 2018 18:05:39 -0700 Subject: Update the docs and api_def. --- tensorflow/contrib/cloud/ops/gcs_config_ops.cc | 42 ++-------------------- .../base_api/api_def_GcsConfigureBlockCache.pbtxt | 9 +++++ .../base_api/api_def_GcsConfigureCredentials.pbtxt | 33 +++++++++++++++++ 3 files changed, 44 insertions(+), 40 deletions(-) create mode 100644 tensorflow/core/api_def/base_api/api_def_GcsConfigureBlockCache.pbtxt create mode 100644 tensorflow/core/api_def/base_api/api_def_GcsConfigureCredentials.pbtxt diff --git a/tensorflow/contrib/cloud/ops/gcs_config_ops.cc b/tensorflow/contrib/cloud/ops/gcs_config_ops.cc index 9cf85f5f18..5e31a15498 100644 --- a/tensorflow/contrib/cloud/ops/gcs_config_ops.cc +++ b/tensorflow/contrib/cloud/ops/gcs_config_ops.cc @@ -21,50 +21,12 @@ namespace tensorflow { REGISTER_OP("GcsConfigureCredentials") .Input("json: string") - .SetShapeFn(shape_inference::NoOutputs) - .Doc(R"doc( -Configures the credentials used by the GCS client of the local TF runtime. - -The json input can be of the format: - -1. Refresh Token: -{ - "client_id": "", - "client_secret": "", - "refresh_token: "", - "type": "authorized_user", -} - -2. Service Account: -{ - "type": "service_account", - "project_id": "", - "private_key_id": "", - "private_key": "------BEGIN PRIVATE KEY-----\n\n-----END PRIVATE KEY------\n", - "client_email": "@.iam.gserviceaccount.com", - "client_id": "", - # Some additional fields elided -} - -Note the credentials established through this method are shared across all -sessions run on this runtime. - -Note be sure to feed the inputs to this op to ensure the credentials are not -stored in a constant op within the graph that might accidentally be checkpointed -or in other ways be persisted or exfiltrated. -)doc"); + .SetShapeFn(shape_inference::NoOutputs); REGISTER_OP("GcsConfigureBlockCache") .Input("max_cache_size: uint64") .Input("block_size: uint64") .Input("max_staleness: uint64") - .SetShapeFn(shape_inference::NoOutputs) - .Doc(R"doc( -Re-configures the GCS block cache with the new configuration values. - -If the values are the same as already configured values, this op is a no-op. If -they are different, the current contents of the block cache is dropped, and a -new block cache is created fresh. -)doc"); + .SetShapeFn(shape_inference::NoOutputs); } // namespace tensorflow diff --git a/tensorflow/core/api_def/base_api/api_def_GcsConfigureBlockCache.pbtxt b/tensorflow/core/api_def/base_api/api_def_GcsConfigureBlockCache.pbtxt new file mode 100644 index 0000000000..9d32940c64 --- /dev/null +++ b/tensorflow/core/api_def/base_api/api_def_GcsConfigureBlockCache.pbtxt @@ -0,0 +1,9 @@ +op { + graph_op_name: "GcsConfigureBlockCache" + summary: "Re-configures the GCS block cache with the new configuration values." + description: <", + "client_secret": "", + "refresh_token: "", + "type": "authorized_user", +} + +2. Service Account: +{ + "type": "service_account", + "project_id": "", + "private_key_id": "", + "private_key": "------BEGIN PRIVATE KEY-----\n\n-----END PRIVATE KEY------\n", + "client_email": "@.iam.gserviceaccount.com", + "client_id": "", + # Some additional fields elided +} + +Note the credentials established through this method are shared across all +sessions run on this runtime. + +Note be sure to feed the inputs to this op to ensure the credentials are not +stored in a constant op within the graph that might accidentally be checkpointed +or in other ways be persisted or exfiltrated. +END0 +} -- cgit v1.2.3 From ea3bdbc7ea72e488566326aeb446681a557f4334 Mon Sep 17 00:00:00 2001 From: Michael Case Date: Thu, 14 Jun 2018 06:17:00 -0700 Subject: Update version strings for 1.9.0-rc1. --- tensorflow/core/public/version.h | 2 +- tensorflow/docs_src/install/install_c.md | 2 +- tensorflow/docs_src/install/install_go.md | 2 +- tensorflow/docs_src/install/install_java.md | 22 +++++++++++----------- tensorflow/docs_src/install/install_linux.md | 18 +++++++++--------- tensorflow/docs_src/install/install_mac.md | 10 +++++----- tensorflow/docs_src/install/install_sources.md | 4 ++-- tensorflow/tools/pip_package/setup.py | 2 +- 8 files changed, 31 insertions(+), 31 deletions(-) diff --git a/tensorflow/core/public/version.h b/tensorflow/core/public/version.h index cb1fd09dbb..9e5e747557 100644 --- a/tensorflow/core/public/version.h +++ b/tensorflow/core/public/version.h @@ -24,7 +24,7 @@ limitations under the License. // TF_VERSION_SUFFIX is non-empty for pre-releases (e.g. "-alpha", "-alpha.1", // "-beta", "-rc", "-rc.1") -#define TF_VERSION_SUFFIX "-rc0" +#define TF_VERSION_SUFFIX "-rc1" #define TF_STR_HELPER(x) #x #define TF_STR(x) TF_STR_HELPER(x) diff --git a/tensorflow/docs_src/install/install_c.md b/tensorflow/docs_src/install/install_c.md index 2901848745..2f81ae0c40 100644 --- a/tensorflow/docs_src/install/install_c.md +++ b/tensorflow/docs_src/install/install_c.md @@ -38,7 +38,7 @@ enable TensorFlow for C: OS="linux" # Change to "darwin" for macOS TARGET_DIRECTORY="/usr/local" curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-${OS}-x86_64-1.9.0-rc0.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-${OS}-x86_64-1.9.0-rc1.tar.gz" | sudo tar -C $TARGET_DIRECTORY -xz The `tar` command extracts the TensorFlow C library into the `lib` diff --git a/tensorflow/docs_src/install/install_go.md b/tensorflow/docs_src/install/install_go.md index 55bc0f64e7..1c03dd223e 100644 --- a/tensorflow/docs_src/install/install_go.md +++ b/tensorflow/docs_src/install/install_go.md @@ -38,7 +38,7 @@ steps to install this library and enable TensorFlow for Go: TF_TYPE="cpu" # Change to "gpu" for GPU support TARGET_DIRECTORY='/usr/local' curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-$(go env GOOS)-x86_64-1.9.0-rc0.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-$(go env GOOS)-x86_64-1.9.0-rc1.tar.gz" | sudo tar -C $TARGET_DIRECTORY -xz The `tar` command extracts the TensorFlow C library into the `lib` diff --git a/tensorflow/docs_src/install/install_java.md b/tensorflow/docs_src/install/install_java.md index b3b739212e..c73e2f4281 100644 --- a/tensorflow/docs_src/install/install_java.md +++ b/tensorflow/docs_src/install/install_java.md @@ -36,7 +36,7 @@ following to the project's `pom.xml` to use the TensorFlow Java APIs: org.tensorflow tensorflow - 1.9.0-rc0 + 1.9.0-rc1 ``` @@ -65,7 +65,7 @@ As an example, these steps will create a Maven project that uses TensorFlow: org.tensorflow tensorflow - 1.9.0-rc0 + 1.9.0-rc1 @@ -124,12 +124,12 @@ instead: org.tensorflow libtensorflow - 1.9.0-rc0 + 1.9.0-rc1 org.tensorflow libtensorflow_jni_gpu - 1.9.0-rc0 + 1.9.0-rc1 ``` @@ -148,7 +148,7 @@ refer to the simpler instructions above instead. Take the following steps to install TensorFlow for Java on Linux or macOS: 1. Download - [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc0.jar), + [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc1.jar), which is the TensorFlow Java Archive (JAR). 2. Decide whether you will run TensorFlow for Java on CPU(s) only or with @@ -167,7 +167,7 @@ Take the following steps to install TensorFlow for Java on Linux or macOS: OS=$(uname -s | tr '[:upper:]' '[:lower:]') mkdir -p ./jni curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-${TF_TYPE}-${OS}-x86_64-1.9.0-rc0.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-${TF_TYPE}-${OS}-x86_64-1.9.0-rc1.tar.gz" | tar -xz -C ./jni ### Install on Windows @@ -175,10 +175,10 @@ Take the following steps to install TensorFlow for Java on Linux or macOS: Take the following steps to install TensorFlow for Java on Windows: 1. Download - [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc0.jar), + [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc1.jar), which is the TensorFlow Java Archive (JAR). 2. Download the following Java Native Interface (JNI) file appropriate for - [TensorFlow for Java on Windows](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-cpu-windows-x86_64-1.9.0-rc0.zip). + [TensorFlow for Java on Windows](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-cpu-windows-x86_64-1.9.0-rc1.zip). 3. Extract this .zip file. @@ -227,7 +227,7 @@ must be part of your `classpath`. For example, you can include the downloaded `.jar` in your `classpath` by using the `-cp` compilation flag as follows: -
javac -cp libtensorflow-1.9.0-rc0.jar HelloTF.java
+
javac -cp libtensorflow-1.9.0-rc1.jar HelloTF.java
### Running @@ -241,11 +241,11 @@ two files are available to the JVM: For example, the following command line executes the `HelloTF` program on Linux and macOS X: -
java -cp libtensorflow-1.9.0-rc0.jar:. -Djava.library.path=./jni HelloTF
+
java -cp libtensorflow-1.9.0-rc1.jar:. -Djava.library.path=./jni HelloTF
And the following command line executes the `HelloTF` program on Windows: -
java -cp libtensorflow-1.9.0-rc0.jar;. -Djava.library.path=jni HelloTF
+
java -cp libtensorflow-1.9.0-rc1.jar;. -Djava.library.path=jni HelloTF
If the program prints Hello from version, you've successfully installed TensorFlow for Java and are ready to use the API. If the program diff --git a/tensorflow/docs_src/install/install_linux.md b/tensorflow/docs_src/install/install_linux.md index 2ecab808c4..9baf6870be 100644 --- a/tensorflow/docs_src/install/install_linux.md +++ b/tensorflow/docs_src/install/install_linux.md @@ -438,7 +438,7 @@ Take the following steps to install TensorFlow in an Anaconda environment:
      (tensorflow)$ pip install --ignore-installed --upgrade \
-     https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp34-cp34m-linux_x86_64.whl
+ https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc1-cp34-cp34m-linux_x86_64.whl ## Validate your installation @@ -684,14 +684,14 @@ This section documents the relevant values for Linux installations. CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp27-none-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc1-cp27-none-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc0-cp27-none-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc1-cp27-none-linux_x86_64.whl
 
Note that GPU support requires the NVIDIA hardware and software described in @@ -703,14 +703,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp34-cp34m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc1-cp34-cp34m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc0-cp34-cp34m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc1-cp34-cp34m-linux_x86_64.whl
 
Note that GPU support requires the NVIDIA hardware and software described in @@ -722,14 +722,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp35-cp35m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc1-cp35-cp35m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc0-cp35-cp35m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc1-cp35-cp35m-linux_x86_64.whl
 
@@ -741,14 +741,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp36-cp36m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc1-cp36-cp36m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc0-cp36-cp36m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc1-cp36-cp36m-linux_x86_64.whl
 
diff --git a/tensorflow/docs_src/install/install_mac.md b/tensorflow/docs_src/install/install_mac.md index 9d01271c5a..693254f876 100644 --- a/tensorflow/docs_src/install/install_mac.md +++ b/tensorflow/docs_src/install/install_mac.md @@ -119,7 +119,7 @@ Take the following steps to install TensorFlow with Virtualenv: TensorFlow in the active Virtualenv is as follows:
 $ pip3 install --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py3-none-any.whl
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc1-py3-none-any.whl If you encounter installation problems, see [Common Installation Problems](#common-installation-problems). @@ -242,7 +242,7 @@ take the following steps: issue the following command:
 $ sudo pip3 install --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py3-none-any.whl 
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc1-py3-none-any.whl If the preceding command fails, see [installation problems](#common-installation-problems). @@ -350,7 +350,7 @@ Take the following steps to install TensorFlow in an Anaconda environment: TensorFlow for Python 2.7:
 (targetDirectory)$ pip install --ignore-installed --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py2-none-any.whl
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc1-py2-none-any.whl @@ -522,7 +522,7 @@ The value you specify depends on your Python version.
-https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py2-none-any.whl
+https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc1-py2-none-any.whl
 
@@ -530,5 +530,5 @@ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py2-none-a
-https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py3-none-any.whl
+https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc1-py3-none-any.whl
 
diff --git a/tensorflow/docs_src/install/install_sources.md b/tensorflow/docs_src/install/install_sources.md index d25e641cee..70e97cf556 100644 --- a/tensorflow/docs_src/install/install_sources.md +++ b/tensorflow/docs_src/install/install_sources.md @@ -328,10 +328,10 @@ Invoke `pip install` to install that pip package. The filename of the `.whl` file depends on your platform. For example, the following command will install the pip package -for TensorFlow 1.9.0rc0 on Linux: +for TensorFlow 1.9.0rc1 on Linux:
-$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.9.0rc0-py2-none-any.whl
+$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.9.0rc1-py2-none-any.whl
 
## Validate your installation diff --git a/tensorflow/tools/pip_package/setup.py b/tensorflow/tools/pip_package/setup.py index 92a1465cea..eb2e359ee5 100644 --- a/tensorflow/tools/pip_package/setup.py +++ b/tensorflow/tools/pip_package/setup.py @@ -45,7 +45,7 @@ DOCLINES = __doc__.split('\n') # This version string is semver compatible, but incompatible with pip. # For pip, we will remove all '-' characters from this string, and use the # result for pip. -_VERSION = '1.9.0-rc0' +_VERSION = '1.9.0-rc1' REQUIRED_PACKAGES = [ 'absl-py >= 0.1.6', -- cgit v1.2.3 From f5ee4df50af4041dc0063d0adc31c7a6eebdbcd3 Mon Sep 17 00:00:00 2001 From: Billy Lamberta Date: Fri, 8 Jun 2018 15:47:19 -0700 Subject: Copy edits to Keras guide, formatting, moving some things around. Make the right TOC nav more useful. PiperOrigin-RevId: 199863216 --- tensorflow/docs_src/programmers_guide/keras.md | 870 +++++++++++-------------- 1 file changed, 389 insertions(+), 481 deletions(-) diff --git a/tensorflow/docs_src/programmers_guide/keras.md b/tensorflow/docs_src/programmers_guide/keras.md index 6a9df12a25..c6aca7ebf4 100644 --- a/tensorflow/docs_src/programmers_guide/keras.md +++ b/tensorflow/docs_src/programmers_guide/keras.md @@ -1,334 +1,304 @@ # Keras -## What's Keras? - -Keras is a high-level API specification for building and training deep learning -models, suitable for fast prototyping, advanced research, and production. -It offers three key advantages: - -- **User friendliness.** Keras follows best practices for reducing - cognitive load: it offers consistent & simple interfaces, - it minimizes the number of user actions required for common use cases, - and it provides clear and actionable feedback upon user error. -- **Modularity and composability.** A Keras model is composed of - fully-configurable building blocks that can be plugged together - with as few restrictions as possible -- like Lego bricks. -- **Easy extensibility.** You can easily write your own building blocks - (such as new layers, new loss functions, new models where you write - the forward pass from scratch). This allows for total expressiveness, - making Keras suitable for advanced research. - - -## What's tf.keras? - -`tf.keras` is TensorFlow's implementation of the Keras API specification, that -serves as the TensorFlow high-level API: it's how you build models in TensorFlow. -`tf.keras` seamlessly integrates with the rest of the TensorFlow API -(such as `tf.data` input pipelines), bringing you the full power and flexibility -of TensorFlow through an easy-to-use interface. - -You can import `tf.keras` via: +Keras is a high-level API to build and train deep learning models. It's used for +fast prototyping, advanced research, and production, with three key advantages: + +- *User friendly*
+ Keras has a simple, consistent interface optimized for common use cases. It + provides clear and actionable feedback for user errors. +- *Modular and composable*
+ Keras models are made by connecting configurable building blocks together, + with few restrictions. +- *Easy to extend*
Write custom building blocks to express new ideas for + research. Create new layers, loss functions, and develop state-of-the-art + models. + +## Import tf.keras + +`tf.keras` is TensorFlow's implementation of the +[Keras API specification](https://keras.io){:.external}. This is a high-level +API to build and train models that includes first-class support for +TensorFlow-specific functionality, such as [eager execution](#eager_execution), +`tf.data` pipelines, and [Estimators](/programmers_guide/estimators). +`tf.keras` makes TensorFlow easier to use without sacrificing flexibility and +performance. + +To get started, import `tf.keras` as part of your TensorFlow program setup: ```python +import tensorflow as tf from tensorflow import keras ``` -What follows is a quick introduction to the basics of `tf.keras`. +`tf.keras` can run any Keras-compatible code, but keep in mind: +* The `tf.keras` version in the latest TensorFlow release might not be the same + as the latest `keras` version from PyPI. Check `tf.keras.__version__`. +* When [saving a model's weights](#weights_only), `tf.keras` defaults to the + [checkpoint format](/get_started/checkpoints). Pass `save_format='h5'` to use + HDF5. -## Table of contents +## Build a simple model -- [Getting started: the Sequential model](#getting-started-the-sequential-model) -- [Configuring layers](#configuring-layers) -- [Configuring training](#configuring-training) -- [Training and evaluation](#training-and-evaluation) -- [Building advanced models: the functional API](#building-advanced-models-the-functional-api) -- [Building fully-customizable research models: the Model subclassing API](#building-fully-customizable-research-models-the-model-subclassing-api) -- [Callbacks](#callbacks) -- [Saving and serialization](#saving-and-serialization) -- [Developing custom layers](#developing-custom-layers) -- [Eager execution](#eager-execution) -- [Further reading](#further-reading) -- [FAQ](#faq) +### Sequential model +In Keras, you assemble *layers* to build *models*. A model is (usually) a graph +of layers. The most common type of model is a stack of layers: the +`tf.keras.Sequential` model. ---- - -## Getting started: the Sequential model - -In `tf.keras`, you're assembling together **layers** to build **models**. -A model is generally a graph of layers. -The most common type of model is just a stack of layers: the `Sequential` class. - -Here's how to build a simple fully-connected network (multi-layer perceptron): +To build a simple, fully-connected network (i.e. multi-layer perceptron): ```python -from tensorflow import keras -from tensorflow.keras import layers - model = keras.Sequential() -# This adds to the model a densely-connected layer with 64 units: -model.add(Dense(64, activation='relu')) -# Another one: -model.add(Dense(64, activation='relu')) -# This adds a softmax layer with 10 output units: -model.add(Dense(10, activation='softmax')) +# Adds a densely-connected layer with 64 units to the model: +model.add(keras.layers.Dense(64, activation='relu')) +# Add another: +model.add(keras.layers.Dense(64, activation='relu')) +# Add a softmax layer with 10 output units: +model.add(keras.layers.Dense(10, activation='softmax')) ``` ---- - -## Configuring layers - -Each layer may have unique constructor arguments, but some common arguments include: +### Configure the layers -- `activation`: the activation function to be used. - It could be specified by name, as a string (for built-in functions) - or as a callable object. By default, no activation is applied. -- `kernel_initializer` and `bias_initializer`: the initialization schemes to use - to create the layer's weights (kernel and bias). - Likewise, they may be passed either by name or by specifying a callable. - By default, the "Glorot uniform" initializer is used. -- `kernel_regularizer` and `bias_regularizer`: the regularization schemes to - apply to the layer's weights (kernel and bias), such as L1 - or L2 regularization. By default, no regularization is applied. +There are many `tf.keras.layers` available with some common constructor +parameters: +* `activation`: Set the activation function for the layer. This parameter is + specified by the name of a built-in function or as a callable object. By + default, no activation is applied. +* `kernel_initializer` and `bias_initializer`: The initialization schemes + that create the layer's weights (kernel and bias). This parameter is a name or + a callable object. This defaults to the `"Glorot uniform"` initializer. +* `kernel_regularizer` and `bias_regularizer`: The regularization schemes + that apply the layer's weights (kernel and bias), such as L1 or L2 + regularization. By default, no regularization is applied. -### Examples +The following instantiates `tf.keras.layers.Dense` layers using constructor +arguments: ```python -import tensorflow as tf -from tensorflow.keras.layers import Dense -from tensorflow.keras import regularizers -from tensorflow.keras import initializers - -# A sigmoid layer: -Dense(64, activation='sigmoid') -# Another way to define the same sigmoid layer: -Dense(64, activation=tf.sigmoid) - -# A linear layer with L1 regularization of factor 0.01 -# applied to the kernel matrix: -Dense(64, kernel_regularizer=regularizers.l1(0.01)) -# A linear layer with L2 regularization of factor 0.01 -# applied to the bias vector: -Dense(64, bias_regularizer=regularizers.l2(0.01)) +# Create a sigmoid layer: +layers.Dense(64, activation='sigmoid') +# Or: +layers.Dense(64, activation=tf.sigmoid) + +# A linear layer with L1 regularization of factor 0.01 applied to the kernel matrix: +layers.Dense(64, kernel_regularizer=keras.regularizers.l1(0.01)) +# A linear layer with L2 regularization of factor 0.01 applied to the bias vector: +layers.Dense(64, bias_regularizer=keras.regularizers.l2(0.01)) # A linear layer with a kernel initialized to a random orthogonal matrix: -Dense(64, kernel_initializer='orthogonal') +layers.Dense(64, kernel_initializer='orthogonal') # A linear layer with a bias vector initialized to 2.0s: -Dense(64, bias_initializer=initializers.constant(2.0)) +layers.Dense(64, bias_initializer=keras.initializers.constant(2.0)) ``` ---- +## Train and evaluate -## Configuring training +### Set up training -Once your model looks good, configure its learning process by calling `compile`: +After the model is constructed, configure its learning process by calling the +`compile` method: ```python -import tensorflow as tf - model.compile(optimizer=tf.train.AdamOptimizer(0.001), loss='categorical_crossentropy', metrics=['accuracy']) ``` -There are three key arguments that you need to specify: +`tf.keras.Model.compile` takes three important arguments: -- An `optimizer`: this object specifies the training procedure. - We recommend that you pass instances of optimizers from the `tf.train` module - (such as [`AdamOptimizer`](https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer), - [`RMSPropOptimizer`](https://www.tensorflow.org/api_docs/python/tf/train/RMSPropOptimizer), - or [`GradientDescentOptimizer`](https://www.tensorflow.org/api_docs/python/tf/train/GradientDescentOptimizer)). -- A `loss` function to minimize: this specifies the optimization objective. - Common choices include mean square error (`mse`), `categorical_crossentropy` - and `binary_crossentropy`. Loss functions may be specified by name - or by passing a callable (e.g. from the `tf.keras.losses` module). -- Some `metrics` to monitor during training: again, you can pass these as either - string names or callables (e.g. from the `tf.keras.metrics` module). +* `optimizer`: This object specifies the training procedure. Pass it optimizer + instances from the `tf.train` module, such as + [`AdamOptimizer`](/api_docs/python/tf/train/AdamOptimizer), + [`RMSPropOptimizer`](/api_docs/python/tf/train/RMSPropOptimizer), or + [`GradientDescentOptimizer`](/api_docs/python/tf/train/GradientDescentOptimizer). +* `loss`: The function to minimize during optimization. Common choices include + mean square error (`mse`), `categorical_crossentropy`, and + `binary_crossentropy`. Loss functions are specified by name or by + passing a callable object from the `tf.keras.losses` module. +* `metrics`: Used to monitor training. These are string names or callables from + the `tf.keras.metrics` module. - -### Examples +The following shows a few examples of configuring a model for training: ```python -# Configures a model to do mean-squared error regression. +# Configure a model for mean-squared error regression. model.compile(optimizer=tf.train.AdamOptimizer(0.01), - loss='mse', # mean squared error + loss='mse', # mean squared error metrics=['mae']) # mean absolute error -``` -```python -# Configures a model to do categorical classification. + +# Configure a model for categorical classification. model.compile(optimizer=tf.train.RMSPropOptimizer(0.01), - loss=tf.keras.losses.categorical_crossentropy, - metrics=[tf.keras.metrics.categorical_accuracy]) + loss=keras.losses.categorical_crossentropy, + metrics=[keras.metrics.categorical_accuracy]) ``` ---- - -## Training and evaluation +### Input NumPy data -### From Numpy data - -When running locally on small datasets, the easiest way to do training and -evaluation is to pass data to your model as Numpy arrays of inputs and targets. -You can "fit" your model to some training data using the `model.fit()` method: +For small datasets, use in-memory [NumPy](https://www.numpy.org/){:.external} +arrays to train and evaluate a model. The model is "fit" to the training data +using the `fit` method: ```python import numpy as np -data = np.random.random(shape=(1000, 32)) -targets = np.random.random(shape=(1000, 10)) +data = np.random.random((1000, 32)) +labels = np.random.random((1000, 10)) -model.fit(data, targets, epochs=10, batch_size=32) +model.fit(data, labels, epochs=10, batch_size=32) ``` -Here are some key arguments you can pass to the `fit` method: - -- `epochs`: Training is structured into **epochs**. An epoch is one iteration - over the entire input data (which is done in smaller batches). -- `batch_size`: when passing Numpy data, the model will slice the data into - smaller batches and iterate over these batches during training. - This integer specifies the size of each batch - (the last batch may be smaller if the total number of samples is not - divisible by the batch size). -- `validation_data`: when prototyping a model, you want to be able to quickly - monitor its performance on some validation data. - When you pass this argument (it expects a tuple of inputs and targets), - the model will display the loss and metrics in inference mode on the data - you passed, at the end of each epoch. +`tf.keras.Model.fit` takes three important arguments: + +* `epochs`: Training is structured into *epochs*. An epoch is one iteration over + the entire input data (this is done in smaller batches). +* `batch_size`: When passed NumPy data, the model slices the data into smaller + batches and iterates over these batches during training. This integer + specifies the size of each batch. Be aware that the last batch may be smaller + if the total number of samples is not divisible by the batch size. +* `validation_data`: When prototyping a model, you want to easily monitor its + performance on some validation data. Passing this argument—a tuple of inputs + and labels—allows the model to display the loss and metrics in inference mode + for the passed data, at the end of each epoch. Here's an example using `validation_data`: ```python import numpy as np -data = np.random.random(shape=(1000, 32)) -targets = np.random.random(shape=(1000, 10)) +data = np.random.random((1000, 32)) +labels = np.random.random((1000, 10)) -val_data = np.random.random(shape=(100, 32)) -val_targets = np.random.random(shape=(100, 10)) +val_data = np.random.random((100, 32)) +val_labels = np.random.random((100, 10)) -model.fit(data, targets, epochs=10, batch_size=32, - validation_data=(val_data, val_targets)) +model.fit(data, labels, epochs=10, batch_size=32, + validation_data=(val_data, val_labels)) ``` -### From tf.data datasets +### Input tf.data datasets -When you need to scale to large datasets or multi-device training, -training from Numpy arrays in memory will not be ideal. -In such cases, you should use [the `tf.data` API](https://www.tensorflow.org/programmers_guide/datasets). -You can pass a `tf.data.Dataset` instance to the `fit` method: +Use the [Datasets API](/programmers_guide/datasets) to scale to large datasets +or multi-device training. Pass a `tf.data.Dataset` instance to the `fit` +method: ```python -import tensorflow as tf - # Instantiates a toy dataset instance: -dataset = tf.data.Dataset.from_tensor_slices((data, targets)).batch(32) +dataset = tf.data.Dataset.from_tensor_slices((data, labels)) +dataset = dataset.batch(32) +dataset = dataset.repeat() # Don't forget to specify `steps_per_epoch` when calling `fit` on a dataset. model.fit(dataset, epochs=10, steps_per_epoch=30) ``` -When doing so, the dataset itself will yield batches of data, -so the model does not need to be passed `batch_size` information. -Instead, the model needs to know for how many steps (or batches of data) -it should run at each epoch. -You specify this with the `steps_per_epoch` argument: it's the number of -training steps the model will run before moving on the next epoch. +Here, the `fit` method uses the `steps_per_epoch` argument—this is the number of +training steps the model runs before it moves to the next epoch. Since the +`Dataset` yields batches of data, this snippet does not require a `batch_size`. -You can also pass datasets for validation: +Datasets can also be used for validation: ```python -dataset = tf.data.Dataset.from_tensor_slices((data, targets)).batch(32) -val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_targets)).batch(32) +dataset = tf.data.Dataset.from_tensor_slices((data, labels)) +dataset = dataset.batch(32).repeat() -model.fit(dataset, epochs=10, steps_per_epoch=30, validation_data=val_dataset, validation_steps=3) +val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_labels)) +val_dataset = val_dataset.batch(32).repeat() + +model.fit(dataset, epochs=10, steps_per_epoch=30, + validation_data=val_dataset, + validation_steps=3) ``` ### Evaluate and predict -In addition, you get access to the following methods -(both with Numpy data and dataset instances): +The `tf.keras.Model.evaluate` and `tf.keras.Model.predict` methods can use NumPy +data and a `tf.data.Dataset`. -- `model.evaluate(x, y, batch_size=32)` or `model.evaluate(dataset, steps=30)` - will return the inference-mode loss and metrics for the data provided. -- `model.predict(x, y, batch_size=32)` or `model.predict(dataset, steps=30)` - will return the output(s) of the last layer(s) in inference on the data - provided, as Numpy array(s). +To *evaluate* the inference-mode loss and metrics for the data provided: ---- +```python +model.evaluate(x, y, batch_size=32) -## Building advanced models: the functional API +model.evaluate(dataset, steps=30 +``` -The `Sequential` model cannot represent arbitrary models -- only simple stacks -of layers. If you need to use more complex model topologies, -such as multi-input models, multi-output models, -models with a same layer called several times (shared layers), -or models with non-sequential data flows (e.g. residual connections), -you can use the 'functional API'. +And to *predict* the output of the last layer in inference for the data provided, +as a NumPy array: -Here's how it works: +``` +model.predict(x, batch_size=32) -- A layer instance is callable (on a tensor), and it returns a tensor. -- Input tensor(s) and output tensor(s) can then be used to define a `Model` instance. -- Such a model can be trained just like the `Sequential` model. +model.predict(dataset, steps=30) +``` -Here's a basic example showing the same model we previously defined, -built using the functional API: +## Build advanced models -```python -from tensorflow import keras -from tensorflow.keras import layers +### Functional API -# This returns a placeholder tensor: -inputs = keras.Input(shape=(784,)) +The `tf.keras.Sequential` model is a simple stack of layers that cannot +represent arbitrary models. Use the +[Keras functional API](https://keras.io/getting-started/functional-api-guide/){:.external} +to build complex model topologies such as: + +* Multi-input models, +* Multi-output models, +* Models with shared layers (the same layer called several times), +* Models with non-sequential data flows (e.g. residual connections). + +Building a model with the functional API works like this: + +1. A layer instance is callable and returns a tensor. +2. Input tensors and output tensors are used to define a `tf.keras.Model` + instance. +3. This model is trained just like the `Sequential` model. + +The following example uses the functional API to build a simple, fully-connected +network: + +```python +inputs = keras.Input(shape=(32,)) # Returns a placeholder tensor # A layer instance is callable on a tensor, and returns a tensor. -x = layers.Dense(64, activation='relu')(inputs) -x = layers.Dense(64, activation='relu')(x) -predictions = layers.Dense(10, activation='softmax')(x) +x = keras.layers.Dense(64, activation='relu')(inputs) +x = keras.layers.Dense(64, activation='relu')(x) +predictions = keras.layers.Dense(10, activation='softmax')(x) -# Instantiates the model given inputs and outputs. +# Instantiate the model given inputs and outputs. model = keras.Model(inputs=inputs, outputs=predictions) -# The "compile" step specifies the training configuration. -model.compile(optimizer='rmsprop', +# The compile step specifies the training configuration. +model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), loss='categorical_crossentropy', metrics=['accuracy']) -# Trains for 5 epochs. +# Trains for 5 epochs model.fit(data, labels, batch_size=32, epochs=5) ``` -This API enables you to create models with multiple inputs and outputs, -and to "share" layers across different inputs -(i.e. to reuse a same instance multiple times). -For examples of these use cases, -please see [this guide to the functional API in Keras](https://keras.io/getting-started/functional-api-guide/). +### Model subclassing ---- +Build a fully-customizable model by subclassing `tf.keras.Model` and defining +your own forward pass. Create layers in the `__init__` method and set them as +attributes of the class instance. Define the forward pass in the `call` method. -## Building fully-customizable research models: the Model subclassing API +Model subclassing is particularly useful when +[eager execution](/programmers_guide/eager) is enabled since the forward pass +can be written imperatively. -Besides `Sequential` and the functional API, one last, more flexible way to -define models is to directly subclass the `Model` class and define your own -forward pass manually. +Key Point: Use the right API for the job. While model subclassing offers +flexibility, it comes at a cost of greater complexity and more opportunities for +user errors. If possible, prefer the functional API. -In this API, you instante layers in `__init__` and set them as attribute of the -class instance. Then you specify the forward pass in `call`. -This API is particularly valuable when using TensorFlow with [eager execution](https://www.tensorflow.org/programmers_guide/eager), -since eager execution allows you to write your forward pass in an -imperative fashion (as if you were writing Numpy code, for instance). +The following example shows a subclassed `tf.keras.Model` using a custom forward +pass: ```python -import tensorflow as tf -from tensorflow import keras - - class MyModel(keras.Model): - def __init__(self, num_classes=2): + def __init__(self, num_classes=10): super(MyModel, self).__init__(name='my_model') self.num_classes = num_classes # Define your layers here. @@ -351,10 +321,10 @@ class MyModel(keras.Model): # Instantiates the subclassed model. -model = MyModel(num_classes=2) +model = MyModel(num_classes=10) -# The "compile" step specifies the training configuration. -model.compile(optimizer='rmsprop', +# The compile step specifies the training configuration. +model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), loss='categorical_crossentropy', metrics=['accuracy']) @@ -362,353 +332,291 @@ model.compile(optimizer='rmsprop', model.fit(data, labels, batch_size=32, epochs=5) ``` -**Remember:** use the right API for the right job. -Using the `Model` subclassing API offers more flexibility, -but at the cost of greater complexity and a larger potential user error surface. -Prefer using the functional API when possible. ---- +### Custom layers -## Callbacks +Create a custom layer by subclassing `tf.keras.layers.Layer` and implementing +the following methods: -Callbacks are objects that you can pass to your model that customize and extend -its behavior during training. -There are callbacks for saving checkpoints of your model at regular intervals -(`tf.keras.callbacks.ModelCheckpoint`), -to dynamically change the learning rate (`tf.keras.callbacks.LearningRateScheduler`) -or to interrupt training when validation performance has stopped improving -(`tf.keras.callbacks.EarlyStopping`). -You can also use a callback to monitor your model's behavior using -[TensorBoard](https://www.tensorflow.org/programmers_guide/summaries_and_tensorboard) -(`tf.keras.callbacks.TensorBoard`). -You can also write your own custom callbacks. - -Different built-in callback are found in `tf.keras.callbacks`. -You use them by passing a `Callback` instance to `fit`: +* `build`: Create the weights of the layer. Add weights with the `add_weight` + method. +* `call`: Define the forward pass. +* `compute_output_shape`: Specify how to compute the output shape of the layer + given the input shape. +* Optionally, a layer can be serialized by implementing the `get_config` method + and the `from_config` class method. + +Here's an example of a custom layer that implements a `matmul` of an input with +a kernel matrix: ```python -from tensorflow import keras +class MyLayer(keras.layers.Layer): + + def __init__(self, output_dim, **kwargs): + self.output_dim = output_dim + super(MyLayer, self).__init__(**kwargs) + + def build(self, input_shape): + shape = tf.TensorShape((input_shape[1], self.output_dim)) + # Create a trainable weight variable for this layer. + self.kernel = self.add_weight(name='kernel', + shape=shape, + initializer='uniform', + trainable=True) + # Be sure to call this at the end + super(MyLayer, self).build(input_shape) -callbacks = [ - # Interrupt training if `val_loss` stops improving for over 2 epochs - keras.callbacks.EarlyStopping(patience=2, monitor='val_loss'), - # Write TensorBoard logs to `./logs` directory - keras.callbacks.TensorBoard(log_dir='./logs') -] -model.fit(data, labels, batch_size=32, epochs=5, callbacks=callbacks) -``` + def call(self, inputs): + return tf.matmul(inputs, self.kernel) ---- + def compute_output_shape(self, input_shape): + shape = tf.TensorShape(input_shape).as_list() + shape[-1] = self.output_dim + return tf.TensorShape(shape) -## Saving and serialization + def get_config(self): + base_config = super(MyLayer, self).get_config() + base_config['output_dim'] = self.output_dim -### Weights-only saving + @classmethod + def from_config(cls, config): + return cls(**config) -You can save the weight values of a model via `model.save_weights(filepath)`: -```python -# Saves weights to a SavedModel file. -model.save_weights('my_model') +# Create a model using the custom layer +model = keras.Sequential([MyLayer(10), + keras.layers.Activation('softmax')]) -# Restores the model's state -# (this requires a model that has the same architecture). -model.load_weights('my_model') +# The compile step specifies the training configuration +model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), + loss='categorical_crossentropy', + metrics=['accuracy']) + +# Trains for 5 epochs. +model.fit(data, targets, batch_size=32, epochs=5) ``` -By default, this saves the weight in the TensorFlow -[`SavedModel`](https://www.tensorflow.org/programmers_guide/saved_model) format. -You could also save them in the Keras HDF5 format -(which is the default in the multi-backend implementation of Keras): -```python -# Saves weights to a HDF5 file. -model.save_weights('my_model.h5', format='h5') +## Callbacks -# Restores the model's state. -model.load_weights('my_model.h5') -``` +A callback is an object passed to a model to customize and extend its behavior +during training. You can write your own custom callback, or use the built-in +`tf.keras.callbacks` that include: -### Configuration-only saving (serialization) +* `tf.keras.callbacks.ModelCheckpoint`: Save checkpoints of your model at + regular intervals. +* `tf.keras.callbacks.LearningRateScheduler`: Dynamically change the learning + rate. +* `tf.keras.callbacks.EarlyStopping`: Interrupt training when validation + performance has stopped improving. +* `tf.keras.callbacks.TensorBoard`: Monitor the model's behavior using + [TensorBoard](/programmers_guide/summaries_and_tensorboard). -You can also save the model's configuration -(its architecture, without any weight values), -which allows you to recreate the same model later (freshly initialized) even if -you don't have the code that defined it anymore. -Two possible serialization formats are JSON and YAML: +To use a `tf.keras.callbacks.Callback`, pass it to the model's `fit` method: ```python -from tensorflow.keras import models - -# Serializes a model to JSON. -json_string = model.to_json() -# Recreates the model (freshly initialized). -fresh_model = models.from_json(json_string) - -# Serializes a model to YAML. -yaml_string = model.to_yaml() -# Recreates the model. -fresh_model = models.from_yaml(yaml_string) +callbacks = [ + # Interrupt training if `val_loss` stops improving for over 2 epochs + keras.callbacks.EarlyStopping(patience=2, monitor='val_loss'), + # Write TensorBoard logs to `./logs` directory + keras.callbacks.TensorBoard(log_dir='./logs') +] +model.fit(data, labels, batch_size=32, epochs=5, callbacks=callbacks, + validation_data=(val_data, val_targets)) ``` -Note that this feature is not available with subclassed models, -because they are simply not serializable: -their architecture is defined as Python code -(the body of the `call` method of the model). -### Whole-model saving +## Save and restore -Finally, you can also save a model wholesale, to a file that will contain both -the weight values, the model's configuration, -and even the optimizer's configuration. -The allows you to checkpoint a model and resume training later -- -from the exact same state -- even if you don't have access to the original code. +### Weights only -```python -from tensorflow.keras import models +Save and load the weights of a model using `tf.keras.Model.save_weights`: -model.save('my_model.h5') +```python +# Save weights to a TensorFlow Checkpoint file +model.save_weights('./my_model') -# Recreates the exact same model, complete with weights and optimizer. -model = models.load_model('my_model.h5') +# Restore the model's state, +# this requires a model with the same architecture. +model.load_weights('my_model') ``` ---- - -## Developing custom layers - -You can write your own custom layers by subclassing the class -`tf.keras.layers.Layer`. You will need to implement the following three methods: - -- `build`: Creates the weights of the layer. - Weights should be added via the `add_weight` method. -- `call`: Specifies the forward pass. -- `compute_output_shape`: Specifies how to compute the output shape of the layer - given the input shape. - -Optionally, you may also implement the method `get_config()` and the -class method `from_config()` if you want your layer to be serializable. - -Here's a simple example of a custom layer that implements a `matmul` -of an input with a kernel matrix: +By default, this saves the model's weights in the +[TensorFlow checkpoint](/get_started/checkpoints) file format. Weights can also +be saved to the Keras HDF5 format (the default for the multi-backend +implementation of Keras): ```python -import tensorflow as tf -from tensorflow.keras import layers - -class MyLayer(layers.Layer): - - def __init__(self, output_dim, **kwargs): - self.output_dim = output_dim - super(MyLayer, self).__init__(**kwargs) - - def build(self, input_shape): - # Create a trainable weight variable for this layer. - self.kernel = self.add_weight(name='kernel', - shape=(input_shape[1], self.output_dim), - initializer='uniform', - trainable=True) - # Be sure to call this at the end - super(MyLayer, self).build(input_shape) - - def call(self, inputs): - return tf.matmul(inputs, self.kernel) - - def compute_output_shape(self, input_shape): - shape = tf.TensorShape(input_shape).as_list() - shape[-1] = self.output_dim - return tf.TensorShape(shape) - - def get_config(self): - base_config = super(MyLayer, self).get_config() - base_config['output_dim'] = self.output_dim - - @classmethod - def from_config(cls, config): - return cls(**config) -``` +# Save weights to a HDF5 file +model.save_weights('my_model.h5', save_format='h5') ---- - -## Eager execution +# Restore the model's state +model.load_weights('my_model.h5') +``` -[Eager execution](https://www.tensorflow.org/programmers_guide/eager) -is a way to write TensorFlow code imperatively. -All three `tf.keras` model-building APIs -(`Sequential`, the functional API `Model(inputs, outputs)`, -and the subclassing API `MyModel(Model)`) are compatible with eager execution. -When using `Sequential` or the functional API, it makes no difference to the -user experience whether the model is executing eagerly or not. -Eager execution is most beneficial when used with the `Model` subclassing API, -or when prototyping a custom layer -- that is to say, in APIs that require you -to *write a forward pass as code*, rather than in APIs that allow you to create -models by assembling together existing layers. +### Configuration only -While the same training and evaluating APIs presented in this guide work -as usual with eager execution, you can in addition -write custom training loops using the eager `GradientTape` -and define-by-run autodifferentiation: +A model's configuration can be saved—this serializes the model architecture +without any weights. A saved configuration can recreate and initialize the same +model, even without the code that defined the original model. Keras supports +JSON and YAML serialization formats: ```python -import tensorflow as tf -from tensorflow.contrib import eager as tfe - -# This call begins the eager execution session. -tf.enable_eager_execution() - -model = ... # Defines a Keras model (we recommend Model subclassing in this case). -dataset = ... # Defines a `tf.data` dataset. +# Serialize a model to JSON format +json_string = model.to_json() -optimizer = tf.train.AdamOptimizer(0.01) +# Recreate the model (freshly initialized) +fresh_model = keras.models.from_json(json_string) -for data, labels in dataset: - # Runs the forward pass and loss computation under a `GradientTape` scope, - # which will record all operations in order to prepare for the backward pass. - with tfe.GradientTape() as tape: - predictions = model(data) - loss = loss_function(labels, predictions) +# Serializes a model to YAML format +yaml_string = model.to_yaml() - # Runs the backward pass manually using the operations recorded - # by the gradient tape. - grads = tape.gradient(loss, model.trainable_weights) - optimizer.apply_gradients(zip(grads, model.trainable_weights), - global_step=tf.train.get_or_create_global_step()) +# Recreate the model +fresh_model = keras.models.from_yaml(yaml_string) ``` ---- +Caution: Subclassed models are not serializable because their architecture is +defined by the Python code in the body of the `call` method. -## Further reading -### Documentation +### Entire model -- [tf.keras documentation](https://www.tensorflow.org/api_docs/python/tf/keras) -- [keras.io](https://keras.io/) +The entire model can be saved to a file that contains the weight values, the +model's configuration, and even the optimizer's configuration. This allows you +to checkpoint a model and resume training later—from the exact same +state—without access to the original code. -### tf.keras tutorials and examples - -- [Fashion-MNIST with tf.Keras](https://medium.com/tensorflow/hello-deep-learning-fashion-mnist-with-keras-50fcff8cd74a) -- [Predicting the price of wine with the Keras Functional API and TensorFlow]( - https://medium.com/tensorflow/predicting-the-price-of-wine-with-the-keras-functional-api-and-tensorflow-a95d1c2c1b03) +```python +# Create a trivial model +model = keras.Sequential([ + keras.layers.Dense(10, activation='softmax', input_shape=(32,)), + keras.layers.Dense(10, activation='softmax') +]) +model.compile(optimizer='rmsprop', + loss='categorical_crossentropy', + metrics=['accuracy']) +model.fit(data, targets, batch_size=32, epochs=5) ---- +# Save entire model to a HDF5 file +model.save('my_model.h5') -## FAQ +# Recreate the exact same model, including weights and optimizer. +model = keras.models.load_model('my_model.h5') +``` -### What are the differences between tf.keras and the multi-backend Keras implementation? -`tf.keras` includes first-class support for important TensorFlow-specific -functionality not found in other Keras implementations, in particular: +## Eager execution -- Support for eager execution. -- Support for the `tf.data` API. -- Integration with the - [`tf.estimator` API](https://www.tensorflow.org/programmers_guide/estimators), - via `tf.keras.estimator.model_to_estimator`. +[Eager execution](/programmers_guide/eager) is an imperative programming +environment that evaluates operations immediately. This is not required for +Keras, but is supported by `tf.keras` and useful for inspecting your program and +debugging. -In terms of API differences: `tf.keras` is a full implementation of the -Keras API, so any code targeting the Keras API will run on `tf.keras`. -However, keep in mind that: +All of the `tf.keras` model-building APIs are compatible with eager execution. +And while the `Sequential` and functional APIs can be used, eager execution +especially benefits *model subclassing* and building *custom layers*—the APIs +that require you to write the forward pass as code (instead of the APIs that +create models by assembling existing layers). -- The `tf.keras` API version in the latest TensorFlow release might not be the - same as the latest `keras` version from PyPI. - Check out `tf.keras.__version__` if in doubt. -- In `tf.keras`, the default file format saved by `model.save_weights` is the - TensorFlow `SavedModel` format. - To use HDF5, you can pass the `format='h5'` argument. +See the [eager execution guide](/programmers_guide/eager#build_a_model) for +examples of using Keras models with custom training loops and `tf.GradientTape`. -### What is the relationship between tf.keras and tf.estimator? +## Distribution -The [`tf.estimator` API](https://www.tensorflow.org/programmers_guide/estimators) -is a high-level TensorFlow API for training "estimator" models, -in particular in distributed settings. -This API targets industry use cases, such as distributed training -on large datasets with a focus on eventually exporting a production model. +### Estimators -If you have a `tf.keras` model that would like to train with the `tf.estimator` -API, you can convert your model to an `Estimator` object via the -`model_to_estimator` utility](https://www.tensorflow.org/programmers_guide/estimators#creating_estimators_from_keras_models): +The [Estimators](/programmers_guide/estimators) API is used for training models +for distributed environments. This targets industry use cases such as +distributed training on large datasets that can export a model for production. +A `tf.keras.Model` can be trained with the `tf.estimator` API by converting the +model to an `tf.estimator.Estimator` object with +`tf.keras.estimator.model_to_estimator`. See +[Creating Estimators from Keras models](/programmers_guide/estimators#creating_estimators_from_keras_models). ```python -estimator = tf.keras.estimator.model_to_estimator(model) -``` +model = keras.Sequential([layers.Dense(10,activation='softmax'), + layers.Dense(10,activation='softmax')]) -When using `model_to_estimator`, enabling eager execution is helpful for -developing and debugging your `input_fn` -(as it allows you to easily print your data). +model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), + loss='categorical_crossentropy', + metrics=['accuracy']) + +estimator = keras.estimator.model_to_estimator(model) +``` +Note: Enable [eager execution](/programmers_guide/eager) for debugging +[Estimator input functions](/programmers_guide/premade_estimators#create_input_functions) +and inspecting data. -### How can I run tf.keras models on multiple GPUs? +### Multiple GPUs -You can run tf.keras models on multiple GPUs using the -[`DistributionStrategy API`](https://www.tensorflow.org/versions/master/api_docs/python/tf/contrib/distribute/DistributionStrategy). -The `DistributionStrategy` API allow you to distribute training on multiple GPUs -with almost no changes to your existing code. +`tf.keras` models can run on multiple GPUs using +`tf.contrib.distribute.DistributionStrategy`. This API provides distributed +training on multiple GPUs with almost no changes to existing code. -Currently [`MirroredStrategy`](https://www.tensorflow.org/versions/master/api_docs/python/tf/contrib/distribute/MirroredStrategy) -is the only supported strategy. -`MirroredStrategy` allows you to do in-graph replication with synchronous -training using all-reduce on a single machine. -To use `DistributionStrategy` with a `tf.keras` model, -you can use the `model_to_estimator` utility to convert a `tf.keras` model to -an `Estimator` and then train the estimator. +Currently, `tf.contrib.distribute.MirroredStrategy` is the only supported +distribution strategy. `MirroredStrategy` does in-graph replication with +synchronous training using all-reduce on a single machine. To use +`DistributionStrategy` with Keras, convert the `tf.keras.Model` to a +`tf.estimator.Estimator` with `tf.keras.estimator.model_to_estimator`, then +train the estimator -Here is a simple example of distributing a `tf.keras` model across multiple GPUs -on a single machine. +The following example distributes a `tf.keras.Model` across multiple GPUs on a +single machine. -Let's first define a simple model: +First, define a simple model: ```python -model = tf.keras.Sequential() -model.add(tf.keras.layers.Dense(16, activation='relu', input_shape=(10,))) -model.add(tf.keras.layers.Dense(1, activation='sigmoid')) +model = keras.Sequential() +model.add(keras.layers.Dense(16, activation='relu', input_shape=(10,))) +model.add(keras.layers.Dense(1, activation='sigmoid')) + optimizer = tf.train.GradientDescentOptimizer(0.2) + model.compile(loss='binary_crossentropy', optimizer=optimizer) model.summary() ``` -Let's use `model_to_estimator` to create an `Estimator` instance from the -`tf.keras` model defined above. +Convert the Keras model to a `tf.estimator.Estimator` instance: ```python -keras_estimator = tf.keras.estimator.model_to_estimator( - keras_model=model, - config=config, - model_dir='/tmp/model_dir') +keras_estimator = keras.estimator.model_to_estimator( + keras_model=model, + config=config, + model_dir='/tmp/model_dir') ``` -We'll use `tf.data.Datasets` to define our input pipeline. -Our `input_fn` returns a `tf.data.Dataset` object that we then use to distribute -the data across multiple devices with each device processing +Define an *input pipeline*. The `input_fn` returns a `tf.data.Dataset` object +used to distribute the data across multiple devices—with each device processing a slice of the input batch. ```python def input_fn(): - x = np.random.random((1024, 10)) - y = np.random.randint(2, size=(1024, 1)) - x = tf.cast(x, tf.float32) - dataset = tf.data.Dataset.from_tensor_slices((x, y)) - dataset = dataset.repeat(10) - dataset = dataset.batch(32) - return dataset + x = np.random.random((1024, 10)) + y = np.random.randint(2, size=(1024, 1)) + x = tf.cast(x, tf.float32) + dataset = tf.data.Dataset.from_tensor_slices((x, y)) + dataset = dataset.repeat(10) + dataset = dataset.batch(32) + return dataset ``` -The next step is to create a `RunConfig` and set the train_distribute argument -to the new `MirroredStrategy` instance. -You can specify a list of devices or the `num_gpus` argument when creating -a `MirroredStrategy` instance. -Not specifying any arguments defaults to using all the available GPUs like we do -in this example. +Next, create a `tf.estimator.RunConfig` and set the `train_distribute` argument +to the `tf.contrib.distribute.MirroredStrategy` instance. When creating +`MirroredStrategy`, you can specify a list of devices or set the `num_gpus` +argument. The default uses all available GPUs, like the following: ```python strategy = tf.contrib.distribute.MirroredStrategy() config = tf.estimator.RunConfig(train_distribute=strategy) ``` -Call train on the `Estimator` instance providing the `input_fn` and `steps` -arguments as input: +Finally, train the `Estimator` instance by providing the `input_fn` and `steps` +arguments: ```python keras_estimator.train(input_fn=input_fn, steps=10) -- cgit v1.2.3 From 7e859ebc65bf7d77ed89f736c7fd6fede0a93c92 Mon Sep 17 00:00:00 2001 From: Michael Case Date: Mon, 18 Jun 2018 11:07:48 -0700 Subject: Add missing Eager relnotes for TensorFlow 1.9. (#20101) --- RELEASE.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/RELEASE.md b/RELEASE.md index 879ce6e440..510eca5467 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -22,6 +22,8 @@ * (C++) `DatasetBase::MakeIterator()` has been renamed to `DatasetBase::MakeIteratorInternal()`. * (C++) `IteratorBase::Initialize()` method was added to support raising errors during iterator construction. * Eager Execution: + * Added the ability to pause recording operations for gradient computation via `tf.GradientTape.stop_recording`. + * Updated documentation, introductory notebooks. * `tf.keras`: * Move Keras code out of _impl folder and remove API files. * `tf.keras.Model.save_weights` now saves in TensorFlow format by default. -- cgit v1.2.3 From 86a6b0d7efbe5a3fa1f511237b85c926a6aef3a5 Mon Sep 17 00:00:00 2001 From: Brennan Saeta Date: Tue, 19 Jun 2018 17:47:37 -0700 Subject: [GCS] Typo in ConfigureGcsHook. This commit fixes a typo on ConfigureGcsHook that prevented its correct operation. --- tensorflow/contrib/cloud/python/ops/gcs_config_ops.py | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/tensorflow/contrib/cloud/python/ops/gcs_config_ops.py b/tensorflow/contrib/cloud/python/ops/gcs_config_ops.py index 8c8c5acb31..4f7300fd1f 100644 --- a/tensorflow/contrib/cloud/python/ops/gcs_config_ops.py +++ b/tensorflow/contrib/cloud/python/ops/gcs_config_ops.py @@ -120,13 +120,17 @@ class ConfigureGcsHook(training.SessionRunHook): def begin(self): if self._credentials: self._credentials_placeholder = array_ops.placeholder(dtypes.string) - self._credentials_ops = gen_gcs_config_ops.gcs_configure_credentials( + self._credentials_op = gen_gcs_config_ops.gcs_configure_credentials( self._credentials_placeholder) + else: + self._credentials_op = None if self._block_cache: self._block_cache_op = gen_gcs_config_ops.gcs_configure_block_cache( max_cache_size=self._block_cache.max_bytes, block_size=self._block_cache.block_size, max_staleness=self._block_cache.max_staleness) + else: + self._block_cache_op = None def after_create_session(self, session, coord): del coord -- cgit v1.2.3 From fdbb80f217d3a153b4eda66c766df921b3f73ab4 Mon Sep 17 00:00:00 2001 From: Michael Case Date: Wed, 20 Jun 2018 14:08:57 -0700 Subject: Move external/ directory in pip package. Moving external/ directory in the pip packages (which is currently installed directly into site-packages directory). Moving the directory to tensorflow/include/external/. Also, removing all python files from external (since it should really only contain headers and license files.) --- tensorflow/tools/pip_package/build_pip_package.sh | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/tensorflow/tools/pip_package/build_pip_package.sh b/tensorflow/tools/pip_package/build_pip_package.sh index f7e42ce536..9e41514cfa 100755 --- a/tensorflow/tools/pip_package/build_pip_package.sh +++ b/tensorflow/tools/pip_package/build_pip_package.sh @@ -24,9 +24,15 @@ function real_path() { function cp_external() { local src_dir=$1 local dest_dir=$2 - for f in `find "$src_dir" -maxdepth 1 -mindepth 1 ! -name '*local_config_cuda*' ! -name '*local_config_tensorrt*' ! -name '*org_tensorflow*'`; do - cp -R "$f" "$dest_dir" + + pushd . + cd "$src_dir" + for f in `find . ! -type d ! -name '*.py' ! -name '*local_config_cuda*' ! -name '*local_config_tensorrt*' ! -name '*org_tensorflow*'`; do + mkdir -p "${dest_dir}/$(dirname ${f})" + cp "${f}" "${dest_dir}/$(dirname ${f})/" done + popd + mkdir -p "${dest_dir}/local_config_cuda/cuda/cuda/" cp "${src_dir}/local_config_cuda/cuda/cuda/cuda_config.h" "${dest_dir}/local_config_cuda/cuda/cuda/" } @@ -49,6 +55,8 @@ function prepare_src() { TMPDIR="$1" mkdir -p "$TMPDIR" + EXTERNAL_INCLUDES="${TMPDIR}/tensorflow/include/external" + echo $(date) : "=== Preparing sources in dir: ${TMPDIR}" if [ ! -d bazel-bin/tensorflow ]; then @@ -66,10 +74,9 @@ function prepare_src() { cp -R \ bazel-bin/tensorflow/tools/pip_package/simple_console_for_window_unzip/runfiles/org_tensorflow/tensorflow \ "${TMPDIR}" - mkdir "${TMPDIR}/external" cp_external \ bazel-bin/tensorflow/tools/pip_package/simple_console_for_window_unzip/runfiles \ - "${TMPDIR}/external" + "${EXTERNAL_INCLUDES}/" RUNFILES=bazel-bin/tensorflow/tools/pip_package/simple_console_for_window_unzip/runfiles/org_tensorflow else RUNFILES=bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/org_tensorflow @@ -78,10 +85,9 @@ function prepare_src() { cp -R \ bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/org_tensorflow/tensorflow \ "${TMPDIR}" - mkdir "${TMPDIR}/external" cp_external \ bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/org_tensorflow/external \ - "${TMPDIR}/external" + "${EXTERNAL_INCLUDES}" # Copy MKL libs over so they can be loaded at runtime so_lib_dir=$(ls $RUNFILES | grep solib) || true if [ -n "${so_lib_dir}" ]; then @@ -96,10 +102,9 @@ function prepare_src() { cp -R \ bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles/org_tensorflow/tensorflow \ "${TMPDIR}" - mkdir "${TMPDIR}/external" cp_external \ bazel-bin/tensorflow/tools/pip_package/build_pip_package.runfiles \ - "${TMPDIR}/external" + "${EXTERNAL_INCLUDES}" # Copy MKL libs over so they can be loaded at runtime so_lib_dir=$(ls $RUNFILES | grep solib) || true if [ -n "${so_lib_dir}" ]; then -- cgit v1.2.3 From 1adbc5aa6927d1a5d7151c31aea1da6e73a1b53c Mon Sep 17 00:00:00 2001 From: Allen Lavoie Date: Thu, 31 May 2018 19:03:21 -0700 Subject: Add a single positional argument mode for shape inference in subclassed Models. Allows fit() when call's signature looks something like call(x, training=True). Calling conventions are "inputs", single positional, and multiple positional. Right now the distinction between "inputs" and single positional calling conventions is the text of one error message. Both support shape inference (which just hasn't been implemented for multiple positional input arguments yet). PiperOrigin-RevId: 198815483 --- tensorflow/python/keras/engine/base_layer.py | 45 ++++++++++++++++---- tensorflow/python/keras/engine/network.py | 50 +++++++++++++++++++---- tensorflow/python/keras/engine/training.py | 27 +++++++----- tensorflow/python/keras/model_subclassing_test.py | 4 +- 4 files changed, 98 insertions(+), 28 deletions(-) diff --git a/tensorflow/python/keras/engine/base_layer.py b/tensorflow/python/keras/engine/base_layer.py index 24716cfbe4..4814275fd5 100644 --- a/tensorflow/python/keras/engine/base_layer.py +++ b/tensorflow/python/keras/engine/base_layer.py @@ -19,6 +19,7 @@ from __future__ import division from __future__ import print_function import collections +import enum # pylint: disable=g-bad-import-order import inspect # Necessary supplement to tf_inspect to deal with variadic args. import numpy as np @@ -50,6 +51,20 @@ from tensorflow.python.util import tf_inspect from tensorflow.python.util.tf_export import tf_export +class CallConvention(enum.Enum): + """Calling conventions for passing `Layer` inputs to `Layer.call`.""" + # The Layer takes inputs as its first argument, named "inputs" for + # compatibility with the signature of Layer.__call__. This is the mode assumed + # for Layers which are not subclassed Models. + EXPLICIT_INPUTS_ARGUMENT = 1 + # The Layer takes a single positional argument, not named "inputs". It's + # treated like an "inputs" argument. + SINGLE_POSITIONAL_ARGUMENT = 2 + # The Layer has multiple positional arguments to which its inputs should be + # bound. + POSITIONAL_ARGUMENTS_ARE_INPUTS = 3 + + @tf_export('keras.layers.Layer') class Layer(checkpointable.CheckpointableBase): """Base layer class. @@ -149,7 +164,7 @@ class Layer(checkpointable.CheckpointableBase): self._call_fn_args = function_utils.fn_args(self.call) self._compute_previous_mask = ('mask' in self._call_fn_args or hasattr(self, 'compute_mask')) - self._uses_inputs_arg = True + self._call_convention = CallConvention.EXPLICIT_INPUTS_ARGUMENT # These lists will be filled via successive calls # to self._add_inbound_node(). @@ -793,12 +808,22 @@ class Layer(checkpointable.CheckpointableBase): pass # C type such as dict. Masking not supported in this case. def _set_connectivity_metadata_(self, inputs, outputs, args, kwargs): - if args and getattr(self, '_uses_inputs_arg', True): - raise TypeError( - 'This Layer takes an `inputs` argument to call(), and only the ' - '`inputs` argument may be specified as a positional argument. ' - 'Pass everything else as a keyword argument (those arguments will' - ' not be tracked as inputs to the Layer).') + call_convention = getattr(self, '_call_convention', + CallConvention.EXPLICIT_INPUTS_ARGUMENT) + if args: + if call_convention == CallConvention.EXPLICIT_INPUTS_ARGUMENT: + raise TypeError( + 'This Layer takes an `inputs` argument to call(), and only the ' + '`inputs` argument may be specified as a positional argument. ' + 'Pass everything else as a keyword argument (those arguments will' + ' not be tracked as inputs to the Layer).') + elif call_convention == CallConvention.SINGLE_POSITIONAL_ARGUMENT: + raise TypeError( + 'This Layer takes a single positional argument to call(), which is ' + 'by convention the inputs argument, and only this argument may be ' + 'specified as a positional argument. Pass everything else as a ' + 'keyword argument (those arguments will not be tracked as inputs ' + 'to the Layer).') # If the layer returns tensors from its inputs, unmodified, # we copy them to avoid loss of tensor metadata. @@ -834,7 +859,11 @@ class Layer(checkpointable.CheckpointableBase): A tuple of (inputs, non_input_kwargs). These may be the same objects as were passed in (call_args and call_kwargs). """ - if getattr(self, '_uses_inputs_arg', True): + call_convention = getattr(self, '_call_convention', + CallConvention.EXPLICIT_INPUTS_ARGUMENT) + if (call_convention in ( + CallConvention.EXPLICIT_INPUTS_ARGUMENT, + CallConvention.SINGLE_POSITIONAL_ARGUMENT)): assert len(call_args) == 1 # TypeError raised earlier in __call__. return call_args[0], call_kwargs else: diff --git a/tensorflow/python/keras/engine/network.py b/tensorflow/python/keras/engine/network.py index 3d567b8378..6f27eea1e7 100644 --- a/tensorflow/python/keras/engine/network.py +++ b/tensorflow/python/keras/engine/network.py @@ -135,7 +135,7 @@ class Network(base_layer.Layer): self._in_progress_restore_finalizer = None def _init_graph_network(self, inputs, outputs, name=None): - self._uses_inputs_arg = True + self._call_convention = base_layer.CallConvention.EXPLICIT_INPUTS_ARGUMENT # Normalize and set self.inputs, self.outputs. if isinstance(inputs, (list, tuple)): self.inputs = list(inputs) # Tensor or list of tensors. @@ -295,19 +295,55 @@ class Network(base_layer.Layer): def _init_subclassed_network(self, name=None): self._base_init(name=name) self._is_graph_network = False - call_args = tf_inspect.getargspec(self.call).args - if 'training' in call_args: + call_argspec = tf_inspect.getargspec(self.call) + if 'training' in call_argspec.args: self._expects_training_arg = True else: self._expects_training_arg = False - if 'inputs' in call_args: - self._uses_inputs_arg = True - else: - self._uses_inputs_arg = False + self._call_convention = self._determine_call_convention(call_argspec) self.outputs = None self.inputs = None self.built = False + def _determine_call_convention(self, call_argspec): + """Decides how `self.call()` is invoked. See base_layer.CallConvention.""" + if call_argspec.varargs: + may_take_single_argument = False + else: + try: + # Note: tf_inspect doesn't raise a TypeError when regular inspect would, + # so we need to keep in mind that "getcallargs" may have returned + # something even though we under-specified positional arguments. + all_args = tf_inspect.getcallargs(self.call, None) + self_args = set() + for arg_name, obj in all_args.items(): + if obj is self: + self_args.add(arg_name) + may_take_single_argument = True + except TypeError: + may_take_single_argument = False + if may_take_single_argument: + # A single positional argument (plus "self") is considered equivalent to + # an "inputs" argument. + all_positional_args = len(call_argspec.args) + if call_argspec.defaults is not None: + all_positional_args -= len(call_argspec.defaults) + non_self_positional_args = all_positional_args + for positional_arg_name in call_argspec.args[:all_positional_args]: + if positional_arg_name in self_args: + non_self_positional_args -= 1 + if non_self_positional_args == 1: + if 'inputs' in call_argspec.args[all_positional_args:]: + raise TypeError( + "Model.call() takes a single positional argument (to which " + "inputs are passed by convention) and a separate 'inputs' " + "argument. Unable to determine which arguments are inputs.") + return base_layer.CallConvention.SINGLE_POSITIONAL_ARGUMENT + if 'inputs' in call_argspec.args: + return base_layer.CallConvention.EXPLICIT_INPUTS_ARGUMENT + else: + return base_layer.CallConvention.POSITIONAL_ARGUMENTS_ARE_INPUTS + def _track_layers(self, layers): """Add Checkpointable dependencies on a list of Layers.""" weight_layer_index = 0 diff --git a/tensorflow/python/keras/engine/training.py b/tensorflow/python/keras/engine/training.py index 6d625f16c2..04a2aa7664 100644 --- a/tensorflow/python/keras/engine/training.py +++ b/tensorflow/python/keras/engine/training.py @@ -31,12 +31,11 @@ from tensorflow.python.keras import backend as K from tensorflow.python.keras import losses from tensorflow.python.keras import metrics as metrics_module from tensorflow.python.keras import optimizers +from tensorflow.python.keras.engine import base_layer from tensorflow.python.keras.engine import training_arrays from tensorflow.python.keras.engine import training_eager from tensorflow.python.keras.engine import training_generator from tensorflow.python.keras.engine import training_utils -from tensorflow.python.keras.engine.base_layer import DeferredTensor -from tensorflow.python.keras.engine.base_layer import Layer from tensorflow.python.keras.engine.network import Network from tensorflow.python.keras.utils.generic_utils import slice_arrays from tensorflow.python.ops import array_ops @@ -523,7 +522,7 @@ class Model(Network): # Keep track of state updates created by # stateful metrics (i.e. metrics layers). - if isinstance(metric_fn, Layer) and metric_fn.stateful: + if isinstance(metric_fn, base_layer.Layer) and metric_fn.stateful: self.stateful_metric_names.append(metric_name) self.stateful_metric_functions.append(metric_fn) self.metrics_updates += metric_fn.updates @@ -959,11 +958,17 @@ class Model(Network): whether to build the model's graph in inference mode (False), training mode (True), or using the Keras learning phase (None). """ - if not getattr(self, '_uses_inputs_arg', True): + call_convention = getattr( + self, + '_call_convention', + base_layer.CallConvention.EXPLICIT_INPUTS_ARGUMENT) + if call_convention not in ( + base_layer.CallConvention.EXPLICIT_INPUTS_ARGUMENT, + base_layer.CallConvention.SINGLE_POSITIONAL_ARGUMENT): raise NotImplementedError( - 'Subclassed Models without "inputs" in their call() signatures do ' - 'not yet support shape inference. File a feature request if this ' - 'limitation bothers you.') + 'Subclassed Models without "inputs" (or single positional arguments) ' + 'in their call() signatures do not yet support shape inference. File ' + 'a feature request if this limitation bothers you.') if self.__class__.__name__ == 'Sequential': # Note: we can't test whether the model is `Sequential` via `isinstance` # since `Sequential` depends on `Model`. @@ -1020,11 +1025,11 @@ class Model(Network): else: dummy_output_values = [dummy_output_values] self.outputs = [ - DeferredTensor(shape=(None for _ in v.shape), - dtype=v.dtype) for v in dummy_output_values] + base_layer.DeferredTensor(shape=(None for _ in v.shape), + dtype=v.dtype) for v in dummy_output_values] self.inputs = [ - DeferredTensor(shape=(None for _ in v.shape), - dtype=v.dtype) for v in dummy_input_values] + base_layer.DeferredTensor(shape=(None for _ in v.shape), + dtype=v.dtype) for v in dummy_input_values] self.input_names = [ 'input_%d' % (i + 1) for i in range(len(dummy_input_values))] self.output_names = [ diff --git a/tensorflow/python/keras/model_subclassing_test.py b/tensorflow/python/keras/model_subclassing_test.py index 86f7e20bec..8fb957da43 100644 --- a/tensorflow/python/keras/model_subclassing_test.py +++ b/tensorflow/python/keras/model_subclassing_test.py @@ -56,8 +56,8 @@ class SimpleTestModel(keras.Model): if self.use_bn: self.bn = keras.layers.BatchNormalization(axis=-1) - def call(self, inputs): - x = self.dense1(inputs) + def call(self, x): + x = self.dense1(x) if self.use_dp: x = self.dp(x) if self.use_bn: -- cgit v1.2.3 From 6fb75293ec2cb5cd8d815cf98ec33aa953442b34 Mon Sep 17 00:00:00 2001 From: Derek Murray Date: Wed, 20 Jun 2018 10:10:55 -0700 Subject: [tf.data] Properly export `tf.contrib.data.choose_from_datasets()` PiperOrigin-RevId: 201371642 --- tensorflow/contrib/data/__init__.py | 1 + 1 file changed, 1 insertion(+) diff --git a/tensorflow/contrib/data/__init__.py b/tensorflow/contrib/data/__init__.py index 1af1ed08b5..9c6a13333e 100644 --- a/tensorflow/contrib/data/__init__.py +++ b/tensorflow/contrib/data/__init__.py @@ -72,6 +72,7 @@ from tensorflow.contrib.data.python.ops.error_ops import ignore_errors from tensorflow.contrib.data.python.ops.get_single_element import get_single_element from tensorflow.contrib.data.python.ops.grouping import bucket_by_sequence_length from tensorflow.contrib.data.python.ops.grouping import group_by_window +from tensorflow.contrib.data.python.ops.interleave_ops import choose_from_datasets from tensorflow.contrib.data.python.ops.interleave_ops import parallel_interleave from tensorflow.contrib.data.python.ops.interleave_ops import sample_from_datasets from tensorflow.contrib.data.python.ops.interleave_ops import sloppy_interleave -- cgit v1.2.3 From f283e65a1bdb797070be9b84a69ef323268f7c3c Mon Sep 17 00:00:00 2001 From: Tom Hennigan Date: Tue, 5 Jun 2018 03:56:47 -0700 Subject: Handle scalar input to assert_equal in eager. PiperOrigin-RevId: 199274329 --- tensorflow/python/kernel_tests/check_ops_test.py | 7 +++++++ tensorflow/python/ops/check_ops.py | 4 ++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/tensorflow/python/kernel_tests/check_ops_test.py b/tensorflow/python/kernel_tests/check_ops_test.py index 5a83ec8d30..7ef841c96b 100644 --- a/tensorflow/python/kernel_tests/check_ops_test.py +++ b/tensorflow/python/kernel_tests/check_ops_test.py @@ -88,6 +88,13 @@ class AssertEqualTest(test.TestCase): out = array_ops.identity(small) self.evaluate(out) + @test_util.run_in_graph_and_eager_modes() + def test_scalar_comparison(self): + const_true = constant_op.constant(True, name="true") + const_false = constant_op.constant(False, name="false") + with self.assertRaisesRegexp(errors.InvalidArgumentError, "fail"): + check_ops.assert_equal(const_true, const_false, message="fail") + def test_returns_none_with_eager(self): with context.eager_mode(): small = constant_op.constant([1, 2], name="small") diff --git a/tensorflow/python/ops/check_ops.py b/tensorflow/python/ops/check_ops.py index cabc1e724c..375a5ec2c3 100644 --- a/tensorflow/python/ops/check_ops.py +++ b/tensorflow/python/ops/check_ops.py @@ -341,8 +341,8 @@ def assert_equal(x, y, data=None, summarize=None, message=None, name=None): y_sum, y_np[:y_sum])) index_and_values_str = '' - if x.shape == y.shape: - # If the shapes of x and y are the same, + if x.shape == y.shape and x.shape.as_list(): + # If the shapes of x and y are the same (and not scalars), # Get the values that actually differed and their indices. # If shapes are different this information is more confusing # than useful. -- cgit v1.2.3 From e71f9b863097086c91b2a3f5aea1e081f275ceca Mon Sep 17 00:00:00 2001 From: "A. Unique TensorFlower" Date: Thu, 21 Jun 2018 18:53:05 -0700 Subject: Update Eigen version to commit e5e305a158a029f5b5f837bf821411a51439a970. PiperOrigin-RevId: 201624024 --- .../kernel_tests/distributions/dirichlet_multinomial_test.py | 6 +++--- tensorflow/workspace.bzl | 8 ++++---- 2 files changed, 7 insertions(+), 7 deletions(-) diff --git a/tensorflow/python/kernel_tests/distributions/dirichlet_multinomial_test.py b/tensorflow/python/kernel_tests/distributions/dirichlet_multinomial_test.py index 7922fb0606..daea699514 100644 --- a/tensorflow/python/kernel_tests/distributions/dirichlet_multinomial_test.py +++ b/tensorflow/python/kernel_tests/distributions/dirichlet_multinomial_test.py @@ -250,9 +250,9 @@ class DirichletMultinomialTest(test.TestCase): dist.variance(), dist.stddev(), ]) - self.assertAllClose(sample_mean_, analytic_mean, atol=0., rtol=0.04) - self.assertAllClose(sample_cov_, analytic_cov, atol=0., rtol=0.05) - self.assertAllClose(sample_var_, analytic_var, atol=0., rtol=0.05) + self.assertAllClose(sample_mean_, analytic_mean, atol=0., rtol=0.06) + self.assertAllClose(sample_cov_, analytic_cov, atol=0., rtol=0.07) + self.assertAllClose(sample_var_, analytic_var, atol=0., rtol=0.07) self.assertAllClose(sample_stddev_, analytic_stddev, atol=0., rtol=0.02) def testCovariance(self): diff --git a/tensorflow/workspace.bzl b/tensorflow/workspace.bzl index 50a69598a1..43152c88cf 100644 --- a/tensorflow/workspace.bzl +++ b/tensorflow/workspace.bzl @@ -107,11 +107,11 @@ def tf_workspace(path_prefix="", tf_repo_name=""): tf_http_archive( name = "eigen_archive", urls = [ - "https://mirror.bazel.build/bitbucket.org/eigen/eigen/get/6913f0cf7d06.tar.gz", - "https://bitbucket.org/eigen/eigen/get/6913f0cf7d06.tar.gz", + "https://mirror.bazel.build/bitbucket.org/eigen/eigen/get/e5e305a158a0.tar.gz", + "https://bitbucket.org/eigen/eigen/get/e5e305a158a0.tar.gz", ], - sha256 = "791b836cacd03e20bae5bdd25f1c4a5505a0a9975ba94a61eb4e2631fbd1d53a", - strip_prefix = "eigen-eigen-6913f0cf7d06", + sha256 = "8bbe676d69e7f59070c83a949454b8b6344034e0ebbf686b337528e5dc04c7de", + strip_prefix = "eigen-eigen-e5e305a158a0", build_file = clean_dep("//third_party:eigen.BUILD"), patch_file = clean_dep("//third_party:eigen_fix_cuda_compilation.patch") ) -- cgit v1.2.3 From 5c450d2e1d0d3a1abae4997df0da1b8d73684e01 Mon Sep 17 00:00:00 2001 From: Rasmus Munk Larsen Date: Fri, 22 Jun 2018 13:36:57 -0700 Subject: Update workspace.bzl --- tensorflow/workspace.bzl | 1 - 1 file changed, 1 deletion(-) diff --git a/tensorflow/workspace.bzl b/tensorflow/workspace.bzl index 43152c88cf..3c657c4a5b 100644 --- a/tensorflow/workspace.bzl +++ b/tensorflow/workspace.bzl @@ -113,7 +113,6 @@ def tf_workspace(path_prefix="", tf_repo_name=""): sha256 = "8bbe676d69e7f59070c83a949454b8b6344034e0ebbf686b337528e5dc04c7de", strip_prefix = "eigen-eigen-e5e305a158a0", build_file = clean_dep("//third_party:eigen.BUILD"), - patch_file = clean_dep("//third_party:eigen_fix_cuda_compilation.patch") ) tf_http_archive( -- cgit v1.2.3 From df2c8315211895afab0d7ba1ff64e831d9d3ce3b Mon Sep 17 00:00:00 2001 From: Billy Lamberta Date: Tue, 19 Jun 2018 23:11:00 -0700 Subject: Get started landing page. Move "Datasets Quickstart" to "Datasets for Estimators" under guide. PiperOrigin-RevId: 201301717 --- tensorflow/docs_src/get_started/_index.yaml | 255 ++++++++++++++ .../docs_src/get_started/basic_classification.md | 3 + .../docs_src/get_started/basic_regression.md | 3 + .../get_started/basic_text_classification.md | 3 + .../docs_src/get_started/datasets_quickstart.md | 387 --------------------- tensorflow/docs_src/get_started/eager.md | 2 +- tensorflow/docs_src/get_started/index.md | 29 -- tensorflow/docs_src/get_started/leftnav_files | 12 +- tensorflow/docs_src/get_started/next_steps.md | 36 ++ .../docs_src/get_started/overfit_and_underfit.md | 3 + .../get_started/save_and_restore_models.md | 3 + tensorflow/docs_src/install/install_linux.md | 8 +- tensorflow/docs_src/install/install_mac.md | 6 +- tensorflow/docs_src/install/install_raspbian.md | 6 +- tensorflow/docs_src/install/install_sources.md | 2 +- tensorflow/docs_src/install/install_windows.md | 7 +- .../programmers_guide/datasets_for_estimators.md | 387 +++++++++++++++++++++ tensorflow/docs_src/programmers_guide/index.md | 1 + .../docs_src/programmers_guide/leftnav_files | 1 + .../programmers_guide/premade_estimators.md | 8 +- tensorflow/docs_src/tutorials/index.md | 5 +- 21 files changed, 715 insertions(+), 452 deletions(-) create mode 100644 tensorflow/docs_src/get_started/_index.yaml create mode 100644 tensorflow/docs_src/get_started/basic_classification.md create mode 100644 tensorflow/docs_src/get_started/basic_regression.md create mode 100644 tensorflow/docs_src/get_started/basic_text_classification.md delete mode 100644 tensorflow/docs_src/get_started/datasets_quickstart.md delete mode 100644 tensorflow/docs_src/get_started/index.md create mode 100644 tensorflow/docs_src/get_started/next_steps.md create mode 100644 tensorflow/docs_src/get_started/overfit_and_underfit.md create mode 100644 tensorflow/docs_src/get_started/save_and_restore_models.md create mode 100644 tensorflow/docs_src/programmers_guide/datasets_for_estimators.md diff --git a/tensorflow/docs_src/get_started/_index.yaml b/tensorflow/docs_src/get_started/_index.yaml new file mode 100644 index 0000000000..af255a482d --- /dev/null +++ b/tensorflow/docs_src/get_started/_index.yaml @@ -0,0 +1,255 @@ +project_path: /_project.yaml +book_path: /_book.yaml +description: +landing_page: + show_side_navs: True + rows: + - description: > +

Get Started with TensorFlow

+

+ TensorFlow is an open-source machine learning library for research and + production. TensorFlow offers APIs for beginners and experts to develop + for desktop, mobile, web, and cloud. See the sections below to get + started. +

+ items: + - custom_html: > + +
+ +

Learn and use ML

+
+
+

+ The high-level Keras API provides building blocks to create and + train deep learning models. Start with these beginner-friendly + notebook examples, then read the + TensorFlow Keras guide. +

+
    +
  1. Basic classification
  2. +
  3. Text classification
  4. +
  5. Regression
  6. +
  7. Overfitting and underfitting
  8. +
  9. Save and load
  10. +
+
+ +
+ - classname: tfo-landing-row-item-code-block + code_block: | +
+        import tensorflow as tf
+        mnist = tf.keras.datasets.mnist
+
+        (x_train, y_train),(x_test, y_test) = mnist.load_data()
+        x_train, x_test = x_train / 255.0, x_test / 255.0
+
+        model = tf.keras.models.Sequential([
+          tf.keras.layers.Flatten(),
+          tf.keras.layers.Dense(512, activation=tf.nn.relu),
+          tf.keras.layers.Dropout(0.2),
+          tf.keras.layers.Dense(10, activation=tf.nn.softmax)
+        ])
+        model.compile(optimizer='adam',
+                      loss='sparse_categorical_crossentropy',
+                      metrics=['accuracy'])
+
+        model.fit(x_train, y_train, epochs=5)
+        model.evaluate(x_test, y_test)
+        
+ {% dynamic if request.tld != 'cn' %} + Run in a Notebook + {% dynamic endif %} + + - items: + - custom_html: > +
+ +

Research and experimentation

+
+
+

+ Eager execution provides an imperative, define-by-run interface for advanced operations. Write custom layers, forward passes, and training loops with auto‑differentiation. Start with + these notebooks, then read the eager execution guide. +

+
    +
  1. + {% dynamic if request.tld == 'cn' %} + Eager execution basics + {% dynamic else %} + Eager execution basics + {% dynamic endif %} +
  2. +
  3. + {% dynamic if request.tld == 'cn' %} + Automatic differentiation and gradient tapes + {% dynamic else %} + Automatic differentiation and gradient tapes + {% dynamic endif %} +
  4. +
  5. + {% dynamic if request.tld == 'cn' %} + Variables, models, and training + {% dynamic else %} + Variables, models, and training + {% dynamic endif %} +
  6. +
  7. + {% dynamic if request.tld == 'cn' %} + Custom layers + {% dynamic else %} + Custom layers + {% dynamic endif %} +
  8. +
  9. Custom training walkthrough
  10. +
  11. + {% dynamic if request.tld == 'cn' %} + Example: Neural machine translation w/ attention + {% dynamic else %} + Example: Neural machine translation w/ attention + {% dynamic endif %} +
  12. +
+
+ +
+ - custom_html: > +
+ +

ML at production scale

+
+
+

+ Estimators can train large models on multiple machines in a + production environment. Try the examples below and read the + Estimators guide. +

+
    +
  1. How to build a simple text classifier with TF-Hub
  2. +
  3. Classifying Higgs boson processes
  4. +
  5. Wide and deep learning using estimators
  6. +
+
+ +
+ + - description: > +

Google Colab: An easy way to learn and use TensorFlow

+

+ Colaboratory + is a Google research project created to help disseminate machine learning + education and research. It's a Jupyter notebook environment that requires + no setup to use and runs entirely in the cloud. + Read the blog post. +

+ + - description: > +

Build your first ML app

+

Create and deploy TensorFlow models on web and mobile.

+ background: grey + items: + - custom_html: > +
+ +

Web developers

+
+
+ TensorFlow.js is a WebGL accelerated, JavaScript library to train and + deploy ML models in the browser and for Node.js. +
+
+ - custom_html: > +
+ +

Mobile developers

+
+
+ TensorFlow Lite is lightweight solution for mobile and embedded devices. +
+
+ + - description: > +

Videos and updates

+

+ Subscribe to the TensorFlow + YouTube channel + and blog for + the latest videos and updates. +

+ items: + - description: > +

Get started with TensorFlow's High-Level APIs

+ youtube_id: tjsHSIG8I08 + buttons: + - label: Watch the video + path: https://www.youtube.com/watch?v=tjsHSIG8I08 + - description: > +

Eager execution

+ youtube_id: T8AW0fKP0Hs + background: grey + buttons: + - label: Watch the video + path: https://www.youtube.com/watch?v=T8AW0fKP0Hs + - description: > +

tf.data: Fast, flexible, and easy-to-use input pipelines

+ youtube_id: uIcqeP7MFH0 + buttons: + - label: Watch the video + path: https://www.youtube.com/watch?v=uIcqeP7MFH0 diff --git a/tensorflow/docs_src/get_started/basic_classification.md b/tensorflow/docs_src/get_started/basic_classification.md new file mode 100644 index 0000000000..91bbd85b24 --- /dev/null +++ b/tensorflow/docs_src/get_started/basic_classification.md @@ -0,0 +1,3 @@ +# Basic Classification + +[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/basic_classification.ipynb) diff --git a/tensorflow/docs_src/get_started/basic_regression.md b/tensorflow/docs_src/get_started/basic_regression.md new file mode 100644 index 0000000000..a535f22f5a --- /dev/null +++ b/tensorflow/docs_src/get_started/basic_regression.md @@ -0,0 +1,3 @@ +# Basic Regression + +[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/basic_regression.ipynb) diff --git a/tensorflow/docs_src/get_started/basic_text_classification.md b/tensorflow/docs_src/get_started/basic_text_classification.md new file mode 100644 index 0000000000..7c5d4f7896 --- /dev/null +++ b/tensorflow/docs_src/get_started/basic_text_classification.md @@ -0,0 +1,3 @@ +# Basic Text Classification + +[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/basic_text_classification.ipynb) diff --git a/tensorflow/docs_src/get_started/datasets_quickstart.md b/tensorflow/docs_src/get_started/datasets_quickstart.md deleted file mode 100644 index 020e40dd3b..0000000000 --- a/tensorflow/docs_src/get_started/datasets_quickstart.md +++ /dev/null @@ -1,387 +0,0 @@ -# Datasets Quick Start - -The @{tf.data} module contains a collection of classes that allows you to -easily load data, manipulate it, and pipe it into your model. This document -introduces the API by walking through two simple examples: - -* Reading in-memory data from numpy arrays. -* Reading lines from a csv file. - - - -## Basic input - -Taking slices from an array is the simplest way to get started with `tf.data`. - -The @{$premade_estimators$Premade Estimators} chapter describes -the following `train_input_fn`, from -[`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py), -to pipe the data into the Estimator: - -``` python -def train_input_fn(features, labels, batch_size): - """An input function for training""" - # Convert the inputs to a Dataset. - dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) - - # Shuffle, repeat, and batch the examples. - dataset = dataset.shuffle(1000).repeat().batch(batch_size) - - # Return the dataset. - return dataset -``` - -Let's look at this more closely. - -### Arguments - -This function expects three arguments. Arguments expecting an "array" can -accept nearly anything that can be converted to an array with `numpy.array`. -One exception is -[`tuple`](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences) -which, as we will see, has special meaning for `Datasets`. - -* `features`: A `{'feature_name':array}` dictionary (or - [`DataFrame`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html)) - containing the raw input features. -* `labels` : An array containing the - [label](https://developers.google.com/machine-learning/glossary/#label) - for each example. -* `batch_size` : An integer indicating the desired batch size. - -In [`premade_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py) -we retrieved the Iris data using the `iris_data.load_data()` function. -You can run it, and unpack the results as follows: - -``` python -import iris_data - -# Fetch the data -train, test = iris_data.load_data() -features, labels = train -``` - -Then we passed this data to the input function, with a line similar to this: - -``` python -batch_size=100 -iris_data.train_input_fn(features, labels, batch_size) -``` - -Let's walk through the `train_input_fn()`. - -### Slices - -The function starts by using the @{tf.data.Dataset.from_tensor_slices} function -to create a @{tf.data.Dataset} representing slices of the array. The array is -sliced across the first dimension. For example, an array containing the -@{$tutorials/layers$mnist training data} has a shape of `(60000, 28, 28)`. -Passing this to `from_tensor_slices` returns a `Dataset` object containing -60000 slices, each one a 28x28 image. - -The code that returns this `Dataset` is as follows: - -``` python -train, test = tf.keras.datasets.mnist.load_data() -mnist_x, mnist_y = train - -mnist_ds = tf.data.Dataset.from_tensor_slices(mnist_x) -print(mnist_ds) -``` - -This will print the following line, showing the -@{$programmers_guide/tensors#shapes$shapes} and -@{$programmers_guide/tensors#data_types$types} of the items in -the dataset. Note that a `Dataset` does not know how many items it contains. - -``` None - -``` - -The `Dataset` above represents a simple collection of arrays, but datasets are -much more powerful than this. A `Dataset` can transparently handle any nested -combination of dictionaries or tuples (or -[`namedtuple`](https://docs.python.org/2/library/collections.html#collections.namedtuple) -). - -For example after converting the iris `features` -to a standard python dictionary, you can then convert the dictionary of arrays -to a `Dataset` of dictionaries as follows: - -``` python -dataset = tf.data.Dataset.from_tensor_slices(dict(features)) -print(dataset) -``` -``` None - -``` - -Here we see that when a `Dataset` contains structured elements, the `shapes` -and `types` of the `Dataset` take on the same structure. This dataset contains -dictionaries of @{$programmers_guide/tensors#rank$scalars}, all of type -`tf.float64`. - -The first line of the iris `train_input_fn` uses the same functionality, but -adds another level of structure. It creates a dataset containing -`(features_dict, label)` pairs. - -The following code shows that the label is a scalar with type `int64`: - -``` python -# Convert the inputs to a Dataset. -dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) -print(dataset) -``` -``` - -``` - -### Manipulation - -Currently the `Dataset` would iterate over the data once, in a fixed order, and -only produce a single element at a time. It needs further processing before it -can be used for training. Fortunately, the `tf.data.Dataset` class provides -methods to better prepare the data for training. The next line of the input -function takes advantage of several of these methods: - -``` python -# Shuffle, repeat, and batch the examples. -dataset = dataset.shuffle(1000).repeat().batch(batch_size) -``` - -The @{tf.data.Dataset.shuffle$`shuffle`} method uses a fixed-size buffer to -shuffle the items as they pass through. In this case the `buffer_size` is -greater than the number of examples in the `Dataset`, ensuring that the data is -completely shuffled (The Iris data set only contains 150 examples). - -The @{tf.data.Dataset.repeat$`repeat`} method restarts the `Dataset` when -it reaches the end. To limit the number of epochs, set the `count` argument. - -The @{tf.data.Dataset.batch$`batch`} method collects a number of examples and -stacks them, to create batches. This adds a dimension to their shape. The new -dimension is added as the first dimension. The following code uses -the `batch` method on the MNIST `Dataset`, from earlier. This results in a -`Dataset` containing 3D arrays representing stacks of `(28,28)` images: - -``` python -print(mnist_ds.batch(100)) -``` - -``` none - -``` -Note that the dataset has an unknown batch size because the last batch will -have fewer elements. - -In `train_input_fn`, after batching the `Dataset` contains 1D vectors of -elements where each scalar was previously: - -```python -print(dataset) -``` -``` - -``` - - -### Return - -At this point the `Dataset` contains `(features_dict, labels)` pairs. -This is the format expected by the `train` and `evaluate` methods, so the -`input_fn` returns the dataset. - -The `labels` can/should be omitted when using the `predict` method. - - - - -## Reading a CSV File - -The most common real-world use case for the `Dataset` class is to stream data -from files on disk. The @{tf.data} module includes a variety of -file readers. Let's see how parsing the Iris dataset from the csv file looks -using a `Dataset`. - -The following call to the `iris_data.maybe_download` function downloads the -data if necessary, and returns the pathnames of the resulting files: - -``` python -import iris_data -train_path, test_path = iris_data.maybe_download() -``` - -The [`iris_data.csv_input_fn`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py) -function contains an alternative implementation that parses the csv files using -a `Dataset`. - -Let's look at how to build an Estimator-compatible input function that reads -from the local files. - -### Build the `Dataset` - -We start by building a @{tf.data.TextLineDataset$`TextLineDataset`} object to -read the file one line at a time. Then, we call the -@{tf.data.Dataset.skip$`skip`} method to skip over the first line of the file, which contains a header, not an example: - -``` python -ds = tf.data.TextLineDataset(train_path).skip(1) -``` - -### Build a csv line parser - -We will start by building a function to parse a single line. - -The following `iris_data.parse_line` function accomplishes this task using the -@{tf.decode_csv} function, and some simple python code: - -We must parse each of the lines in the dataset in order to generate the -necessary `(features, label)` pairs. The following `_parse_line` function -calls @{tf.decode_csv} to parse a single line into its features -and the label. Since Estimators require that features be represented as a -dictionary, we rely on Python's built-in `dict` and `zip` functions to build -that dictionary. The feature names are the keys of that dictionary. -We then call the dictionary's `pop` method to remove the label field from -the features dictionary: - -``` python -# Metadata describing the text columns -COLUMNS = ['SepalLength', 'SepalWidth', - 'PetalLength', 'PetalWidth', - 'label'] -FIELD_DEFAULTS = [[0.0], [0.0], [0.0], [0.0], [0]] -def _parse_line(line): - # Decode the line into its fields - fields = tf.decode_csv(line, FIELD_DEFAULTS) - - # Pack the result into a dictionary - features = dict(zip(COLUMNS,fields)) - - # Separate the label from the features - label = features.pop('label') - - return features, label -``` - -### Parse the lines - -Datasets have many methods for manipulating the data while it is being piped -to a model. The most heavily-used method is @{tf.data.Dataset.map$`map`}, which -applies a transformation to each element of the `Dataset`. - -The `map` method takes a `map_func` argument that describes how each item in the -`Dataset` should be transformed. - -
- -
-
-The @{tf.data.Dataset.map$`map`} method applies the `map_func` to -transform each item in the Dataset. -
- -So to parse the lines as they are streamed out of the csv file, we pass our -`_parse_line` function to the `map` method: - -``` python -ds = ds.map(_parse_line) -print(ds) -``` -``` None - -``` - -Now instead of simple scalar strings, the dataset contains `(features, label)` -pairs. - -the remainder of the `iris_data.csv_input_fn` function is identical -to `iris_data.train_input_fn` which was covered in the in the -[Basic input](#basic_input) section. - -### Try it out - -This function can be used as a replacement for -`iris_data.train_input_fn`. It can be used to feed an estimator as follows: - -``` python -train_path, test_path = iris_data.maybe_download() - -# All the inputs are numeric -feature_columns = [ - tf.feature_column.numeric_column(name) - for name in iris_data.CSV_COLUMN_NAMES[:-1]] - -# Build the estimator -est = tf.estimator.LinearClassifier(feature_columns, - n_classes=3) -# Train the estimator -batch_size = 100 -est.train( - steps=1000, - input_fn=lambda : iris_data.csv_input_fn(train_path, batch_size)) -``` - -Estimators expect an `input_fn` to take no arguments. To work around this -restriction, we use `lambda` to capture the arguments and provide the expected -interface. - -## Summary - -The `tf.data` module provides a collection of classes and functions for easily -reading data from a variety of sources. Furthermore, `tf.data` has simple -powerful methods for applying a wide variety of standard and custom -transformations. - -Now you have the basic idea of how to efficiently load data into an -Estimator. Consider the following documents next: - - -* @{$custom_estimators}, which demonstrates how to build your own - custom `Estimator` model. -* The @{$low_level_intro#datasets$Low Level Introduction}, which demonstrates - how to experiment directly with `tf.data.Datasets` using TensorFlow's low - level APIs. -* @{$programmers_guide/datasets} which goes into great detail about additional - functionality of `Datasets`. - diff --git a/tensorflow/docs_src/get_started/eager.md b/tensorflow/docs_src/get_started/eager.md index bbb25e20c6..ddf239485a 100644 --- a/tensorflow/docs_src/get_started/eager.md +++ b/tensorflow/docs_src/get_started/eager.md @@ -1,3 +1,3 @@ -# Get Started with Eager Execution +# Custom Training Walkthrough [Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/r1.9.0/samples/core/get_started/eager.ipynb) diff --git a/tensorflow/docs_src/get_started/index.md b/tensorflow/docs_src/get_started/index.md deleted file mode 100644 index 232d2f1547..0000000000 --- a/tensorflow/docs_src/get_started/index.md +++ /dev/null @@ -1,29 +0,0 @@ -# Get Started - -If you are new to machine learning, we recommend taking the following online -course prior to diving into TensorFlow documentation: - - * [Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/), - which introduces machine learning concepts and encourages experimentation - with existing TensorFlow code. - -TensorFlow is a tool for machine learning. While it contains a wide range of -functionality, TensorFlow is mainly designed for deep neural network models. - -The easiest way to get started with TensorFlow is by using Eager Execution. - - * @{$get_started/eager}, is for anyone new to machine learning or TensorFlow. - -TensorFlow provides many APIs. The remainder of this section focuses on the -Estimator API which provide scalable, high-performance models. See the -@{$estimators} guide. - -For more advanced users: - - * The @{$low_level_intro$Low Level Introduction} demonstrates how to use - TensorFlow outside of the Estimator framework, for debugging and - experimentation. - * The @{$programmers_guide$Programmer's Guide} details major - TensorFlow components. - * The @{$tutorials$Tutorials} provide walkthroughs of a variety of - TensorFlow models. diff --git a/tensorflow/docs_src/get_started/leftnav_files b/tensorflow/docs_src/get_started/leftnav_files index e6cc8d5658..9a60496cb5 100644 --- a/tensorflow/docs_src/get_started/leftnav_files +++ b/tensorflow/docs_src/get_started/leftnav_files @@ -1,4 +1,10 @@ -index.md +### Learn and use ML +basic_classification.md +basic_text_classification.md +basic_regression.md +overfit_and_underfit.md +save_and_restore_models.md +next_steps.md -eager.md -datasets_quickstart.md +### Research and experimentation +custom_training_walkthrough.md diff --git a/tensorflow/docs_src/get_started/next_steps.md b/tensorflow/docs_src/get_started/next_steps.md new file mode 100644 index 0000000000..79c0ef3346 --- /dev/null +++ b/tensorflow/docs_src/get_started/next_steps.md @@ -0,0 +1,36 @@ +# Next Steps + +## Learn more about TensorFlow + +* The [TensorFlow Guide](/programmers_guide) includes usage guides for the + high-level APIs, as well as advanced TensorFlow operations. +* [Premade Estimators](/programmers_guide/premade_estimators) are designed to + get results out of the box. Use TensorFlow without building your own models. +* [TensorFlow.js](https://js.tensorflow.org/) allows web developers to train and + deploy ML models in the browser and using Node.js. +* [TFLite](/mobile/tflite) allows mobile developers to do inference efficiently + on mobile devices. +* [TensorFlow Serving](/serving) is an open-source project that can put + TensorFlow models in production quickly. +* The [ecosystem](/ecosystem) contains more projects, including + [Magenta](https://magenta.tensorflow.org/), [TFX](/tfx), + [Swift for TensorFlow](https://github.com/tensorflow/swift), and more. + +## Learn more about machine learning + +Recommended resources include: + +* [Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/), + a course from Google that introduces machine learning concepts. +* [CS 20: Tensorflow for Deep Learning Research](http://web.stanford.edu/class/cs20si/), + notes from an intro course from Stanford. +* [CS231n: Convolutional Neural Networks for Visual Recognition](http://cs231n.stanford.edu/), + a course that teaches how convolutional networks work. +* [Machine Learning Recipes](https://www.youtube.com/watch?v=cKxRvEZd3Mw&list=PLOU2XLYxmsIIuiBfYad6rFYQU_jL2ryal), + a video series that introduces basic machine learning concepts with few prerequisites. +* [Deep Learning with Python](https://www.manning.com/books/deep-learning-with-python), + a book by Francois Chollet about the Keras API, as well as an excellent hands on intro to Deep Learning. +* [Hands-on Machine Learning with Scikit-Learn and TensorFlow](https://github.com/ageron/handson-ml), + a book by Aurélien Geron's that is a clear getting-started guide to data science and deep learning. +* [Deep Learning](https://www.deeplearningbook.org/), a book by Ian Goodfellow et al. + that provides a technical dive into learning machine learning. diff --git a/tensorflow/docs_src/get_started/overfit_and_underfit.md b/tensorflow/docs_src/get_started/overfit_and_underfit.md new file mode 100644 index 0000000000..e5b5ae7b5a --- /dev/null +++ b/tensorflow/docs_src/get_started/overfit_and_underfit.md @@ -0,0 +1,3 @@ +# Overfitting and Underfitting + +[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/overfit_and_underfit.ipynb) diff --git a/tensorflow/docs_src/get_started/save_and_restore_models.md b/tensorflow/docs_src/get_started/save_and_restore_models.md new file mode 100644 index 0000000000..44b3772945 --- /dev/null +++ b/tensorflow/docs_src/get_started/save_and_restore_models.md @@ -0,0 +1,3 @@ +# Save and restore Models + +[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/save_and_restore_models.ipynb) diff --git a/tensorflow/docs_src/install/install_linux.md b/tensorflow/docs_src/install/install_linux.md index 9baf6870be..41619ca230 100644 --- a/tensorflow/docs_src/install/install_linux.md +++ b/tensorflow/docs_src/install/install_linux.md @@ -491,13 +491,7 @@ TensorFlow programs: If the system outputs an error message instead of a greeting, see [Common installation problems](#common_installation_problems). -If you are new to machine learning, we recommend the following: - -* [Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course) -* @{$get_started/eager} - -If you are experienced with machine learning but new to TensorFlow, see -@{$get_started/eager}. +To learn more, see [Get Started with TensorFlow](https://www.tensorflow.org/get_started). ## TensorFlow GPU support diff --git a/tensorflow/docs_src/install/install_mac.md b/tensorflow/docs_src/install/install_mac.md index 693254f876..eeca389617 100644 --- a/tensorflow/docs_src/install/install_mac.md +++ b/tensorflow/docs_src/install/install_mac.md @@ -403,11 +403,7 @@ writing TensorFlow programs: If the system outputs an error message instead of a greeting, see [Common installation problems](#common_installation_problems). -If you are new to machine learning, we recommend the -[Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course). - -If you are experienced with machine learning but new to TensorFlow, see -@{$get_started/eager}. +To learn more, see [Get Started with TensorFlow](https://www.tensorflow.org/get_started). ## Common installation problems diff --git a/tensorflow/docs_src/install/install_raspbian.md b/tensorflow/docs_src/install/install_raspbian.md index 2f425162a1..0caab6d335 100644 --- a/tensorflow/docs_src/install/install_raspbian.md +++ b/tensorflow/docs_src/install/install_raspbian.md @@ -230,11 +230,7 @@ problems, despite the log message. If the system outputs an error message instead of a greeting, see [Common installation problems](#common_installation_problems). -If you are new to machine learning, we recommend the [Machine Learning Crash -Course](https://developers.google.com/machine-learning/crash-course). - -If you are experienced with machine learning but new to TensorFlow, see -@{$get_started/eager}. +To learn more, see [Get Started with TensorFlow](https://www.tensorflow.org/get_started). ## Common installation problems diff --git a/tensorflow/docs_src/install/install_sources.md b/tensorflow/docs_src/install/install_sources.md index 70e97cf556..7afcd340aa 100644 --- a/tensorflow/docs_src/install/install_sources.md +++ b/tensorflow/docs_src/install/install_sources.md @@ -362,7 +362,7 @@ TensorFlow programs:
Hello, TensorFlow!
-If you are new to TensorFlow, see @{$get_started/eager}. +To learn more, see [Get Started with TensorFlow](https://www.tensorflow.org/get_started). If the system outputs an error message instead of a greeting, see [Common installation problems](#common_installation_problems). diff --git a/tensorflow/docs_src/install/install_windows.md b/tensorflow/docs_src/install/install_windows.md index 6c4f5b85ab..7fe94f0bc3 100644 --- a/tensorflow/docs_src/install/install_windows.md +++ b/tensorflow/docs_src/install/install_windows.md @@ -157,12 +157,7 @@ TensorFlow programs: If the system outputs an error message instead of a greeting, see [Common installation problems](#common_installation_problems). -If you are new to machine learning, we recommend the -[Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course). - -If you are experienced with machine learning but new to TensorFlow, see -@{$get_started/eager}. - +To learn more, see [Get Started with TensorFlow](https://www.tensorflow.org/get_started). ## Common installation problems diff --git a/tensorflow/docs_src/programmers_guide/datasets_for_estimators.md b/tensorflow/docs_src/programmers_guide/datasets_for_estimators.md new file mode 100644 index 0000000000..345a31b985 --- /dev/null +++ b/tensorflow/docs_src/programmers_guide/datasets_for_estimators.md @@ -0,0 +1,387 @@ +# Datasets for Estimators + +The @{tf.data} module contains a collection of classes that allows you to +easily load data, manipulate it, and pipe it into your model. This document +introduces the API by walking through two simple examples: + +* Reading in-memory data from numpy arrays. +* Reading lines from a csv file. + + + +## Basic input + +Taking slices from an array is the simplest way to get started with `tf.data`. + +The @{$premade_estimators$Premade Estimators} chapter describes +the following `train_input_fn`, from +[`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py), +to pipe the data into the Estimator: + +``` python +def train_input_fn(features, labels, batch_size): + """An input function for training""" + # Convert the inputs to a Dataset. + dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) + + # Shuffle, repeat, and batch the examples. + dataset = dataset.shuffle(1000).repeat().batch(batch_size) + + # Return the dataset. + return dataset +``` + +Let's look at this more closely. + +### Arguments + +This function expects three arguments. Arguments expecting an "array" can +accept nearly anything that can be converted to an array with `numpy.array`. +One exception is +[`tuple`](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences) +which, as we will see, has special meaning for `Datasets`. + +* `features`: A `{'feature_name':array}` dictionary (or + [`DataFrame`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html)) + containing the raw input features. +* `labels` : An array containing the + [label](https://developers.google.com/machine-learning/glossary/#label) + for each example. +* `batch_size` : An integer indicating the desired batch size. + +In [`premade_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py) +we retrieved the Iris data using the `iris_data.load_data()` function. +You can run it, and unpack the results as follows: + +``` python +import iris_data + +# Fetch the data +train, test = iris_data.load_data() +features, labels = train +``` + +Then we passed this data to the input function, with a line similar to this: + +``` python +batch_size=100 +iris_data.train_input_fn(features, labels, batch_size) +``` + +Let's walk through the `train_input_fn()`. + +### Slices + +The function starts by using the @{tf.data.Dataset.from_tensor_slices} function +to create a @{tf.data.Dataset} representing slices of the array. The array is +sliced across the first dimension. For example, an array containing the +@{$tutorials/layers$mnist training data} has a shape of `(60000, 28, 28)`. +Passing this to `from_tensor_slices` returns a `Dataset` object containing +60000 slices, each one a 28x28 image. + +The code that returns this `Dataset` is as follows: + +``` python +train, test = tf.keras.datasets.mnist.load_data() +mnist_x, mnist_y = train + +mnist_ds = tf.data.Dataset.from_tensor_slices(mnist_x) +print(mnist_ds) +``` + +This will print the following line, showing the +@{$programmers_guide/tensors#shapes$shapes} and +@{$programmers_guide/tensors#data_types$types} of the items in +the dataset. Note that a `Dataset` does not know how many items it contains. + +``` None + +``` + +The `Dataset` above represents a simple collection of arrays, but datasets are +much more powerful than this. A `Dataset` can transparently handle any nested +combination of dictionaries or tuples (or +[`namedtuple`](https://docs.python.org/2/library/collections.html#collections.namedtuple) +). + +For example after converting the iris `features` +to a standard python dictionary, you can then convert the dictionary of arrays +to a `Dataset` of dictionaries as follows: + +``` python +dataset = tf.data.Dataset.from_tensor_slices(dict(features)) +print(dataset) +``` +``` None + +``` + +Here we see that when a `Dataset` contains structured elements, the `shapes` +and `types` of the `Dataset` take on the same structure. This dataset contains +dictionaries of @{$programmers_guide/tensors#rank$scalars}, all of type +`tf.float64`. + +The first line of the iris `train_input_fn` uses the same functionality, but +adds another level of structure. It creates a dataset containing +`(features_dict, label)` pairs. + +The following code shows that the label is a scalar with type `int64`: + +``` python +# Convert the inputs to a Dataset. +dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) +print(dataset) +``` +``` + +``` + +### Manipulation + +Currently the `Dataset` would iterate over the data once, in a fixed order, and +only produce a single element at a time. It needs further processing before it +can be used for training. Fortunately, the `tf.data.Dataset` class provides +methods to better prepare the data for training. The next line of the input +function takes advantage of several of these methods: + +``` python +# Shuffle, repeat, and batch the examples. +dataset = dataset.shuffle(1000).repeat().batch(batch_size) +``` + +The @{tf.data.Dataset.shuffle$`shuffle`} method uses a fixed-size buffer to +shuffle the items as they pass through. In this case the `buffer_size` is +greater than the number of examples in the `Dataset`, ensuring that the data is +completely shuffled (The Iris data set only contains 150 examples). + +The @{tf.data.Dataset.repeat$`repeat`} method restarts the `Dataset` when +it reaches the end. To limit the number of epochs, set the `count` argument. + +The @{tf.data.Dataset.batch$`batch`} method collects a number of examples and +stacks them, to create batches. This adds a dimension to their shape. The new +dimension is added as the first dimension. The following code uses +the `batch` method on the MNIST `Dataset`, from earlier. This results in a +`Dataset` containing 3D arrays representing stacks of `(28,28)` images: + +``` python +print(mnist_ds.batch(100)) +``` + +``` none + +``` +Note that the dataset has an unknown batch size because the last batch will +have fewer elements. + +In `train_input_fn`, after batching the `Dataset` contains 1D vectors of +elements where each scalar was previously: + +```python +print(dataset) +``` +``` + +``` + + +### Return + +At this point the `Dataset` contains `(features_dict, labels)` pairs. +This is the format expected by the `train` and `evaluate` methods, so the +`input_fn` returns the dataset. + +The `labels` can/should be omitted when using the `predict` method. + + + + +## Reading a CSV File + +The most common real-world use case for the `Dataset` class is to stream data +from files on disk. The @{tf.data} module includes a variety of +file readers. Let's see how parsing the Iris dataset from the csv file looks +using a `Dataset`. + +The following call to the `iris_data.maybe_download` function downloads the +data if necessary, and returns the pathnames of the resulting files: + +``` python +import iris_data +train_path, test_path = iris_data.maybe_download() +``` + +The [`iris_data.csv_input_fn`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py) +function contains an alternative implementation that parses the csv files using +a `Dataset`. + +Let's look at how to build an Estimator-compatible input function that reads +from the local files. + +### Build the `Dataset` + +We start by building a @{tf.data.TextLineDataset$`TextLineDataset`} object to +read the file one line at a time. Then, we call the +@{tf.data.Dataset.skip$`skip`} method to skip over the first line of the file, which contains a header, not an example: + +``` python +ds = tf.data.TextLineDataset(train_path).skip(1) +``` + +### Build a csv line parser + +We will start by building a function to parse a single line. + +The following `iris_data.parse_line` function accomplishes this task using the +@{tf.decode_csv} function, and some simple python code: + +We must parse each of the lines in the dataset in order to generate the +necessary `(features, label)` pairs. The following `_parse_line` function +calls @{tf.decode_csv} to parse a single line into its features +and the label. Since Estimators require that features be represented as a +dictionary, we rely on Python's built-in `dict` and `zip` functions to build +that dictionary. The feature names are the keys of that dictionary. +We then call the dictionary's `pop` method to remove the label field from +the features dictionary: + +``` python +# Metadata describing the text columns +COLUMNS = ['SepalLength', 'SepalWidth', + 'PetalLength', 'PetalWidth', + 'label'] +FIELD_DEFAULTS = [[0.0], [0.0], [0.0], [0.0], [0]] +def _parse_line(line): + # Decode the line into its fields + fields = tf.decode_csv(line, FIELD_DEFAULTS) + + # Pack the result into a dictionary + features = dict(zip(COLUMNS,fields)) + + # Separate the label from the features + label = features.pop('label') + + return features, label +``` + +### Parse the lines + +Datasets have many methods for manipulating the data while it is being piped +to a model. The most heavily-used method is @{tf.data.Dataset.map$`map`}, which +applies a transformation to each element of the `Dataset`. + +The `map` method takes a `map_func` argument that describes how each item in the +`Dataset` should be transformed. + +
+ +
+
+The @{tf.data.Dataset.map$`map`} method applies the `map_func` to +transform each item in the Dataset. +
+ +So to parse the lines as they are streamed out of the csv file, we pass our +`_parse_line` function to the `map` method: + +``` python +ds = ds.map(_parse_line) +print(ds) +``` +``` None + +``` + +Now instead of simple scalar strings, the dataset contains `(features, label)` +pairs. + +the remainder of the `iris_data.csv_input_fn` function is identical +to `iris_data.train_input_fn` which was covered in the in the +[Basic input](#basic_input) section. + +### Try it out + +This function can be used as a replacement for +`iris_data.train_input_fn`. It can be used to feed an estimator as follows: + +``` python +train_path, test_path = iris_data.maybe_download() + +# All the inputs are numeric +feature_columns = [ + tf.feature_column.numeric_column(name) + for name in iris_data.CSV_COLUMN_NAMES[:-1]] + +# Build the estimator +est = tf.estimator.LinearClassifier(feature_columns, + n_classes=3) +# Train the estimator +batch_size = 100 +est.train( + steps=1000, + input_fn=lambda : iris_data.csv_input_fn(train_path, batch_size)) +``` + +Estimators expect an `input_fn` to take no arguments. To work around this +restriction, we use `lambda` to capture the arguments and provide the expected +interface. + +## Summary + +The `tf.data` module provides a collection of classes and functions for easily +reading data from a variety of sources. Furthermore, `tf.data` has simple +powerful methods for applying a wide variety of standard and custom +transformations. + +Now you have the basic idea of how to efficiently load data into an +Estimator. Consider the following documents next: + + +* @{$custom_estimators}, which demonstrates how to build your own + custom `Estimator` model. +* The @{$low_level_intro#datasets$Low Level Introduction}, which demonstrates + how to experiment directly with `tf.data.Datasets` using TensorFlow's low + level APIs. +* @{$programmers_guide/datasets} which goes into great detail about additional + functionality of `Datasets`. + diff --git a/tensorflow/docs_src/programmers_guide/index.md b/tensorflow/docs_src/programmers_guide/index.md index 0c2d4afb11..9c58a3b45e 100644 --- a/tensorflow/docs_src/programmers_guide/index.md +++ b/tensorflow/docs_src/programmers_guide/index.md @@ -22,6 +22,7 @@ works. The units are as follows: design yourself. * @{$feature_columns}, which shows how an Estimator can handle a variety of input data types without changes to the model. +* @{$datasets_for_estimators} describes using tf.data with estimators. * @{$checkpoints}, which explains how to save training progress and resume where you left off. diff --git a/tensorflow/docs_src/programmers_guide/leftnav_files b/tensorflow/docs_src/programmers_guide/leftnav_files index 3bcf864e13..357a2a1cb9 100644 --- a/tensorflow/docs_src/programmers_guide/leftnav_files +++ b/tensorflow/docs_src/programmers_guide/leftnav_files @@ -10,6 +10,7 @@ estimators.md: Introduction to Estimators premade_estimators.md custom_estimators.md feature_columns.md +datasets_for_estimators.md checkpoints.md ### Accelerators diff --git a/tensorflow/docs_src/programmers_guide/premade_estimators.md b/tensorflow/docs_src/programmers_guide/premade_estimators.md index f6dd75eaca..02e2caf64b 100644 --- a/tensorflow/docs_src/programmers_guide/premade_estimators.md +++ b/tensorflow/docs_src/programmers_guide/premade_estimators.md @@ -81,7 +81,7 @@ We strongly recommend writing TensorFlow programs with the following APIs: * @{$programmers_guide/estimators$Estimators}, which represent a complete model. The Estimator API provides methods to train the model, to judge the model's accuracy, and to generate predictions. -* @{$get_started/datasets_quickstart$Datasets}, which build a data input +* @{$programmers_guide/datasets_for_estimators}, which build a data input pipeline. The Dataset API has methods to load and manipulate data, and feed it into your model. The Dataset API meshes well with the Estimators API. @@ -424,9 +424,7 @@ Now that you've gotten started writing TensorFlow programs, consider the following material: * @{$checkpoints$Checkpoints} to learn how to save and restore models. -* @{$get_started/datasets_quickstart$Datasets} to learn more about importing - data into your - model. +* @{$programmers_guide/datasets_for_estimators} to learn more about importing + data into your model. * @{$custom_estimators$Creating Custom Estimators} to learn how to write your own Estimator, customized for a particular problem. - diff --git a/tensorflow/docs_src/tutorials/index.md b/tensorflow/docs_src/tutorials/index.md index af01d3eaa1..6bd3a3a897 100644 --- a/tensorflow/docs_src/tutorials/index.md +++ b/tensorflow/docs_src/tutorials/index.md @@ -2,9 +2,8 @@ This section contains tutorials demonstrating how to do specific tasks -in TensorFlow. If you are new to TensorFlow, we recommend reading the -documents in the "@{$get_started$Get Started}" section before reading -these tutorials. +in TensorFlow. If you are new to TensorFlow, we recommend reading +[Get Started with TensorFlow](/get_started/). ## Images -- cgit v1.2.3 From d9e006e80990e54913c25de70a1f8e7db2f22bc8 Mon Sep 17 00:00:00 2001 From: Billy Lamberta Date: Wed, 20 Jun 2018 10:02:25 -0700 Subject: Fix eager path in get_started leftnav PiperOrigin-RevId: 201370156 --- tensorflow/docs_src/get_started/leftnav_files | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tensorflow/docs_src/get_started/leftnav_files b/tensorflow/docs_src/get_started/leftnav_files index 9a60496cb5..5c400a67f0 100644 --- a/tensorflow/docs_src/get_started/leftnav_files +++ b/tensorflow/docs_src/get_started/leftnav_files @@ -7,4 +7,4 @@ save_and_restore_models.md next_steps.md ### Research and experimentation -custom_training_walkthrough.md +eager.md -- cgit v1.2.3 From 4e0b1612e0a71b0e14da2bc37c49e3d65744342c Mon Sep 17 00:00:00 2001 From: Mark Daoust Date: Fri, 22 Jun 2018 15:37:58 -0700 Subject: Add Install Raspbian to leftnav. PiperOrigin-RevId: 201752380 --- tensorflow/docs_src/install/leftnav_files | 1 + 1 file changed, 1 insertion(+) diff --git a/tensorflow/docs_src/install/leftnav_files b/tensorflow/docs_src/install/leftnav_files index e523e06f67..ace275c0e8 100644 --- a/tensorflow/docs_src/install/leftnav_files +++ b/tensorflow/docs_src/install/leftnav_files @@ -4,6 +4,7 @@ index.md install_linux.md: Ubuntu install_mac.md: MacOS install_windows.md: Windows +install_raspbian.md: Raspbian install_sources.md: From source >>> migration.md -- cgit v1.2.3 From 2897538b938dcd6d9c63a97f0870232ac9e4819e Mon Sep 17 00:00:00 2001 From: Mark Daoust Date: Mon, 25 Jun 2018 12:48:40 -0700 Subject: Update r1.9 release notes. - link to new get_started. - Add keras CuDNN layers. - Links for gradient boosted estimators. - Added new contrib-estimators and string-processing. - Bumped some minor sounding things down from "Major" to "Bugfix+Other" --- RELEASE.md | 35 +++++++++++++++++++++++++++-------- 1 file changed, 27 insertions(+), 8 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 510eca5467..bfe0da8739 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -1,18 +1,37 @@ # Release 1.9.0 ## Major Features And Improvements -* Update tf.keras to the Keras 2.1.6 API. +* New `tf.keras` based [get_started](http://tensorflow.org/versions/r1.9/get_started) +* Update `tf.keras` to the Keras 2.1.6 API. +* Added [`tf.keras.layers.CuDNNGRU`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/keras/layers/CuDNNGRU) and [`tf.keras.layers.CuDNNLSTM`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/keras/layers/CuDNNLSTM) layers. [Try it](https://colab.sandbox.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb?linkId=53292082). +* Adding support of core [feature columns](https://www.tensorflow.org/get_started/feature_columns) and [losses](https://www.tensorflow.org/api_docs/python/tf/losses) to [gradient boosted trees estimators](https://github.com/tensorflow/models/tree/master/official/boosted_trees). +* The [python interface](https://tensorflow-dot-devsite.googleplex.com/versions/r1.9/api_docs/python/tf/contrib/lite) + for the [TFLite Optimizing Converter](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/toco/README.md) + has been expanded, and the command line interface (AKA: `toco`, `tflite_convert`) is once again + included in the standard `pip` installation. +* Improved data-loading and text processing with: + * [`tf.decode_compressed`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/decode_compressed) + * [`tf.string_strip`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/string_strip) + * [`tf.strings.regex_full_match`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/strings/regex_full_match) +* Added experimental support for new pre-made Estimators: + * [`tf.contrib.estimator.BaselineEstimator`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/estimator/BaselineEstimator) + * [`tf.contrib.estimator.RNNClassifier`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/estimator/RNNEstimator) + * [`tf.contrib.estimator.RNNEstimator`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/estimator/RNNClassifier) +* The [distributions.Bijector](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/distributions/bijectors/Bijector) + API supports broadcasting for Bijectors with new API changes. + +## Breaking Chances + * If you're opening empty variable scopes; replace `variable_scope('', ...)` by + `variable_scope(tf.get_variable_scope(), ...)`. + +## Bug Fixes and Other Changes + * `tfe.Network` is deprecated. Please inherit from `tf.keras.Model`. -* Adding support of core feature columns and losses to gradient boosted trees estimators. -* The distributions.Bijector API supports broadcasting for Bijectors with new API changes. See [here](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/distributions/bijectors/Bijector) for more details. * Layered variable names have changed in the following conditions: * Using `tf.keras.layers` with custom variable scopes. - * Using `tf.layers` in a subclassed `tf.keras.Model` class. See [here](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/layers) for more details + * Using `tf.layers` in a subclassed `tf.keras.Model` class. See + [here](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/layers) for more details -## Breaking Chances - * If you're opening empty variable scopes; replace `variable_scope`('', ...) by `variable_scope`(`tf.get_variable_scope()`, ...). - -## Bug Fixes and Other Changes * `tf.data`: * `Dataset.from_generator()` now accepts an `args` list, in order to create nested generators. * `Dataset.list_files()` now produces determinstic results when `shuffle=False` or a `seed` is passed. -- cgit v1.2.3 From 56fba15b868145f87109bd5cb155527b0c0640d1 Mon Sep 17 00:00:00 2001 From: Mark Daoust Date: Mon, 25 Jun 2018 13:16:52 -0700 Subject: Update RELEASE.md --- RELEASE.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/RELEASE.md b/RELEASE.md index bfe0da8739..f6a52a2951 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -1,7 +1,7 @@ # Release 1.9.0 ## Major Features And Improvements -* New `tf.keras` based [get_started](http://tensorflow.org/versions/r1.9/get_started) +* New `tf.keras` based [get_started](http://tensorflow.org/versions/r1.9/get_started), and [programmers_guide](http://tensorflow.org/versions/r1.9/programmers_guide/keras). * Update `tf.keras` to the Keras 2.1.6 API. * Added [`tf.keras.layers.CuDNNGRU`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/keras/layers/CuDNNGRU) and [`tf.keras.layers.CuDNNLSTM`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/keras/layers/CuDNNLSTM) layers. [Try it](https://colab.sandbox.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb?linkId=53292082). * Adding support of core [feature columns](https://www.tensorflow.org/get_started/feature_columns) and [losses](https://www.tensorflow.org/api_docs/python/tf/losses) to [gradient boosted trees estimators](https://github.com/tensorflow/models/tree/master/official/boosted_trees). -- cgit v1.2.3 From ce03a10d70884d2b6d8134b30ad3c5d181877403 Mon Sep 17 00:00:00 2001 From: Mark Daoust Date: Mon, 25 Jun 2018 13:22:14 -0700 Subject: Update RELEASE.md --- RELEASE.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/RELEASE.md b/RELEASE.md index f6a52a2951..5c79ebec34 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -1,7 +1,8 @@ # Release 1.9.0 ## Major Features And Improvements -* New `tf.keras` based [get_started](http://tensorflow.org/versions/r1.9/get_started), and [programmers_guide](http://tensorflow.org/versions/r1.9/programmers_guide/keras). +* Updated docs for `tf.keras`: New Keras-based [get started](http://tensorflow.org/versions/r1.9/get_started), + and [programmers guide page](http://tensorflow.org/versions/r1.9/programmers_guide/keras). * Update `tf.keras` to the Keras 2.1.6 API. * Added [`tf.keras.layers.CuDNNGRU`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/keras/layers/CuDNNGRU) and [`tf.keras.layers.CuDNNLSTM`](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/keras/layers/CuDNNLSTM) layers. [Try it](https://colab.sandbox.google.com/github/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb?linkId=53292082). * Adding support of core [feature columns](https://www.tensorflow.org/get_started/feature_columns) and [losses](https://www.tensorflow.org/api_docs/python/tf/losses) to [gradient boosted trees estimators](https://github.com/tensorflow/models/tree/master/official/boosted_trees). -- cgit v1.2.3 From b0f2eee339a041de4e7837b68a9ff4fc77ca7c4a Mon Sep 17 00:00:00 2001 From: Billy Lamberta Date: Fri, 22 Jun 2018 17:40:27 -0700 Subject: Rename programmers_guide/ directory to guide/. Update references in source files and docs in tensorflow and related projects. PiperOrigin-RevId: 201766994 --- README.md | 2 +- RELEASE.md | 6 +- tensorflow/contrib/autograph/README.md | 2 +- tensorflow/contrib/data/__init__.py | 2 +- tensorflow/contrib/eager/README.md | 2 +- .../python/examples/notebooks/3_datasets.ipynb | 6 +- tensorflow/contrib/eager/python/g3doc/guide.md | 4 +- tensorflow/contrib/lite/toco/README.md | 2 +- tensorflow/contrib/tpu/python/tpu/tpu_estimator.py | 2 +- tensorflow/core/protobuf/config.proto | 6 +- tensorflow/docs_src/api_guides/python/client.md | 2 +- .../docs_src/api_guides/python/input_dataset.md | 3 +- .../docs_src/api_guides/python/reading_data.md | 8 +- tensorflow/docs_src/deploy/distributed.md | 2 +- tensorflow/docs_src/extend/architecture.md | 5 +- tensorflow/docs_src/get_started/_index.yaml | 12 +- tensorflow/docs_src/get_started/next_steps.md | 4 +- tensorflow/docs_src/guide/checkpoints.md | 238 +++++ tensorflow/docs_src/guide/custom_estimators.md | 602 +++++++++++++ tensorflow/docs_src/guide/datasets.md | 823 +++++++++++++++++ .../docs_src/guide/datasets_for_estimators.md | 387 ++++++++ tensorflow/docs_src/guide/debugger.md | 804 +++++++++++++++++ tensorflow/docs_src/guide/eager.md | 849 +++++++++++++++++ tensorflow/docs_src/guide/embedding.md | 262 ++++++ tensorflow/docs_src/guide/estimators.md | 193 ++++ tensorflow/docs_src/guide/faq.md | 297 ++++++ tensorflow/docs_src/guide/feature_columns.md | 572 ++++++++++++ tensorflow/docs_src/guide/graph_viz.md | 316 +++++++ tensorflow/docs_src/guide/graphs.md | 558 ++++++++++++ tensorflow/docs_src/guide/index.md | 86 ++ tensorflow/docs_src/guide/keras.md | 623 +++++++++++++ tensorflow/docs_src/guide/leftnav_files | 40 + tensorflow/docs_src/guide/low_level_intro.md | 604 +++++++++++++ tensorflow/docs_src/guide/premade_estimators.md | 430 +++++++++ tensorflow/docs_src/guide/saved_model.md | 999 +++++++++++++++++++++ .../docs_src/guide/summaries_and_tensorboard.md | 225 +++++ .../docs_src/guide/tensorboard_histograms.md | 245 +++++ tensorflow/docs_src/guide/tensors.md | 330 +++++++ tensorflow/docs_src/guide/using_gpu.md | 215 +++++ tensorflow/docs_src/guide/using_tpu.md | 395 ++++++++ tensorflow/docs_src/guide/variables.md | 319 +++++++ tensorflow/docs_src/guide/version_compat.md | 319 +++++++ tensorflow/docs_src/install/install_go.md | 2 +- tensorflow/docs_src/install/install_java.md | 2 +- .../docs_src/programmers_guide/checkpoints.md | 240 ----- .../programmers_guide/custom_estimators.md | 602 ------------- tensorflow/docs_src/programmers_guide/datasets.md | 823 ----------------- .../programmers_guide/datasets_for_estimators.md | 387 -------- tensorflow/docs_src/programmers_guide/debugger.md | 804 ----------------- tensorflow/docs_src/programmers_guide/eager.md | 849 ----------------- tensorflow/docs_src/programmers_guide/embedding.md | 262 ------ .../docs_src/programmers_guide/estimators.md | 193 ---- tensorflow/docs_src/programmers_guide/faq.md | 297 ------ .../docs_src/programmers_guide/feature_columns.md | 572 ------------ tensorflow/docs_src/programmers_guide/graph_viz.md | 316 ------- tensorflow/docs_src/programmers_guide/graphs.md | 558 ------------ tensorflow/docs_src/programmers_guide/index.md | 86 -- tensorflow/docs_src/programmers_guide/keras.md | 623 ------------- .../docs_src/programmers_guide/leftnav_files | 40 - .../docs_src/programmers_guide/low_level_intro.md | 604 ------------- .../programmers_guide/premade_estimators.md | 430 --------- .../docs_src/programmers_guide/saved_model.md | 999 --------------------- .../programmers_guide/summaries_and_tensorboard.md | 225 ----- .../programmers_guide/tensorboard_histograms.md | 245 ----- tensorflow/docs_src/programmers_guide/tensors.md | 330 ------- tensorflow/docs_src/programmers_guide/using_gpu.md | 215 ----- tensorflow/docs_src/programmers_guide/using_tpu.md | 395 -------- tensorflow/docs_src/programmers_guide/variables.md | 319 ------- .../docs_src/programmers_guide/version_compat.md | 319 ------- tensorflow/docs_src/tutorials/deep_cnn.md | 2 +- tensorflow/docs_src/tutorials/layers.md | 2 +- .../how_tos/reading_data/fully_connected_reader.py | 2 +- tensorflow/java/README.md | 5 +- .../src/main/java/org/tensorflow/package-info.java | 2 +- tensorflow/python/data/__init__.py | 2 +- tensorflow/python/data/ops/dataset_ops.py | 14 + tensorflow/python/debug/BUILD | 2 +- tensorflow/python/debug/README.md | 4 +- tensorflow/python/debug/examples/README.md | 4 +- tensorflow/python/estimator/keras.py | 2 +- tensorflow/python/ops/script_ops.py | 2 +- tensorflow/python/tools/saved_model_cli.py | 4 +- third_party/examples/eager/spinn/README.md | 2 +- 83 files changed, 10798 insertions(+), 10789 deletions(-) create mode 100644 tensorflow/docs_src/guide/checkpoints.md create mode 100644 tensorflow/docs_src/guide/custom_estimators.md create mode 100644 tensorflow/docs_src/guide/datasets.md create mode 100644 tensorflow/docs_src/guide/datasets_for_estimators.md create mode 100644 tensorflow/docs_src/guide/debugger.md create mode 100644 tensorflow/docs_src/guide/eager.md create mode 100644 tensorflow/docs_src/guide/embedding.md create mode 100644 tensorflow/docs_src/guide/estimators.md create mode 100644 tensorflow/docs_src/guide/faq.md create mode 100644 tensorflow/docs_src/guide/feature_columns.md create mode 100644 tensorflow/docs_src/guide/graph_viz.md create mode 100644 tensorflow/docs_src/guide/graphs.md create mode 100644 tensorflow/docs_src/guide/index.md create mode 100644 tensorflow/docs_src/guide/keras.md create mode 100644 tensorflow/docs_src/guide/leftnav_files create mode 100644 tensorflow/docs_src/guide/low_level_intro.md create mode 100644 tensorflow/docs_src/guide/premade_estimators.md create mode 100644 tensorflow/docs_src/guide/saved_model.md create mode 100644 tensorflow/docs_src/guide/summaries_and_tensorboard.md create mode 100644 tensorflow/docs_src/guide/tensorboard_histograms.md create mode 100644 tensorflow/docs_src/guide/tensors.md create mode 100644 tensorflow/docs_src/guide/using_gpu.md create mode 100644 tensorflow/docs_src/guide/using_tpu.md create mode 100644 tensorflow/docs_src/guide/variables.md create mode 100644 tensorflow/docs_src/guide/version_compat.md delete mode 100644 tensorflow/docs_src/programmers_guide/checkpoints.md delete mode 100644 tensorflow/docs_src/programmers_guide/custom_estimators.md delete mode 100644 tensorflow/docs_src/programmers_guide/datasets.md delete mode 100644 tensorflow/docs_src/programmers_guide/datasets_for_estimators.md delete mode 100644 tensorflow/docs_src/programmers_guide/debugger.md delete mode 100644 tensorflow/docs_src/programmers_guide/eager.md delete mode 100644 tensorflow/docs_src/programmers_guide/embedding.md delete mode 100644 tensorflow/docs_src/programmers_guide/estimators.md delete mode 100644 tensorflow/docs_src/programmers_guide/faq.md delete mode 100644 tensorflow/docs_src/programmers_guide/feature_columns.md delete mode 100644 tensorflow/docs_src/programmers_guide/graph_viz.md delete mode 100644 tensorflow/docs_src/programmers_guide/graphs.md delete mode 100644 tensorflow/docs_src/programmers_guide/index.md delete mode 100644 tensorflow/docs_src/programmers_guide/keras.md delete mode 100644 tensorflow/docs_src/programmers_guide/leftnav_files delete mode 100644 tensorflow/docs_src/programmers_guide/low_level_intro.md delete mode 100644 tensorflow/docs_src/programmers_guide/premade_estimators.md delete mode 100644 tensorflow/docs_src/programmers_guide/saved_model.md delete mode 100644 tensorflow/docs_src/programmers_guide/summaries_and_tensorboard.md delete mode 100644 tensorflow/docs_src/programmers_guide/tensorboard_histograms.md delete mode 100644 tensorflow/docs_src/programmers_guide/tensors.md delete mode 100644 tensorflow/docs_src/programmers_guide/using_gpu.md delete mode 100644 tensorflow/docs_src/programmers_guide/using_tpu.md delete mode 100644 tensorflow/docs_src/programmers_guide/variables.md delete mode 100644 tensorflow/docs_src/programmers_guide/version_compat.md diff --git a/README.md b/README.md index 6fb4486d0d..4e4d139bd1 100644 --- a/README.md +++ b/README.md @@ -14,7 +14,7 @@ data flow graphs. The graph nodes represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture enables you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting -code. TensorFlow also includes [TensorBoard](https://www.tensorflow.org/programmers_guide/summaries_and_tensorboard), a data visualization toolkit. +code. TensorFlow also includes [TensorBoard](https://www.tensorflow.org/guide/summaries_and_tensorboard), a data visualization toolkit. TensorFlow was originally developed by researchers and engineers working on the Google Brain team within Google's Machine Intelligence Research diff --git a/RELEASE.md b/RELEASE.md index 510eca5467..5fec61af7e 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -467,7 +467,7 @@ answered questions, and were part of inspiring discussions. ## Major Features And Improvements * `tf.keras` is now part of the core TensorFlow API. -* [`tf.data`](http://tensorflow.org/programmers_guide/datasets) is now part of +* [`tf.data`](http://tensorflow.org/guide/datasets) is now part of the core TensorFlow API. * The API is now subject to backwards compatibility guarantees. @@ -495,7 +495,7 @@ answered questions, and were part of inspiring discussions. * TensorFlow Debugger (tfdbg): * Add `eval` command to allow evaluation of arbitrary Python/numpy expressions in tfdbg command-line interface. See - [Debugging TensorFlow Programs](https://www.tensorflow.org/programmers_guide/debugger) + [Debugging TensorFlow Programs](https://www.tensorflow.org/guide/debugger) for more details. * Usability improvement: The frequently used tensor filter `has_inf_or_nan` is now added to `Session` wrappers and hooks by default. So there is no need @@ -782,7 +782,7 @@ answered questions, and were part of inspiring discussions. * Support client-provided ClusterSpec's and propagate them to all workers to enable the creation of dynamic TensorFlow clusters. * TensorFlow C library now available for Windows. * We released a new open-source version of TensorBoard. -* [`SavedModel CLI`](https://www.tensorflow.org/versions/master/programmers_guide/saved_model_cli) tool available to inspect and execute MetaGraph in SavedModel +* [`SavedModel CLI`](https://www.tensorflow.org/versions/master/guide/saved_model_cli) tool available to inspect and execute MetaGraph in SavedModel * Android releases of TensorFlow are now pushed to jcenter for easier integration into apps. See https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/android/README.md diff --git a/tensorflow/contrib/autograph/README.md b/tensorflow/contrib/autograph/README.md index 674859bed4..47b1d4a99a 100644 --- a/tensorflow/contrib/autograph/README.md +++ b/tensorflow/contrib/autograph/README.md @@ -4,7 +4,7 @@ IMPORTANT: AutoGraph is alpha software, and under active development. Expect rou AutoGraph is a Python to TensorFlow compiler. -With AutoGraph, you can write [Eager style](https://www.tensorflow.org/programmers_guide/eager) code in a concise manner, and run it as a TensorFlow graph. AutoGraph uses source code transformation and partial evaluation to generate Python code that builds an equivalent TensorFlow subgraph. The result is code that behaves like ops and can be freely combined with other TensorFlow ops. +With AutoGraph, you can write [Eager style](https://www.tensorflow.org/guide/eager) code in a concise manner, and run it as a TensorFlow graph. AutoGraph uses source code transformation and partial evaluation to generate Python code that builds an equivalent TensorFlow subgraph. The result is code that behaves like ops and can be freely combined with other TensorFlow ops. For example, this Python function: diff --git a/tensorflow/contrib/data/__init__.py b/tensorflow/contrib/data/__init__.py index 9c6a13333e..3510e7b1ad 100644 --- a/tensorflow/contrib/data/__init__.py +++ b/tensorflow/contrib/data/__init__.py @@ -20,7 +20,7 @@ be used in conjunction with the @{tf.data.Dataset} API. Note that the guarantees as `tf.data`, but we will provide deprecation advice in advance of removing existing functionality. -See the @{$datasets$Importing Data} Programmer's Guide for an overview. +See @{$guide/datasets$Importing Data} for an overview. @@Counter @@CheckpointInputPipelineHook diff --git a/tensorflow/contrib/eager/README.md b/tensorflow/contrib/eager/README.md index 4384431e7b..86d203452e 100644 --- a/tensorflow/contrib/eager/README.md +++ b/tensorflow/contrib/eager/README.md @@ -44,7 +44,7 @@ Installation instructions at https://www.tensorflow.org/install/ For an introduction to eager execution in TensorFlow, see: -- [User Guide](https://www.tensorflow.org/programmers_guide/eager) ([source](../../docs_src/programmers_guide/eager.md)) +- [User Guide](https://www.tensorflow.org/guide/eager) ([source](../../docs_src/guide/eager.md)) - Notebook: [Basic Usage](python/examples/notebooks/1_basics.ipynb) - Notebook: [Gradients](python/examples/notebooks/2_gradients.ipynb) - Notebook: [Importing Data](python/examples/notebooks/3_datasets.ipynb) diff --git a/tensorflow/contrib/eager/python/examples/notebooks/3_datasets.ipynb b/tensorflow/contrib/eager/python/examples/notebooks/3_datasets.ipynb index bfcc7feb07..d268cbcd91 100644 --- a/tensorflow/contrib/eager/python/examples/notebooks/3_datasets.ipynb +++ b/tensorflow/contrib/eager/python/examples/notebooks/3_datasets.ipynb @@ -9,7 +9,7 @@ "source": [ "# Eager Execution Tutorial: Importing Data\n", "\n", - "This notebook demonstrates the use of the [`tf.data.Dataset` API](https://www.tensorflow.org/programmers_guide/datasets) to build pipelines to feed data to your program. It covers:\n", + "This notebook demonstrates the use of the [`tf.data.Dataset` API](https://www.tensorflow.org/guide/datasets) to build pipelines to feed data to your program. It covers:\n", "\n", "* Creating a `Dataset`.\n", "* Iteration over a `Dataset` with eager execution enabled.\n", @@ -18,7 +18,7 @@ "\n", "If you're familiar with TensorFlow graphs, the API for constructing the `Dataset` object remains exactly the same when eager execution is enabled, but the process of iterating over elements of the dataset is slightly simpler.\n", "You can use Python iteration over the `tf.data.Dataset` object and do not need to explicitly create an `tf.data.Iterator` object.\n", - "As a result, the discussion on iterators in the [Programmer's Guide](https://www.tensorflow.org/programmers_guide/datasets) is not relevant when eager execution is enabled." + "As a result, the discussion on iterators in the [TensorFlow Guide](https://www.tensorflow.org/guide/datasets) is not relevant when eager execution is enabled." ] }, { @@ -63,7 +63,7 @@ "source": [ "# Step 1: Create a source `Dataset`\n", "\n", - "Create a _source_ dataset using one of the factory functions like [`Dataset.from_tensors`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#from_tensors), [`Dataset.from_tensor_slices`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#from_tensor_slices) or using objects that read from files like [`TextLineDataset`](https://www.tensorflow.org/api_docs/python/tf/data/TextLineDataset) or [`TFRecordDataset`](https://www.tensorflow.org/api_docs/python/tf/data/TFRecordDataset). See the [Programmer's Guide](https://www.google.com/url?sa=D\u0026q=https%3A%2F%2Fwww.tensorflow.org%2Fprogrammers_guide%2Fdatasets%23reading_input_data) for more information." + "Create a _source_ dataset using one of the factory functions like [`Dataset.from_tensors`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#from_tensors), [`Dataset.from_tensor_slices`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#from_tensor_slices) or using objects that read from files like [`TextLineDataset`](https://www.tensorflow.org/api_docs/python/tf/data/TextLineDataset) or [`TFRecordDataset`](https://www.tensorflow.org/api_docs/python/tf/data/TFRecordDataset). See the [TensorFlow Guide](https://www.tensorflow.org/guide/datasets#reading_input_data) for more information." ] }, { diff --git a/tensorflow/contrib/eager/python/g3doc/guide.md b/tensorflow/contrib/eager/python/g3doc/guide.md index 2d2aba6908..23f33d0230 100644 --- a/tensorflow/contrib/eager/python/g3doc/guide.md +++ b/tensorflow/contrib/eager/python/g3doc/guide.md @@ -4,8 +4,8 @@ Eager execution is a feature that makes TensorFlow execute operations immediately: concrete values are returned, instead of creating a computational graph that is executed later. -A user guide is available: https://www.tensorflow.org/programmers_guide/eager -([source file](../../../../docs_src/programmers_guide/eager.md)) +A user guide is available: https://www.tensorflow.org/guide/eager +([source file](../../../../docs_src/guide/eager.md)) We welcome feedback through [GitHub issues](https://github.com/tensorflow/tensorflow/labels/comp:eager). diff --git a/tensorflow/contrib/lite/toco/README.md b/tensorflow/contrib/lite/toco/README.md index 522e260ad2..ee83c7a6e3 100644 --- a/tensorflow/contrib/lite/toco/README.md +++ b/tensorflow/contrib/lite/toco/README.md @@ -17,7 +17,7 @@ Usage information is given in these documents: Once an application developer has a trained TensorFlow model, TOCO will accept that model and generate a TensorFlow Lite [FlatBuffer](https://google.github.io/flatbuffers/) file. TOCO currently supports -[SavedModels](https://www.tensorflow.org/programmers_guide/saved_model#using_savedmodel_with_estimators) +[SavedModels](https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators) and frozen graphs (models generated via [freeze_graph.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/freeze_graph.py)). The TensorFlow Lite FlatBuffer file can be shipped to client devices, generally diff --git a/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py b/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py index 7c770912b4..c57acd0a2d 100644 --- a/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py +++ b/tensorflow/contrib/tpu/python/tpu/tpu_estimator.py @@ -1103,7 +1103,7 @@ class _InputPipeline(object): err_msg = ('Input pipeline contains one or more QueueRunners. ' 'It could be slow and not scalable. Please consider ' 'converting your input pipeline to use `tf.data` instead (see ' - 'https://www.tensorflow.org/programmers_guide/datasets for ' + 'https://www.tensorflow.org/guide/datasets for ' 'instructions.') if _WRAP_INPUT_FN_INTO_WHILE_LOOP: raise RuntimeError(err_msg) diff --git a/tensorflow/core/protobuf/config.proto b/tensorflow/core/protobuf/config.proto index 9a48f43a63..d83215d5c2 100644 --- a/tensorflow/core/protobuf/config.proto +++ b/tensorflow/core/protobuf/config.proto @@ -147,7 +147,7 @@ message GPUOptions { // Everything inside experimental is subject to change and is not subject // to API stability guarantees in - // https://www.tensorflow.org/programmers_guide/version_compat. + // https://www.tensorflow.org/guide/version_compat. Experimental experimental = 9; }; @@ -381,7 +381,7 @@ message ConfigProto { // Everything inside Experimental is subject to change and is not subject // to API stability guarantees in - // https://www.tensorflow.org/programmers_guide/version_compat. + // https://www.tensorflow.org/guide/version_compat. message Experimental { // Task name for group resolution. string collective_group_leader = 1; @@ -426,7 +426,7 @@ message RunOptions { // Everything inside Experimental is subject to change and is not subject // to API stability guarantees in - // https://www.tensorflow.org/programmers_guide/version_compat. + // https://www.tensorflow.org/guide/version_compat. message Experimental { // If non-zero, declares that this graph is going to use collective // ops and must synchronize step_ids with any other graph with this diff --git a/tensorflow/docs_src/api_guides/python/client.md b/tensorflow/docs_src/api_guides/python/client.md index eef23696db..27fc8610bf 100644 --- a/tensorflow/docs_src/api_guides/python/client.md +++ b/tensorflow/docs_src/api_guides/python/client.md @@ -3,7 +3,7 @@ This library contains classes for launching graphs and executing operations. -@{$programmers_guide/low_level_intro$This guide} has examples of how a graph +@{$guide/low_level_intro$This guide} has examples of how a graph is launched in a @{tf.Session}. ## Session management diff --git a/tensorflow/docs_src/api_guides/python/input_dataset.md b/tensorflow/docs_src/api_guides/python/input_dataset.md index a6e2fc48e0..a6612d1bf7 100644 --- a/tensorflow/docs_src/api_guides/python/input_dataset.md +++ b/tensorflow/docs_src/api_guides/python/input_dataset.md @@ -2,8 +2,7 @@ [TOC] @{tf.data.Dataset} allows you to build complex input pipelines. See the -@{$datasets$programmer's guide} for an in-depth explanation of how to use this -API. +@{$guide/datasets} for an in-depth explanation of how to use this API. ## Reader classes diff --git a/tensorflow/docs_src/api_guides/python/reading_data.md b/tensorflow/docs_src/api_guides/python/reading_data.md index 5bbbfd3216..d7d0904ae2 100644 --- a/tensorflow/docs_src/api_guides/python/reading_data.md +++ b/tensorflow/docs_src/api_guides/python/reading_data.md @@ -16,8 +16,8 @@ There are four methods of getting data into a TensorFlow program: ## `tf.data` API -See the @{$datasets$programmer's guide} for an in-depth explanation of -@{tf.data.Dataset}. The `tf.data` API enables you to extract and preprocess data +See the @{$guide/datasets} for an in-depth explanation of @{tf.data.Dataset}. +The `tf.data` API enables you to extract and preprocess data from different input/file formats, and apply transformations such as batching, shuffling, and mapping functions over the dataset. This is an improved version of the old input methods---feeding and `QueueRunner`---which are described @@ -511,8 +511,8 @@ You can have the train and eval in the same graph in the same process, and share their trained variables or layers. See @{$variables$the shared variables tutorial}. To support the single-graph approach -@{$programmers_guide/datasets$`tf.data`} also supplies -@{$programmers_guide/datasets#creating_an_iterator$advanced iterator types} that +@{$guide/datasets$`tf.data`} also supplies +@{$guide/datasets#creating_an_iterator$advanced iterator types} that that allow the user to change the input pipeline without rebuilding the graph or session. diff --git a/tensorflow/docs_src/deploy/distributed.md b/tensorflow/docs_src/deploy/distributed.md index d7ed6b1deb..8e2c818e39 100644 --- a/tensorflow/docs_src/deploy/distributed.md +++ b/tensorflow/docs_src/deploy/distributed.md @@ -2,7 +2,7 @@ This document shows how to create a cluster of TensorFlow servers, and how to distribute a computation graph across that cluster. We assume that you are -familiar with the @{$programmers_guide/low_level_intro$basic concepts} of +familiar with the @{$guide/low_level_intro$basic concepts} of writing low level TensorFlow programs. ## Hello distributed TensorFlow! diff --git a/tensorflow/docs_src/extend/architecture.md b/tensorflow/docs_src/extend/architecture.md index c8f522a03a..84435a57f2 100644 --- a/tensorflow/docs_src/extend/architecture.md +++ b/tensorflow/docs_src/extend/architecture.md @@ -7,9 +7,8 @@ learning models and system-level optimizations. This document describes the system architecture that makes this combination of scale and flexibility possible. It assumes that you have basic familiarity with TensorFlow programming concepts such as the computation graph, operations, -and sessions. See @{$programmers_guide/low_level_intro$this document} -for an introduction to these topics. Some familiarity -with @{$distributed$distributed TensorFlow} +and sessions. See @{$guide/low_level_intro$this document} for an introduction to +these topics. Some familiarity with @{$distributed$distributed TensorFlow} will also be helpful. This document is for developers who want to extend TensorFlow in some way not diff --git a/tensorflow/docs_src/get_started/_index.yaml b/tensorflow/docs_src/get_started/_index.yaml index af255a482d..277fc852fb 100644 --- a/tensorflow/docs_src/get_started/_index.yaml +++ b/tensorflow/docs_src/get_started/_index.yaml @@ -74,7 +74,7 @@ landing_page: The high-level Keras API provides building blocks to create and train deep learning models. Start with these beginner-friendly notebook examples, then read the - TensorFlow Keras guide. + TensorFlow Keras guide.

  1. Basic classification
  2. @@ -85,7 +85,7 @@ landing_page:
- classname: tfo-landing-row-item-code-block @@ -123,7 +123,7 @@ landing_page:

Eager execution provides an imperative, define-by-run interface for advanced operations. Write custom layers, forward passes, and training loops with auto‑differentiation. Start with - these notebooks, then read the eager execution guide. + these notebooks, then read the eager execution guide.

  1. @@ -165,7 +165,7 @@ landing_page:
- custom_html: > @@ -177,7 +177,7 @@ landing_page:

Estimators can train large models on multiple machines in a production environment. Try the examples below and read the - Estimators guide. + Estimators guide.

  1. How to build a simple text classifier with TF-Hub
  2. @@ -186,7 +186,7 @@ landing_page:
diff --git a/tensorflow/docs_src/get_started/next_steps.md b/tensorflow/docs_src/get_started/next_steps.md index 79c0ef3346..6318a39c6c 100644 --- a/tensorflow/docs_src/get_started/next_steps.md +++ b/tensorflow/docs_src/get_started/next_steps.md @@ -2,9 +2,9 @@ ## Learn more about TensorFlow -* The [TensorFlow Guide](/programmers_guide) includes usage guides for the +* The [TensorFlow Guide](/guide) includes usage guides for the high-level APIs, as well as advanced TensorFlow operations. -* [Premade Estimators](/programmers_guide/premade_estimators) are designed to +* [Premade Estimators](/guide/premade_estimators) are designed to get results out of the box. Use TensorFlow without building your own models. * [TensorFlow.js](https://js.tensorflow.org/) allows web developers to train and deploy ML models in the browser and using Node.js. diff --git a/tensorflow/docs_src/guide/checkpoints.md b/tensorflow/docs_src/guide/checkpoints.md new file mode 100644 index 0000000000..dfb2626b86 --- /dev/null +++ b/tensorflow/docs_src/guide/checkpoints.md @@ -0,0 +1,238 @@ +# Checkpoints + +This document examines how to save and restore TensorFlow models built with +Estimators. TensorFlow provides two model formats: + +* checkpoints, which is a format dependent on the code that created + the model. +* SavedModel, which is a format independent of the code that created + the model. + +This document focuses on checkpoints. For details on `SavedModel`, see the +@{$saved_model$Saving and Restoring} guide. + + +## Sample code + +This document relies on the same +[Iris classification example](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py) detailed in @{$premade_estimators$Getting Started with TensorFlow}. +To download and access the example, invoke the following two commands: + +```shell +git clone https://github.com/tensorflow/models/ +cd models/samples/core/get_started +``` + +Most of the code snippets in this document are minor variations +on `premade_estimator.py`. + + +## Saving partially-trained models + +Estimators automatically write the following to disk: + +* **checkpoints**, which are versions of the model created during training. +* **event files**, which contain information that + [TensorBoard](https://developers.google.com/machine-learning/glossary/#TensorBoard) + uses to create visualizations. + +To specify the top-level directory in which the Estimator stores its +information, assign a value to the optional `model_dir` argument of *any* +`Estimator`'s constructor. +Taking `DNNClassifier` as an example, +the following code sets the `model_dir` +argument to the `models/iris` directory: + +```python +classifier = tf.estimator.DNNClassifier( + feature_columns=my_feature_columns, + hidden_units=[10, 10], + n_classes=3, + model_dir='models/iris') +``` + +Suppose you call the Estimator's `train` method. For example: + + +```python +classifier.train( + input_fn=lambda:train_input_fn(train_x, train_y, batch_size=100), + steps=200) +``` + +As suggested by the following diagrams, the first call to `train` +adds checkpoints and other files to the `model_dir` directory: + +
+ +
+
+The first call to train(). +
+ + +To see the objects in the created `model_dir` directory on a +UNIX-based system, just call `ls` as follows: + +```none +$ ls -1 models/iris +checkpoint +events.out.tfevents.timestamp.hostname +graph.pbtxt +model.ckpt-1.data-00000-of-00001 +model.ckpt-1.index +model.ckpt-1.meta +model.ckpt-200.data-00000-of-00001 +model.ckpt-200.index +model.ckpt-200.meta +``` + +The preceding `ls` command shows that the Estimator created checkpoints +at steps 1 (the start of training) and 200 (the end of training). + + +### Default checkpoint directory + +If you don't specify `model_dir` in an Estimator's constructor, the Estimator +writes checkpoint files to a temporary directory chosen by Python's +[tempfile.mkdtemp](https://docs.python.org/3/library/tempfile.html#tempfile.mkdtemp) +function. For example, the following Estimator constructor does *not* specify +the `model_dir` argument: + +```python +classifier = tf.estimator.DNNClassifier( + feature_columns=my_feature_columns, + hidden_units=[10, 10], + n_classes=3) + +print(classifier.model_dir) +``` + +The `tempfile.mkdtemp` function picks a secure, temporary directory +appropriate for your operating system. For example, a typical temporary +directory on macOS might be something like the following: + +```None +/var/folders/0s/5q9kfzfj3gx2knj0vj8p68yc00dhcr/T/tmpYm1Rwa +``` + +### Checkpointing Frequency + +By default, the Estimator saves +[checkpoints](https://developers.google.com/machine-learning/glossary/#checkpoint) +in the `model_dir` according to the following schedule: + +* Writes a checkpoint every 10 minutes (600 seconds). +* Writes a checkpoint when the `train` method starts (first iteration) + and completes (final iteration). +* Retains only the 5 most recent checkpoints in the directory. + +You may alter the default schedule by taking the following steps: + +1. Create a @{tf.estimator.RunConfig$`RunConfig`} object that defines the + desired schedule. +2. When instantiating the Estimator, pass that `RunConfig` object to the + Estimator's `config` argument. + +For example, the following code changes the checkpointing schedule to every +20 minutes and retains the 10 most recent checkpoints: + +```python +my_checkpointing_config = tf.estimator.RunConfig( + save_checkpoints_secs = 20*60, # Save checkpoints every 20 minutes. + keep_checkpoint_max = 10, # Retain the 10 most recent checkpoints. +) + +classifier = tf.estimator.DNNClassifier( + feature_columns=my_feature_columns, + hidden_units=[10, 10], + n_classes=3, + model_dir='models/iris', + config=my_checkpointing_config) +``` + +## Restoring your model + +The first time you call an Estimator's `train` method, TensorFlow saves a +checkpoint to the `model_dir`. Each subsequent call to the Estimator's +`train`, `evaluate`, or `predict` method causes the following: + +1. The Estimator builds the model's + [graph](https://developers.google.com/machine-learning/glossary/#graph) + by running the `model_fn()`. (For details on the `model_fn()`, see + @{$custom_estimators$Creating Custom Estimators.}) +2. The Estimator initializes the weights of the new model from the data + stored in the most recent checkpoint. + +In other words, as the following illustration suggests, once checkpoints +exist, TensorFlow rebuilds the model each time you call `train()`, +`evaluate()`, or `predict()`. + +
+ +
+
+Subsequent calls to train(), evaluate(), or predict() +
+ + +### Avoiding a bad restoration + +Restoring a model's state from a checkpoint only works if the model +and checkpoint are compatible. For example, suppose you trained a +`DNNClassifier` Estimator containing two hidden layers, +each having 10 nodes: + +```python +classifier = tf.estimator.DNNClassifier( + feature_columns=feature_columns, + hidden_units=[10, 10], + n_classes=3, + model_dir='models/iris') + +classifier.train( + input_fn=lambda:train_input_fn(train_x, train_y, batch_size=100), + steps=200) +``` + +After training (and, therefore, after creating checkpoints in `models/iris`), +imagine that you changed the number of neurons in each hidden layer from 10 to +20 and then attempted to retrain the model: + +``` python +classifier2 = tf.estimator.DNNClassifier( + feature_columns=my_feature_columns, + hidden_units=[20, 20], # Change the number of neurons in the model. + n_classes=3, + model_dir='models/iris') + +classifier.train( + input_fn=lambda:train_input_fn(train_x, train_y, batch_size=100), + steps=200) +``` + +Since the state in the checkpoint is incompatible with the model described +in `classifier2`, retraining fails with the following error: + +```None +... +InvalidArgumentError (see above for traceback): tensor_name = +dnn/hiddenlayer_1/bias/t_0/Adagrad; shape in shape_and_slice spec [10] +does not match the shape stored in checkpoint: [20] +``` + +To run experiments in which you train and compare slightly different +versions of a model, save a copy of the code that created each +`model_dir`, possibly by creating a separate git branch for each version. +This separation will keep your checkpoints recoverable. + +## Summary + +Checkpoints provide an easy automatic mechanism for saving and restoring +models created by Estimators. + +See the @{$saved_model$Saving and Restoring} guide for details about: + +* Saving and restoring models using low-level TensorFlow APIs. +* Exporting and importing models in the SavedModel format, which is a + language-neutral, recoverable, serialization format. diff --git a/tensorflow/docs_src/guide/custom_estimators.md b/tensorflow/docs_src/guide/custom_estimators.md new file mode 100644 index 0000000000..fb20b35c12 --- /dev/null +++ b/tensorflow/docs_src/guide/custom_estimators.md @@ -0,0 +1,602 @@ + +# Creating Custom Estimators + +This document introduces custom Estimators. In particular, this document +demonstrates how to create a custom @{tf.estimator.Estimator$Estimator} that +mimics the behavior of the pre-made Estimator +@{tf.estimator.DNNClassifier$`DNNClassifier`} in solving the Iris problem. See +the @{$premade_estimators$Pre-Made Estimators chapter} for details +on the Iris problem. + +To download and access the example code invoke the following two commands: + +```shell +git clone https://github.com/tensorflow/models/ +cd models/samples/core/get_started +``` + +In this document we will be looking at +[`custom_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py). +You can run it with the following command: + +```bsh +python custom_estimator.py +``` + +If you are feeling impatient, feel free to compare and contrast +[`custom_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py) +with +[`premade_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py). +(which is in the same directory). + + + +## Pre-made vs. custom + +As the following figure shows, pre-made Estimators are subclasses of the +@{tf.estimator.Estimator} base class, while custom Estimators are an instance +of tf.estimator.Estimator: + +
+Premade estimators are sub-classes of `Estimator`. Custom Estimators are usually (direct) instances of `Estimator` +
+
+Pre-made and custom Estimators are all Estimators. +
+ +Pre-made Estimators are fully baked. Sometimes though, you need more control +over an Estimator's behavior. That's where custom Estimators come in. You can +create a custom Estimator to do just about anything. If you want hidden layers +connected in some unusual fashion, write a custom Estimator. If you want to +calculate a unique +[metric](https://developers.google.com/machine-learning/glossary/#metric) +for your model, write a custom Estimator. Basically, if you want an Estimator +optimized for your specific problem, write a custom Estimator. + +A model function (or `model_fn`) implements the ML algorithm. The +only difference between working with pre-made Estimators and custom Estimators +is: + +* With pre-made Estimators, someone already wrote the model function for you. +* With custom Estimators, you must write the model function. + +Your model function could implement a wide range of algorithms, defining all +sorts of hidden layers and metrics. Like input functions, all model functions +must accept a standard group of input parameters and return a standard group of +output values. Just as input functions can leverage the Dataset API, model +functions can leverage the Layers API and the Metrics API. + +Let's see how to solve the Iris problem with a custom Estimator. A quick +reminder--here's the organization of the Iris model that we're trying to mimic: + +
+A diagram of the network architecture: Inputs, 2 hidden layers, and outputs +
+
+Our implementation of Iris contains four features, two hidden layers, +and a logits output layer. +
+ +## Write an Input function + +Our custom Estimator implementation uses the same input function as our +@{$premade_estimators$pre-made Estimator implementation}, from +[`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py). +Namely: + +```python +def train_input_fn(features, labels, batch_size): + """An input function for training""" + # Convert the inputs to a Dataset. + dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) + + # Shuffle, repeat, and batch the examples. + dataset = dataset.shuffle(1000).repeat().batch(batch_size) + + # Return the read end of the pipeline. + return dataset.make_one_shot_iterator().get_next() +``` + +This input function builds an input pipeline that yields batches of +`(features, labels)` pairs, where `features` is a dictionary features. + +## Create feature columns + +As detailed in the @{$premade_estimators$Premade Estimators} and +@{$feature_columns$Feature Columns} chapters, you must define +your model's feature columns to specify how the model should use each feature. +Whether working with pre-made Estimators or custom Estimators, you define +feature columns in the same fashion. + +The following code creates a simple `numeric_column` for each input feature, +indicating that the value of the input feature should be used directly as an +input to the model: + +```python +# Feature columns describe how to use the input. +my_feature_columns = [] +for key in train_x.keys(): + my_feature_columns.append(tf.feature_column.numeric_column(key=key)) +``` + +## Write a model function + +The model function we'll use has the following call signature: + +```python +def my_model_fn( + features, # This is batch_features from input_fn + labels, # This is batch_labels from input_fn + mode, # An instance of tf.estimator.ModeKeys + params): # Additional configuration +``` + +The first two arguments are the batches of features and labels returned from +the input function; that is, `features` and `labels` are the handles to the +data your model will use. The `mode` argument indicates whether the caller is +requesting training, predicting, or evaluation. + +The caller may pass `params` to an Estimator's constructor. Any `params` passed +to the constructor are in turn passed on to the `model_fn`. In +[`custom_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py) +the following lines create the estimator and set the params to configure the +model. This configuration step is similar to how we configured the @{tf.estimator.DNNClassifier} in +@{$premade_estimators}. + +```python +classifier = tf.estimator.Estimator( + model_fn=my_model, + params={ + 'feature_columns': my_feature_columns, + # Two hidden layers of 10 nodes each. + 'hidden_units': [10, 10], + # The model must choose between 3 classes. + 'n_classes': 3, + }) +``` + +To implement a typical model function, you must do the following: + +* [Define the model](#define_the_model). +* Specify additional calculations for each of + the [three different modes](#modes): + * [Predict](#predict) + * [Evaluate](#evaluate) + * [Train](#train) + +## Define the model + +The basic deep neural network model must define the following three sections: + +* An [input layer](https://developers.google.com/machine-learning/glossary/#input_layer) +* One or more [hidden layers](https://developers.google.com/machine-learning/glossary/#hidden_layer) +* An [output layer](https://developers.google.com/machine-learning/glossary/#output_layer) + +### Define the input layer + +The first line of the `model_fn` calls @{tf.feature_column.input_layer} to +convert the feature dictionary and `feature_columns` into input for your model, +as follows: + +```python + # Use `input_layer` to apply the feature columns. + net = tf.feature_column.input_layer(features, params['feature_columns']) +``` + +The preceding line applies the transformations defined by your feature columns, +creating the model's input layer. + +
+A diagram of the input layer, in this case a 1:1 mapping from raw-inputs to features. +
+ + +### Hidden Layers + +If you are creating a deep neural network, you must define one or more hidden +layers. The Layers API provides a rich set of functions to define all types of +hidden layers, including convolutional, pooling, and dropout layers. For Iris, +we're simply going to call @{tf.layers.dense} to create hidden layers, with +dimensions defined by `params['hidden_layers']`. In a `dense` layer each node +is connected to every node in the preceding layer. Here's the relevant code: + +``` python + # Build the hidden layers, sized according to the 'hidden_units' param. + for units in params['hidden_units']: + net = tf.layers.dense(net, units=units, activation=tf.nn.relu) +``` + +* The `units` parameter defines the number of output neurons in a given layer. +* The `activation` parameter defines the [activation function](https://developers.google.com/machine-learning/glossary/#activation_function) — + [Relu](https://developers.google.com/machine-learning/glossary/#ReLU) in this + case. + +The variable `net` here signifies the current top layer of the network. During +the first iteration, `net` signifies the input layer. On each loop iteration +`tf.layers.dense` creates a new layer, which takes the previous layer's output +as its input, using the variable `net`. + +After creating two hidden layers, our network looks as follows. For +simplicity, the figure does not show all the units in each layer. + +
+The input layer with two hidden layers added. +
+ +Note that @{tf.layers.dense} provides many additional capabilities, including +the ability to set a multitude of regularization parameters. For the sake of +simplicity, though, we're going to simply accept the default values of the +other parameters. + +### Output Layer + +We'll define the output layer by calling @{tf.layers.dense} yet again, this +time without an activation function: + +```python + # Compute logits (1 per class). + logits = tf.layers.dense(net, params['n_classes'], activation=None) +``` + +Here, `net` signifies the final hidden layer. Therefore, the full set of layers +is now connected as follows: + +
+A logit output layer connected to the top hidden layer +
+
+The final hidden layer feeds into the output layer. +
+ +When defining an output layer, the `units` parameter specifies the number of +outputs. So, by setting `units` to `params['n_classes']`, the model produces +one output value per class. Each element of the output vector will contain the +score, or "logit", calculated for the associated class of Iris: Setosa, +Versicolor, or Virginica, respectively. + +Later on, these logits will be transformed into probabilities by the +@{tf.nn.softmax} function. + +## Implement training, evaluation, and prediction {#modes} + +The final step in creating a model function is to write branching code that +implements prediction, evaluation, and training. + +The model function gets invoked whenever someone calls the Estimator's `train`, +`evaluate`, or `predict` methods. Recall that the signature for the model +function looks like this: + +``` python +def my_model_fn( + features, # This is batch_features from input_fn + labels, # This is batch_labels from input_fn + mode, # An instance of tf.estimator.ModeKeys, see below + params): # Additional configuration +``` + +Focus on that third argument, mode. As the following table shows, when someone +calls `train`, `evaluate`, or `predict`, the Estimator framework invokes your model +function with the mode parameter set as follows: + +| Estimator method | Estimator Mode | +|:---------------------------------|:------------------| +|@{tf.estimator.Estimator.train$`train()`} |@{tf.estimator.ModeKeys.TRAIN$`ModeKeys.TRAIN`} | +|@{tf.estimator.Estimator.evaluate$`evaluate()`} |@{tf.estimator.ModeKeys.EVAL$`ModeKeys.EVAL`} | +|@{tf.estimator.Estimator.predict$`predict()`}|@{tf.estimator.ModeKeys.PREDICT$`ModeKeys.PREDICT`} | + +For example, suppose you instantiate a custom Estimator to generate an object +named `classifier`. Then, you make the following call: + +``` python +classifier = tf.estimator.Estimator(...) +classifier.train(input_fn=lambda: my_input_fn(FILE_TRAIN, True, 500)) +``` +The Estimator framework then calls your model function with mode set to +`ModeKeys.TRAIN`. + +Your model function must provide code to handle all three of the mode values. +For each mode value, your code must return an instance of +`tf.estimator.EstimatorSpec`, which contains the information the caller +requires. Let's examine each mode. + +### Predict + +When the Estimator's `predict` method is called, the `model_fn` receives +`mode = ModeKeys.PREDICT`. In this case, the model function must return a +`tf.estimator.EstimatorSpec` containing the prediction. + +The model must have been trained prior to making a prediction. The trained model +is stored on disk in the `model_dir` directory established when you +instantiated the Estimator. + +The code to generate the prediction for this model looks as follows: + +```python +# Compute predictions. +predicted_classes = tf.argmax(logits, 1) +if mode == tf.estimator.ModeKeys.PREDICT: + predictions = { + 'class_ids': predicted_classes[:, tf.newaxis], + 'probabilities': tf.nn.softmax(logits), + 'logits': logits, + } + return tf.estimator.EstimatorSpec(mode, predictions=predictions) +``` +The prediction dictionary contains everything that your model returns when run +in prediction mode. + +
+Additional outputs added to the output layer. +
+ +The `predictions` holds the following three key/value pairs: + +* `class_ids` holds the class id (0, 1, or 2) representing the model's + prediction of the most likely species for this example. +* `probabilities` holds the three probabilities (in this example, 0.02, 0.95, + and 0.03) +* `logit` holds the raw logit values (in this example, -1.3, 2.6, and -0.9) + +We return that dictionary to the caller via the `predictions` parameter of the +@{tf.estimator.EstimatorSpec}. The Estimator's +@{tf.estimator.Estimator.predict$`predict`} method will yield these +dictionaries. + +### Calculate the loss + +For both [training](#train) and [evaluation](#evaluate) we need to calculate the +model's loss. This is the +[objective](https://developers.google.com/machine-learning/glossary/#objective) +that will be optimized. + +We can calculate the loss by calling @{tf.losses.sparse_softmax_cross_entropy}. +The value returned by this function will be lowest, approximately 0, +probability of the correct class (at index `label`) is near 1.0. The loss value +returned is progressively larger as the probability of the correct class +decreases. + +This function returns the average over the whole batch. + +```python +# Compute loss. +loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits) +``` + +### Evaluate + +When the Estimator's `evaluate` method is called, the `model_fn` receives +`mode = ModeKeys.EVAL`. In this case, the model function must return a +`tf.estimator.EstimatorSpec` containing the model's loss and optionally one +or more metrics. + +Although returning metrics is optional, most custom Estimators do return at +least one metric. TensorFlow provides a Metrics module @{tf.metrics} to +calculate common metrics. For brevity's sake, we'll only return accuracy. The +@{tf.metrics.accuracy} function compares our predictions against the +true values, that is, against the labels provided by the input function. The +@{tf.metrics.accuracy} function requires the labels and predictions to have the +same shape. Here's the call to @{tf.metrics.accuracy}: + +``` python +# Compute evaluation metrics. +accuracy = tf.metrics.accuracy(labels=labels, + predictions=predicted_classes, + name='acc_op') +``` + +The @{tf.estimator.EstimatorSpec$`EstimatorSpec`} returned for evaluation +typically contains the following information: + +* `loss`, which is the model's loss +* `eval_metric_ops`, which is an optional dictionary of metrics. + +So, we'll create a dictionary containing our sole metric. If we had calculated +other metrics, we would have added them as additional key/value pairs to that +same dictionary. Then, we'll pass that dictionary in the `eval_metric_ops` +argument of `tf.estimator.EstimatorSpec`. Here's the code: + +```python +metrics = {'accuracy': accuracy} +tf.summary.scalar('accuracy', accuracy[1]) + +if mode == tf.estimator.ModeKeys.EVAL: + return tf.estimator.EstimatorSpec( + mode, loss=loss, eval_metric_ops=metrics) +``` + +The @{tf.summary.scalar} will make accuracy available to TensorBoard +in both `TRAIN` and `EVAL` modes. (More on this later). + +### Train + +When the Estimator's `train` method is called, the `model_fn` is called +with `mode = ModeKeys.TRAIN`. In this case, the model function must return an +`EstimatorSpec` that contains the loss and a training operation. + +Building the training operation will require an optimizer. We will use +@{tf.train.AdagradOptimizer} because we're mimicking the `DNNClassifier`, which +also uses `Adagrad` by default. The `tf.train` package provides many other +optimizers—feel free to experiment with them. + +Here is the code that builds the optimizer: + +``` python +optimizer = tf.train.AdagradOptimizer(learning_rate=0.1) +``` + +Next, we build the training operation using the optimizer's +@{tf.train.Optimizer.minimize$`minimize`} method on the loss we calculated +earlier. + +The `minimize` method also takes a `global_step` parameter. TensorFlow uses this +parameter to count the number of training steps that have been processed +(to know when to end a training run). Furthermore, the `global_step` is +essential for TensorBoard graphs to work correctly. Simply call +@{tf.train.get_global_step} and pass the result to the `global_step` +argument of `minimize`. + +Here's the code to train the model: + +``` python +train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step()) +``` + +The @{tf.estimator.EstimatorSpec$`EstimatorSpec`} returned for training +must have the following fields set: + +* `loss`, which contains the value of the loss function. +* `train_op`, which executes a training step. + +Here's our code to call `EstimatorSpec`: + +```python +return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op) +``` + +The model function is now complete. + +## The custom Estimator + +Instantiate the custom Estimator through the Estimator base class as follows: + +```python + # Build 2 hidden layer DNN with 10, 10 units respectively. + classifier = tf.estimator.Estimator( + model_fn=my_model, + params={ + 'feature_columns': my_feature_columns, + # Two hidden layers of 10 nodes each. + 'hidden_units': [10, 10], + # The model must choose between 3 classes. + 'n_classes': 3, + }) +``` +Here the `params` dictionary serves the same purpose as the key-word +arguments of `DNNClassifier`; that is, the `params` dictionary lets you +configure your Estimator without modifying the code in the `model_fn`. + +The rest of the code to train, evaluate, and generate predictions using our +Estimator is the same as in the +@{$premade_estimators$Premade Estimators} chapter. For +example, the following line will train the model: + +```python +# Train the Model. +classifier.train( + input_fn=lambda:iris_data.train_input_fn(train_x, train_y, args.batch_size), + steps=args.train_steps) +``` + +## TensorBoard + +You can view training results for your custom Estimator in TensorBoard. To see +this reporting, start TensorBoard from your command line as follows: + +```bsh +# Replace PATH with the actual path passed as model_dir +tensorboard --logdir=PATH +``` + +Then, open TensorBoard by browsing to: [http://localhost:6006](http://localhost:6006) + +All the pre-made Estimators automatically log a lot of information to +TensorBoard. With custom Estimators, however, TensorBoard only provides one +default log (a graph of the loss) plus the information you explicitly tell +TensorBoard to log. For the custom Estimator you just created, TensorBoard +generates the following: + +
+ +Accuracy, 'scalar' graph from tensorboard + +loss 'scalar' graph from tensorboard + +steps/second 'scalar' graph from tensorboard +
+ +
+TensorBoard displays three graphs. +
+ + +In brief, here's what the three graphs tell you: + +* global_step/sec: A performance indicator showing how many batches (gradient + updates) we processed per second as the model trains. + +* loss: The loss reported. + +* accuracy: The accuracy is recorded by the following two lines: + + * `eval_metric_ops={'my_accuracy': accuracy}`, during evaluation. + * `tf.summary.scalar('accuracy', accuracy[1])`, during training. + +These tensorboard graphs are one of the main reasons it's important to pass a +`global_step` to your optimizer's `minimize` method. The model can't record +the x-coordinate for these graphs without it. + +Note the following in the `my_accuracy` and `loss` graphs: + +* The orange line represents training. +* The blue dot represents evaluation. + +During training, summaries (the orange line) are recorded periodically as +batches are processed, which is why it becomes a graph spanning x-axis range. + +By contrast, evaluation produces only a single point on the graph for each call +to `evaluate`. This point contains the average over the entire evaluation call. +This has no width on the graph as it is evaluated entirely from the model state +at a particular training step (from a single checkpoint). + +As suggested in the following figure, you may see and also selectively +disable/enable the reporting using the controls on the left side. + +
+Check-boxes allowing the user to select which runs are shown. +
+
+Enable or disable reporting. +
+ + +## Summary + +Although pre-made Estimators can be an effective way to quickly create new +models, you will often need the additional flexibility that custom Estimators +provide. Fortunately, pre-made and custom Estimators follow the same +programming model. The only practical difference is that you must write a model +function for custom Estimators; everything else is the same. + +For more details, be sure to check out: + +* The + [official TensorFlow implementation of MNIST](https://github.com/tensorflow/models/tree/master/official/mnist), + which uses a custom estimator. +* The TensorFlow + [official models repository](https://github.com/tensorflow/models/tree/master/official), + which contains more curated examples using custom estimators. +* This [TensorBoard video](https://youtu.be/eBbEDRsCmv4), which introduces + TensorBoard. +* The @{$low_level_intro$Low Level Introduction}, which demonstrates + how to experiment directly with TensorFlow's low level APIs, making debugging + easier. diff --git a/tensorflow/docs_src/guide/datasets.md b/tensorflow/docs_src/guide/datasets.md new file mode 100644 index 0000000000..8b69860a68 --- /dev/null +++ b/tensorflow/docs_src/guide/datasets.md @@ -0,0 +1,823 @@ +# Importing Data + +The @{tf.data} API enables you to build complex input pipelines from +simple, reusable pieces. For example, the pipeline for an image model might +aggregate data from files in a distributed file system, apply random +perturbations to each image, and merge randomly selected images into a batch +for training. The pipeline for a text model might involve extracting symbols +from raw text data, converting them to embedding identifiers with a lookup +table, and batching together sequences of different lengths. The `tf.data` API +makes it easy to deal with large amounts of data, different data formats, and +complicated transformations. + +The `tf.data` API introduces two new abstractions to TensorFlow: + +* A `tf.data.Dataset` represents a sequence of elements, in which + each element contains one or more `Tensor` objects. For example, in an image + pipeline, an element might be a single training example, with a pair of + tensors representing the image data and a label. There are two distinct + ways to create a dataset: + + * Creating a **source** (e.g. `Dataset.from_tensor_slices()`) constructs a + dataset from + one or more `tf.Tensor` objects. + + * Applying a **transformation** (e.g. `Dataset.batch()`) constructs a dataset + from one or more `tf.data.Dataset` objects. + +* A `tf.data.Iterator` provides the main way to extract elements from a + dataset. The operation returned by `Iterator.get_next()` yields the next + element of a `Dataset` when executed, and typically acts as the interface + between input pipeline code and your model. The simplest iterator is a + "one-shot iterator", which is associated with a particular `Dataset` and + iterates through it once. For more sophisticated uses, the + `Iterator.initializer` operation enables you to reinitialize and parameterize + an iterator with different datasets, so that you can, for example, iterate + over training and validation data multiple times in the same program. + +## Basic mechanics + +This section of the guide describes the fundamentals of creating different kinds +of `Dataset` and `Iterator` objects, and how to extract data from them. + +To start an input pipeline, you must define a *source*. For example, to +construct a `Dataset` from some tensors in memory, you can use +`tf.data.Dataset.from_tensors()` or +`tf.data.Dataset.from_tensor_slices()`. Alternatively, if your input +data are on disk in the recommended TFRecord format, you can construct a +`tf.data.TFRecordDataset`. + +Once you have a `Dataset` object, you can *transform* it into a new `Dataset` by +chaining method calls on the `tf.data.Dataset` object. For example, you +can apply per-element transformations such as `Dataset.map()` (to apply a +function to each element), and multi-element transformations such as +`Dataset.batch()`. See the documentation for @{tf.data.Dataset} +for a complete list of transformations. + +The most common way to consume values from a `Dataset` is to make an +**iterator** object that provides access to one element of the dataset at a time +(for example, by calling `Dataset.make_one_shot_iterator()`). A +`tf.data.Iterator` provides two operations: `Iterator.initializer`, +which enables you to (re)initialize the iterator's state; and +`Iterator.get_next()`, which returns `tf.Tensor` objects that correspond to the +symbolic next element. Depending on your use case, you might choose a different +type of iterator, and the options are outlined below. + +### Dataset structure + +A dataset comprises elements that each have the same structure. An element +contains one or more `tf.Tensor` objects, called *components*. Each component +has a `tf.DType` representing the type of elements in the tensor, and a +`tf.TensorShape` representing the (possibly partially specified) static shape of +each element. The `Dataset.output_types` and `Dataset.output_shapes` properties +allow you to inspect the inferred types and shapes of each component of a +dataset element. The *nested structure* of these properties map to the structure +of an element, which may be a single tensor, a tuple of tensors, or a nested +tuple of tensors. For example: + +```python +dataset1 = tf.data.Dataset.from_tensor_slices(tf.random_uniform([4, 10])) +print(dataset1.output_types) # ==> "tf.float32" +print(dataset1.output_shapes) # ==> "(10,)" + +dataset2 = tf.data.Dataset.from_tensor_slices( + (tf.random_uniform([4]), + tf.random_uniform([4, 100], maxval=100, dtype=tf.int32))) +print(dataset2.output_types) # ==> "(tf.float32, tf.int32)" +print(dataset2.output_shapes) # ==> "((), (100,))" + +dataset3 = tf.data.Dataset.zip((dataset1, dataset2)) +print(dataset3.output_types) # ==> (tf.float32, (tf.float32, tf.int32)) +print(dataset3.output_shapes) # ==> "(10, ((), (100,)))" +``` + +It is often convenient to give names to each component of an element, for +example if they represent different features of a training example. In addition +to tuples, you can use `collections.namedtuple` or a dictionary mapping strings +to tensors to represent a single element of a `Dataset`. + +```python +dataset = tf.data.Dataset.from_tensor_slices( + {"a": tf.random_uniform([4]), + "b": tf.random_uniform([4, 100], maxval=100, dtype=tf.int32)}) +print(dataset.output_types) # ==> "{'a': tf.float32, 'b': tf.int32}" +print(dataset.output_shapes) # ==> "{'a': (), 'b': (100,)}" +``` + +The `Dataset` transformations support datasets of any structure. When using the +`Dataset.map()`, `Dataset.flat_map()`, and `Dataset.filter()` transformations, +which apply a function to each element, the element structure determines the +arguments of the function: + +```python +dataset1 = dataset1.map(lambda x: ...) + +dataset2 = dataset2.flat_map(lambda x, y: ...) + +# Note: Argument destructuring is not available in Python 3. +dataset3 = dataset3.filter(lambda x, (y, z): ...) +``` + +### Creating an iterator + +Once you have built a `Dataset` to represent your input data, the next step is to +create an `Iterator` to access elements from that dataset. The `tf.data` API +currently supports the following iterators, in increasing level of +sophistication: + +* **one-shot**, +* **initializable**, +* **reinitializable**, and +* **feedable**. + +A **one-shot** iterator is the simplest form of iterator, which only supports +iterating once through a dataset, with no need for explicit initialization. +One-shot iterators handle almost all of the cases that the existing queue-based +input pipelines support, but they do not support parameterization. Using the +example of `Dataset.range()`: + +```python +dataset = tf.data.Dataset.range(100) +iterator = dataset.make_one_shot_iterator() +next_element = iterator.get_next() + +for i in range(100): + value = sess.run(next_element) + assert i == value +``` + +Note: Currently, one-shot iterators are the only type that is easily usable +with an `Estimator`. + +An **initializable** iterator requires you to run an explicit +`iterator.initializer` operation before using it. In exchange for this +inconvenience, it enables you to *parameterize* the definition of the dataset, +using one or more `tf.placeholder()` tensors that can be fed when you +initialize the iterator. Continuing the `Dataset.range()` example: + +```python +max_value = tf.placeholder(tf.int64, shape=[]) +dataset = tf.data.Dataset.range(max_value) +iterator = dataset.make_initializable_iterator() +next_element = iterator.get_next() + +# Initialize an iterator over a dataset with 10 elements. +sess.run(iterator.initializer, feed_dict={max_value: 10}) +for i in range(10): + value = sess.run(next_element) + assert i == value + +# Initialize the same iterator over a dataset with 100 elements. +sess.run(iterator.initializer, feed_dict={max_value: 100}) +for i in range(100): + value = sess.run(next_element) + assert i == value +``` + +A **reinitializable** iterator can be initialized from multiple different +`Dataset` objects. For example, you might have a training input pipeline that +uses random perturbations to the input images to improve generalization, and +a validation input pipeline that evaluates predictions on unmodified data. These +pipelines will typically use different `Dataset` objects that have the same +structure (i.e. the same types and compatible shapes for each component). + +```python +# Define training and validation datasets with the same structure. +training_dataset = tf.data.Dataset.range(100).map( + lambda x: x + tf.random_uniform([], -10, 10, tf.int64)) +validation_dataset = tf.data.Dataset.range(50) + +# A reinitializable iterator is defined by its structure. We could use the +# `output_types` and `output_shapes` properties of either `training_dataset` +# or `validation_dataset` here, because they are compatible. +iterator = tf.data.Iterator.from_structure(training_dataset.output_types, + training_dataset.output_shapes) +next_element = iterator.get_next() + +training_init_op = iterator.make_initializer(training_dataset) +validation_init_op = iterator.make_initializer(validation_dataset) + +# Run 20 epochs in which the training dataset is traversed, followed by the +# validation dataset. +for _ in range(20): + # Initialize an iterator over the training dataset. + sess.run(training_init_op) + for _ in range(100): + sess.run(next_element) + + # Initialize an iterator over the validation dataset. + sess.run(validation_init_op) + for _ in range(50): + sess.run(next_element) +``` + +A **feedable** iterator can be used together with @{tf.placeholder} to select +what `Iterator` to use in each call to @{tf.Session.run}, via the familiar +`feed_dict` mechanism. It offers the same functionality as a reinitializable +iterator, but it does not require you to initialize the iterator from the start +of a dataset when you switch between iterators. For example, using the same +training and validation example from above, you can use +@{tf.data.Iterator.from_string_handle} to define a feedable iterator +that allows you to switch between the two datasets: + +```python +# Define training and validation datasets with the same structure. +training_dataset = tf.data.Dataset.range(100).map( + lambda x: x + tf.random_uniform([], -10, 10, tf.int64)).repeat() +validation_dataset = tf.data.Dataset.range(50) + +# A feedable iterator is defined by a handle placeholder and its structure. We +# could use the `output_types` and `output_shapes` properties of either +# `training_dataset` or `validation_dataset` here, because they have +# identical structure. +handle = tf.placeholder(tf.string, shape=[]) +iterator = tf.data.Iterator.from_string_handle( + handle, training_dataset.output_types, training_dataset.output_shapes) +next_element = iterator.get_next() + +# You can use feedable iterators with a variety of different kinds of iterator +# (such as one-shot and initializable iterators). +training_iterator = training_dataset.make_one_shot_iterator() +validation_iterator = validation_dataset.make_initializable_iterator() + +# The `Iterator.string_handle()` method returns a tensor that can be evaluated +# and used to feed the `handle` placeholder. +training_handle = sess.run(training_iterator.string_handle()) +validation_handle = sess.run(validation_iterator.string_handle()) + +# Loop forever, alternating between training and validation. +while True: + # Run 200 steps using the training dataset. Note that the training dataset is + # infinite, and we resume from where we left off in the previous `while` loop + # iteration. + for _ in range(200): + sess.run(next_element, feed_dict={handle: training_handle}) + + # Run one pass over the validation dataset. + sess.run(validation_iterator.initializer) + for _ in range(50): + sess.run(next_element, feed_dict={handle: validation_handle}) +``` + +### Consuming values from an iterator + +The `Iterator.get_next()` method returns one or more `tf.Tensor` objects that +correspond to the symbolic next element of an iterator. Each time these tensors +are evaluated, they take the value of the next element in the underlying +dataset. (Note that, like other stateful objects in TensorFlow, calling +`Iterator.get_next()` does not immediately advance the iterator. Instead you +must use the returned `tf.Tensor` objects in a TensorFlow expression, and pass +the result of that expression to `tf.Session.run()` to get the next elements and +advance the iterator.) + +If the iterator reaches the end of the dataset, executing +the `Iterator.get_next()` operation will raise a `tf.errors.OutOfRangeError`. +After this point the iterator will be in an unusable state, and you must +initialize it again if you want to use it further. + +```python +dataset = tf.data.Dataset.range(5) +iterator = dataset.make_initializable_iterator() +next_element = iterator.get_next() + +# Typically `result` will be the output of a model, or an optimizer's +# training operation. +result = tf.add(next_element, next_element) + +sess.run(iterator.initializer) +print(sess.run(result)) # ==> "0" +print(sess.run(result)) # ==> "2" +print(sess.run(result)) # ==> "4" +print(sess.run(result)) # ==> "6" +print(sess.run(result)) # ==> "8" +try: + sess.run(result) +except tf.errors.OutOfRangeError: + print("End of dataset") # ==> "End of dataset" +``` + +A common pattern is to wrap the "training loop" in a `try`-`except` block: + +```python +sess.run(iterator.initializer) +while True: + try: + sess.run(result) + except tf.errors.OutOfRangeError: + break +``` + +If each element of the dataset has a nested structure, the return value of +`Iterator.get_next()` will be one or more `tf.Tensor` objects in the same +nested structure: + +```python +dataset1 = tf.data.Dataset.from_tensor_slices(tf.random_uniform([4, 10])) +dataset2 = tf.data.Dataset.from_tensor_slices((tf.random_uniform([4]), tf.random_uniform([4, 100]))) +dataset3 = tf.data.Dataset.zip((dataset1, dataset2)) + +iterator = dataset3.make_initializable_iterator() + +sess.run(iterator.initializer) +next1, (next2, next3) = iterator.get_next() +``` + +Note that `next1`, `next2`, and `next3` are tensors produced by the +same op/node (created by `Iterator.get_next()`). Therefore, evaluating *any* of +these tensors will advance the iterator for all components. A typical consumer +of an iterator will include all components in a single expression. + +### Saving iterator state + +The @{tf.contrib.data.make_saveable_from_iterator} function creates a +`SaveableObject` from an iterator, which can be used to save and +restore the current state of the iterator (and, effectively, the whole input +pipeline). A saveable object thus created can be added to @{tf.train.Saver} +variables list or the `tf.GraphKeys.SAVEABLE_OBJECTS` collection for saving and +restoring in the same manner as a @{tf.Variable}. Refer to +@{$saved_model$Saving and Restoring} for details on how to save and restore +variables. + +```python +# Create saveable object from iterator. +saveable = tf.contrib.data.make_saveable_from_iterator(iterator) + +# Save the iterator state by adding it to the saveable objects collection. +tf.add_to_collection(tf.GraphKeys.SAVEABLE_OBJECTS, saveable) +saver = tf.train.Saver() + +with tf.Session() as sess: + + if should_checkpoint: + saver.save(path_to_checkpoint) + +# Restore the iterator state. +with tf.Session() as sess: + saver.restore(sess, path_to_checkpoint) +``` + +## Reading input data + +### Consuming NumPy arrays + +If all of your input data fit in memory, the simplest way to create a `Dataset` +from them is to convert them to `tf.Tensor` objects and use +`Dataset.from_tensor_slices()`. + +```python +# Load the training data into two NumPy arrays, for example using `np.load()`. +with np.load("/var/data/training_data.npy") as data: + features = data["features"] + labels = data["labels"] + +# Assume that each row of `features` corresponds to the same row as `labels`. +assert features.shape[0] == labels.shape[0] + +dataset = tf.data.Dataset.from_tensor_slices((features, labels)) +``` + +Note that the above code snippet will embed the `features` and `labels` arrays +in your TensorFlow graph as `tf.constant()` operations. This works well for a +small dataset, but wastes memory---because the contents of the array will be +copied multiple times---and can run into the 2GB limit for the `tf.GraphDef` +protocol buffer. + +As an alternative, you can define the `Dataset` in terms of `tf.placeholder()` +tensors, and *feed* the NumPy arrays when you initialize an `Iterator` over the +dataset. + +```python +# Load the training data into two NumPy arrays, for example using `np.load()`. +with np.load("/var/data/training_data.npy") as data: + features = data["features"] + labels = data["labels"] + +# Assume that each row of `features` corresponds to the same row as `labels`. +assert features.shape[0] == labels.shape[0] + +features_placeholder = tf.placeholder(features.dtype, features.shape) +labels_placeholder = tf.placeholder(labels.dtype, labels.shape) + +dataset = tf.data.Dataset.from_tensor_slices((features_placeholder, labels_placeholder)) +# [Other transformations on `dataset`...] +dataset = ... +iterator = dataset.make_initializable_iterator() + +sess.run(iterator.initializer, feed_dict={features_placeholder: features, + labels_placeholder: labels}) +``` + +### Consuming TFRecord data + +The `tf.data` API supports a variety of file formats so that you can process +large datasets that do not fit in memory. For example, the TFRecord file format +is a simple record-oriented binary format that many TensorFlow applications use +for training data. The `tf.data.TFRecordDataset` class enables you to +stream over the contents of one or more TFRecord files as part of an input +pipeline. + +```python +# Creates a dataset that reads all of the examples from two files. +filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] +dataset = tf.data.TFRecordDataset(filenames) +``` + +The `filenames` argument to the `TFRecordDataset` initializer can either be a +string, a list of strings, or a `tf.Tensor` of strings. Therefore if you have +two sets of files for training and validation purposes, you can use a +`tf.placeholder(tf.string)` to represent the filenames, and initialize an +iterator from the appropriate filenames: + +```python +filenames = tf.placeholder(tf.string, shape=[None]) +dataset = tf.data.TFRecordDataset(filenames) +dataset = dataset.map(...) # Parse the record into tensors. +dataset = dataset.repeat() # Repeat the input indefinitely. +dataset = dataset.batch(32) +iterator = dataset.make_initializable_iterator() + +# You can feed the initializer with the appropriate filenames for the current +# phase of execution, e.g. training vs. validation. + +# Initialize `iterator` with training data. +training_filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] +sess.run(iterator.initializer, feed_dict={filenames: training_filenames}) + +# Initialize `iterator` with validation data. +validation_filenames = ["/var/data/validation1.tfrecord", ...] +sess.run(iterator.initializer, feed_dict={filenames: validation_filenames}) +``` + +### Consuming text data + +Many datasets are distributed as one or more text files. The +`tf.data.TextLineDataset` provides an easy way to extract lines from +one or more text files. Given one or more filenames, a `TextLineDataset` will +produce one string-valued element per line of those files. Like a +`TFRecordDataset`, `TextLineDataset` accepts `filenames` as a `tf.Tensor`, so +you can parameterize it by passing a `tf.placeholder(tf.string)`. + +```python +filenames = ["/var/data/file1.txt", "/var/data/file2.txt"] +dataset = tf.data.TextLineDataset(filenames) +``` + +By default, a `TextLineDataset` yields *every* line of each file, which may +not be desirable, for example if the file starts with a header line, or contains +comments. These lines can be removed using the `Dataset.skip()` and +`Dataset.filter()` transformations. To apply these transformations to each +file separately, we use `Dataset.flat_map()` to create a nested `Dataset` for +each file. + +```python +filenames = ["/var/data/file1.txt", "/var/data/file2.txt"] + +dataset = tf.data.Dataset.from_tensor_slices(filenames) + +# Use `Dataset.flat_map()` to transform each file as a separate nested dataset, +# and then concatenate their contents sequentially into a single "flat" dataset. +# * Skip the first line (header row). +# * Filter out lines beginning with "#" (comments). +dataset = dataset.flat_map( + lambda filename: ( + tf.data.TextLineDataset(filename) + .skip(1) + .filter(lambda line: tf.not_equal(tf.substr(line, 0, 1), "#")))) +``` + +### Consuming CSV data + +The CSV file format is a popular format for storing tabular data in plain text. +The @{tf.contrib.data.CsvDataset} class provides a way to extract records from +one or more CSV files that comply with [RFC 4180](https://tools.ietf.org/html/rfc4180). +Given one or more filenames and a list of defaults, a `CsvDataset` will produce +a tuple of elements whose types correspond to the types of the defaults +provided, per CSV record. Like `TFRecordDataset` and `TextLineDataset`, +`CsvDataset` accepts `filenames` as a `tf.Tensor`, so you can parameterize it +by passing a `tf.placeholder(tf.string)`. + +``` +# Creates a dataset that reads all of the records from two CSV files, each with +# eight float columns +filenames = ["/var/data/file1.csv", "/var/data/file2.csv"] +record_defaults = [tf.float32] * 8 # Eight required float columns +dataset = tf.contrib.data.CsvDataset(filenames, record_defaults) +``` + +If some columns are empty, you can provide defaults instead of types. + +``` +# Creates a dataset that reads all of the records from two CSV files, each with +# four float columns which may have missing values +record_defaults = [[0.0]] * 8 +dataset = tf.contrib.data.CsvDataset(filenames, record_defaults) +``` + +By default, a `CsvDataset` yields *every* column of *every* line of the file, +which may not be desirable, for example if the file starts with a header line +that should be ignored, or if some columns are not required in the input. +These lines and fields can be removed with the `header` and `select_cols` +arguments respectively. + +``` +# Creates a dataset that reads all of the records from two CSV files with +# headers, extracting float data from columns 2 and 4. +record_defaults = [[0.0]] * 2 # Only provide defaults for the selected columns +dataset = tf.contrib.data.CsvDataset(filenames, record_defaults, header=True, select_cols=[2,4]) +``` + + +## Preprocessing data with `Dataset.map()` + +The `Dataset.map(f)` transformation produces a new dataset by applying a given +function `f` to each element of the input dataset. It is based on +the +[`map()` function](https://en.wikipedia.org/wiki/Map_(higher-order_function)) +that is commonly applied to lists (and other structures) in functional +programming languages. The function `f` takes the `tf.Tensor` objects that +represent a single element in the input, and returns the `tf.Tensor` objects +that will represent a single element in the new dataset. Its implementation uses +standard TensorFlow operations to transform one element into another. + +This section covers common examples of how to use `Dataset.map()`. + +### Parsing `tf.Example` protocol buffer messages + +Many input pipelines extract `tf.train.Example` protocol buffer messages from a +TFRecord-format file (written, for example, using +`tf.python_io.TFRecordWriter`). Each `tf.train.Example` record contains one or +more "features", and the input pipeline typically converts these features into +tensors. + +```python +# Transforms a scalar string `example_proto` into a pair of a scalar string and +# a scalar integer, representing an image and its label, respectively. +def _parse_function(example_proto): + features = {"image": tf.FixedLenFeature((), tf.string, default_value=""), + "label": tf.FixedLenFeature((), tf.int32, default_value=0)} + parsed_features = tf.parse_single_example(example_proto, features) + return parsed_features["image"], parsed_features["label"] + +# Creates a dataset that reads all of the examples from two files, and extracts +# the image and label features. +filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] +dataset = tf.data.TFRecordDataset(filenames) +dataset = dataset.map(_parse_function) +``` + +### Decoding image data and resizing it + +When training a neural network on real-world image data, it is often necessary +to convert images of different sizes to a common size, so that they may be +batched into a fixed size. + +```python +# Reads an image from a file, decodes it into a dense tensor, and resizes it +# to a fixed shape. +def _parse_function(filename, label): + image_string = tf.read_file(filename) + image_decoded = tf.image.decode_jpeg(image_string) + image_resized = tf.image.resize_images(image_decoded, [28, 28]) + return image_resized, label + +# A vector of filenames. +filenames = tf.constant(["/var/data/image1.jpg", "/var/data/image2.jpg", ...]) + +# `labels[i]` is the label for the image in `filenames[i]. +labels = tf.constant([0, 37, ...]) + +dataset = tf.data.Dataset.from_tensor_slices((filenames, labels)) +dataset = dataset.map(_parse_function) +``` + +### Applying arbitrary Python logic with `tf.py_func()` + +For performance reasons, we encourage you to use TensorFlow operations for +preprocessing your data whenever possible. However, it is sometimes useful to +call upon external Python libraries when parsing your input data. To do so, +invoke, the `tf.py_func()` operation in a `Dataset.map()` transformation. + +```python +import cv2 + +# Use a custom OpenCV function to read the image, instead of the standard +# TensorFlow `tf.read_file()` operation. +def _read_py_function(filename, label): + image_decoded = cv2.imread(filename.decode(), cv2.IMREAD_GRAYSCALE) + return image_decoded, label + +# Use standard TensorFlow operations to resize the image to a fixed shape. +def _resize_function(image_decoded, label): + image_decoded.set_shape([None, None, None]) + image_resized = tf.image.resize_images(image_decoded, [28, 28]) + return image_resized, label + +filenames = ["/var/data/image1.jpg", "/var/data/image2.jpg", ...] +labels = [0, 37, 29, 1, ...] + +dataset = tf.data.Dataset.from_tensor_slices((filenames, labels)) +dataset = dataset.map( + lambda filename, label: tuple(tf.py_func( + _read_py_function, [filename, label], [tf.uint8, label.dtype]))) +dataset = dataset.map(_resize_function) +``` + + + +## Batching dataset elements + +### Simple batching + +The simplest form of batching stacks `n` consecutive elements of a dataset into +a single element. The `Dataset.batch()` transformation does exactly this, with +the same constraints as the `tf.stack()` operator, applied to each component +of the elements: i.e. for each component *i*, all elements must have a tensor +of the exact same shape. + +```python +inc_dataset = tf.data.Dataset.range(100) +dec_dataset = tf.data.Dataset.range(0, -100, -1) +dataset = tf.data.Dataset.zip((inc_dataset, dec_dataset)) +batched_dataset = dataset.batch(4) + +iterator = batched_dataset.make_one_shot_iterator() +next_element = iterator.get_next() + +print(sess.run(next_element)) # ==> ([0, 1, 2, 3], [ 0, -1, -2, -3]) +print(sess.run(next_element)) # ==> ([4, 5, 6, 7], [-4, -5, -6, -7]) +print(sess.run(next_element)) # ==> ([8, 9, 10, 11], [-8, -9, -10, -11]) +``` + +### Batching tensors with padding + +The above recipe works for tensors that all have the same size. However, many +models (e.g. sequence models) work with input data that can have varying size +(e.g. sequences of different lengths). To handle this case, the +`Dataset.padded_batch()` transformation enables you to batch tensors of +different shape by specifying one or more dimensions in which they may be +padded. + +```python +dataset = tf.data.Dataset.range(100) +dataset = dataset.map(lambda x: tf.fill([tf.cast(x, tf.int32)], x)) +dataset = dataset.padded_batch(4, padded_shapes=[None]) + +iterator = dataset.make_one_shot_iterator() +next_element = iterator.get_next() + +print(sess.run(next_element)) # ==> [[0, 0, 0], [1, 0, 0], [2, 2, 0], [3, 3, 3]] +print(sess.run(next_element)) # ==> [[4, 4, 4, 4, 0, 0, 0], + # [5, 5, 5, 5, 5, 0, 0], + # [6, 6, 6, 6, 6, 6, 0], + # [7, 7, 7, 7, 7, 7, 7]] +``` + +The `Dataset.padded_batch()` transformation allows you to set different padding +for each dimension of each component, and it may be variable-length (signified +by `None` in the example above) or constant-length. It is also possible to +override the padding value, which defaults to 0. + + + +## Training workflows + +### Processing multiple epochs + +The `tf.data` API offers two main ways to process multiple epochs of the same +data. + +The simplest way to iterate over a dataset in multiple epochs is to use the +`Dataset.repeat()` transformation. For example, to create a dataset that repeats +its input for 10 epochs: + +```python +filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] +dataset = tf.data.TFRecordDataset(filenames) +dataset = dataset.map(...) +dataset = dataset.repeat(10) +dataset = dataset.batch(32) +``` + +Applying the `Dataset.repeat()` transformation with no arguments will repeat +the input indefinitely. The `Dataset.repeat()` transformation concatenates its +arguments without signaling the end of one epoch and the beginning of the next +epoch. + +If you want to receive a signal at the end of each epoch, you can write a +training loop that catches the `tf.errors.OutOfRangeError` at the end of a +dataset. At that point you might collect some statistics (e.g. the validation +error) for the epoch. + +```python +filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] +dataset = tf.data.TFRecordDataset(filenames) +dataset = dataset.map(...) +dataset = dataset.batch(32) +iterator = dataset.make_initializable_iterator() +next_element = iterator.get_next() + +# Compute for 100 epochs. +for _ in range(100): + sess.run(iterator.initializer) + while True: + try: + sess.run(next_element) + except tf.errors.OutOfRangeError: + break + + # [Perform end-of-epoch calculations here.] +``` + +### Randomly shuffling input data + +The `Dataset.shuffle()` transformation randomly shuffles the input dataset +using a similar algorithm to `tf.RandomShuffleQueue`: it maintains a fixed-size +buffer and chooses the next element uniformly at random from that buffer. + +```python +filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] +dataset = tf.data.TFRecordDataset(filenames) +dataset = dataset.map(...) +dataset = dataset.shuffle(buffer_size=10000) +dataset = dataset.batch(32) +dataset = dataset.repeat() +``` + +### Using high-level APIs + +The @{tf.train.MonitoredTrainingSession} API simplifies many aspects of running +TensorFlow in a distributed setting. `MonitoredTrainingSession` uses the +@{tf.errors.OutOfRangeError} to signal that training has completed, so to use it +with the `tf.data` API, we recommend using +`Dataset.make_one_shot_iterator()`. For example: + +```python +filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] +dataset = tf.data.TFRecordDataset(filenames) +dataset = dataset.map(...) +dataset = dataset.shuffle(buffer_size=10000) +dataset = dataset.batch(32) +dataset = dataset.repeat(num_epochs) +iterator = dataset.make_one_shot_iterator() + +next_example, next_label = iterator.get_next() +loss = model_function(next_example, next_label) + +training_op = tf.train.AdagradOptimizer(...).minimize(loss) + +with tf.train.MonitoredTrainingSession(...) as sess: + while not sess.should_stop(): + sess.run(training_op) +``` + +To use a `Dataset` in the `input_fn` of a @{tf.estimator.Estimator}, we also +recommend using `Dataset.make_one_shot_iterator()`. For example: + +```python +def dataset_input_fn(): + filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] + dataset = tf.data.TFRecordDataset(filenames) + + # Use `tf.parse_single_example()` to extract data from a `tf.Example` + # protocol buffer, and perform any additional per-record preprocessing. + def parser(record): + keys_to_features = { + "image_data": tf.FixedLenFeature((), tf.string, default_value=""), + "date_time": tf.FixedLenFeature((), tf.int64, default_value=""), + "label": tf.FixedLenFeature((), tf.int64, + default_value=tf.zeros([], dtype=tf.int64)), + } + parsed = tf.parse_single_example(record, keys_to_features) + + # Perform additional preprocessing on the parsed data. + image = tf.image.decode_jpeg(parsed["image_data"]) + image = tf.reshape(image, [299, 299, 1]) + label = tf.cast(parsed["label"], tf.int32) + + return {"image_data": image, "date_time": parsed["date_time"]}, label + + # Use `Dataset.map()` to build a pair of a feature dictionary and a label + # tensor for each example. + dataset = dataset.map(parser) + dataset = dataset.shuffle(buffer_size=10000) + dataset = dataset.batch(32) + dataset = dataset.repeat(num_epochs) + iterator = dataset.make_one_shot_iterator() + + # `features` is a dictionary in which each value is a batch of values for + # that feature; `labels` is a batch of labels. + features, labels = iterator.get_next() + return features, labels +``` diff --git a/tensorflow/docs_src/guide/datasets_for_estimators.md b/tensorflow/docs_src/guide/datasets_for_estimators.md new file mode 100644 index 0000000000..b04af78cd8 --- /dev/null +++ b/tensorflow/docs_src/guide/datasets_for_estimators.md @@ -0,0 +1,387 @@ +# Datasets for Estimators + +The @{tf.data} module contains a collection of classes that allows you to +easily load data, manipulate it, and pipe it into your model. This document +introduces the API by walking through two simple examples: + +* Reading in-memory data from numpy arrays. +* Reading lines from a csv file. + + + +## Basic input + +Taking slices from an array is the simplest way to get started with `tf.data`. + +The @{$premade_estimators$Premade Estimators} chapter describes +the following `train_input_fn`, from +[`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py), +to pipe the data into the Estimator: + +``` python +def train_input_fn(features, labels, batch_size): + """An input function for training""" + # Convert the inputs to a Dataset. + dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) + + # Shuffle, repeat, and batch the examples. + dataset = dataset.shuffle(1000).repeat().batch(batch_size) + + # Return the dataset. + return dataset +``` + +Let's look at this more closely. + +### Arguments + +This function expects three arguments. Arguments expecting an "array" can +accept nearly anything that can be converted to an array with `numpy.array`. +One exception is +[`tuple`](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences) +which, as we will see, has special meaning for `Datasets`. + +* `features`: A `{'feature_name':array}` dictionary (or + [`DataFrame`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html)) + containing the raw input features. +* `labels` : An array containing the + [label](https://developers.google.com/machine-learning/glossary/#label) + for each example. +* `batch_size` : An integer indicating the desired batch size. + +In [`premade_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py) +we retrieved the Iris data using the `iris_data.load_data()` function. +You can run it, and unpack the results as follows: + +``` python +import iris_data + +# Fetch the data +train, test = iris_data.load_data() +features, labels = train +``` + +Then we passed this data to the input function, with a line similar to this: + +``` python +batch_size=100 +iris_data.train_input_fn(features, labels, batch_size) +``` + +Let's walk through the `train_input_fn()`. + +### Slices + +The function starts by using the @{tf.data.Dataset.from_tensor_slices} function +to create a @{tf.data.Dataset} representing slices of the array. The array is +sliced across the first dimension. For example, an array containing the +@{$tutorials/layers$mnist training data} has a shape of `(60000, 28, 28)`. +Passing this to `from_tensor_slices` returns a `Dataset` object containing +60000 slices, each one a 28x28 image. + +The code that returns this `Dataset` is as follows: + +``` python +train, test = tf.keras.datasets.mnist.load_data() +mnist_x, mnist_y = train + +mnist_ds = tf.data.Dataset.from_tensor_slices(mnist_x) +print(mnist_ds) +``` + +This will print the following line, showing the +@{$guide/tensors#shapes$shapes} and +@{$guide/tensors#data_types$types} of the items in +the dataset. Note that a `Dataset` does not know how many items it contains. + +``` None + +``` + +The `Dataset` above represents a simple collection of arrays, but datasets are +much more powerful than this. A `Dataset` can transparently handle any nested +combination of dictionaries or tuples (or +[`namedtuple`](https://docs.python.org/2/library/collections.html#collections.namedtuple) +). + +For example after converting the iris `features` +to a standard python dictionary, you can then convert the dictionary of arrays +to a `Dataset` of dictionaries as follows: + +``` python +dataset = tf.data.Dataset.from_tensor_slices(dict(features)) +print(dataset) +``` +``` None + +``` + +Here we see that when a `Dataset` contains structured elements, the `shapes` +and `types` of the `Dataset` take on the same structure. This dataset contains +dictionaries of @{$guide/tensors#rank$scalars}, all of type +`tf.float64`. + +The first line of the iris `train_input_fn` uses the same functionality, but +adds another level of structure. It creates a dataset containing +`(features_dict, label)` pairs. + +The following code shows that the label is a scalar with type `int64`: + +``` python +# Convert the inputs to a Dataset. +dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) +print(dataset) +``` +``` + +``` + +### Manipulation + +Currently the `Dataset` would iterate over the data once, in a fixed order, and +only produce a single element at a time. It needs further processing before it +can be used for training. Fortunately, the `tf.data.Dataset` class provides +methods to better prepare the data for training. The next line of the input +function takes advantage of several of these methods: + +``` python +# Shuffle, repeat, and batch the examples. +dataset = dataset.shuffle(1000).repeat().batch(batch_size) +``` + +The @{tf.data.Dataset.shuffle$`shuffle`} method uses a fixed-size buffer to +shuffle the items as they pass through. In this case the `buffer_size` is +greater than the number of examples in the `Dataset`, ensuring that the data is +completely shuffled (The Iris data set only contains 150 examples). + +The @{tf.data.Dataset.repeat$`repeat`} method restarts the `Dataset` when +it reaches the end. To limit the number of epochs, set the `count` argument. + +The @{tf.data.Dataset.batch$`batch`} method collects a number of examples and +stacks them, to create batches. This adds a dimension to their shape. The new +dimension is added as the first dimension. The following code uses +the `batch` method on the MNIST `Dataset`, from earlier. This results in a +`Dataset` containing 3D arrays representing stacks of `(28,28)` images: + +``` python +print(mnist_ds.batch(100)) +``` + +``` none + +``` +Note that the dataset has an unknown batch size because the last batch will +have fewer elements. + +In `train_input_fn`, after batching the `Dataset` contains 1D vectors of +elements where each scalar was previously: + +```python +print(dataset) +``` +``` + +``` + + +### Return + +At this point the `Dataset` contains `(features_dict, labels)` pairs. +This is the format expected by the `train` and `evaluate` methods, so the +`input_fn` returns the dataset. + +The `labels` can/should be omitted when using the `predict` method. + + + + +## Reading a CSV File + +The most common real-world use case for the `Dataset` class is to stream data +from files on disk. The @{tf.data} module includes a variety of +file readers. Let's see how parsing the Iris dataset from the csv file looks +using a `Dataset`. + +The following call to the `iris_data.maybe_download` function downloads the +data if necessary, and returns the pathnames of the resulting files: + +``` python +import iris_data +train_path, test_path = iris_data.maybe_download() +``` + +The [`iris_data.csv_input_fn`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py) +function contains an alternative implementation that parses the csv files using +a `Dataset`. + +Let's look at how to build an Estimator-compatible input function that reads +from the local files. + +### Build the `Dataset` + +We start by building a @{tf.data.TextLineDataset$`TextLineDataset`} object to +read the file one line at a time. Then, we call the +@{tf.data.Dataset.skip$`skip`} method to skip over the first line of the file, which contains a header, not an example: + +``` python +ds = tf.data.TextLineDataset(train_path).skip(1) +``` + +### Build a csv line parser + +We will start by building a function to parse a single line. + +The following `iris_data.parse_line` function accomplishes this task using the +@{tf.decode_csv} function, and some simple python code: + +We must parse each of the lines in the dataset in order to generate the +necessary `(features, label)` pairs. The following `_parse_line` function +calls @{tf.decode_csv} to parse a single line into its features +and the label. Since Estimators require that features be represented as a +dictionary, we rely on Python's built-in `dict` and `zip` functions to build +that dictionary. The feature names are the keys of that dictionary. +We then call the dictionary's `pop` method to remove the label field from +the features dictionary: + +``` python +# Metadata describing the text columns +COLUMNS = ['SepalLength', 'SepalWidth', + 'PetalLength', 'PetalWidth', + 'label'] +FIELD_DEFAULTS = [[0.0], [0.0], [0.0], [0.0], [0]] +def _parse_line(line): + # Decode the line into its fields + fields = tf.decode_csv(line, FIELD_DEFAULTS) + + # Pack the result into a dictionary + features = dict(zip(COLUMNS,fields)) + + # Separate the label from the features + label = features.pop('label') + + return features, label +``` + +### Parse the lines + +Datasets have many methods for manipulating the data while it is being piped +to a model. The most heavily-used method is @{tf.data.Dataset.map$`map`}, which +applies a transformation to each element of the `Dataset`. + +The `map` method takes a `map_func` argument that describes how each item in the +`Dataset` should be transformed. + +
+ +
+
+The @{tf.data.Dataset.map$`map`} method applies the `map_func` to +transform each item in the Dataset. +
+ +So to parse the lines as they are streamed out of the csv file, we pass our +`_parse_line` function to the `map` method: + +``` python +ds = ds.map(_parse_line) +print(ds) +``` +``` None + +``` + +Now instead of simple scalar strings, the dataset contains `(features, label)` +pairs. + +the remainder of the `iris_data.csv_input_fn` function is identical +to `iris_data.train_input_fn` which was covered in the in the +[Basic input](#basic_input) section. + +### Try it out + +This function can be used as a replacement for +`iris_data.train_input_fn`. It can be used to feed an estimator as follows: + +``` python +train_path, test_path = iris_data.maybe_download() + +# All the inputs are numeric +feature_columns = [ + tf.feature_column.numeric_column(name) + for name in iris_data.CSV_COLUMN_NAMES[:-1]] + +# Build the estimator +est = tf.estimator.LinearClassifier(feature_columns, + n_classes=3) +# Train the estimator +batch_size = 100 +est.train( + steps=1000, + input_fn=lambda : iris_data.csv_input_fn(train_path, batch_size)) +``` + +Estimators expect an `input_fn` to take no arguments. To work around this +restriction, we use `lambda` to capture the arguments and provide the expected +interface. + +## Summary + +The `tf.data` module provides a collection of classes and functions for easily +reading data from a variety of sources. Furthermore, `tf.data` has simple +powerful methods for applying a wide variety of standard and custom +transformations. + +Now you have the basic idea of how to efficiently load data into an +Estimator. Consider the following documents next: + + +* @{$custom_estimators}, which demonstrates how to build your own + custom `Estimator` model. +* The @{$low_level_intro#datasets$Low Level Introduction}, which demonstrates + how to experiment directly with `tf.data.Datasets` using TensorFlow's low + level APIs. +* @{$guide/datasets} which goes into great detail about additional + functionality of `Datasets`. + diff --git a/tensorflow/docs_src/guide/debugger.md b/tensorflow/docs_src/guide/debugger.md new file mode 100644 index 0000000000..6bd941886d --- /dev/null +++ b/tensorflow/docs_src/guide/debugger.md @@ -0,0 +1,804 @@ +# TensorFlow Debugger + + + +[TOC] + +`tfdbg` is a specialized debugger for TensorFlow. It lets you view the internal +structure and states of running TensorFlow graphs during training and inference, +which is difficult to debug with general-purpose debuggers such as Python's `pdb` +due to TensorFlow's computation-graph paradigm. + +This guide focuses on the command-line interface (CLI) of `tfdbg`. For guide on +how to use the graphical user interface (GUI) of tfdbg, i.e., the +**TensorBoard Debugger Plugin**, please visit +[its README](https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/debugger/README.md). + +Note: The TensorFlow debugger uses a +[curses](https://en.wikipedia.org/wiki/Curses_\(programming_library\))-based text +user interface. On Mac OS X, the `ncurses` library is required and can be +installed with `brew install homebrew/dupes/ncurses`. On Windows, curses isn't as +well supported, so a [readline](https://en.wikipedia.org/wiki/GNU_Readline)-based +interface can be used with tfdbg by installing `pyreadline` with `pip`. If you +use Anaconda3, you can install it with a command such as +`"C:\Program Files\Anaconda3\Scripts\pip.exe" install pyreadline`. Unofficial +Windows curses packages can be downloaded +[here](https://www.lfd.uci.edu/~gohlke/pythonlibs/#curses), then subsequently +installed using `pip install .whl`, however curses on Windows may +not work as reliably as curses on Linux or Mac. + +This tutorial demonstrates how to use the **tfdbg** CLI to debug the appearance +of [`nan`s](https://en.wikipedia.org/wiki/NaN) +and [`inf`s](https://en.wikipedia.org/wiki/Infinity), a frequently-encountered +type of bug in TensorFlow model development. +The following example is for users who use the low-level +[`Session`](https://www.tensorflow.org/api_docs/python/tf/Session) API of +TensorFlow. A later section of this document describes how to use **tfdbg** +with a higher-level API, namely `Estimator`s. +To *observe* such an issue, run the following command without the debugger (the +source code can be found +[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/debug/examples/debug_mnist.py)): + +```none +python -m tensorflow.python.debug.examples.debug_mnist +``` + +This code trains a simple neural network for MNIST digit image recognition. +Notice that the accuracy increases slightly after the first training step, but +then gets stuck at a low (near-chance) level: + +```none +Accuracy at step 0: 0.1113 +Accuracy at step 1: 0.3183 +Accuracy at step 2: 0.098 +Accuracy at step 3: 0.098 +Accuracy at step 4: 0.098 +``` + +Wondering what might have gone wrong, you suspect that certain nodes in the +training graph generated bad numeric values such as `inf`s and `nan`s, because +this is a common cause of this type of training failure. +Let's use tfdbg to debug this issue and pinpoint the exact graph node where this +numeric problem first surfaced. + +## Wrapping TensorFlow Sessions with tfdbg + +To add support for tfdbg in our example, all that is needed is to add the +following lines of code and wrap the Session object with a debugger wrapper. +This code is already added in +[debug_mnist.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/debug/examples/debug_mnist.py), +so you can activate tfdbg CLI with the `--debug` flag at the command line. + +```python +# Let your BUILD target depend on "//tensorflow/python/debug:debug_py" +# (You don't need to worry about the BUILD dependency if you are using a pip +# install of open-source TensorFlow.) +from tensorflow.python import debug as tf_debug + +sess = tf_debug.LocalCLIDebugWrapperSession(sess) +``` + +This wrapper has the same interface as Session, so enabling debugging requires +no other changes to the code. The wrapper provides additional features, +including: + +* Bringing up a CLI before and after `Session.run()` calls, to let you +control the execution and inspect the graph's internal state. +* Allowing you to register special `filters` for tensor values, to facilitate +the diagnosis of issues. + +In this example, we have already registered a tensor filter called +@{tfdbg.has_inf_or_nan}, +which simply determines if there are any `nan` or `inf` values in any +intermediate tensors (tensors that are neither inputs or outputs of the +`Session.run()` call, but are in the path leading from the inputs to the +outputs). This filter is for `nan`s and `inf`s is a common enough use case that +we ship it with the +@{$python/tfdbg#Classes_for_debug_dump_data_and_directories$`debug_data`} +module. + +Note: You can also write your own custom filters. See +the @{tfdbg.DebugDumpDir.find$API documentation} +of `DebugDumpDir.find()` for additional information. + +## Debugging Model Training with tfdbg + + +Let's try training the model again, but with the `--debug` flag added this time: + +```none +python -m tensorflow.python.debug.examples.debug_mnist --debug +``` + +The debug wrapper session will prompt you when it is about to execute the first +`Session.run()` call, with information regarding the fetched tensor and feed +dictionaries displayed on the screen. + +![tfdbg run-start UI](https://www.tensorflow.org/images/tfdbg_screenshot_run_start.png) + +This is what we refer to as the *run-start CLI*. It lists the feeds and fetches +to the current `Session.run` call, before executing anything. + +If the screen size is too small to display the content of the message in its +entirety, you can resize it. + +Use the **PageUp** / **PageDown** / **Home** / **End** keys to navigate the +screen output. On most keyboards lacking those keys **Fn + Up** / +**Fn + Down** / **Fn + Right** / **Fn + Left** will work. + +Enter the `run` command (or just `r`) at the command prompt: + +``` +tfdbg> run +``` + +The `run` command causes tfdbg to execute until the end of the next +`Session.run()` call, which calculates the model's accuracy using a test data +set. tfdbg augments the runtime Graph to dump all intermediate tensors. +After the run ends, tfdbg displays all the dumped tensors values in the +*run-end CLI*. For example: + +![tfdbg run-end UI: accuracy](https://www.tensorflow.org/images/tfdbg_screenshot_run_end_accuracy.png) + +This list of tensors can also be obtained by running the command `lt` after you +executed `run`. + +### tfdbg CLI Frequently-Used Commands + +Try the following commands at the `tfdbg>` prompt (referencing the code at +`tensorflow/python/debug/examples/debug_mnist.py`): + +| Command | Syntax or Option | Explanation | Example | +|:-------------------|:---------------- |:------------ |:------------------------- | +| **`lt`** | | **List dumped tensors.** | `lt` | +| | `-n ` | List dumped tensors with names matching given regular-expression pattern. | `lt -n Softmax.*` | +| | `-t ` | List dumped tensors with op types matching given regular-expression pattern. | `lt -t MatMul` | +| | `-f ` | List only the tensors that pass a registered tensor filter. | `lt -f has_inf_or_nan` | +| | `-f -fenn ` | List only the tensors that pass a registered tensor filter, excluding nodes with names matching the regular expression. | `lt -f has_inf_or_nan` `-fenn .*Sqrt.*` | +| | `-s ` | Sort the output by given `sort_key`, whose possible values are `timestamp` (default), `dump_size`, `op_type` and `tensor_name`. | `lt -s dump_size` | +| | `-r` | Sort in reverse order. | `lt -r -s dump_size` | +| **`pt`** | | **Print value of a dumped tensor.** | | +| | `pt ` | Print tensor value. | `pt hidden/Relu:0` | +| | `pt [slicing]` | Print a subarray of tensor, using [numpy](http://www.numpy.org/)-style array slicing. | `pt hidden/Relu:0[0:50,:]` | +| | `-a` | Print the entirety of a large tensor, without using ellipses. (May take a long time for large tensors.) | `pt -a hidden/Relu:0[0:50,:]` | +| | `-r ` | Highlight elements falling into specified numerical range. Multiple ranges can be used in conjunction. | `pt hidden/Relu:0 -a -r [[-inf,-1],[1,inf]]` | +| | `-n ` | Print dump corresponding to specified 0-based dump number. Required for tensors with multiple dumps. | `pt -n 0 hidden/Relu:0` | +| | `-s` | Include a summary of the numeric values of the tensor (applicable only to non-empty tensors with Boolean and numeric types such as `int*` and `float*`.) | `pt -s hidden/Relu:0[0:50,:]` | +| | `-w` | Write the value of the tensor (possibly sliced) to a Numpy file using [`numpy.save()`](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.save.html) | `pt -s hidden/Relu:0 -w /tmp/relu.npy` | +| **`@[coordinates]`** | | Navigate to specified element in `pt` output. | `@[10,0]` or `@10,0` | +| **`/regex`** | | [less](https://linux.die.net/man/1/less)-style search for given regular expression. | `/inf` | +| **`/`** | | Scroll to the next line with matches to the searched regex (if any). | `/` | +| **`pf`** | | **Print a value in the feed_dict to `Session.run`.** | | +| | `pf ` | Print the value of the feed. Also note that the `pf` command has the `-a`, `-r` and `-s` flags (not listed below), which have the same syntax and semantics as the identically-named flags of `pt`. | `pf input_xs:0` | +| **eval** | | **Evaluate arbitrary Python and numpy expression.** | | +| | `eval ` | Evaluate a Python / numpy expression, with numpy available as `np` and debug tensor names enclosed in backticks. | ``eval "np.matmul((`output/Identity:0` / `Softmax:0`).T, `Softmax:0`)"`` | +| | `-a` | Print a large-sized evaluation result in its entirety, i.e., without using ellipses. | ``eval -a 'np.sum(`Softmax:0`, axis=1)'`` | +| | `-w` | Write the result of the evaluation to a Numpy file using [`numpy.save()`](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.save.html) | ``eval -a 'np.sum(`Softmax:0`, axis=1)' -w /tmp/softmax_sum.npy`` | +| **`ni`** | | **Display node information.** | | +| | `-a` | Include node attributes in the output. | `ni -a hidden/Relu` | +| | `-d` | List the debug dumps available from the node. | `ni -d hidden/Relu` | +| | `-t` | Display the Python stack trace of the node's creation. | `ni -t hidden/Relu` | +| **`li`** | | **List inputs to node** | | +| | `-r` | List the inputs to node, recursively (the input tree.) | `li -r hidden/Relu:0` | +| | `-d ` | Limit recursion depth under the `-r` mode. | `li -r -d 3 hidden/Relu:0` | +| | `-c` | Include control inputs. | `li -c -r hidden/Relu:0` | +| | `-t` | Show op types of input nodes. | `li -t -r hidden/Relu:0` | +| **`lo`** | | **List output recipients of node** | | +| | `-r` | List the output recipients of node, recursively (the output tree.) | `lo -r hidden/Relu:0` | +| | `-d ` | Limit recursion depth under the `-r` mode. | `lo -r -d 3 hidden/Relu:0` | +| | `-c` | Include recipients via control edges. | `lo -c -r hidden/Relu:0` | +| | `-t` | Show op types of recipient nodes. | `lo -t -r hidden/Relu:0` | +| **`ls`** | | **List Python source files involved in node creation.** | | +| | `-p ` | Limit output to source files matching given regular-expression path pattern. | `ls -p .*debug_mnist.*` | +| | `-n` | Limit output to node names matching given regular-expression pattern. | `ls -n Softmax.*` | +| **`ps`** | | **Print Python source file.** | | +| | `ps ` | Print given Python source file source.py, with the lines annotated with the nodes created at each of them (if any). | `ps /path/to/source.py` | +| | `-t` | Perform annotation with respect to Tensors, instead of the default, nodes. | `ps -t /path/to/source.py` | +| | `-b ` | Annotate source.py beginning at given line. | `ps -b 30 /path/to/source.py` | +| | `-m ` | Limit the number of elements in the annotation for each line. | `ps -m 100 /path/to/source.py` | +| **`run`** | | **Proceed to the next Session.run()** | `run` | +| | `-n` | Execute through the next `Session.run` without debugging, and drop to CLI right before the run after that. | `run -n` | +| | `-t ` | Execute `Session.run` `T - 1` times without debugging, followed by a run with debugging. Then drop to CLI right after the debugged run. | `run -t 10` | +| | `-f ` | Continue executing `Session.run` until any intermediate tensor triggers the specified Tensor filter (causes the filter to return `True`). | `run -f has_inf_or_nan` | +| | `-f -fenn ` | Continue executing `Session.run` until any intermediate tensor whose node names doesn't match the regular expression triggers the specified Tensor filter (causes the filter to return `True`). | `run -f has_inf_or_nan -fenn .*Sqrt.*` | +| | `--node_name_filter ` | Execute the next `Session.run`, watching only nodes with names matching the given regular-expression pattern. | `run --node_name_filter Softmax.*` | +| | `--op_type_filter ` | Execute the next `Session.run`, watching only nodes with op types matching the given regular-expression pattern. | `run --op_type_filter Variable.*` | +| | `--tensor_dtype_filter ` | Execute the next `Session.run`, dumping only Tensors with data types (`dtype`s) matching the given regular-expression pattern. | `run --tensor_dtype_filter int.*` | +| | `-p` | Execute the next `Session.run` call in profiling mode. | `run -p` | +| **`ri`** | | **Display information about the run the current run, including fetches and feeds.** | `ri` | +| **`config`** | | **Set or show persistent TFDBG UI configuration.** | | +| | `set` | Set the value of a config item: {`graph_recursion_depth`, `mouse_mode`}. | `config set graph_recursion_depth 3` | +| | `show` | Show current persistent UI configuration. | `config show` | +| **`help`** | | **Print general help information** | `help` | +| | `help ` | Print help for given command. | `help lt` | + +Note that each time you enter a command, a new screen output +will appear. This is somewhat analogous to web pages in a browser. You can +navigate between these screens by clicking the `<--` and +`-->` text arrows near the top-left corner of the CLI. + +### Other Features of the tfdbg CLI + +In addition to the commands listed above, the tfdbg CLI provides the following +additional features: + +* To navigate through previous tfdbg commands, type in a few characters + followed by the Up or Down arrow keys. tfdbg will show you the history of + commands that started with those characters. +* To navigate through the history of screen outputs, do either of the + following: + * Use the `prev` and `next` commands. + * Click underlined `<--` and `-->` links near the top left corner of the + screen. +* Tab completion of commands and some command arguments. +* To redirect the screen output to a file instead of the screen, end the + command with bash-style redirection. For example, the following command + redirects the output of the pt command to the `/tmp/xent_value_slices.txt` + file: + + ```none + tfdbg> pt cross_entropy/Log:0[:, 0:10] > /tmp/xent_value_slices.txt + ``` + +### Finding `nan`s and `inf`s + +In this first `Session.run()` call, there happen to be no problematic numerical +values. You can move on to the next run by using the command `run` or its +shorthand `r`. + +> TIP: If you enter `run` or `r` repeatedly, you will be able to move through +> the `Session.run()` calls in a sequential manner. +> +> You can also use the `-t` flag to move ahead a number of `Session.run()` calls +> at a time, for example: +> +> ``` +> tfdbg> run -t 10 +> ``` + +Instead of entering `run` repeatedly and manually searching for `nan`s and +`inf`s in the run-end UI after every `Session.run()` call (for example, by using +the `pt` command shown in the table above) , you can use the following +command to let the debugger repeatedly execute `Session.run()` calls without +stopping at the run-start or run-end prompt, until the first `nan` or `inf` +value shows up in the graph. This is analogous to *conditional breakpoints* in +some procedural-language debuggers: + +```none +tfdbg> run -f has_inf_or_nan +``` + +> NOTE: The preceding command works properly because a tensor filter called +> `has_inf_or_nan` has been registered for you when the wrapped session is +> created. This filter detects `nan`s and `inf`s (as explained previously). +> If you have registered any other filters, you can +> use "run -f" to have tfdbg run until any tensor triggers that filter (cause +> the filter to return True). +> +> ``` python +> def my_filter_callable(datum, tensor): +> # A filter that detects zero-valued scalars. +> return len(tensor.shape) == 0 and tensor == 0.0 +> +> sess.add_tensor_filter('my_filter', my_filter_callable) +> ``` +> +> Then at the tfdbg run-start prompt run until your filter is triggered: +> +> ``` +> tfdbg> run -f my_filter +> ``` + +See [this API document](https://www.tensorflow.org/api_docs/python/tfdbg/DebugDumpDir#find) +for more information on the expected signature and return value of the predicate +`Callable` used with `add_tensor_filter()`. + +![tfdbg run-end UI: infs and nans](https://www.tensorflow.org/images/tfdbg_screenshot_run_end_inf_nan.png) + +As the screen display indicates on the first line, the `has_inf_or_nan` filter is first triggered +during the fourth `Session.run()` call: an +[Adam optimizer](https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer) +forward-backward training pass on the graph. In this run, 36 (out of the total +95) intermediate tensors contain `nan` or `inf` values. These tensors are listed +in chronological order, with their timestamps displayed on the left. At the top +of the list, you can see the first tensor in which the bad numerical values +first surfaced: `cross_entropy/Log:0`. + +To view the value of the tensor, click the underlined tensor name +`cross_entropy/Log:0` or enter the equivalent command: + +```none +tfdbg> pt cross_entropy/Log:0 +``` + +Scroll down a little and you will notice some scattered `inf` values. If the +instances of `inf` and `nan` are difficult to spot by eye, you can use the +following command to perform a regex search and highlight the output: + +```none +tfdbg> /inf +``` + +Or, alternatively: + +```none +tfdbg> /(inf|nan) +``` + +You can also use the `-s` or `--numeric_summary` command to get a quick summary +of the types of numeric values in the tensor: + +``` none +tfdbg> pt -s cross_entropy/Log:0 +``` + +From the summary, you can see that several of the 1000 elements of the +`cross_entropy/Log:0` tensor are `-inf`s (negative infinities). + +Why did these infinities appear? To further debug, display more information +about the node `cross_entropy/Log` by clicking the underlined `node_info` menu +item on the top or entering the equivalent node_info (`ni`) command: + +```none +tfdbg> ni cross_entropy/Log +``` + +![tfdbg run-end UI: infs and nans](https://www.tensorflow.org/images/tfdbg_screenshot_run_end_node_info.png) + +You can see that this node has the op type `Log` +and that its input is the node `Softmax`. Run the following command to +take a closer look at the input tensor: + +```none +tfdbg> pt Softmax:0 +``` + +Examine the values in the input tensor, searching for zeros: + +```none +tfdbg> /0\.000 +``` + +Indeed, there are zeros. Now it is clear that the origin of the bad numerical +values is the node `cross_entropy/Log` taking logs of zeros. To find out the +culprit line in the Python source code, use the `-t` flag of the `ni` command +to show the traceback of the node's construction: + +```none +tfdbg> ni -t cross_entropy/Log +``` + +If you click "node_info" at the top of the screen, tfdbg automatically shows the +traceback of the node's construction. + +From the traceback, you can see that the op is constructed at the following +line: +[`debug_mnist.py`](https://www.tensorflow.org/code/tensorflow/python/debug/examples/debug_mnist.py): + +```python +diff = y_ * tf.log(y) +``` + +**tfdbg** has a feature that makes it easy to trace Tensors and ops back to +lines in Python source files. It can annotate lines of a Python file with +the ops or Tensors created by them. To use this feature, +simply click the underlined line numbers in the stack trace output of the +`ni -t ` commands, or use the `ps` (or `print_source`) command such as: +`ps /path/to/source.py`. For example, the following screenshot shows the output +of a `ps` command. + +![tfdbg run-end UI: annotated Python source file](https://www.tensorflow.org/images/tfdbg_screenshot_run_end_annotated_source.png) + +### Fixing the problem + +To fix the problem, edit `debug_mnist.py`, changing the original line: + +```python +diff = -(y_ * tf.log(y)) +``` + +to the built-in, numerically-stable implementation of softmax cross-entropy: + +```python +diff = tf.losses.softmax_cross_entropy(labels=y_, logits=logits) +``` + +Rerun with the `--debug` flag as follows: + +```none +python -m tensorflow.python.debug.examples.debug_mnist --debug +``` + +At the `tfdbg>` prompt, enter the following command: + +```none +run -f has_inf_or_nan` +``` + +Confirm that no tensors are flagged as containing `nan` or `inf` values, and +accuracy now continues to rise rather than getting stuck. Success! + +## Debugging TensorFlow Estimators + +This section explains how to debug TensorFlow programs that use the `Estimator` +APIs. Part of the convenience provided by these APIs is that +they manage `Session`s internally. This makes the `LocalCLIDebugWrapperSession` +described in the preceding sections inapplicable. Fortunately, you can still +debug them by using special `hook`s provided by `tfdbg`. + +`tfdbg` can debug the +@{tf.estimator.Estimator.train$`train()`}, +@{tf.estimator.Estimator.evaluate$`evaluate()`} and +@{tf.estimator.Estimator.predict$`predict()`} +methods of tf-learn `Estimator`s. To debug `Estimator.train()`, +create a `LocalCLIDebugHook` and supply it in the `hooks` argument. For example: + +```python +# First, let your BUILD target depend on "//tensorflow/python/debug:debug_py" +# (You don't need to worry about the BUILD dependency if you are using a pip +# install of open-source TensorFlow.) +from tensorflow.python import debug as tf_debug + +# Create a LocalCLIDebugHook and use it as a monitor when calling fit(). +hooks = [tf_debug.LocalCLIDebugHook()] + +# To debug `train`: +classifier.train(input_fn, + steps=1000, + hooks=hooks) +``` + +Similarly, to debug `Estimator.evaluate()` and `Estimator.predict()`, assign +hooks to the `hooks` parameter, as in the following example: + +```python +# To debug `evaluate`: +accuracy_score = classifier.evaluate(eval_input_fn, + hooks=hooks)["accuracy"] + +# To debug `predict`: +predict_results = classifier.predict(predict_input_fn, hooks=hooks) +``` + +[debug_tflearn_iris.py](https://www.tensorflow.org/code/tensorflow/python/debug/examples/debug_tflearn_iris.py), +based on [tf-learn's iris tutorial](https://www.tensorflow.org/versions/r1.8/get_started/tflearn), +contains a full example of how to use the tfdbg with `Estimator`s. +To run this example, do: + +```none +python -m tensorflow.python.debug.examples.debug_tflearn_iris --debug +``` + +The `LocalCLIDebugHook` also allows you to configure a `watch_fn` that can be +used to flexibly specify what `Tensor`s to watch on different `Session.run()` +calls, as a function of the `fetches` and `feed_dict` and other states. See +@{tfdbg.DumpingDebugWrapperSession.__init__$this API doc} +for more details. + +## Debugging Keras Models with TFDBG + +To use TFDBG with [Keras](https://keras.io/), let the Keras backend use +a TFDBG-wrapped Session object. For example, to use the CLI wrapper: + +``` python +import tensorflow as tf +from keras import backend as keras_backend +from tensorflow.python import debug as tf_debug + +keras_backend.set_session(tf_debug.LocalCLIDebugWrapperSession(tf.Session())) + +# Define your keras model, called "model". +model.fit(...) # This will break into the TFDBG CLI. +``` + +## Debugging tf-slim with TFDBG + +TFDBG supports debugging of training and evaluation with +[tf-slim](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim). +As detailed below, training and evaluation require slightly different debugging +workflows. + +### Debugging training in tf-slim +To debug the training process, provide `LocalCLIDebugWrapperSession` to the +`session_wrapper` argument of `slim.learning.train()`. For example: + +``` python +import tensorflow as tf +from tensorflow.python import debug as tf_debug + +# ... Code that creates the graph and the train_op ... +tf.contrib.slim.learning.train( + train_op, + logdir, + number_of_steps=10, + session_wrapper=tf_debug.LocalCLIDebugWrapperSession) +``` + +### Debugging evaluation in tf-slim +To debug the evaluation process, provide `LocalCLIDebugHook` to the +`hooks` argument of `slim.evaluation.evaluate_once()`. For example: + +``` python +import tensorflow as tf +from tensorflow.python import debug as tf_debug + +# ... Code that creates the graph and the eval and final ops ... +tf.contrib.slim.evaluation.evaluate_once( + '', + checkpoint_path, + logdir, + eval_op=my_eval_op, + final_op=my_value_op, + hooks=[tf_debug.LocalCLIDebugHook()]) +``` + +## Offline Debugging of Remotely-Running Sessions + +Often, your model is running on a remote machine or a process that you don't +have terminal access to. To perform model debugging in such cases, you can use +the `offline_analyzer` binary of `tfdbg` (described below). It operates on +dumped data directories. This can be done to both the lower-level `Session` API +and the higher-level `Estimator` API. + +### Debugging Remote tf.Sessions + +If you interact directly with the `tf.Session` API in `python`, you can +configure the `RunOptions` proto that you call your `Session.run()` method +with, by using the method @{tfdbg.watch_graph}. +This will cause the intermediate tensors and runtime graphs to be dumped to a +shared storage location of your choice when the `Session.run()` call occurs +(at the cost of slower performance). For example: + +```python +from tensorflow.python import debug as tf_debug + +# ... Code where your session and graph are set up... + +run_options = tf.RunOptions() +tf_debug.watch_graph( + run_options, + session.graph, + debug_urls=["file:///shared/storage/location/tfdbg_dumps_1"]) +# Be sure to specify different directories for different run() calls. + +session.run(fetches, feed_dict=feeds, options=run_options) +``` + +Later, in an environment that you have terminal access to (for example, a local +computer that can access the shared storage location specified in the code +above), you can load and inspect the data in the dump directory on the shared +storage by using the `offline_analyzer` binary of `tfdbg`. For example: + +```none +python -m tensorflow.python.debug.cli.offline_analyzer \ + --dump_dir=/shared/storage/location/tfdbg_dumps_1 +``` + +The `Session` wrapper `DumpingDebugWrapperSession` offers an easier and more +flexible way to generate file-system dumps that can be analyzed offline. +To use it, simply wrap your session in a `tf_debug.DumpingDebugWrapperSession`. +For example: + +```python +# Let your BUILD target depend on "//tensorflow/python/debug:debug_py +# (You don't need to worry about the BUILD dependency if you are using a pip +# install of open-source TensorFlow.) +from tensorflow.python import debug as tf_debug + +sess = tf_debug.DumpingDebugWrapperSession( + sess, "/shared/storage/location/tfdbg_dumps_1/", watch_fn=my_watch_fn) +``` + +The `watch_fn` argument accepts a `Callable` that allows you to configure what +`tensor`s to watch on different `Session.run()` calls, as a function of the +`fetches` and `feed_dict` to the `run()` call and other states. + +### C++ and other languages + +If your model code is written in C++ or other languages, you can also +modify the `debug_options` field of `RunOptions` to generate debug dumps that +can be inspected offline. See +[the proto definition](https://www.tensorflow.org/code/tensorflow/core/protobuf/debug.proto) +for more details. + +### Debugging Remotely-Running Estimators + +If your remote TensorFlow server runs `Estimator`s, +you can use the non-interactive `DumpingDebugHook`. For example: + +```python +# Let your BUILD target depend on "//tensorflow/python/debug:debug_py +# (You don't need to worry about the BUILD dependency if you are using a pip +# install of open-source TensorFlow.) +from tensorflow.python import debug as tf_debug + +hooks = [tf_debug.DumpingDebugHook("/shared/storage/location/tfdbg_dumps_1")] +``` + +Then this `hook` can be used in the same way as the `LocalCLIDebugHook` examples +described earlier in this document. +As the training, evalution or prediction happens with `Estimator`, +tfdbg creates directories having the following name pattern: +`/shared/storage/location/tfdbg_dumps_1/run__`. +Each directory corresponds to a `Session.run()` call that underlies +the `fit()` or `evaluate()` call. You can load these directories and inspect +them in a command-line interface in an offline manner using the +`offline_analyzer` offered by tfdbg. For example: + +```bash +python -m tensorflow.python.debug.cli.offline_analyzer \ + --dump_dir="/shared/storage/location/tfdbg_dumps_1/run__" +``` + +## Frequently Asked Questions + +**Q**: _Do the timestamps on the left side of the `lt` output reflect actual + performance in a non-debugging session?_ + +**A**: No. The debugger inserts additional special-purpose debug nodes to the + graph to record the values of intermediate tensors. These nodes + slow down the graph execution. If you are interested in profiling your + model, check out + + 1. The profiling mode of tfdbg: `tfdbg> run -p`. + 2. [tfprof](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/core/profiler) + and other profiling tools for TensorFlow. + +**Q**: _How do I link tfdbg against my `Session` in Bazel? Why do I see an + error such as "ImportError: cannot import name debug"?_ + +**A**: In your BUILD rule, declare dependencies: + `"//tensorflow:tensorflow_py"` and `"//tensorflow/python/debug:debug_py"`. + The first is the dependency that you include to use TensorFlow even + without debugger support; the second enables the debugger. + Then, In your Python file, add: + +```python +from tensorflow.python import debug as tf_debug + +# Then wrap your TensorFlow Session with the local-CLI wrapper. +sess = tf_debug.LocalCLIDebugWrapperSession(sess) +``` + +**Q**: _Does tfdbg help debug runtime errors such as shape mismatches?_ + +**A**: Yes. tfdbg intercepts errors generated by ops during runtime and presents + the errors with some debug instructions to the user in the CLI. + See examples: + +```none +# Debugging shape mismatch during matrix multiplication. +python -m tensorflow.python.debug.examples.debug_errors \ + --error shape_mismatch --debug + +# Debugging uninitialized variable. +python -m tensorflow.python.debug.examples.debug_errors \ + --error uninitialized_variable --debug +``` + +**Q**: _How can I let my tfdbg-wrapped Sessions or Hooks run the debug mode +only from the main thread?_ + +**A**: +This is a common use case, in which the `Session` object is used from multiple +threads concurrently. Typically, the child threads take care of background tasks +such as running enqueue operations. Often, you want to debug only the main +thread (or less frequently, only one of the child threads). You can use the +`thread_name_filter` keyword argument of `LocalCLIDebugWrapperSession` to +achieve this type of thread-selective debugging. For example, to debug from the +main thread only, construct a wrapped `Session` as follows: + +```python +sess = tf_debug.LocalCLIDebugWrapperSession(sess, thread_name_filter="MainThread$") +``` + +The above example relies on the fact that main threads in Python have the +default name `MainThread`. + +**Q**: _The model I am debugging is very large. The data dumped by tfdbg +fills up the free space of my disk. What can I do?_ + +**A**: +You might encounter this problem in any of the following situations: + +* models with many intermediate tensors +* very large intermediate tensors +* many @{tf.while_loop} iterations + +There are three possible workarounds or solutions: + +* The constructors of `LocalCLIDebugWrapperSession` and `LocalCLIDebugHook` + provide a keyword argument, `dump_root`, to specify the path + to which tfdbg dumps the debug data. You can use it to let tfdbg dump the + debug data on a disk with larger free space. For example: + +```python +# For LocalCLIDebugWrapperSession +sess = tf_debug.LocalCLIDebugWrapperSession(dump_root="/with/lots/of/space") + +# For LocalCLIDebugHook +hooks = [tf_debug.LocalCLIDebugHook(dump_root="/with/lots/of/space")] +``` + Make sure that the directory pointed to by dump_root is empty or nonexistent. + `tfdbg` cleans up the dump directories before exiting. + +* Reduce the batch size used during the runs. +* Use the filtering options of tfdbg's `run` command to watch only specific + nodes in the graph. For example: + + ``` + tfdbg> run --node_name_filter .*hidden.* + tfdbg> run --op_type_filter Variable.* + tfdbg> run --tensor_dtype_filter int.* + ``` + + The first command above watches only nodes whose name match the + regular-expression pattern `.*hidden.*`. The second command watches only + operations whose name match the pattern `Variable.*`. The third one watches + only the tensors whose dtype match the pattern `int.*` (e.g., `int32`). + + +**Q**: _Why can't I select text in the tfdbg CLI?_ + +**A**: This is because the tfdbg CLI enables mouse events in the terminal by + default. This [mouse-mask](https://linux.die.net/man/3/mousemask) mode + overrides default terminal interactions, including text selection. You + can re-enable text selection by using the command `mouse off` or + `m off`. + +**Q**: _Why does the tfdbg CLI show no dumped tensors when I debug code like the following?_ + +``` python +a = tf.ones([10], name="a") +b = tf.add(a, a, name="b") +sess = tf.Session() +sess = tf_debug.LocalCLIDebugWrapperSession(sess) +sess.run(b) +``` + +**A**: The reason why you see no data dumped is because every node in the + executed TensorFlow graph is constant-folded by the TensorFlow runtime. + In this exapmle, `a` is a constant tensor; therefore, the fetched + tensor `b` is effectively also a constant tensor. TensorFlow's graph + optimization folds the graph that contains `a` and `b` into a single + node to speed up future runs of the graph, which is why `tfdbg` does + not generate any intermediate tensor dumps. However, if `a` were a + @{tf.Variable}, as in the following example: + +``` python +import numpy as np + +a = tf.Variable(np.ones[10], name="a") +b = tf.add(a, a, name="b") +sess = tf.Session() +sess.run(tf.global_variables_initializer()) +sess = tf_debug.LocalCLIDebugWrapperSession(sess) +sess.run(b) +``` + +the constant-folding would not occur and `tfdbg` should show the intermediate +tensor dumps. + + +**Q**: I am debugging a model that generates unwanted infinities or NaNs. But + there are some nodes in my model that are known to generate infinities + or NaNs in their output tensors even under completely normal conditions. + How can I skip those nodes during my `run -f has_inf_or_nan` actions? + +**A**: Use the `--filter_exclude_node_names` (`-fenn` for short) flag. For + example, if you known you have a node with name matching the regular + expression `.*Sqrt.*` that generates infinities or NaNs regardless + of whether the model is behaving correctly, you can exclude the nodes + from the infinity/NaN-finding runs with the command + `run -f has_inf_or_nan -fenn .*Sqrt.*`. + + +**Q**: Is there a GUI for tfdbg? + +**A**: Yes, the **TensorBoard Debugger Plugin** is the GUI of tfdbg. + It offers features such as inspection of the computation graph, + real-time visualization of tensor values, continuation to tensor + and conditional breakpoints, and tying tensors to their + graph-construction source code, all in the browser environment. + To get started, please visit + [its README](https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/debugger/README.md). diff --git a/tensorflow/docs_src/guide/eager.md b/tensorflow/docs_src/guide/eager.md new file mode 100644 index 0000000000..00d02b4455 --- /dev/null +++ b/tensorflow/docs_src/guide/eager.md @@ -0,0 +1,849 @@ +# Eager Execution + +TensorFlow's eager execution is an imperative programming environment that +evaluates operations immediately, without building graphs: operations return +concrete values instead of constructing a computational graph to run later. This +makes it easy to get started with TensorFlow and debug models, and it +reduces boilerplate as well. To follow along with this guide, run the code +samples below in an interactive `python` interpreter. + +Eager execution is a flexible machine learning platform for research and +experimentation, providing: + +* *An intuitive interface*—Structure your code naturally and use Python data + structures. Quickly iterate on small models and small data. +* *Easier debugging*—Call ops directly to inspect running models and test + changes. Use standard Python debugging tools for immediate error reporting. +* *Natural control flow*—Use Python control flow instead of graph control + flow, simplifying the specification of dynamic models. + +Eager execution supports most TensorFlow operations and GPU acceleration. For a +collection of examples running in eager execution, see: +[tensorflow/contrib/eager/python/examples](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples). + +Note: Some models may experience increased overhead with eager execution +enabled. Performance improvements are ongoing, but please +[file a bug](https://github.com/tensorflow/tensorflow/issues) if you find a +problem and share your benchmarks. + +## Setup and basic usage + +Upgrade to the latest version of TensorFlow: + +``` +$ pip install --upgrade tensorflow +``` + +To start eager execution, add `tf.enable_eager_execution()` to the beginning of +the program or console session. Do not add this operation to other modules that +the program calls. + +```py +from __future__ import absolute_import, division, print_function + +import tensorflow as tf + +tf.enable_eager_execution() +``` + +Now you can run TensorFlow operations and the results will return immediately: + +```py +tf.executing_eagerly() # => True + +x = [[2.]] +m = tf.matmul(x, x) +print("hello, {}".format(m)) # => "hello, [[4.]]" +``` + +Enabling eager execution changes how TensorFlow operations behave—now they +immediately evaluate and return their values to Python. `tf.Tensor` objects +reference concrete values instead of symbolic handles to nodes in a computational +graph. Since there isn't a computational graph to build and run later in a +session, it's easy to inspect results using `print()` or a debugger. Evaluating, +printing, and checking tensor values does not break the flow for computing +gradients. + +Eager execution works nicely with [NumPy](http://www.numpy.org/). NumPy +operations accept `tf.Tensor` arguments. TensorFlow +[math operations](https://www.tensorflow.org/api_guides/python/math_ops) convert +Python objects and NumPy arrays to `tf.Tensor` objects. The +`tf.Tensor.numpy` method returns the object's value as a NumPy `ndarray`. + +```py +a = tf.constant([[1, 2], + [3, 4]]) +print(a) +# => tf.Tensor([[1 2] +# [3 4]], shape=(2, 2), dtype=int32) + +# Broadcasting support +b = tf.add(a, 1) +print(b) +# => tf.Tensor([[2 3] +# [4 5]], shape=(2, 2), dtype=int32) + +# Operator overloading is supported +print(a * b) +# => tf.Tensor([[ 2 6] +# [12 20]], shape=(2, 2), dtype=int32) + +# Use NumPy values +import numpy as np + +c = np.multiply(a, b) +print(c) +# => [[ 2 6] +# [12 20]] + +# Obtain numpy value from a tensor: +print(a.numpy()) +# => [[1 2] +# [3 4]] +``` + +The `tf.contrib.eager` module contains symbols available to both eager and graph execution +environments and is useful for writing code to [work with graphs](#work_with_graphs): + +```py +tfe = tf.contrib.eager +``` + +## Dynamic control flow + +A major benefit of eager execution is that all the functionality of the host +language is available while your model is executing. So, for example, +it is easy to write [fizzbuzz](https://en.wikipedia.org/wiki/Fizz_buzz): + +```py +def fizzbuzz(max_num): + counter = tf.constant(0) + max_num = tf.convert_to_tensor(max_num) + for num in range(max_num.numpy()): + num = tf.constant(num) + if int(num % 3) == 0 and int(num % 5) == 0: + print('FizzBuzz') + elif int(num % 3) == 0: + print('Fizz') + elif int(num % 5) == 0: + print('Buzz') + else: + print(num) + counter += 1 + return counter +``` + +This has conditionals that depend on tensor values and it prints these values +at runtime. + +## Build a model + +Many machine learning models are represented by composing layers. When +using TensorFlow with eager execution you can either write your own layers or +use a layer provided in the `tf.keras.layers` package. + +While you can use any Python object to represent a layer, +TensorFlow has `tf.keras.layers.Layer` as a convenient base class. Inherit from +it to implement your own layer: + +```py +class MySimpleLayer(tf.keras.layers.Layer): + def __init__(self, output_units): + self.output_units = output_units + + def build(self, input): + # The build method gets called the first time your layer is used. + # Creating variables on build() allows you to make their shape depend + # on the input shape and hence remove the need for the user to specify + # full shapes. It is possible to create variables during __init__() if + # you already know their full shapes. + self.kernel = self.add_variable( + "kernel", [input.shape[-1], self.output_units]) + + def call(self, input): + # Override call() instead of __call__ so we can perform some bookkeeping. + return tf.matmul(input, self.kernel) +``` + +Use `tf.keras.layers.Dense` layer instead of `MySimpleLayer` above as it has +a superset of its functionality (it can also add a bias). + +When composing layers into models you can use `tf.keras.Sequential` to represent +models which are a linear stack of layers. It is easy to use for basic models: + +```py +model = tf.keras.Sequential([ + tf.keras.layers.Dense(10, input_shape=(784,)), # must declare input shape + tf.keras.layers.Dense(10) +]) +``` + +Alternatively, organize models in classes by inheriting from `tf.keras.Model`. +This is a container for layers that is a layer itself, allowing `tf.keras.Model` +objects to contain other `tf.keras.Model` objects. + +```py +class MNISTModel(tf.keras.Model): + def __init__(self): + super(MNISTModel, self).__init__() + self.dense1 = tf.keras.layers.Dense(units=10) + self.dense2 = tf.keras.layers.Dense(units=10) + + def call(self, input): + """Run the model.""" + result = self.dense1(input) + result = self.dense2(result) + result = self.dense2(result) # reuse variables from dense2 layer + return result + +model = MNISTModel() +``` + +It's not required to set an input shape for the `tf.keras.Model` class since +the parameters are set the first time input is passed to the layer. + +`tf.keras.layers` classes create and contain their own model variables that +are tied to the lifetime of their layer objects. To share layer variables, share +their objects. + + +## Eager training + +### Computing gradients + +[Automatic differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation) +is useful for implementing machine learning algorithms such as +[backpropagation](https://en.wikipedia.org/wiki/Backpropagation) for training +neural networks. During eager execution, use `tf.GradientTape` to trace +operations for computing gradients later. + +`tf.GradientTape` is an opt-in feature to provide maximal performance when +not tracing. Since different operations can occur during each call, all +forward-pass operations get recorded to a "tape". To compute the gradient, play +the tape backwards and then discard. A particular `tf.GradientTape` can only +compute one gradient; subsequent calls throw a runtime error. + +```py +w = tfe.Variable([[1.0]]) +with tf.GradientTape() as tape: + loss = w * w + +grad = tape.gradient(loss, w) +print(grad) # => tf.Tensor([[ 2.]], shape=(1, 1), dtype=float32) +``` + +Here's an example of `tf.GradientTape` that records forward-pass operations +to train a simple model: + +```py +# A toy dataset of points around 3 * x + 2 +NUM_EXAMPLES = 1000 +training_inputs = tf.random_normal([NUM_EXAMPLES]) +noise = tf.random_normal([NUM_EXAMPLES]) +training_outputs = training_inputs * 3 + 2 + noise + +def prediction(input, weight, bias): + return input * weight + bias + +# A loss function using mean-squared error +def loss(weights, biases): + error = prediction(training_inputs, weights, biases) - training_outputs + return tf.reduce_mean(tf.square(error)) + +# Return the derivative of loss with respect to weight and bias +def grad(weights, biases): + with tf.GradientTape() as tape: + loss_value = loss(weights, biases) + return tape.gradient(loss_value, [weights, biases]) + +train_steps = 200 +learning_rate = 0.01 +# Start with arbitrary values for W and B on the same batch of data +W = tfe.Variable(5.) +B = tfe.Variable(10.) + +print("Initial loss: {:.3f}".format(loss(W, B))) + +for i in range(train_steps): + dW, dB = grad(W, B) + W.assign_sub(dW * learning_rate) + B.assign_sub(dB * learning_rate) + if i % 20 == 0: + print("Loss at step {:03d}: {:.3f}".format(i, loss(W, B))) + +print("Final loss: {:.3f}".format(loss(W, B))) +print("W = {}, B = {}".format(W.numpy(), B.numpy())) +``` + +Output (exact numbers may vary): + +``` +Initial loss: 71.204 +Loss at step 000: 68.333 +Loss at step 020: 30.222 +Loss at step 040: 13.691 +Loss at step 060: 6.508 +Loss at step 080: 3.382 +Loss at step 100: 2.018 +Loss at step 120: 1.422 +Loss at step 140: 1.161 +Loss at step 160: 1.046 +Loss at step 180: 0.996 +Final loss: 0.974 +W = 3.01582956314, B = 2.1191945076 +``` + +Replay the `tf.GradientTape` to compute the gradients and apply them in a +training loop. This is demonstrated in an excerpt from the +[mnist_eager.py](https://github.com/tensorflow/models/blob/master/official/mnist/mnist_eager.py) +example: + +```py +dataset = tf.data.Dataset.from_tensor_slices((data.train.images, + data.train.labels)) +... +for (batch, (images, labels)) in enumerate(dataset): + ... + with tf.GradientTape() as tape: + logits = model(images, training=True) + loss_value = loss(logits, labels) + ... + grads = tape.gradient(loss_value, model.variables) + optimizer.apply_gradients(zip(grads, model.variables), + global_step=tf.train.get_or_create_global_step()) +``` + + +The following example creates a multi-layer model that classifies the standard +[MNIST handwritten digits](https://www.tensorflow.org/tutorials/layers). It +demonstrates the optimizer and layer APIs to build trainable graphs in an eager +execution environment. + +### Train a model + +Even without training, call the model and inspect the output in eager execution: + +```py +# Create a tensor representing a blank image +batch = tf.zeros([1, 1, 784]) +print(batch.shape) # => (1, 1, 784) + +result = model(batch) +# => tf.Tensor([[[ 0. 0., ..., 0.]]], shape=(1, 1, 10), dtype=float32) +``` + +This example uses the +[dataset.py module](https://github.com/tensorflow/models/blob/master/official/mnist/dataset.py) +from the +[TensorFlow MNIST example](https://github.com/tensorflow/models/tree/master/official/mnist); +download this file to your local directory. Run the following to download the +MNIST data files to your working directory and prepare a `tf.data.Dataset` +for training: + +```py +import dataset # download dataset.py file +dataset_train = dataset.train('./datasets').shuffle(60000).repeat(4).batch(32) +``` + +To train a model, define a loss function to optimize and then calculate +gradients. Use an optimizer to update the variables: + +```py +def loss(model, x, y): + prediction = model(x) + return tf.losses.sparse_softmax_cross_entropy(labels=y, logits=prediction) + +def grad(model, inputs, targets): + with tf.GradientTape() as tape: + loss_value = loss(model, inputs, targets) + return tape.gradient(loss_value, model.variables) + +optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001) + +x, y = iter(dataset_train).next() +print("Initial loss: {:.3f}".format(loss(model, x, y))) + +# Training loop +for (i, (x, y)) in enumerate(dataset_train): + # Calculate derivatives of the input function with respect to its parameters. + grads = grad(model, x, y) + # Apply the gradient to the model + optimizer.apply_gradients(zip(grads, model.variables), + global_step=tf.train.get_or_create_global_step()) + if i % 200 == 0: + print("Loss at step {:04d}: {:.3f}".format(i, loss(model, x, y))) + +print("Final loss: {:.3f}".format(loss(model, x, y))) +``` + +Output (exact numbers may vary): + +``` +Initial loss: 2.674 +Loss at step 0000: 2.593 +Loss at step 0200: 2.143 +Loss at step 0400: 2.009 +Loss at step 0600: 2.103 +Loss at step 0800: 1.621 +Loss at step 1000: 1.695 +... +Loss at step 6600: 0.602 +Loss at step 6800: 0.557 +Loss at step 7000: 0.499 +Loss at step 7200: 0.744 +Loss at step 7400: 0.681 +Final loss: 0.670 +``` + +And for faster training, move the computation to a GPU: + +```py +with tf.device("/gpu:0"): + for (i, (x, y)) in enumerate(dataset_train): + # minimize() is equivalent to the grad() and apply_gradients() calls. + optimizer.minimize(lambda: loss(model, x, y), + global_step=tf.train.get_or_create_global_step()) +``` + +### Variables and optimizers + +`tfe.Variable` objects store mutable `tf.Tensor` values accessed during +training to make automatic differentiation easier. The parameters of a model can +be encapsulated in classes as variables. + +Better encapsulate model parameters by using `tfe.Variable` with +`tf.GradientTape`. For example, the automatic differentiation example above +can be rewritten: + +```py +class Model(tf.keras.Model): + def __init__(self): + super(Model, self).__init__() + self.W = tfe.Variable(5., name='weight') + self.B = tfe.Variable(10., name='bias') + def predict(self, inputs): + return inputs * self.W + self.B + +# A toy dataset of points around 3 * x + 2 +NUM_EXAMPLES = 2000 +training_inputs = tf.random_normal([NUM_EXAMPLES]) +noise = tf.random_normal([NUM_EXAMPLES]) +training_outputs = training_inputs * 3 + 2 + noise + +# The loss function to be optimized +def loss(model, inputs, targets): + error = model.predict(inputs) - targets + return tf.reduce_mean(tf.square(error)) + +def grad(model, inputs, targets): + with tf.GradientTape() as tape: + loss_value = loss(model, inputs, targets) + return tape.gradient(loss_value, [model.W, model.B]) + +# Define: +# 1. A model. +# 2. Derivatives of a loss function with respect to model parameters. +# 3. A strategy for updating the variables based on the derivatives. +model = Model() +optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01) + +print("Initial loss: {:.3f}".format(loss(model, training_inputs, training_outputs))) + +# Training loop +for i in range(300): + grads = grad(model, training_inputs, training_outputs) + optimizer.apply_gradients(zip(grads, [model.W, model.B]), + global_step=tf.train.get_or_create_global_step()) + if i % 20 == 0: + print("Loss at step {:03d}: {:.3f}".format(i, loss(model, training_inputs, training_outputs))) + +print("Final loss: {:.3f}".format(loss(model, training_inputs, training_outputs))) +print("W = {}, B = {}".format(model.W.numpy(), model.B.numpy())) +``` + +Output (exact numbers may vary): + +``` +Initial loss: 69.066 +Loss at step 000: 66.368 +Loss at step 020: 30.107 +Loss at step 040: 13.959 +Loss at step 060: 6.769 +Loss at step 080: 3.567 +Loss at step 100: 2.141 +Loss at step 120: 1.506 +Loss at step 140: 1.223 +Loss at step 160: 1.097 +Loss at step 180: 1.041 +Loss at step 200: 1.016 +Loss at step 220: 1.005 +Loss at step 240: 1.000 +Loss at step 260: 0.998 +Loss at step 280: 0.997 +Final loss: 0.996 +W = 2.99431324005, B = 2.02129220963 +``` + +## Use objects for state during eager execution + +With graph execution, program state (such as the variables) is stored in global +collections and their lifetime is managed by the `tf.Session` object. In +contrast, during eager execution the lifetime of state objects is determined by +the lifetime of their corresponding Python object. + +### Variables are objects + +During eager execution, variables persist until the last reference to the object +is removed, and is then deleted. + +```py +with tf.device("gpu:0"): + v = tfe.Variable(tf.random_normal([1000, 1000])) + v = None # v no longer takes up GPU memory +``` + +### Object-based saving + +`tfe.Checkpoint` can save and restore `tfe.Variable`s to and from +checkpoints: + +```py +x = tfe.Variable(10.) + +checkpoint = tfe.Checkpoint(x=x) # save as "x" + +x.assign(2.) # Assign a new value to the variables and save. +save_path = checkpoint.save('./ckpt/') + +x.assign(11.) # Change the variable after saving. + +# Restore values from the checkpoint +checkpoint.restore(save_path) + +print(x) # => 2.0 +``` + +To save and load models, `tfe.Checkpoint` stores the internal state of objects, +without requiring hidden variables. To record the state of a `model`, +an `optimizer`, and a global step, pass them to a `tfe.Checkpoint`: + +```py +model = MyModel() +optimizer = tf.train.AdamOptimizer(learning_rate=0.001) +checkpoint_dir = ‘/path/to/model_dir’ +checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt") +root = tfe.Checkpoint(optimizer=optimizer, + model=model, + optimizer_step=tf.train.get_or_create_global_step()) + +root.save(file_prefix=checkpoint_prefix) +# or +root.restore(tf.train.latest_checkpoint(checkpoint_dir)) +``` + +### Object-oriented metrics + +`tfe.metrics` are stored as objects. Update a metric by passing the new data to +the callable, and retrieve the result using the `tfe.metrics.result` method, +for example: + +```py +m = tfe.metrics.Mean("loss") +m(0) +m(5) +m.result() # => 2.5 +m([8, 9]) +m.result() # => 5.5 +``` + +#### Summaries and TensorBoard + +@{$summaries_and_tensorboard$TensorBoard} is a visualization tool for +understanding, debugging and optimizing the model training process. It uses +summary events that are written while executing the program. + +`tf.contrib.summary` is compatible with both eager and graph execution +environments. Summary operations, such as `tf.contrib.summary.scalar`, are +inserted during model construction. For example, to record summaries once every +100 global steps: + +```py +writer = tf.contrib.summary.create_file_writer(logdir) +global_step=tf.train.get_or_create_global_step() # return global step var + +writer.set_as_default() + +for _ in range(iterations): + global_step.assign_add(1) + # Must include a record_summaries method + with tf.contrib.summary.record_summaries_every_n_global_steps(100): + # your model code goes here + tf.contrib.summary.scalar('loss', loss) + ... +``` + +## Advanced automatic differentiation topics + +### Dynamic models + +`tf.GradientTape` can also be used in dynamic models. This example for a +[backtracking line search](https://wikipedia.org/wiki/Backtracking_line_search) +algorithm looks like normal NumPy code, except there are gradients and is +differentiable, despite the complex control flow: + +```py +def line_search_step(fn, init_x, rate=1.0): + with tf.GradientTape() as tape: + # Variables are automatically recorded, but manually watch a tensor + tape.watch(init_x) + value = fn(init_x) + grad = tape.gradient(value, init_x) + grad_norm = tf.reduce_sum(grad * grad) + init_value = value + while value > init_value - rate * grad_norm: + x = init_x - rate * grad + value = fn(x) + rate /= 2.0 + return x, value +``` + +### Additional functions to compute gradients + +`tf.GradientTape` is a powerful interface for computing gradients, but there +is another [Autograd](https://github.com/HIPS/autograd)-style API available for +automatic differentiation. These functions are useful if writing math code with +only tensors and gradient functions, and without `tfe.Variables`: + +* `tfe.gradients_function` —Returns a function that computes the derivatives + of its input function parameter with respect to its arguments. The input + function parameter must return a scalar value. When the returned function is + invoked, it returns a list of `tf.Tensor` objects: one element for each + argument of the input function. Since anything of interest must be passed as a + function parameter, this becomes unwieldy if there's a dependency on many + trainable parameters. +* `tfe.value_and_gradients_function` —Similar to + `tfe.gradients_function`, but when the returned function is invoked, it + returns the value from the input function in addition to the list of + derivatives of the input function with respect to its arguments. + +In the following example, `tfe.gradients_function` takes the `square` +function as an argument and returns a function that computes the partial +derivatives of `square` with respect to its inputs. To calculate the derivative +of `square` at `3`, `grad(3.0)` returns `6`. + +```py +def square(x): + return tf.multiply(x, x) + +grad = tfe.gradients_function(square) + +square(3.) # => 9.0 +grad(3.) # => [6.0] + +# The second-order derivative of square: +gradgrad = tfe.gradients_function(lambda x: grad(x)[0]) +gradgrad(3.) # => [2.0] + +# The third-order derivative is None: +gradgradgrad = tfe.gradients_function(lambda x: gradgrad(x)[0]) +gradgradgrad(3.) # => [None] + + +# With flow control: +def abs(x): + return x if x > 0. else -x + +grad = tfe.gradients_function(abs) + +grad(3.) # => [1.0] +grad(-3.) # => [-1.0] +``` + +### Custom gradients + +Custom gradients are an easy way to override gradients in eager and graph +execution. Within the forward function, define the gradient with respect to the +inputs, outputs, or intermediate results. For example, here's an easy way to clip +the norm of the gradients in the backward pass: + +```py +@tf.custom_gradient +def clip_gradient_by_norm(x, norm): + y = tf.identity(x) + def grad_fn(dresult): + return [tf.clip_by_norm(dresult, norm), None] + return y, grad_fn +``` + +Custom gradients are commonly used to provide a numerically stable gradient for a +sequence of operations: + +```py +def log1pexp(x): + return tf.log(1 + tf.exp(x)) +grad_log1pexp = tfe.gradients_function(log1pexp) + +# The gradient computation works fine at x = 0. +grad_log1pexp(0.) # => [0.5] + +# However, x = 100 fails because of numerical instability. +grad_log1pexp(100.) # => [nan] +``` + +Here, the `log1pexp` function can be analytically simplified with a custom +gradient. The implementation below reuses the value for `tf.exp(x)` that is +computed during the forward pass—making it more efficient by eliminating +redundant calculations: + +```py +@tf.custom_gradient +def log1pexp(x): + e = tf.exp(x) + def grad(dy): + return dy * (1 - 1 / (1 + e)) + return tf.log(1 + e), grad + +grad_log1pexp = tfe.gradients_function(log1pexp) + +# As before, the gradient computation works fine at x = 0. +grad_log1pexp(0.) # => [0.5] + +# And the gradient computation also works at x = 100. +grad_log1pexp(100.) # => [1.0] +``` + +## Performance + +Computation is automatically offloaded to GPUs during eager execution. If you +want control over where a computation runs you can enclose it in a +`tf.device('/gpu:0')` block (or the CPU equivalent): + +```py +import time + +def measure(x, steps): + # TensorFlow initializes a GPU the first time it's used, exclude from timing. + tf.matmul(x, x) + start = time.time() + for i in range(steps): + x = tf.matmul(x, x) + _ = x.numpy() # Make sure to execute op and not just enqueue it + end = time.time() + return end - start + +shape = (1000, 1000) +steps = 200 +print("Time to multiply a {} matrix by itself {} times:".format(shape, steps)) + +# Run on CPU: +with tf.device("/cpu:0"): + print("CPU: {} secs".format(measure(tf.random_normal(shape), steps))) + +# Run on GPU, if available: +if tfe.num_gpus() > 0: + with tf.device("/gpu:0"): + print("GPU: {} secs".format(measure(tf.random_normal(shape), steps))) +else: + print("GPU: not found") +``` + +Output (exact numbers depend on hardware): + +``` +Time to multiply a (1000, 1000) matrix by itself 200 times: +CPU: 4.614904403686523 secs +GPU: 0.5581181049346924 secs +``` + +A `tf.Tensor` object can be copied to a different device to execute its +operations: + +```py +x = tf.random_normal([10, 10]) + +x_gpu0 = x.gpu() +x_cpu = x.cpu() + +_ = tf.matmul(x_cpu, x_cpu) # Runs on CPU +_ = tf.matmul(x_gpu0, x_gpu0) # Runs on GPU:0 + +if tfe.num_gpus() > 1: + x_gpu1 = x.gpu(1) + _ = tf.matmul(x_gpu1, x_gpu1) # Runs on GPU:1 +``` + +### Benchmarks + +For compute-heavy models, such as +[ResNet50](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples/resnet50) +training on a GPU, eager execution performance is comparable to graph execution. +But this gap grows larger for models with less computation and there is work to +be done for optimizing hot code paths for models with lots of small operations. + + +## Work with graphs + +While eager execution makes development and debugging more interactive, +TensorFlow graph execution has advantages for distributed training, performance +optimizations, and production deployment. However, writing graph code can feel +different than writing regular Python code and more difficult to debug. + +For building and training graph-constructed models, the Python program first +builds a graph representing the computation, then invokes `Session.run` to send +the graph for execution on the C++-based runtime. This provides: + +* Automatic differentiation using static autodiff. +* Simple deployment to a platform independent server. +* Graph-based optimizations (common subexpression elimination, constant-folding, etc.). +* Compilation and kernel fusion. +* Automatic distribution and replication (placing nodes on the distributed system). + +Deploying code written for eager execution is more difficult: either generate a +graph from the model, or run the Python runtime and code directly on the server. + +### Write compatible code + +The same code written for eager execution will also build a graph during graph +execution. Do this by simply running the same code in a new Python session where +eager execution is not enabled. + +Most TensorFlow operations work during eager execution, but there are some things +to keep in mind: + +* Use `tf.data` for input processing instead of queues. It's faster and easier. +* Use object-oriented layer APIs—like `tf.keras.layers` and + `tf.keras.Model`—since they have explicit storage for variables. +* Most model code works the same during eager and graph execution, but there are + exceptions. (For example, dynamic models using Python control flow to change the + computation based on inputs.) +* Once eager execution is enabled with `tf.enable_eager_execution`, it + cannot be turned off. Start a new Python session to return to graph execution. + +It's best to write code for both eager execution *and* graph execution. This +gives you eager's interactive experimentation and debuggability with the +distributed performance benefits of graph execution. + +Write, debug, and iterate in eager execution, then import the model graph for +production deployment. Use `tfe.Checkpoint` to save and restore model +variables, this allows movement between eager and graph execution environments. +See the examples in: +[tensorflow/contrib/eager/python/examples](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples). + +### Use eager execution in a graph environment + +Selectively enable eager execution in a TensorFlow graph environment using +`tfe.py_func`. This is used when `tf.enable_eager_execution()` has *not* +been called. + +```py +def my_py_func(x): + x = tf.matmul(x, x) # You can use tf ops + print(x) # but it's eager! + return x + +with tf.Session() as sess: + x = tf.placeholder(dtype=tf.float32) + # Call eager function in graph! + pf = tfe.py_func(my_py_func, [x], tf.float32) + sess.run(pf, feed_dict={x: [[2.0]]}) # [[4.0]] +``` diff --git a/tensorflow/docs_src/guide/embedding.md b/tensorflow/docs_src/guide/embedding.md new file mode 100644 index 0000000000..8a98367dfb --- /dev/null +++ b/tensorflow/docs_src/guide/embedding.md @@ -0,0 +1,262 @@ +# Embeddings + +This document introduces the concept of embeddings, gives a simple example of +how to train an embedding in TensorFlow, and explains how to view embeddings +with the TensorBoard Embedding Projector +([live example](http://projector.tensorflow.org)). The first two parts target +newcomers to machine learning or TensorFlow, and the Embedding Projector how-to +is for users at all levels. + +An alternative tutorial on these concepts is available in the +[Embeddings section of Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/embeddings/video-lecture). + +[TOC] + +An **embedding** is a mapping from discrete objects, such as words, to vectors +of real numbers. For example, a 300-dimensional embedding for English words +could include: + +``` +blue: (0.01359, 0.00075997, 0.24608, ..., -0.2524, 1.0048, 0.06259) +blues: (0.01396, 0.11887, -0.48963, ..., 0.033483, -0.10007, 0.1158) +orange: (-0.24776, -0.12359, 0.20986, ..., 0.079717, 0.23865, -0.014213) +oranges: (-0.35609, 0.21854, 0.080944, ..., -0.35413, 0.38511, -0.070976) +``` + +The individual dimensions in these vectors typically have no inherent meaning. +Instead, it's the overall patterns of location and distance between vectors +that machine learning takes advantage of. + +Embeddings are important for input to machine learning. Classifiers, and neural +networks more generally, work on vectors of real numbers. They train best on +dense vectors, where all values contribute to define an object. However, many +important inputs to machine learning, such as words of text, do not have a +natural vector representation. Embedding functions are the standard and +effective way to transform such discrete input objects into useful +continuous vectors. + +Embeddings are also valuable as outputs of machine learning. Because embeddings +map objects to vectors, applications can use similarity in vector space (for +instance, Euclidean distance or the angle between vectors) as a robust and +flexible measure of object similarity. One common use is to find nearest +neighbors. Using the same word embeddings as above, for instance, here are the +three nearest neighbors for each word and the corresponding angles: + +``` +blue: (red, 47.6°), (yellow, 51.9°), (purple, 52.4°) +blues: (jazz, 53.3°), (folk, 59.1°), (bluegrass, 60.6°) +orange: (yellow, 53.5°), (colored, 58.0°), (bright, 59.9°) +oranges: (apples, 45.3°), (lemons, 48.3°), (mangoes, 50.4°) +``` + +This would tell an application that apples and oranges are in some way more +similar (45.3° apart) than lemons and oranges (48.3° apart). + +## Embeddings in TensorFlow + +To create word embeddings in TensorFlow, we first split the text into words +and then assign an integer to every word in the vocabulary. Let us assume that +this has already been done, and that `word_ids` is a vector of these integers. +For example, the sentence “I have a cat.” could be split into +`[“I”, “have”, “a”, “cat”, “.”]` and then the corresponding `word_ids` tensor +would have shape `[5]` and consist of 5 integers. To map these word ids +to vectors, we need to create the embedding variable and use the +`tf.nn.embedding_lookup` function as follows: + +``` +word_embeddings = tf.get_variable(“word_embeddings”, + [vocabulary_size, embedding_size]) +embedded_word_ids = tf.nn.embedding_lookup(word_embeddings, word_ids) +``` + +After this, the tensor `embedded_word_ids` will have shape `[5, embedding_size]` +in our example and contain the embeddings (dense vectors) for each of the 5 +words. At the end of training, `word_embeddings` will contain the embeddings +for all words in the vocabulary. + +Embeddings can be trained in many network types, and with various loss +functions and data sets. For example, one could use a recurrent neural network +to predict the next word from the previous one given a large corpus of +sentences, or one could train two networks to do multi-lingual translation. +These methods are described in the @{$word2vec$Vector Representations of Words} +tutorial. + +## Visualizing Embeddings + +TensorBoard includes the **Embedding Projector**, a tool that lets you +interactively visualize embeddings. This tool can read embeddings from your +model and render them in two or three dimensions. + +The Embedding Projector has three panels: + +- *Data panel* on the top left, where you can choose the run, the embedding + variable and data columns to color and label points by. +- *Projections panel* on the bottom left, where you can choose the type of + projection. +- *Inspector panel* on the right side, where you can search for particular + points and see a list of nearest neighbors. + +### Projections +The Embedding Projector provides three ways to reduce the dimensionality of a +data set. + +- *[t-SNE](https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding)*: + a nonlinear nondeterministic algorithm (T-distributed stochastic neighbor + embedding) that tries to preserve local neighborhoods in the data, often at + the expense of distorting global structure. You can choose whether to compute + two- or three-dimensional projections. + +- *[PCA](https://en.wikipedia.org/wiki/Principal_component_analysis)*: + a linear deterministic algorithm (principal component analysis) that tries to + capture as much of the data variability in as few dimensions as possible. PCA + tends to highlight large-scale structure in the data, but can distort local + neighborhoods. The Embedding Projector computes the top 10 principal + components, from which you can choose two or three to view. + +- *Custom*: a linear projection onto horizontal and vertical axes that you + specify using labels in the data. You define the horizontal axis, for + instance, by giving text patterns for "Left" and "Right". The Embedding + Projector finds all points whose label matches the "Left" pattern and + computes the centroid of that set; similarly for "Right". The line passing + through these two centroids defines the horizontal axis. The vertical axis is + likewise computed from the centroids for points matching the "Up" and "Down" + text patterns. + +Further useful articles are +[How to Use t-SNE Effectively](https://distill.pub/2016/misread-tsne/) and +[Principal Component Analysis Explained Visually](http://setosa.io/ev/principal-component-analysis/). + +### Exploration + +You can explore visually by zooming, rotating, and panning using natural +click-and-drag gestures. Hovering your mouse over a point will show any +[metadata](#metadata) for that point. You can also inspect nearest-neighbor +subsets. Clicking on a point causes the right pane to list the nearest +neighbors, along with distances to the current point. The nearest-neighbor +points are also highlighted in the projection. + +It is sometimes useful to restrict the view to a subset of points and perform +projections only on those points. To do so, you can select points in multiple +ways: + +- After clicking on a point, its nearest neighbors are also selected. +- After a search, the points matching the query are selected. +- Enabling selection, clicking on a point and dragging defines a selection + sphere. + +Then click the "Isolate *nnn* points" button at the top of the Inspector pane +on the right hand side. The following image shows 101 points selected and ready +for the user to click "Isolate 101 points": + +![Selection of nearest neighbors](https://www.tensorflow.org/images/embedding-nearest-points.png "Selection of nearest neighbors") + +*Selection of the nearest neighbors of “important” in a word embedding dataset.* + +Advanced tip: filtering with custom projection can be powerful. Below, we +filtered the 100 nearest neighbors of “politics” and projected them onto the +“worst” - “best” vector as an x axis. The y axis is random. As a result, one +finds on the right side “ideas”, “science”, “perspective”, “journalism” but on +the left “crisis”, “violence” and “conflict”. + + + + + + + + + + +
+ Custom controls panel + + Custom projection +
+ Custom projection controls. + + Custom projection of neighbors of "politics" onto "best" - "worst" vector. +
+ +To share your findings, you can use the bookmark panel in the bottom right +corner and save the current state (including computed coordinates of any +projection) as a small file. The Projector can then be pointed to a set of one +or more of these files, producing the panel below. Other users can then walk +through a sequence of bookmarks. + +Bookmark panel + +### Metadata + +If you are working with an embedding, you'll probably want to attach +labels/images to the data points. You can do this by generating a metadata file +containing the labels for each point and clicking "Load data" in the data panel +of the Embedding Projector. + +The metadata can be either labels or images, which are +stored in a separate file. For labels, the format should +be a [TSV file](https://en.wikipedia.org/wiki/Tab-separated_values) +(tab characters shown in red) whose first line contains column headers +(shown in bold) and subsequent lines contain the metadata values. For example: + + +Word\tFrequency
+ Airplane\t345
+ Car\t241
+ ... +
+ +The order of lines in the metadata file is assumed to match the order of +vectors in the embedding variable, except for the header. Consequently, the +(i+1)-th line in the metadata file corresponds to the i-th row of the embedding +variable. If the TSV metadata file has only a single column, then we don’t +expect a header row, and assume each row is the label of the embedding. We +include this exception because it matches the commonly-used "vocab file" +format. + +To use images as metadata, you must produce a single +[sprite image](https://www.google.com/webhp#q=what+is+a+sprite+image), +consisting of small thumbnails, one for each vector in the embedding. The +sprite should store thumbnails in row-first order: the first data point placed +in the top left and the last data point in the bottom right, though the last +row doesn't have to be filled, as shown below. + + + + + + + + + + + + + + + + + +
012
345
67
+ +Follow [this link](https://www.tensorflow.org/images/embedding-mnist.mp4) +to see a fun example of thumbnail images in the Embedding Projector. + + +## Mini-FAQ + +**Is "embedding" an action or a thing?** +Both. People talk about embedding words in a vector space (action) and about +producing word embeddings (things). Common to both is the notion of embedding +as a mapping from discrete objects to vectors. Creating or applying that +mapping is an action, but the mapping itself is a thing. + +**Are embeddings high-dimensional or low-dimensional?** +It depends. A 300-dimensional vector space of words and phrases, for instance, +is often called low-dimensional (and dense) when compared to the millions of +words and phrases it can contain. But mathematically it is high-dimensional, +displaying many properties that are dramatically different from what our human +intuition has learned about 2- and 3-dimensional spaces. + +**Is an embedding the same as an embedding layer?** +No. An *embedding layer* is a part of neural network, but an *embedding* is a more +general concept. diff --git a/tensorflow/docs_src/guide/estimators.md b/tensorflow/docs_src/guide/estimators.md new file mode 100644 index 0000000000..78b30c3040 --- /dev/null +++ b/tensorflow/docs_src/guide/estimators.md @@ -0,0 +1,193 @@ +# Estimators + +This document introduces @{tf.estimator$**Estimators**}--a high-level TensorFlow +API that greatly simplifies machine learning programming. Estimators encapsulate +the following actions: + +* training +* evaluation +* prediction +* export for serving + +You may either use the pre-made Estimators we provide or write your +own custom Estimators. All Estimators--whether pre-made or custom--are +classes based on the @{tf.estimator.Estimator} class. + +Note: TensorFlow also includes a deprecated `Estimator` class at +@{tf.contrib.learn.Estimator}, which you should not use. + + +## Advantages of Estimators + +Estimators provide the following benefits: + +* You can run Estimator-based models on a local host or on a + distributed multi-server environment without changing your model. + Furthermore, you can run Estimator-based models on CPUs, GPUs, + or TPUs without recoding your model. +* Estimators simplify sharing implementations between model developers. +* You can develop a state of the art model with high-level intuitive code. + In short, it is generally much easier to create models with Estimators + than with the low-level TensorFlow APIs. +* Estimators are themselves built on @{tf.layers}, which + simplifies customization. +* Estimators build the graph for you. +* Estimators provide a safe distributed training loop that controls how and + when to: + * build the graph + * initialize variables + * start queues + * handle exceptions + * create checkpoint files and recover from failures + * save summaries for TensorBoard + +When writing an application with Estimators, you must separate the data input +pipeline from the model. This separation simplifies experiments with +different data sets. + + +## Pre-made Estimators + +Pre-made Estimators enable you to work at a much higher conceptual level +than the base TensorFlow APIs. You no longer have to worry about creating +the computational graph or sessions since Estimators handle all +the "plumbing" for you. That is, pre-made Estimators create and manage +@{tf.Graph$`Graph`} and @{tf.Session$`Session`} objects for you. Furthermore, +pre-made Estimators let you experiment with different model architectures by +making only minimal code changes. @{tf.estimator.DNNClassifier$`DNNClassifier`}, +for example, is a pre-made Estimator class that trains classification models +based on dense, feed-forward neural networks. + + +### Structure of a pre-made Estimators program + +A TensorFlow program relying on a pre-made Estimator typically consists +of the following four steps: + +1. **Write one or more dataset importing functions.** For example, you might + create one function to import the training set and another function to + import the test set. Each dataset importing function must return two + objects: + + * a dictionary in which the keys are feature names and the + values are Tensors (or SparseTensors) containing the corresponding + feature data + * a Tensor containing one or more labels + + For example, the following code illustrates the basic skeleton for + an input function: + + def input_fn(dataset): + ... # manipulate dataset, extracting the feature dict and the label + return feature_dict, label + + (See @{$guide/datasets} for full details.) + +2. **Define the feature columns.** Each @{tf.feature_column} + identifies a feature name, its type, and any input pre-processing. + For example, the following snippet creates three feature + columns that hold integer or floating-point data. The first two + feature columns simply identify the feature's name and type. The + third feature column also specifies a lambda the program will invoke + to scale the raw data: + + # Define three numeric feature columns. + population = tf.feature_column.numeric_column('population') + crime_rate = tf.feature_column.numeric_column('crime_rate') + median_education = tf.feature_column.numeric_column('median_education', + normalizer_fn=lambda x: x - global_education_mean) + +3. **Instantiate the relevant pre-made Estimator.** For example, here's + a sample instantiation of a pre-made Estimator named `LinearClassifier`: + + # Instantiate an estimator, passing the feature columns. + estimator = tf.estimator.LinearClassifier( + feature_columns=[population, crime_rate, median_education], + ) + +4. **Call a training, evaluation, or inference method.** + For example, all Estimators provide a `train` method, which trains a model. + + # my_training_set is the function created in Step 1 + estimator.train(input_fn=my_training_set, steps=2000) + + +### Benefits of pre-made Estimators + +Pre-made Estimators encode best practices, providing the following benefits: + +* Best practices for determining where different parts of the computational + graph should run, implementing strategies on a single machine or on a + cluster. +* Best practices for event (summary) writing and universally useful + summaries. + +If you don't use pre-made Estimators, you must implement the preceding +features yourself. + + +## Custom Estimators + +The heart of every Estimator--whether pre-made or custom--is its +**model function**, which is a method that builds graphs for training, +evaluation, and prediction. When you are using a pre-made Estimator, +someone else has already implemented the model function. When relying +on a custom Estimator, you must write the model function yourself. A +@{$custom_estimators$companion document} +explains how to write the model function. + + +## Recommended workflow + +We recommend the following workflow: + +1. Assuming a suitable pre-made Estimator exists, use it to build your + first model and use its results to establish a baseline. +2. Build and test your overall pipeline, including the integrity and + reliability of your data with this pre-made Estimator. +3. If suitable alternative pre-made Estimators are available, run + experiments to determine which pre-made Estimator produces the + best results. +4. Possibly, further improve your model by building your own custom Estimator. + + +## Creating Estimators from Keras models + +You can convert existing Keras models to Estimators. Doing so enables your Keras +model to access Estimator's strengths, such as distributed training. Call +@{tf.keras.estimator.model_to_estimator} as in the +following sample: + +```python +# Instantiate a Keras inception v3 model. +keras_inception_v3 = tf.keras.applications.inception_v3.InceptionV3(weights=None) +# Compile model with the optimizer, loss, and metrics you'd like to train with. +keras_inception_v3.compile(optimizer=tf.keras.optimizers.SGD(lr=0.0001, momentum=0.9), + loss='categorical_crossentropy', + metric='accuracy') +# Create an Estimator from the compiled Keras model. Note the initial model +# state of the keras model is preserved in the created Estimator. +est_inception_v3 = tf.keras.estimator.model_to_estimator(keras_model=keras_inception_v3) + +# Treat the derived Estimator as you would with any other Estimator. +# First, recover the input name(s) of Keras model, so we can use them as the +# feature column name(s) of the Estimator input function: +keras_inception_v3.input_names # print out: ['input_1'] +# Once we have the input name(s), we can create the input function, for example, +# for input(s) in the format of numpy ndarray: +train_input_fn = tf.estimator.inputs.numpy_input_fn( + x={"input_1": train_data}, + y=train_labels, + num_epochs=1, + shuffle=False) +# To train, we call Estimator's train function: +est_inception_v3.train(input_fn=train_input_fn, steps=2000) +``` +Note that the names of feature columns and labels of a keras estimator come from +the corresponding compiled keras model. For example, the input key names for +`train_input_fn` above can be obtained from `keras_inception_v3.input_names`, +and similarly, the predicted output names can be obtained from +`keras_inception_v3.output_names`. + +For more details, please refer to the documentation for +@{tf.keras.estimator.model_to_estimator}. diff --git a/tensorflow/docs_src/guide/faq.md b/tensorflow/docs_src/guide/faq.md new file mode 100644 index 0000000000..b6291a9ffa --- /dev/null +++ b/tensorflow/docs_src/guide/faq.md @@ -0,0 +1,297 @@ +# Frequently Asked Questions + +This document provides answers to some of the frequently asked questions about +TensorFlow. If you have a question that is not covered here, you might find an +answer on one of the TensorFlow @{$about$community resources}. + +[TOC] + +## Features and Compatibility + +#### Can I run distributed training on multiple computers? + +Yes! TensorFlow gained +@{$distributed$support for distributed computation} in +version 0.8. TensorFlow now supports multiple devices (CPUs and GPUs) in one or +more computers. + +#### Does TensorFlow work with Python 3? + +As of the 0.6.0 release timeframe (Early December 2015), we do support Python +3.3+. + +## Building a TensorFlow graph + +See also the +@{$python/framework$API documentation on building graphs}. + +#### Why does `c = tf.matmul(a, b)` not execute the matrix multiplication immediately? + +In the TensorFlow Python API, `a`, `b`, and `c` are +@{tf.Tensor} objects. A `Tensor` object is +a symbolic handle to the result of an operation, but does not actually hold the +values of the operation's output. Instead, TensorFlow encourages users to build +up complicated expressions (such as entire neural networks and its gradients) as +a dataflow graph. You then offload the computation of the entire dataflow graph +(or a subgraph of it) to a TensorFlow +@{tf.Session}, which is able to execute the +whole computation much more efficiently than executing the operations +one-by-one. + +#### How are devices named? + +The supported device names are `"/device:CPU:0"` (or `"/cpu:0"`) for the CPU +device, and `"/device:GPU:i"` (or `"/gpu:i"`) for the *i*th GPU device. + +#### How do I place operations on a particular device? + +To place a group of operations on a device, create them within a +@{tf.device$`with tf.device(name):`} context. See +the how-to documentation on +@{$using_gpu$using GPUs with TensorFlow} for details of how +TensorFlow assigns operations to devices, and the +@{$deep_cnn$CIFAR-10 tutorial} for an example model that +uses multiple GPUs. + + +## Running a TensorFlow computation + +See also the +@{$python/client$API documentation on running graphs}. + +#### What's the deal with feeding and placeholders? + +Feeding is a mechanism in the TensorFlow Session API that allows you to +substitute different values for one or more tensors at run time. The `feed_dict` +argument to @{tf.Session.run} is a +dictionary that maps @{tf.Tensor} objects to +numpy arrays (and some other types), which will be used as the values of those +tensors in the execution of a step. + +#### What is the difference between `Session.run()` and `Tensor.eval()`? + +If `t` is a @{tf.Tensor} object, +@{tf.Tensor.eval} is shorthand for +@{tf.Session.run}, where `sess` is the +current @{tf.get_default_session}. The +two following snippets of code are equivalent: + +```python +# Using `Session.run()`. +sess = tf.Session() +c = tf.constant(5.0) +print(sess.run(c)) + +# Using `Tensor.eval()`. +c = tf.constant(5.0) +with tf.Session(): + print(c.eval()) +``` + +In the second example, the session acts as a +[context manager](https://docs.python.org/2.7/reference/compound_stmts.html#with), +which has the effect of installing it as the default session for the lifetime of +the `with` block. The context manager approach can lead to more concise code for +simple use cases (like unit tests); if your code deals with multiple graphs and +sessions, it may be more straightforward to make explicit calls to +`Session.run()`. + +#### Do Sessions have a lifetime? What about intermediate tensors? + +Sessions can own resources, such as +@{tf.Variable}, +@{tf.QueueBase}, and +@{tf.ReaderBase}. These resources can sometimes use +a significant amount of memory, and can be released when the session is closed by calling +@{tf.Session.close}. + +The intermediate tensors that are created as part of a call to +@{$python/client$`Session.run()`} will be freed at or before the +end of the call. + +#### Does the runtime parallelize parts of graph execution? + +The TensorFlow runtime parallelizes graph execution across many different +dimensions: + +* The individual ops have parallel implementations, using multiple cores in a + CPU, or multiple threads in a GPU. +* Independent nodes in a TensorFlow graph can run in parallel on multiple + devices, which makes it possible to speed up + @{$deep_cnn$CIFAR-10 training using multiple GPUs}. +* The Session API allows multiple concurrent steps (i.e. calls to + @{tf.Session.run} in parallel). This + enables the runtime to get higher throughput, if a single step does not use + all of the resources in your computer. + +#### Which client languages are supported in TensorFlow? + +TensorFlow is designed to support multiple client languages. +Currently, the best-supported client language is [Python](../api_docs/python/index.md). Experimental interfaces for +executing and constructing graphs are also available for +[C++](../api_docs/cc/index.md), [Java](../api_docs/java/reference/org/tensorflow/package-summary.html) and [Go](https://godoc.org/github.com/tensorflow/tensorflow/tensorflow/go). + +TensorFlow also has a +[C-based client API](https://www.tensorflow.org/code/tensorflow/c/c_api.h) +to help build support for more client languages. We invite contributions of new +language bindings. + +Bindings for various other languages (such as [C#](https://github.com/migueldeicaza/TensorFlowSharp), [Julia](https://github.com/malmaud/TensorFlow.jl), [Ruby](https://github.com/somaticio/tensorflow.rb) and [Scala](https://github.com/eaplatanios/tensorflow_scala)) created and supported by the open source community build on top of the C API supported by the TensorFlow maintainers. + +#### Does TensorFlow make use of all the devices (GPUs and CPUs) available on my machine? + +TensorFlow supports multiple GPUs and CPUs. See the how-to documentation on +@{$using_gpu$using GPUs with TensorFlow} for details of how +TensorFlow assigns operations to devices, and the +@{$deep_cnn$CIFAR-10 tutorial} for an example model that +uses multiple GPUs. + +Note that TensorFlow only uses GPU devices with a compute capability greater +than 3.5. + +#### Why does `Session.run()` hang when using a reader or a queue? + +The @{tf.ReaderBase} and +@{tf.QueueBase} classes provide special operations that +can *block* until input (or free space in a bounded queue) becomes +available. These operations allow you to build sophisticated +@{$reading_data$input pipelines}, at the cost of making the +TensorFlow computation somewhat more complicated. See the how-to documentation +for +@{$reading_data#creating_threads_to_prefetch_using_queuerunner_objects$using `QueueRunner` objects to drive queues and readers} +for more information on how to use them. + +## Variables + +See also the how-to documentation on @{$variables$variables} and +@{$python/state_ops$the API documentation for variables}. + +#### What is the lifetime of a variable? + +A variable is created when you first run the +@{tf.Variable.initializer} +operation for that variable in a session. It is destroyed when that +@{tf.Session.close}. + +#### How do variables behave when they are concurrently accessed? + +Variables allow concurrent read and write operations. The value read from a +variable may change if it is concurrently updated. By default, concurrent +assignment operations to a variable are allowed to run with no mutual exclusion. +To acquire a lock when assigning to a variable, pass `use_locking=True` to +@{tf.Variable.assign}. + +## Tensor shapes + +See also the +@{tf.TensorShape}. + +#### How can I determine the shape of a tensor in Python? + +In TensorFlow, a tensor has both a static (inferred) shape and a dynamic (true) +shape. The static shape can be read using the +@{tf.Tensor.get_shape} +method: this shape is inferred from the operations that were used to create the +tensor, and may be +@{tf.TensorShape$partially complete}. If the static +shape is not fully defined, the dynamic shape of a `Tensor` `t` can be +determined by evaluating @{tf.shape$`tf.shape(t)`}. + +#### What is the difference between `x.set_shape()` and `x = tf.reshape(x)`? + +The @{tf.Tensor.set_shape} method updates +the static shape of a `Tensor` object, and it is typically used to provide +additional shape information when this cannot be inferred directly. It does not +change the dynamic shape of the tensor. + +The @{tf.reshape} operation creates +a new tensor with a different dynamic shape. + +#### How do I build a graph that works with variable batch sizes? + +It is often useful to build a graph that works with variable batch sizes +so that the same code can be used for (mini-)batch training, and +single-instance inference. The resulting graph can be +@{tf.Graph.as_graph_def$saved as a protocol buffer} +and +@{tf.import_graph_def$imported into another program}. + +When building a variable-size graph, the most important thing to remember is not +to encode the batch size as a Python constant, but instead to use a symbolic +`Tensor` to represent it. The following tips may be useful: + +* Use [`batch_size = tf.shape(input)[0]`](../api_docs/python/array_ops.md#shape) + to extract the batch dimension from a `Tensor` called `input`, and store it in + a `Tensor` called `batch_size`. + +* Use @{tf.reduce_mean} instead + of `tf.reduce_sum(...) / batch_size`. + + +## TensorBoard + +#### How can I visualize a TensorFlow graph? + +See the @{$graph_viz$graph visualization tutorial}. + +#### What is the simplest way to send data to TensorBoard? + +Add summary ops to your TensorFlow graph, and write +these summaries to a log directory. Then, start TensorBoard using + + python tensorflow/tensorboard/tensorboard.py --logdir=path/to/log-directory + +For more details, see the +@{$summaries_and_tensorboard$Summaries and TensorBoard tutorial}. + +#### Every time I launch TensorBoard, I get a network security popup! + +You can change TensorBoard to serve on localhost rather than '0.0.0.0' by +the flag --host=localhost. This should quiet any security warnings. + +## Extending TensorFlow + +See the how-to documentation for +@{$adding_an_op$adding a new operation to TensorFlow}. + +#### My data is in a custom format. How do I read it using TensorFlow? + +There are three main options for dealing with data in a custom format. + +The easiest option is to write parsing code in Python that transforms the data +into a numpy array. Then, use @{tf.data.Dataset.from_tensor_slices} to +create an input pipeline from the in-memory data. + +If your data doesn't fit in memory, try doing the parsing in the Dataset +pipeline. Start with an appropriate file reader, like +@{tf.data.TextLineDataset}. Then convert the dataset by mapping +@{tf.data.Dataset.map$mapping} appropriate operations over it. +Prefer predefined TensorFlow operations such as @{tf.decode_raw}, +@{tf.decode_csv}, @{tf.parse_example}, or @{tf.image.decode_png}. + +If your data is not easily parsable with the built-in TensorFlow operations, +consider converting it, offline, to a format that is easily parsable, such +as @{tf.python_io.TFRecordWriter$`TFRecord`} format. + +The most efficient method to customize the parsing behavior is to +@{$adding_an_op$add a new op written in C++} that parses your +data format. The @{$new_data_formats$guide to handling new data formats} has +more information about the steps for doing this. + + +## Miscellaneous + +#### What is TensorFlow's coding style convention? + +The TensorFlow Python API adheres to the +[PEP8](https://www.python.org/dev/peps/pep-0008/) conventions.* In +particular, we use `CamelCase` names for classes, and `snake_case` names for +functions, methods, and properties. We also adhere to the +[Google Python style guide](https://google.github.io/styleguide/pyguide.html). + +The TensorFlow C++ code base adheres to the +[Google C++ style guide](https://google.github.io/styleguide/cppguide.html). + +(* With one exception: we use 2-space indentation instead of 4-space +indentation.) + diff --git a/tensorflow/docs_src/guide/feature_columns.md b/tensorflow/docs_src/guide/feature_columns.md new file mode 100644 index 0000000000..1013ec910c --- /dev/null +++ b/tensorflow/docs_src/guide/feature_columns.md @@ -0,0 +1,572 @@ +# Feature Columns + +This document details feature columns. Think of **feature columns** as the +intermediaries between raw data and Estimators. Feature columns are very rich, +enabling you to transform a diverse range of raw data into formats that +Estimators can use, allowing easy experimentation. + +In @{$premade_estimators$Premade Estimators}, we used the premade +Estimator, @{tf.estimator.DNNClassifier$`DNNClassifier`} to train a model to +predict different types of Iris flowers from four input features. That example +created only numerical feature columns (of type +@{tf.feature_column.numeric_column}). Although numerical feature columns model +the lengths of petals and sepals effectively, real world data sets contain all +kinds of features, many of which are non-numerical. + +
+ +
+
+Some real-world features (such as, longitude) are numerical, but many are not. +
+ +## Input to a Deep Neural Network + +What kind of data can a deep neural network operate on? The answer +is, of course, numbers (for example, `tf.float32`). After all, every neuron in +a neural network performs multiplication and addition operations on weights and +input data. Real-life input data, however, often contains non-numerical +(categorical) data. For example, consider a `product_class` feature that can +contain the following three non-numerical values: + +* `kitchenware` +* `electronics` +* `sports` + +ML models generally represent categorical values as simple vectors in which a +1 represents the presence of a value and a 0 represents the absence of a value. +For example, when `product_class` is set to `sports`, an ML model would usually +represent `product_class` as `[0, 0, 1]`, meaning: + +* `0`: `kitchenware` is absent +* `0`: `electronics` is absent +* `1`: `sports` is present + +So, although raw data can be numerical or categorical, an ML model represents +all features as numbers. + +## Feature Columns + +As the following figure suggests, you specify the input to a model through the +`feature_columns` argument of an Estimator (`DNNClassifier` for Iris). +Feature Columns bridge input data (as returned by `input_fn`) with your model. + +
+ +
+
+Feature columns bridge raw data with the data your model needs. +
+ +To create feature columns, call functions from the +@{tf.feature_column} module. This document explains nine of the functions in +that module. As the following figure shows, all nine functions return either a +Categorical-Column or a Dense-Column object, except `bucketized_column`, which +inherits from both classes: + +
+ +
+
+Feature column methods fall into two main categories and one hybrid category. +
+ +Let's look at these functions in more detail. + +### Numeric column + +The Iris classifier calls the @{tf.feature_column.numeric_column} function for +all input features: + + * `SepalLength` + * `SepalWidth` + * `PetalLength` + * `PetalWidth` + +Although `tf.numeric_column` provides optional arguments, calling +`tf.numeric_column` without any arguments, as follows, is a fine way to specify +a numerical value with the default data type (`tf.float32`) as input to your +model: + +```python +# Defaults to a tf.float32 scalar. +numeric_feature_column = tf.feature_column.numeric_column(key="SepalLength") +``` + +To specify a non-default numerical data type, use the `dtype` argument. For +example: + +``` python +# Represent a tf.float64 scalar. +numeric_feature_column = tf.feature_column.numeric_column(key="SepalLength", + dtype=tf.float64) +``` + +By default, a numeric column creates a single value (scalar). Use the shape +argument to specify another shape. For example: + + +```python +# Represent a 10-element vector in which each cell contains a tf.float32. +vector_feature_column = tf.feature_column.numeric_column(key="Bowling", + shape=10) + +# Represent a 10x5 matrix in which each cell contains a tf.float32. +matrix_feature_column = tf.feature_column.numeric_column(key="MyMatrix", + shape=[10,5]) +``` +### Bucketized column + +Often, you don't want to feed a number directly into the model, but instead +split its value into different categories based on numerical ranges. To do so, +create a @{tf.feature_column.bucketized_column$bucketized column}. For +example, consider raw data that represents the year a house was built. Instead +of representing that year as a scalar numeric column, we could split the year +into the following four buckets: + +
+ +
+
+Dividing year data into four buckets. +
+ +The model will represent the buckets as follows: + +|Date Range |Represented as... | +|:----------|:-----------------| +|< 1960 | [1, 0, 0, 0] | +|>= 1960 but < 1980 | [0, 1, 0, 0] | +|>= 1980 but < 2000 | [0, 0, 1, 0] | +|>= 2000 | [0, 0, 0, 1] | + +Why would you want to split a number—a perfectly valid input to your +model—into a categorical value? Well, notice that the categorization splits a +single input number into a four-element vector. Therefore, the model now can +learn _four individual weights_ rather than just one; four weights creates a +richer model than one weight. More importantly, bucketizing enables the model +to clearly distinguish between different year categories since only one of the +elements is set (1) and the other three elements are cleared (0). For example, +when we just use a single number (a year) as input, a linear model can only +learn a linear relationship. So, bucketing provides the model with additional +flexibility that the model can use to learn. + +The following code demonstrates how to create a bucketized feature: + + +```python +# First, convert the raw input to a numeric column. +numeric_feature_column = tf.feature_column.numeric_column("Year") + +# Then, bucketize the numeric column on the years 1960, 1980, and 2000. +bucketized_feature_column = tf.feature_column.bucketized_column( + source_column = numeric_feature_column, + boundaries = [1960, 1980, 2000]) +``` +Note that specifying a _three_-element boundaries vector creates a +_four_-element bucketized vector. + + +### Categorical identity column + +**Categorical identity columns** can be seen as a special case of bucketized +columns. In traditional bucketized columns, each bucket represents a range of +values (for example, from 1960 to 1979). In a categorical identity column, each +bucket represents a single, unique integer. For example, let's say you want to +represent the integer range `[0, 4)`. That is, you want to represent the +integers 0, 1, 2, or 3. In this case, the categorical identity mapping looks +like this: + +
+ +
+
+A categorical identity column mapping. Note that this is a one-hot +encoding, not a binary numerical encoding. +
+ +As with bucketized columns, a model can learn a separate weight for each class +in a categorical identity column. For example, instead of using a string to +represent the `product_class`, let's represent each class with a unique integer +value. That is: + +* `0="kitchenware"` +* `1="electronics"` +* `2="sport"` + +Call @{tf.feature_column.categorical_column_with_identity} to implement a +categorical identity column. For example: + +``` python +# Create categorical output for an integer feature named "my_feature_b", +# The values of my_feature_b must be >= 0 and < num_buckets +identity_feature_column = tf.feature_column.categorical_column_with_identity( + key='my_feature_b', + num_buckets=4) # Values [0, 4) + +# In order for the preceding call to work, the input_fn() must return +# a dictionary containing 'my_feature_b' as a key. Furthermore, the values +# assigned to 'my_feature_b' must belong to the set [0, 4). +def input_fn(): + ... + return ({ 'my_feature_a':[7, 9, 5, 2], 'my_feature_b':[3, 1, 2, 2] }, + [Label_values]) +``` + +### Categorical vocabulary column + +We cannot input strings directly to a model. Instead, we must first map strings +to numeric or categorical values. Categorical vocabulary columns provide a good +way to represent strings as a one-hot vector. For example: + +
+ +
+
+Mapping string values to vocabulary columns. +
+ +As you can see, categorical vocabulary columns are kind of an enum version of +categorical identity columns. TensorFlow provides two different functions to +create categorical vocabulary columns: + +* @{tf.feature_column.categorical_column_with_vocabulary_list} +* @{tf.feature_column.categorical_column_with_vocabulary_file} + +`categorical_column_with_vocabulary_list` maps each string to an integer based +on an explicit vocabulary list. For example: + +```python +# Given input "feature_name_from_input_fn" which is a string, +# create a categorical feature by mapping the input to one of +# the elements in the vocabulary list. +vocabulary_feature_column = + tf.feature_column.categorical_column_with_vocabulary_list( + key=feature_name_from_input_fn, + vocabulary_list=["kitchenware", "electronics", "sports"]) +``` + +The preceding function is pretty straightforward, but it has a significant +drawback. Namely, there's way too much typing when the vocabulary list is long. +For these cases, call +`tf.feature_column.categorical_column_with_vocabulary_file` instead, which lets +you place the vocabulary words in a separate file. For example: + +```python + +# Given input "feature_name_from_input_fn" which is a string, +# create a categorical feature to our model by mapping the input to one of +# the elements in the vocabulary file +vocabulary_feature_column = + tf.feature_column.categorical_column_with_vocabulary_file( + key=feature_name_from_input_fn, + vocabulary_file="product_class.txt", + vocabulary_size=3) +``` + +`product_class.txt` should contain one line for each vocabulary element. In our +case: + +```None +kitchenware +electronics +sports +``` + +### Hashed Column + +So far, we've worked with a naively small number of categories. For example, +our product_class example has only 3 categories. Often though, the number of +categories can be so big that it's not possible to have individual categories +for each vocabulary word or integer because that would consume too much memory. +For these cases, we can instead turn the question around and ask, "How many +categories am I willing to have for my input?" In fact, the +@{tf.feature_column.categorical_column_with_hash_bucket} function enables you +to specify the number of categories. For this type of feature column the model +calculates a hash value of the input, then puts it into one of +the `hash_bucket_size` categories using the modulo operator, as in the following +pseudocode: + +```python +# pseudocode +feature_id = hash(raw_feature) % hash_buckets_size +``` + +The code to create the `feature_column` might look something like this: + +``` python +hashed_feature_column = + tf.feature_column.categorical_column_with_hash_bucket( + key = "some_feature", + hash_buckets_size = 100) # The number of categories +``` +At this point, you might rightfully think: "This is crazy!" After all, we are +forcing the different input values to a smaller set of categories. This means +that two probably unrelated inputs will be mapped to the same +category, and consequently mean the same thing to the neural network. The +following figure illustrates this dilemma, showing that kitchenware and sports +both get assigned to category (hash bucket) 12: + +
+ +
+
+Representing data with hash buckets. +
+ +As with many counterintuitive phenomena in machine learning, it turns out that +hashing often works well in practice. That's because hash categories provide +the model with some separation. The model can use additional features to further +separate kitchenware from sports. + +### Crossed column + +Combining features into a single feature, better known as +[feature crosses](https://developers.google.com/machine-learning/glossary/#feature_cross), +enables the model to learn separate weights for each combination of +features. + +More concretely, suppose we want our model to calculate real estate prices in +Atlanta, GA. Real-estate prices within this city vary greatly depending on +location. Representing latitude and longitude as separate features isn't very +useful in identifying real-estate location dependencies; however, crossing +latitude and longitude into a single feature can pinpoint locations. Suppose we +represent Atlanta as a grid of 100x100 rectangular sections, identifying each +of the 10,000 sections by a feature cross of latitude and longitude. This +feature cross enables the model to train on pricing conditions related to each +individual section, which is a much stronger signal than latitude and longitude +alone. + +The following figure shows our plan, with the latitude & longitude values for +the corners of the city in red text: + +
+ +
+
+Map of Atlanta. Imagine this map divided into 10,000 sections of +equal size. +
+ +For the solution, we used a combination of the `bucketized_column` we looked at +earlier, with the @{tf.feature_column.crossed_column} function. + + + +``` python +def make_dataset(latitude, longitude, labels): + assert latitude.shape == longitude.shape == labels.shape + + features = {'latitude': latitude.flatten(), + 'longitude': longitude.flatten()} + labels=labels.flatten() + + return tf.data.Dataset.from_tensor_slices((features, labels)) + + +# Bucketize the latitude and longitude using the `edges` +latitude_bucket_fc = tf.feature_column.bucketized_column( + tf.feature_column.numeric_column('latitude'), + list(atlanta.latitude.edges)) + +longitude_bucket_fc = tf.feature_column.bucketized_column( + tf.feature_column.numeric_column('longitude'), + list(atlanta.longitude.edges)) + +# Cross the bucketized columns, using 5000 hash bins. +crossed_lat_lon_fc = tf.feature_column.crossed_column( + [latitude_bucket_fc, longitude_bucket_fc], 5000) + +fc = [ + latitude_bucket_fc, + longitude_bucket_fc, + crossed_lat_lon_fc] + +# Build and train the Estimator. +est = tf.estimator.LinearRegressor(fc, ...) +``` + +You may create a feature cross from either of the following: + +* Feature names; that is, names from the `dict` returned from `input_fn`. +* Any categorical column, except `categorical_column_with_hash_bucket` + (since `crossed_column` hashes the input). + +When the feature columns `latitude_bucket_fc` and `longitude_bucket_fc` are +crossed, TensorFlow will create `(latitude_fc, longitude_fc)` pairs for each +example. This would produce a full grid of possibilities as follows: + +``` None + (0,0), (0,1)... (0,99) + (1,0), (1,1)... (1,99) + ... ... ... +(99,0), (99,1)...(99, 99) +``` + +Except that a full grid would only be tractable for inputs with limited +vocabularies. Instead of building this, potentially huge, table of inputs, +the `crossed_column` only builds the number requested by the `hash_bucket_size` +argument. The feature column assigns an example to a index by running a hash +function on the tuple of inputs, followed by a modulo operation with +`hash_bucket_size`. + +As discussed earlier, performing the +hash and modulo function limits the number of categories, but can cause category +collisions; that is, multiple (latitude, longitude) feature crosses will end +up in the same hash bucket. In practice though, performing feature crosses +still adds significant value to the learning capability of your models. + +Somewhat counterintuitively, when creating feature crosses, you typically still +should include the original (uncrossed) features in your model (as in the +preceding code snippet). The independent latitude and longitude features help the +model distinguish between examples where a hash collision has occurred in the +crossed feature. + +## Indicator and embedding columns + +Indicator columns and embedding columns never work on features directly, but +instead take categorical columns as input. + +When using an indicator column, we're telling TensorFlow to do exactly what +we've seen in our categorical product_class example. That is, an +**indicator column** treats each category as an element in a one-hot vector, +where the matching category has value 1 and the rest have 0s: + +
+ +
+
+Representing data in indicator columns. +
+ +Here's how you create an indicator column by calling +@{tf.feature_column.indicator_column}: + +``` python +categorical_column = ... # Create any type of categorical column. + +# Represent the categorical column as an indicator column. +indicator_column = tf.feature_column.indicator_column(categorical_column) +``` + +Now, suppose instead of having just three possible classes, we have a million. +Or maybe a billion. For a number of reasons, as the number of categories grow +large, it becomes infeasible to train a neural network using indicator columns. + +We can use an embedding column to overcome this limitation. Instead of +representing the data as a one-hot vector of many dimensions, an +**embedding column** represents that data as a lower-dimensional, ordinary +vector in which each cell can contain any number, not just 0 or 1. By +permitting a richer palette of numbers for every cell, an embedding column +contains far fewer cells than an indicator column. + +Let's look at an example comparing indicator and embedding columns. Suppose our +input examples consist of different words from a limited palette of only 81 +words. Further suppose that the data set provides the following input +words in 4 separate examples: + +* `"dog"` +* `"spoon"` +* `"scissors"` +* `"guitar"` + +In that case, the following figure illustrates the processing path for +embedding columns or indicator columns. + +
+ +
+
+An embedding column stores categorical data in a lower-dimensional +vector than an indicator column. (We just placed random numbers into the +embedding vectors; training determines the actual numbers.) +
+ +When an example is processed, one of the `categorical_column_with...` functions +maps the example string to a numerical categorical value. For example, a +function maps "spoon" to `[32]`. (The 32 comes from our imagination—the actual +values depend on the mapping function.) You may then represent these numerical +categorical values in either of the following two ways: + +* As an indicator column. A function converts each numeric categorical value + into an 81-element vector (because our palette consists of 81 words), placing + a 1 in the index of the categorical value (0, 32, 79, 80) and a 0 in all the + other positions. + +* As an embedding column. A function uses the numerical categorical values + `(0, 32, 79, 80)` as indices to a lookup table. Each slot in that lookup table + contains a 3-element vector. + +How do the values in the embeddings vectors magically get assigned? Actually, +the assignments happen during training. That is, the model learns the best way +to map your input numeric categorical values to the embeddings vector value in +order to solve your problem. Embedding columns increase your model's +capabilities, since an embeddings vector learns new relationships between +categories from the training data. + +Why is the embedding vector size 3 in our example? Well, the following "formula" +provides a general rule of thumb about the number of embedding dimensions: + +```python +embedding_dimensions = number_of_categories**0.25 +``` + +That is, the embedding vector dimension should be the 4th root of the number of +categories. Since our vocabulary size in this example is 81, the recommended +number of dimensions is 3: + +``` python +3 = 81**0.25 +``` +Note that this is just a general guideline; you can set the number of embedding +dimensions as you please. + +Call @{tf.feature_column.embedding_column} to create an `embedding_column` as +suggested by the following snippet: + +``` python +categorical_column = ... # Create any categorical column + +# Represent the categorical column as an embedding column. +# This means creating an embedding vector lookup table with one element for each category. +embedding_column = tf.feature_column.embedding_column( + categorical_column=categorical_column, + dimension=embedding_dimensions) +``` + +@{$guide/embedding$Embeddings} is a significant topic within machine +learning. This information was just to get you started using them as feature +columns. + +## Passing feature columns to Estimators + +As the following list indicates, not all Estimators permit all types of +`feature_columns` argument(s): + +* @{tf.estimator.LinearClassifier$`LinearClassifier`} and + @{tf.estimator.LinearRegressor$`LinearRegressor`}: Accept all types of + feature column. +* @{tf.estimator.DNNClassifier$`DNNClassifier`} and + @{tf.estimator.DNNRegressor$`DNNRegressor`}: Only accept dense columns. Other + column types must be wrapped in either an `indicator_column` or + `embedding_column`. +* @{tf.estimator.DNNLinearCombinedClassifier$`DNNLinearCombinedClassifier`} and + @{tf.estimator.DNNLinearCombinedRegressor$`DNNLinearCombinedRegressor`}: + * The `linear_feature_columns` argument accepts any feature column type. + * The `dnn_feature_columns` argument only accepts dense columns. + +## Other Sources + +For more examples on feature columns, view the following: + +* The @{$low_level_intro#feature_columns$Low Level Introduction} demonstrates how + experiment directly with `feature_columns` using TensorFlow's low level APIs. +* The @{$wide$wide} and @{$wide_and_deep$Wide & Deep} Tutorials solve a + binary classification problem using `feature_columns` on a variety of input + data types. + +To learn more about embeddings, see the following: + +* [Deep Learning, NLP, and representations](http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/) + (Chris Olah's blog) +* The TensorFlow [Embedding Projector](http://projector.tensorflow.org) diff --git a/tensorflow/docs_src/guide/graph_viz.md b/tensorflow/docs_src/guide/graph_viz.md new file mode 100644 index 0000000000..f581ae56da --- /dev/null +++ b/tensorflow/docs_src/guide/graph_viz.md @@ -0,0 +1,316 @@ +# TensorBoard: Graph Visualization + +TensorFlow computation graphs are powerful but complicated. The graph visualization can help you understand and debug them. Here's an example of the visualization at work. + +![Visualization of a TensorFlow graph](https://www.tensorflow.org/images/graph_vis_animation.gif "Visualization of a TensorFlow graph") +*Visualization of a TensorFlow graph.* + +To see your own graph, run TensorBoard pointing it to the log directory of the job, click on the graph tab on the top pane and select the appropriate run using the menu at the upper left corner. For in depth information on how to run TensorBoard and make sure you are logging all the necessary information, see @{$summaries_and_tensorboard$TensorBoard: Visualizing Learning}. + +## Name scoping and nodes + +Typical TensorFlow graphs can have many thousands of nodes--far too many to see +easily all at once, or even to lay out using standard graph tools. To simplify, +variable names can be scoped and the visualization uses this information to +define a hierarchy on the nodes in the graph. By default, only the top of this +hierarchy is shown. Here is an example that defines three operations under the +`hidden` name scope using +@{tf.name_scope}: + +```python +import tensorflow as tf + +with tf.name_scope('hidden') as scope: + a = tf.constant(5, name='alpha') + W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0), name='weights') + b = tf.Variable(tf.zeros([1]), name='biases') +``` + +This results in the following three op names: + +* `hidden/alpha` +* `hidden/weights` +* `hidden/biases` + +By default, the visualization will collapse all three into a node labeled `hidden`. +The extra detail isn't lost. You can double-click, or click +on the orange `+` sign in the top right to expand the node, and then you'll see +three subnodes for `alpha`, `weights` and `biases`. + +Here's a real-life example of a more complicated node in its initial and +expanded states. + + + + + + + + + + +
+ Unexpanded name scope + + Expanded name scope +
+ Initial view of top-level name scope pool_1. Clicking on the orange + button on the top right or double-clicking on the node itself will expand it. + + Expanded view of pool_1 name scope. Clicking on the orange - button on the top right or double-clicking on the node itself will collapse the name scope. +
+ +Grouping nodes by name scopes is critical to making a legible graph. If you're +building a model, name scopes give you control over the resulting visualization. +**The better your name scopes, the better your visualization.** + +The figure above illustrates a second aspect of the visualization. TensorFlow +graphs have two kinds of connections: data dependencies and control +dependencies. Data dependencies show the flow of tensors between two ops and +are shown as solid arrows, while control dependencies use dotted lines. In the +expanded view (right side of the figure above) all the connections are data +dependencies with the exception of the dotted line connecting `CheckNumerics` +and `control_dependency`. + +There's a second trick to simplifying the layout. Most TensorFlow graphs have a +few nodes with many connections to other nodes. For example, many nodes might +have a control dependency on an initialization step. Drawing all edges between +the `init` node and its dependencies would create a very cluttered view. + +To reduce clutter, the visualization separates out all high-degree nodes to an +*auxiliary* area on the right and doesn't draw lines to represent their edges. +Instead of lines, we draw small *node icons* to indicate the connections. +Separating out the auxiliary nodes typically doesn't remove critical +information since these nodes are usually related to bookkeeping functions. +See [Interaction](#interaction) for how to move nodes between the main graph +and the auxiliary area. + + + + + + + + + + +
+ conv_1 is part of the main graph + + save is extracted as auxiliary node +
+ Node conv_1 is connected to save. Note the little save node icon on its right. + + save has a high degree, and will appear as an auxiliary node. The connection with conv_1 is shown as a node icon on its left. To further reduce clutter, since save has a lot of connections, we show the first 5 and abbreviate the others as ... 12 more. +
+ +One last structural simplification is *series collapsing*. Sequential +motifs--that is, nodes whose names differ by a number at the end and have +isomorphic structures--are collapsed into a single *stack* of nodes, as shown +below. For networks with long sequences, this greatly simplifies the view. As +with hierarchical nodes, double-clicking expands the series. See +[Interaction](#interaction) for how to disable/enable series collapsing for a +specific set of nodes. + + + + + + + + + + +
+ Sequence of nodes + + Expanded sequence of nodes +
+ A collapsed view of a node sequence. + + A small piece of the expanded view, after double-click. +
+ +Finally, as one last aid to legibility, the visualization uses special icons +for constants and summary nodes. To summarize, here's a table of node symbols: + +Symbol | Meaning +--- | --- +![Name scope](https://www.tensorflow.org/images/namespace_node.png "Name scope") | *High-level* node representing a name scope. Double-click to expand a high-level node. +![Sequence of unconnected nodes](https://www.tensorflow.org/images/horizontal_stack.png "Sequence of unconnected nodes") | Sequence of numbered nodes that are not connected to each other. +![Sequence of connected nodes](https://www.tensorflow.org/images/vertical_stack.png "Sequence of connected nodes") | Sequence of numbered nodes that are connected to each other. +![Operation node](https://www.tensorflow.org/images/op_node.png "Operation node") | An individual operation node. +![Constant node](https://www.tensorflow.org/images/constant.png "Constant node") | A constant. +![Summary node](https://www.tensorflow.org/images/summary.png "Summary node") | A summary node. +![Data flow edge](https://www.tensorflow.org/images/dataflow_edge.png "Data flow edge") | Edge showing the data flow between operations. +![Control dependency edge](https://www.tensorflow.org/images/control_edge.png "Control dependency edge") | Edge showing the control dependency between operations. +![Reference edge](https://www.tensorflow.org/images/reference_edge.png "Reference edge") | A reference edge showing that the outgoing operation node can mutate the incoming tensor. + +## Interaction {#interaction} + +Navigate the graph by panning and zooming. Click and drag to pan, and use a +scroll gesture to zoom. Double-click on a node, or click on its `+` button, to +expand a name scope that represents a group of operations. To easily keep +track of the current viewpoint when zooming and panning, there is a minimap in +the bottom right corner. + +To close an open node, double-click it again or click its `-` button. You can +also click once to select a node. It will turn a darker color, and details +about it and the nodes it connects to will appear in the info card at upper +right corner of the visualization. + + + + + + + + + + +
+ Info card of a name scope + + Info card of operation node +
+ Info card showing detailed information for the conv2 name scope. The inputs and outputs are combined from the inputs and outputs of the operation nodes inside the name scope. For name scopes no attributes are shown. + + Info card showing detailed information for the DecodeRaw operation node. In addition to inputs and outputs, the card shows the device and the attributes associated with the current operation. +
+ +TensorBoard provides several ways to change the visual layout of the graph. This +doesn't change the graph's computational semantics, but it can bring some +clarity to the network's structure. By right clicking on a node or pressing +buttons on the bottom of that node's info card, you can make the following +changes to its layout: + +* Nodes can be moved between the main graph and the auxiliary area. +* A series of nodes can be ungrouped so that the nodes in the series do not +appear grouped together. Ungrouped series can likewise be regrouped. + +Selection can also be helpful in understanding high-degree nodes. Select any +high-degree node, and the corresponding node icons for its other connections +will be selected as well. This makes it easy, for example, to see which nodes +are being saved--and which aren't. + +Clicking on a node name in the info card will select it. If necessary, the +viewpoint will automatically pan so that the node is visible. + +Finally, you can choose two color schemes for your graph, using the color menu +above the legend. The default *Structure View* shows structure: when two +high-level nodes have the same structure, they appear in the same color of the +rainbow. Uniquely structured nodes are gray. There's a second view, which shows +what device the different operations run on. Name scopes are colored +proportionally to the fraction of devices for the operations inside them. + +The images below give an illustration for a piece of a real-life graph. + + + + + + + + + + +
+ Color by structure + + Color by device +
+ Structure view: The gray nodes have unique structure. The orange conv1 and conv2 nodes have the same structure, and analogously for nodes with other colors. + + Device view: Name scopes are colored proportionally to the fraction of devices of the operation nodes inside them. Here, purple means GPU and the green is CPU. +
+ +## Tensor shape information + +When the serialized `GraphDef` includes tensor shapes, the graph visualizer +labels edges with tensor dimensions, and edge thickness reflects total tensor +size. To include tensor shapes in the `GraphDef` pass the actual graph object +(as in `sess.graph`) to the `FileWriter` when serializing the graph. +The images below show the CIFAR-10 model with tensor shape information: + + + + + + + +
+ CIFAR-10 model with tensor shape information +
+ CIFAR-10 model with tensor shape information. +
+ +## Runtime statistics + +Often it is useful to collect runtime metadata for a run, such as total memory +usage, total compute time, and tensor shapes for nodes. The code example below +is a snippet from the train and test section of a modification of the +@{$layers$simple MNIST tutorial}, in which we have recorded summaries and +runtime statistics. See the +@{$summaries_and_tensorboard#serializing-the-data$Summaries Tutorial} +for details on how to record summaries. +Full source is [here](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/mnist_with_summaries.py). + +```python + # Train the model, and also write summaries. + # Every 10th step, measure test-set accuracy, and write test summaries + # All other steps, run train_step on training data, & add training summaries + + def feed_dict(train): + """Make a TensorFlow feed_dict: maps data onto Tensor placeholders.""" + if train or FLAGS.fake_data: + xs, ys = mnist.train.next_batch(100, fake_data=FLAGS.fake_data) + k = FLAGS.dropout + else: + xs, ys = mnist.test.images, mnist.test.labels + k = 1.0 + return {x: xs, y_: ys, keep_prob: k} + + for i in range(FLAGS.max_steps): + if i % 10 == 0: # Record summaries and test-set accuracy + summary, acc = sess.run([merged, accuracy], feed_dict=feed_dict(False)) + test_writer.add_summary(summary, i) + print('Accuracy at step %s: %s' % (i, acc)) + else: # Record train set summaries, and train + if i % 100 == 99: # Record execution stats + run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE) + run_metadata = tf.RunMetadata() + summary, _ = sess.run([merged, train_step], + feed_dict=feed_dict(True), + options=run_options, + run_metadata=run_metadata) + train_writer.add_run_metadata(run_metadata, 'step%d' % i) + train_writer.add_summary(summary, i) + print('Adding run metadata for', i) + else: # Record a summary + summary, _ = sess.run([merged, train_step], feed_dict=feed_dict(True)) + train_writer.add_summary(summary, i) +``` + +This code will emit runtime statistics for every 100th step starting at step99. + +When you launch tensorboard and go to the Graph tab, you will now see options +under "Session runs" which correspond to the steps where run metadata was added. +Selecting one of these runs will show you the snapshot of the network at that +step, fading out unused nodes. In the controls on the left hand side, you will +be able to color the nodes by total memory or total compute time. Additionally, +clicking on a node will display the exact total memory, compute time, and +tensor output sizes. + + + + + + + + +
+ Color by compute time + + Run metadata graph + + Run metadata info card +
diff --git a/tensorflow/docs_src/guide/graphs.md b/tensorflow/docs_src/guide/graphs.md new file mode 100644 index 0000000000..e6246ef148 --- /dev/null +++ b/tensorflow/docs_src/guide/graphs.md @@ -0,0 +1,558 @@ +# Graphs and Sessions + +TensorFlow uses a **dataflow graph** to represent your computation in terms of +the dependencies between individual operations. This leads to a low-level +programming model in which you first define the dataflow graph, then create a +TensorFlow **session** to run parts of the graph across a set of local and +remote devices. + +This guide will be most useful if you intend to use the low-level programming +model directly. Higher-level APIs such as @{tf.estimator.Estimator} and Keras +hide the details of graphs and sessions from the end user, but this guide may +also be useful if you want to understand how these APIs are implemented. + +## Why dataflow graphs? + +![](../images/tensors_flowing.gif) + +[Dataflow](https://en.wikipedia.org/wiki/Dataflow_programming) is a common +programming model for parallel computing. In a dataflow graph, the nodes +represent units of computation, and the edges represent the data consumed or +produced by a computation. For example, in a TensorFlow graph, the @{tf.matmul} +operation would correspond to a single node with two incoming edges (the +matrices to be multiplied) and one outgoing edge (the result of the +multiplication). + + + +Dataflow has several advantages that TensorFlow leverages when executing your +programs: + +* **Parallelism.** By using explicit edges to represent dependencies between + operations, it is easy for the system to identify operations that can execute + in parallel. + +* **Distributed execution.** By using explicit edges to represent the values + that flow between operations, it is possible for TensorFlow to partition your + program across multiple devices (CPUs, GPUs, and TPUs) attached to different + machines. TensorFlow inserts the necessary communication and coordination + between devices. + +* **Compilation.** TensorFlow's @{$performance/xla$XLA compiler} can + use the information in your dataflow graph to generate faster code, for + example, by fusing together adjacent operations. + +* **Portability.** The dataflow graph is a language-independent representation + of the code in your model. You can build a dataflow graph in Python, store it + in a @{$saved_model$SavedModel}, and restore it in a C++ program for + low-latency inference. + + +## What is a @{tf.Graph}? + +A @{tf.Graph} contains two relevant kinds of information: + +* **Graph structure.** The nodes and edges of the graph, indicating how + individual operations are composed together, but not prescribing how they + should be used. The graph structure is like assembly code: inspecting it can + convey some useful information, but it does not contain all of the useful + context that source code conveys. + +* **Graph collections.** TensorFlow provides a general mechanism for storing + collections of metadata in a @{tf.Graph}. The @{tf.add_to_collection} function + enables you to associate a list of objects with a key (where @{tf.GraphKeys} + defines some of the standard keys), and @{tf.get_collection} enables you to + look up all objects associated with a key. Many parts of the TensorFlow + library use this facility: for example, when you create a @{tf.Variable}, it + is added by default to collections representing "global variables" and + "trainable variables". When you later come to create a @{tf.train.Saver} or + @{tf.train.Optimizer}, the variables in these collections are used as the + default arguments. + + +## Building a @{tf.Graph} + +Most TensorFlow programs start with a dataflow graph construction phase. In this +phase, you invoke TensorFlow API functions that construct new @{tf.Operation} +(node) and @{tf.Tensor} (edge) objects and add them to a @{tf.Graph} +instance. TensorFlow provides a **default graph** that is an implicit argument +to all API functions in the same context. For example: + +* Calling `tf.constant(42.0)` creates a single @{tf.Operation} that produces the + value `42.0`, adds it to the default graph, and returns a @{tf.Tensor} that + represents the value of the constant. + +* Calling `tf.matmul(x, y)` creates a single @{tf.Operation} that multiplies + the values of @{tf.Tensor} objects `x` and `y`, adds it to the default graph, + and returns a @{tf.Tensor} that represents the result of the multiplication. + +* Executing `v = tf.Variable(0)` adds to the graph a @{tf.Operation} that will + store a writeable tensor value that persists between @{tf.Session.run} calls. + The @{tf.Variable} object wraps this operation, and can be used [like a + tensor](#tensor-like_objects), which will read the current value of the + stored value. The @{tf.Variable} object also has methods such as + @{tf.Variable.assign$`assign`} and @{tf.Variable.assign_add$`assign_add`} that + create @{tf.Operation} objects that, when executed, update the stored value. + (See @{$guide/variables} for more information about variables.) + +* Calling @{tf.train.Optimizer.minimize} will add operations and tensors to the + default graph that calculates gradients, and return a @{tf.Operation} that, + when run, will apply those gradients to a set of variables. + +Most programs rely solely on the default graph. However, +see [Dealing with multiple graphs](#programming_with_multiple_graphs) for more +advanced use cases. High-level APIs such as the @{tf.estimator.Estimator} API +manage the default graph on your behalf, and--for example--may create different +graphs for training and evaluation. + +Note: Calling most functions in the TensorFlow API merely adds operations +and tensors to the default graph, but **does not** perform the actual +computation. Instead, you compose these functions until you have a @{tf.Tensor} +or @{tf.Operation} that represents the overall computation--such as performing +one step of gradient descent--and then pass that object to a @{tf.Session} to +perform the computation. See the section "Executing a graph in a @{tf.Session}" +for more details. + +## Naming operations + +A @{tf.Graph} object defines a **namespace** for the @{tf.Operation} objects it +contains. TensorFlow automatically chooses a unique name for each operation in +your graph, but giving operations descriptive names can make your program easier +to read and debug. The TensorFlow API provides two ways to override the name of +an operation: + +* Each API function that creates a new @{tf.Operation} or returns a new + @{tf.Tensor} accepts an optional `name` argument. For example, + `tf.constant(42.0, name="answer")` creates a new @{tf.Operation} named + `"answer"` and returns a @{tf.Tensor} named `"answer:0"`. If the default graph + already contains an operation named `"answer"`, then TensorFlow would append + `"_1"`, `"_2"`, and so on to the name, in order to make it unique. + +* The @{tf.name_scope} function makes it possible to add a **name scope** prefix + to all operations created in a particular context. The current name scope + prefix is a `"/"`-delimited list of the names of all active @{tf.name_scope} + context managers. If a name scope has already been used in the current + context, TensorFlow appends `"_1"`, `"_2"`, and so on. For example: + + ```python + c_0 = tf.constant(0, name="c") # => operation named "c" + + # Already-used names will be "uniquified". + c_1 = tf.constant(2, name="c") # => operation named "c_1" + + # Name scopes add a prefix to all operations created in the same context. + with tf.name_scope("outer"): + c_2 = tf.constant(2, name="c") # => operation named "outer/c" + + # Name scopes nest like paths in a hierarchical file system. + with tf.name_scope("inner"): + c_3 = tf.constant(3, name="c") # => operation named "outer/inner/c" + + # Exiting a name scope context will return to the previous prefix. + c_4 = tf.constant(4, name="c") # => operation named "outer/c_1" + + # Already-used name scopes will be "uniquified". + with tf.name_scope("inner"): + c_5 = tf.constant(5, name="c") # => operation named "outer/inner_1/c" + ``` + +The graph visualizer uses name scopes to group operations and reduce the visual +complexity of a graph. See [Visualizing your graph](#visualizing-your-graph) for +more information. + +Note that @{tf.Tensor} objects are implicitly named after the @{tf.Operation} +that produces the tensor as output. A tensor name has the form `":"` +where: + +* `""` is the name of the operation that produces it. +* `""` is an integer representing the index of that tensor among the + operation's outputs. + +## Placing operations on different devices + +If you want your TensorFlow program to use multiple different devices, the +@{tf.device} function provides a convenient way to request that all operations +created in a particular context are placed on the same device (or type of +device). + +A **device specification** has the following form: + +``` +/job:/task:/device:: +``` + +where: + +* `` is an alpha-numeric string that does not start with a number. +* `` is a registered device type (such as `GPU` or `CPU`). +* `` is a non-negative integer representing the index of the task + in the job named ``. See @{tf.train.ClusterSpec} for an explanation + of jobs and tasks. +* `` is a non-negative integer representing the index of the + device, for example, to distinguish between different GPU devices used in the + same process. + +You do not need to specify every part of a device specification. For example, +if you are running in a single-machine configuration with a single GPU, you +might use @{tf.device} to pin some operations to the CPU and GPU: + +```python +# Operations created outside either context will run on the "best possible" +# device. For example, if you have a GPU and a CPU available, and the operation +# has a GPU implementation, TensorFlow will choose the GPU. +weights = tf.random_normal(...) + +with tf.device("/device:CPU:0"): + # Operations created in this context will be pinned to the CPU. + img = tf.decode_jpeg(tf.read_file("img.jpg")) + +with tf.device("/device:GPU:0"): + # Operations created in this context will be pinned to the GPU. + result = tf.matmul(weights, img) +``` +If you are deploying TensorFlow in a @{$distributed$typical distributed configuration}, +you might specify the job name and task ID to place variables on +a task in the parameter server job (`"/job:ps"`), and the other operations on +task in the worker job (`"/job:worker"`): + +```python +with tf.device("/job:ps/task:0"): + weights_1 = tf.Variable(tf.truncated_normal([784, 100])) + biases_1 = tf.Variable(tf.zeroes([100])) + +with tf.device("/job:ps/task:1"): + weights_2 = tf.Variable(tf.truncated_normal([100, 10])) + biases_2 = tf.Variable(tf.zeroes([10])) + +with tf.device("/job:worker"): + layer_1 = tf.matmul(train_batch, weights_1) + biases_1 + layer_2 = tf.matmul(train_batch, weights_2) + biases_2 +``` + +@{tf.device} gives you a lot of flexibility to choose placements for individual +operations or broad regions of a TensorFlow graph. In many cases, there are +simple heuristics that work well. For example, the +@{tf.train.replica_device_setter} API can be used with @{tf.device} to place +operations for **data-parallel distributed training**. For example, the +following code fragment shows how @{tf.train.replica_device_setter} applies +different placement policies to @{tf.Variable} objects and other operations: + +```python +with tf.device(tf.train.replica_device_setter(ps_tasks=3)): + # tf.Variable objects are, by default, placed on tasks in "/job:ps" in a + # round-robin fashion. + w_0 = tf.Variable(...) # placed on "/job:ps/task:0" + b_0 = tf.Variable(...) # placed on "/job:ps/task:1" + w_1 = tf.Variable(...) # placed on "/job:ps/task:2" + b_1 = tf.Variable(...) # placed on "/job:ps/task:0" + + input_data = tf.placeholder(tf.float32) # placed on "/job:worker" + layer_0 = tf.matmul(input_data, w_0) + b_0 # placed on "/job:worker" + layer_1 = tf.matmul(layer_0, w_1) + b_1 # placed on "/job:worker" +``` + +## Tensor-like objects + +Many TensorFlow operations take one or more @{tf.Tensor} objects as arguments. +For example, @{tf.matmul} takes two @{tf.Tensor} objects, and @{tf.add_n} takes +a list of `n` @{tf.Tensor} objects. For convenience, these functions will accept +a **tensor-like object** in place of a @{tf.Tensor}, and implicitly convert it +to a @{tf.Tensor} using the @{tf.convert_to_tensor} method. Tensor-like objects +include elements of the following types: + +* @{tf.Tensor} +* @{tf.Variable} +* [`numpy.ndarray`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html) +* `list` (and lists of tensor-like objects) +* Scalar Python types: `bool`, `float`, `int`, `str` + +You can register additional tensor-like types using +@{tf.register_tensor_conversion_function}. + +Note: By default, TensorFlow will create a new @{tf.Tensor} each time you use +the same tensor-like object. If the tensor-like object is large (e.g. a +`numpy.ndarray` containing a set of training examples) and you use it multiple +times, you may run out of memory. To avoid this, manually call +@{tf.convert_to_tensor} on the tensor-like object once and use the returned +@{tf.Tensor} instead. + +## Executing a graph in a @{tf.Session} + +TensorFlow uses the @{tf.Session} class to represent a connection between the +client program---typically a Python program, although a similar interface is +available in other languages---and the C++ runtime. A @{tf.Session} object +provides access to devices in the local machine, and remote devices using the +distributed TensorFlow runtime. It also caches information about your +@{tf.Graph} so that you can efficiently run the same computation multiple times. + +### Creating a @{tf.Session} + +If you are using the low-level TensorFlow API, you can create a @{tf.Session} +for the current default graph as follows: + +```python +# Create a default in-process session. +with tf.Session() as sess: + # ... + +# Create a remote session. +with tf.Session("grpc://example.org:2222"): + # ... +``` + +Since a @{tf.Session} owns physical resources (such as GPUs and +network connections), it is typically used as a context manager (in a `with` +block) that automatically closes the session when you exit the block. It is +also possible to create a session without using a `with` block, but you should +explicitly call @{tf.Session.close} when you are finished with it to free the +resources. + +Note: Higher-level APIs such as @{tf.train.MonitoredTrainingSession} or +@{tf.estimator.Estimator} will create and manage a @{tf.Session} for you. These +APIs accept optional `target` and `config` arguments (either directly, or as +part of a @{tf.estimator.RunConfig} object), with the same meaning as +described below. + +@{tf.Session.__init__} accepts three optional arguments: + +* **`target`.** If this argument is left empty (the default), the session will + only use devices in the local machine. However, you may also specify a + `grpc://` URL to specify the address of a TensorFlow server, which gives the + session access to all devices on machines that this server controls. See + @{tf.train.Server} for details of how to create a TensorFlow + server. For example, in the common **between-graph replication** + configuration, the @{tf.Session} connects to a @{tf.train.Server} in the same + process as the client. The [distributed TensorFlow](../deploy/distributed.md) + deployment guide describes other common scenarios. + +* **`graph`.** By default, a new @{tf.Session} will be bound to---and only able + to run operations in---the current default graph. If you are using multiple + graphs in your program (see [Programming with multiple + graphs](#programming_with_multiple_graphs) for more details), you can specify + an explicit @{tf.Graph} when you construct the session. + +* **`config`.** This argument allows you to specify a @{tf.ConfigProto} that + controls the behavior of the session. For example, some of the configuration + options include: + + * `allow_soft_placement`. Set this to `True` to enable a "soft" device + placement algorithm, which ignores @{tf.device} annotations that attempt + to place CPU-only operations on a GPU device, and places them on the CPU + instead. + + * `cluster_def`. When using distributed TensorFlow, this option allows you + to specify what machines to use in the computation, and provide a mapping + between job names, task indices, and network addresses. See + @{tf.train.ClusterSpec.as_cluster_def} for details. + + * `graph_options.optimizer_options`. Provides control over the optimizations + that TensorFlow performs on your graph before executing it. + + * `gpu_options.allow_growth`. Set this to `True` to change the GPU memory + allocator so that it gradually increases the amount of memory allocated, + rather than allocating most of the memory at startup. + + +### Using @{tf.Session.run} to execute operations + +The @{tf.Session.run} method is the main mechanism for running a @{tf.Operation} +or evaluating a @{tf.Tensor}. You can pass one or more @{tf.Operation} or +@{tf.Tensor} objects to @{tf.Session.run}, and TensorFlow will execute the +operations that are needed to compute the result. + +@{tf.Session.run} requires you to specify a list of **fetches**, which determine +the return values, and may be a @{tf.Operation}, a @{tf.Tensor}, or +a [tensor-like type](#tensor-like_objects) such as @{tf.Variable}. These fetches +determine what **subgraph** of the overall @{tf.Graph} must be executed to +produce the result: this is the subgraph that contains all operations named in +the fetch list, plus all operations whose outputs are used to compute the value +of the fetches. For example, the following code fragment shows how different +arguments to @{tf.Session.run} cause different subgraphs to be executed: + +```python +x = tf.constant([[37.0, -23.0], [1.0, 4.0]]) +w = tf.Variable(tf.random_uniform([2, 2])) +y = tf.matmul(x, w) +output = tf.nn.softmax(y) +init_op = w.initializer + +with tf.Session() as sess: + # Run the initializer on `w`. + sess.run(init_op) + + # Evaluate `output`. `sess.run(output)` will return a NumPy array containing + # the result of the computation. + print(sess.run(output)) + + # Evaluate `y` and `output`. Note that `y` will only be computed once, and its + # result used both to return `y_val` and as an input to the `tf.nn.softmax()` + # op. Both `y_val` and `output_val` will be NumPy arrays. + y_val, output_val = sess.run([y, output]) +``` + +@{tf.Session.run} also optionally takes a dictionary of **feeds**, which is a +mapping from @{tf.Tensor} objects (typically @{tf.placeholder} tensors) to +values (typically Python scalars, lists, or NumPy arrays) that will be +substituted for those tensors in the execution. For example: + +```python +# Define a placeholder that expects a vector of three floating-point values, +# and a computation that depends on it. +x = tf.placeholder(tf.float32, shape=[3]) +y = tf.square(x) + +with tf.Session() as sess: + # Feeding a value changes the result that is returned when you evaluate `y`. + print(sess.run(y, {x: [1.0, 2.0, 3.0]})) # => "[1.0, 4.0, 9.0]" + print(sess.run(y, {x: [0.0, 0.0, 5.0]})) # => "[0.0, 0.0, 25.0]" + + # Raises `tf.errors.InvalidArgumentError`, because you must feed a value for + # a `tf.placeholder()` when evaluating a tensor that depends on it. + sess.run(y) + + # Raises `ValueError`, because the shape of `37.0` does not match the shape + # of placeholder `x`. + sess.run(y, {x: 37.0}) +``` + +@{tf.Session.run} also accepts an optional `options` argument that enables you +to specify options about the call, and an optional `run_metadata` argument that +enables you to collect metadata about the execution. For example, you can use +these options together to collect tracing information about the execution: + +``` +y = tf.matmul([[37.0, -23.0], [1.0, 4.0]], tf.random_uniform([2, 2])) + +with tf.Session() as sess: + # Define options for the `sess.run()` call. + options = tf.RunOptions() + options.output_partition_graphs = True + options.trace_level = tf.RunOptions.FULL_TRACE + + # Define a container for the returned metadata. + metadata = tf.RunMetadata() + + sess.run(y, options=options, run_metadata=metadata) + + # Print the subgraphs that executed on each device. + print(metadata.partition_graphs) + + # Print the timings of each operation that executed. + print(metadata.step_stats) +``` + + +## Visualizing your graph + +TensorFlow includes tools that can help you to understand the code in a graph. +The **graph visualizer** is a component of TensorBoard that renders the +structure of your graph visually in a browser. The easiest way to create a +visualization is to pass a @{tf.Graph} when creating the +@{tf.summary.FileWriter}: + +```python +# Build your graph. +x = tf.constant([[37.0, -23.0], [1.0, 4.0]]) +w = tf.Variable(tf.random_uniform([2, 2])) +y = tf.matmul(x, w) +# ... +loss = ... +train_op = tf.train.AdagradOptimizer(0.01).minimize(loss) + +with tf.Session() as sess: + # `sess.graph` provides access to the graph used in a `tf.Session`. + writer = tf.summary.FileWriter("/tmp/log/...", sess.graph) + + # Perform your computation... + for i in range(1000): + sess.run(train_op) + # ... + + writer.close() +``` + +Note: If you are using a @{tf.estimator.Estimator}, the graph (and any +summaries) will be logged automatically to the `model_dir` that you specified +when creating the estimator. + +You can then open the log in `tensorboard`, navigate to the "Graph" tab, and +see a high-level visualization of your graph's structure. Note that a typical +TensorFlow graph---especially training graphs with automatically computed +gradients---has too many nodes to visualize at once. The graph visualizer makes +use of name scopes to group related operations into "super" nodes. You can +click on the orange "+" button on any of these super nodes to expand the +subgraph inside. + +![](../images/mnist_deep.png) + +For more information about visualizing your TensorFlow application with +TensorBoard, see the [TensorBoard tutorial](../get_started/summaries_and_tensorboard.md). + +## Programming with multiple graphs + +Note: When training a model, a common way of organizing your code is to use one +graph for training your model, and a separate graph for evaluating or performing +inference with a trained model. In many cases, the inference graph will be +different from the training graph: for example, techniques like dropout and +batch normalization use different operations in each case. Furthermore, by +default utilities like @{tf.train.Saver} use the names of @{tf.Variable} objects +(which have names based on an underlying @{tf.Operation}) to identify each +variable in a saved checkpoint. When programming this way, you can either use +completely separate Python processes to build and execute the graphs, or you can +use multiple graphs in the same process. This section describes how to use +multiple graphs in the same process. + +As noted above, TensorFlow provides a "default graph" that is implicitly passed +to all API functions in the same context. For many applications, a single graph +is sufficient. However, TensorFlow also provides methods for manipulating +the default graph, which can be useful in more advanced use cases. For example: + +* A @{tf.Graph} defines the namespace for @{tf.Operation} objects: each + operation in a single graph must have a unique name. TensorFlow will + "uniquify" the names of operations by appending `"_1"`, `"_2"`, and so on to + their names if the requested name is already taken. Using multiple explicitly + created graphs gives you more control over what name is given to each + operation. + +* The default graph stores information about every @{tf.Operation} and + @{tf.Tensor} that was ever added to it. If your program creates a large number + of unconnected subgraphs, it may be more efficient to use a different + @{tf.Graph} to build each subgraph, so that unrelated state can be garbage + collected. + +You can install a different @{tf.Graph} as the default graph, using the +@{tf.Graph.as_default} context manager: + +```python +g_1 = tf.Graph() +with g_1.as_default(): + # Operations created in this scope will be added to `g_1`. + c = tf.constant("Node in g_1") + + # Sessions created in this scope will run operations from `g_1`. + sess_1 = tf.Session() + +g_2 = tf.Graph() +with g_2.as_default(): + # Operations created in this scope will be added to `g_2`. + d = tf.constant("Node in g_2") + +# Alternatively, you can pass a graph when constructing a `tf.Session`: +# `sess_2` will run operations from `g_2`. +sess_2 = tf.Session(graph=g_2) + +assert c.graph is g_1 +assert sess_1.graph is g_1 + +assert d.graph is g_2 +assert sess_2.graph is g_2 +``` + +To inspect the current default graph, call @{tf.get_default_graph}, which +returns a @{tf.Graph} object: + +```python +# Print all of the operations in the default graph. +g = tf.get_default_graph() +print(g.get_operations()) +``` diff --git a/tensorflow/docs_src/guide/index.md b/tensorflow/docs_src/guide/index.md new file mode 100644 index 0000000000..eefdb9ceae --- /dev/null +++ b/tensorflow/docs_src/guide/index.md @@ -0,0 +1,86 @@ +# TensorFlow Guide + +The documents in this unit dive into the details of how TensorFlow +works. The units are as follows: + +## High Level APIs + + * @{$guide/keras}, TensorFlow's high-level API for building and + training deep learning models. + * @{$guide/eager}, an API for writing TensorFlow code + imperatively, like you would use Numpy. + * @{$guide/estimators}, a high-level API that provides + fully-packaged models ready for large-scale training and production. + * @{$guide/datasets}, easy input pipelines to bring your data into + your TensorFlow program. + +## Estimators + +* @{$estimators} provides an introduction. +* @{$premade_estimators}, introduces Estimators for machine learning. +* @{$custom_estimators}, which demonstrates how to build and train models you + design yourself. +* @{$feature_columns}, which shows how an Estimator can handle a variety of input + data types without changes to the model. +* @{$datasets_for_estimators} describes using tf.data with estimators. +* @{$checkpoints}, which explains how to save training progress and resume where + you left off. + +## Accelerators + + * @{$using_gpu} explains how TensorFlow assigns operations to + devices and how you can change the arrangement manually. + * @{$using_tpu} explains how to modify `Estimator` programs to run on a TPU. + +## Low Level APIs + + * @{$guide/low_level_intro}, which introduces the + basics of how you can use TensorFlow outside of the high Level APIs. + * @{$guide/tensors}, which explains how to create, + manipulate, and access Tensors--the fundamental object in TensorFlow. + * @{$guide/variables}, which details how + to represent shared, persistent state in your program. + * @{$guide/graphs}, which explains: + * dataflow graphs, which are TensorFlow's representation of computations + as dependencies between operations. + * sessions, which are TensorFlow's mechanism for running dataflow graphs + across one or more local or remote devices. + If you are programming with the low-level TensorFlow API, this unit + is essential. If you are programming with a high-level TensorFlow API + such as Estimators or Keras, the high-level API creates and manages + graphs and sessions for you, but understanding graphs and sessions + can still be helpful. + * @{$guide/saved_model}, which + explains how to save and restore variables and models. + +## ML Concepts + + * @{$guide/embedding}, which introduces the concept + of embeddings, provides a simple example of training an embedding in + TensorFlow, and explains how to view embeddings with the TensorBoard + Embedding Projector. + +## Debugging + + * @{$guide/debugger}, which + explains how to use the TensorFlow debugger (tfdbg). + +## TensorBoard + +TensorBoard is a utility to visualize different aspects of machine learning. +The following guides explain how to use TensorBoard: + + * @{$guide/summaries_and_tensorboard}, + which introduces TensorBoard. + * @{$guide/graph_viz}, which + explains how to visualize the computational graph. + * @{$guide/tensorboard_histograms} which demonstrates the how to + use TensorBoard's histogram dashboard. + + +## Misc + + * @{$guide/version_compat}, + which explains backward compatibility guarantees and non-guarantees. + * @{$guide/faq}, which contains frequently asked + questions about TensorFlow. diff --git a/tensorflow/docs_src/guide/keras.md b/tensorflow/docs_src/guide/keras.md new file mode 100644 index 0000000000..83172dab7f --- /dev/null +++ b/tensorflow/docs_src/guide/keras.md @@ -0,0 +1,623 @@ +# Keras + +Keras is a high-level API to build and train deep learning models. It's used for +fast prototyping, advanced research, and production, with three key advantages: + +- *User friendly*
+ Keras has a simple, consistent interface optimized for common use cases. It + provides clear and actionable feedback for user errors. +- *Modular and composable*
+ Keras models are made by connecting configurable building blocks together, + with few restrictions. +- *Easy to extend*
Write custom building blocks to express new ideas for + research. Create new layers, loss functions, and develop state-of-the-art + models. + +## Import tf.keras + +`tf.keras` is TensorFlow's implementation of the +[Keras API specification](https://keras.io){:.external}. This is a high-level +API to build and train models that includes first-class support for +TensorFlow-specific functionality, such as [eager execution](#eager_execution), +`tf.data` pipelines, and [Estimators](./estimators.md). +`tf.keras` makes TensorFlow easier to use without sacrificing flexibility and +performance. + +To get started, import `tf.keras` as part of your TensorFlow program setup: + +```python +import tensorflow as tf +from tensorflow import keras +``` + +`tf.keras` can run any Keras-compatible code, but keep in mind: + +* The `tf.keras` version in the latest TensorFlow release might not be the same + as the latest `keras` version from PyPI. Check `tf.keras.__version__`. +* When [saving a model's weights](#weights_only), `tf.keras` defaults to the + [checkpoint format](../get_started/checkpoints.md). Pass `save_format='h5'` to + use HDF5. + +## Build a simple model + +### Sequential model + +In Keras, you assemble *layers* to build *models*. A model is (usually) a graph +of layers. The most common type of model is a stack of layers: the +`tf.keras.Sequential` model. + +To build a simple, fully-connected network (i.e. multi-layer perceptron): + +```python +model = keras.Sequential() +# Adds a densely-connected layer with 64 units to the model: +model.add(keras.layers.Dense(64, activation='relu')) +# Add another: +model.add(keras.layers.Dense(64, activation='relu')) +# Add a softmax layer with 10 output units: +model.add(keras.layers.Dense(10, activation='softmax')) +``` + +### Configure the layers + +There are many `tf.keras.layers` available with some common constructor +parameters: + +* `activation`: Set the activation function for the layer. This parameter is + specified by the name of a built-in function or as a callable object. By + default, no activation is applied. +* `kernel_initializer` and `bias_initializer`: The initialization schemes + that create the layer's weights (kernel and bias). This parameter is a name or + a callable object. This defaults to the `"Glorot uniform"` initializer. +* `kernel_regularizer` and `bias_regularizer`: The regularization schemes + that apply the layer's weights (kernel and bias), such as L1 or L2 + regularization. By default, no regularization is applied. + +The following instantiates `tf.keras.layers.Dense` layers using constructor +arguments: + +```python +# Create a sigmoid layer: +layers.Dense(64, activation='sigmoid') +# Or: +layers.Dense(64, activation=tf.sigmoid) + +# A linear layer with L1 regularization of factor 0.01 applied to the kernel matrix: +layers.Dense(64, kernel_regularizer=keras.regularizers.l1(0.01)) +# A linear layer with L2 regularization of factor 0.01 applied to the bias vector: +layers.Dense(64, bias_regularizer=keras.regularizers.l2(0.01)) + +# A linear layer with a kernel initialized to a random orthogonal matrix: +layers.Dense(64, kernel_initializer='orthogonal') +# A linear layer with a bias vector initialized to 2.0s: +layers.Dense(64, bias_initializer=keras.initializers.constant(2.0)) +``` + +## Train and evaluate + +### Set up training + +After the model is constructed, configure its learning process by calling the +`compile` method: + +```python +model.compile(optimizer=tf.train.AdamOptimizer(0.001), + loss='categorical_crossentropy', + metrics=['accuracy']) +``` + +`tf.keras.Model.compile` takes three important arguments: + +* `optimizer`: This object specifies the training procedure. Pass it optimizer + instances from the `tf.train` module, such as + [`AdamOptimizer`](/api_docs/python/tf/train/AdamOptimizer), + [`RMSPropOptimizer`](/api_docs/python/tf/train/RMSPropOptimizer), or + [`GradientDescentOptimizer`](/api_docs/python/tf/train/GradientDescentOptimizer). +* `loss`: The function to minimize during optimization. Common choices include + mean square error (`mse`), `categorical_crossentropy`, and + `binary_crossentropy`. Loss functions are specified by name or by + passing a callable object from the `tf.keras.losses` module. +* `metrics`: Used to monitor training. These are string names or callables from + the `tf.keras.metrics` module. + +The following shows a few examples of configuring a model for training: + +```python +# Configure a model for mean-squared error regression. +model.compile(optimizer=tf.train.AdamOptimizer(0.01), + loss='mse', # mean squared error + metrics=['mae']) # mean absolute error + +# Configure a model for categorical classification. +model.compile(optimizer=tf.train.RMSPropOptimizer(0.01), + loss=keras.losses.categorical_crossentropy, + metrics=[keras.metrics.categorical_accuracy]) +``` + +### Input NumPy data + +For small datasets, use in-memory [NumPy](https://www.numpy.org/){:.external} +arrays to train and evaluate a model. The model is "fit" to the training data +using the `fit` method: + +```python +import numpy as np + +data = np.random.random((1000, 32)) +labels = np.random.random((1000, 10)) + +model.fit(data, labels, epochs=10, batch_size=32) +``` + +`tf.keras.Model.fit` takes three important arguments: + +* `epochs`: Training is structured into *epochs*. An epoch is one iteration over + the entire input data (this is done in smaller batches). +* `batch_size`: When passed NumPy data, the model slices the data into smaller + batches and iterates over these batches during training. This integer + specifies the size of each batch. Be aware that the last batch may be smaller + if the total number of samples is not divisible by the batch size. +* `validation_data`: When prototyping a model, you want to easily monitor its + performance on some validation data. Passing this argument—a tuple of inputs + and labels—allows the model to display the loss and metrics in inference mode + for the passed data, at the end of each epoch. + +Here's an example using `validation_data`: + +```python +import numpy as np + +data = np.random.random((1000, 32)) +labels = np.random.random((1000, 10)) + +val_data = np.random.random((100, 32)) +val_labels = np.random.random((100, 10)) + +model.fit(data, labels, epochs=10, batch_size=32, + validation_data=(val_data, val_labels)) +``` + +### Input tf.data datasets + +Use the [Datasets API](./datasets.md) to scale to large datasets +or multi-device training. Pass a `tf.data.Dataset` instance to the `fit` +method: + +```python +# Instantiates a toy dataset instance: +dataset = tf.data.Dataset.from_tensor_slices((data, labels)) +dataset = dataset.batch(32) +dataset = dataset.repeat() + +# Don't forget to specify `steps_per_epoch` when calling `fit` on a dataset. +model.fit(dataset, epochs=10, steps_per_epoch=30) +``` + +Here, the `fit` method uses the `steps_per_epoch` argument—this is the number of +training steps the model runs before it moves to the next epoch. Since the +`Dataset` yields batches of data, this snippet does not require a `batch_size`. + +Datasets can also be used for validation: + +```python +dataset = tf.data.Dataset.from_tensor_slices((data, labels)) +dataset = dataset.batch(32).repeat() + +val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_labels)) +val_dataset = val_dataset.batch(32).repeat() + +model.fit(dataset, epochs=10, steps_per_epoch=30, + validation_data=val_dataset, + validation_steps=3) +``` + +### Evaluate and predict + +The `tf.keras.Model.evaluate` and `tf.keras.Model.predict` methods can use NumPy +data and a `tf.data.Dataset`. + +To *evaluate* the inference-mode loss and metrics for the data provided: + +```python +model.evaluate(x, y, batch_size=32) + +model.evaluate(dataset, steps=30 +``` + +And to *predict* the output of the last layer in inference for the data provided, +as a NumPy array: + +``` +model.predict(x, batch_size=32) + +model.predict(dataset, steps=30) +``` + + +## Build advanced models + +### Functional API + +The `tf.keras.Sequential` model is a simple stack of layers that cannot +represent arbitrary models. Use the +[Keras functional API](https://keras.io/getting-started/functional-api-guide/){:.external} +to build complex model topologies such as: + +* Multi-input models, +* Multi-output models, +* Models with shared layers (the same layer called several times), +* Models with non-sequential data flows (e.g. residual connections). + +Building a model with the functional API works like this: + +1. A layer instance is callable and returns a tensor. +2. Input tensors and output tensors are used to define a `tf.keras.Model` + instance. +3. This model is trained just like the `Sequential` model. + +The following example uses the functional API to build a simple, fully-connected +network: + +```python +inputs = keras.Input(shape=(32,)) # Returns a placeholder tensor + +# A layer instance is callable on a tensor, and returns a tensor. +x = keras.layers.Dense(64, activation='relu')(inputs) +x = keras.layers.Dense(64, activation='relu')(x) +predictions = keras.layers.Dense(10, activation='softmax')(x) + +# Instantiate the model given inputs and outputs. +model = keras.Model(inputs=inputs, outputs=predictions) + +# The compile step specifies the training configuration. +model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), + loss='categorical_crossentropy', + metrics=['accuracy']) + +# Trains for 5 epochs +model.fit(data, labels, batch_size=32, epochs=5) +``` + +### Model subclassing + +Build a fully-customizable model by subclassing `tf.keras.Model` and defining +your own forward pass. Create layers in the `__init__` method and set them as +attributes of the class instance. Define the forward pass in the `call` method. + +Model subclassing is particularly useful when +[eager execution](./eager.md) is enabled since the forward pass +can be written imperatively. + +Key Point: Use the right API for the job. While model subclassing offers +flexibility, it comes at a cost of greater complexity and more opportunities for +user errors. If possible, prefer the functional API. + +The following example shows a subclassed `tf.keras.Model` using a custom forward +pass: + +```python +class MyModel(keras.Model): + + def __init__(self, num_classes=10): + super(MyModel, self).__init__(name='my_model') + self.num_classes = num_classes + # Define your layers here. + self.dense_1 = keras.layers.Dense(32, activation='relu') + self.dense_2 = keras.layers.Dense(num_classes, activation='sigmoid') + + def call(self, inputs): + # Define your forward pass here, + # using layers you previously defined (in `__init__`). + x = self.dense_1(inputs) + return self.dense_2(x) + + def compute_output_shape(self, input_shape): + # You need to override this function if you want to use the subclassed model + # as part of a functional-style model. + # Otherwise, this method is optional. + shape = tf.TensorShape(input_shape).as_list() + shape[-1] = self.num_classes + return tf.TensorShape(shape) + + +# Instantiates the subclassed model. +model = MyModel(num_classes=10) + +# The compile step specifies the training configuration. +model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), + loss='categorical_crossentropy', + metrics=['accuracy']) + +# Trains for 5 epochs. +model.fit(data, labels, batch_size=32, epochs=5) +``` + + +### Custom layers + +Create a custom layer by subclassing `tf.keras.layers.Layer` and implementing +the following methods: + +* `build`: Create the weights of the layer. Add weights with the `add_weight` + method. +* `call`: Define the forward pass. +* `compute_output_shape`: Specify how to compute the output shape of the layer + given the input shape. +* Optionally, a layer can be serialized by implementing the `get_config` method + and the `from_config` class method. + +Here's an example of a custom layer that implements a `matmul` of an input with +a kernel matrix: + +```python +class MyLayer(keras.layers.Layer): + + def __init__(self, output_dim, **kwargs): + self.output_dim = output_dim + super(MyLayer, self).__init__(**kwargs) + + def build(self, input_shape): + shape = tf.TensorShape((input_shape[1], self.output_dim)) + # Create a trainable weight variable for this layer. + self.kernel = self.add_weight(name='kernel', + shape=shape, + initializer='uniform', + trainable=True) + # Be sure to call this at the end + super(MyLayer, self).build(input_shape) + + def call(self, inputs): + return tf.matmul(inputs, self.kernel) + + def compute_output_shape(self, input_shape): + shape = tf.TensorShape(input_shape).as_list() + shape[-1] = self.output_dim + return tf.TensorShape(shape) + + def get_config(self): + base_config = super(MyLayer, self).get_config() + base_config['output_dim'] = self.output_dim + + @classmethod + def from_config(cls, config): + return cls(**config) + + +# Create a model using the custom layer +model = keras.Sequential([MyLayer(10), + keras.layers.Activation('softmax')]) + +# The compile step specifies the training configuration +model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), + loss='categorical_crossentropy', + metrics=['accuracy']) + +# Trains for 5 epochs. +model.fit(data, targets, batch_size=32, epochs=5) +``` + + +## Callbacks + +A callback is an object passed to a model to customize and extend its behavior +during training. You can write your own custom callback, or use the built-in +`tf.keras.callbacks` that include: + +* `tf.keras.callbacks.ModelCheckpoint`: Save checkpoints of your model at + regular intervals. +* `tf.keras.callbacks.LearningRateScheduler`: Dynamically change the learning + rate. +* `tf.keras.callbacks.EarlyStopping`: Interrupt training when validation + performance has stopped improving. +* `tf.keras.callbacks.TensorBoard`: Monitor the model's behavior using + [TensorBoard](./summaries_and_tensorboard.md). + +To use a `tf.keras.callbacks.Callback`, pass it to the model's `fit` method: + +```python +callbacks = [ + # Interrupt training if `val_loss` stops improving for over 2 epochs + keras.callbacks.EarlyStopping(patience=2, monitor='val_loss'), + # Write TensorBoard logs to `./logs` directory + keras.callbacks.TensorBoard(log_dir='./logs') +] +model.fit(data, labels, batch_size=32, epochs=5, callbacks=callbacks, + validation_data=(val_data, val_targets)) +``` + + +## Save and restore + +### Weights only + +Save and load the weights of a model using `tf.keras.Model.save_weights`: + +```python +# Save weights to a TensorFlow Checkpoint file +model.save_weights('./my_model') + +# Restore the model's state, +# this requires a model with the same architecture. +model.load_weights('my_model') +``` + +By default, this saves the model's weights in the +[TensorFlow checkpoint](../get_started/checkpoints.md) file format. Weights can +also be saved to the Keras HDF5 format (the default for the multi-backend +implementation of Keras): + +```python +# Save weights to a HDF5 file +model.save_weights('my_model.h5', save_format='h5') + +# Restore the model's state +model.load_weights('my_model.h5') +``` + + +### Configuration only + +A model's configuration can be saved—this serializes the model architecture +without any weights. A saved configuration can recreate and initialize the same +model, even without the code that defined the original model. Keras supports +JSON and YAML serialization formats: + +```python +# Serialize a model to JSON format +json_string = model.to_json() + +# Recreate the model (freshly initialized) +fresh_model = keras.models.from_json(json_string) + +# Serializes a model to YAML format +yaml_string = model.to_yaml() + +# Recreate the model +fresh_model = keras.models.from_yaml(yaml_string) +``` + +Caution: Subclassed models are not serializable because their architecture is +defined by the Python code in the body of the `call` method. + + +### Entire model + +The entire model can be saved to a file that contains the weight values, the +model's configuration, and even the optimizer's configuration. This allows you +to checkpoint a model and resume training later—from the exact same +state—without access to the original code. + +```python +# Create a trivial model +model = keras.Sequential([ + keras.layers.Dense(10, activation='softmax', input_shape=(32,)), + keras.layers.Dense(10, activation='softmax') +]) +model.compile(optimizer='rmsprop', + loss='categorical_crossentropy', + metrics=['accuracy']) +model.fit(data, targets, batch_size=32, epochs=5) + + +# Save entire model to a HDF5 file +model.save('my_model.h5') + +# Recreate the exact same model, including weights and optimizer. +model = keras.models.load_model('my_model.h5') +``` + + +## Eager execution + +[Eager execution](./eager.md) is an imperative programming +environment that evaluates operations immediately. This is not required for +Keras, but is supported by `tf.keras` and useful for inspecting your program and +debugging. + +All of the `tf.keras` model-building APIs are compatible with eager execution. +And while the `Sequential` and functional APIs can be used, eager execution +especially benefits *model subclassing* and building *custom layers*—the APIs +that require you to write the forward pass as code (instead of the APIs that +create models by assembling existing layers). + +See the [eager execution guide](./eager.md#build_a_model) for +examples of using Keras models with custom training loops and `tf.GradientTape`. + + +## Distribution + +### Estimators + +The [Estimators](./estimators.md) API is used for training models +for distributed environments. This targets industry use cases such as +distributed training on large datasets that can export a model for production. + +A `tf.keras.Model` can be trained with the `tf.estimator` API by converting the +model to an `tf.estimator.Estimator` object with +`tf.keras.estimator.model_to_estimator`. See +[Creating Estimators from Keras models](./estimators.md#creating_estimators_from_keras_models). + +```python +model = keras.Sequential([layers.Dense(10,activation='softmax'), + layers.Dense(10,activation='softmax')]) + +model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), + loss='categorical_crossentropy', + metrics=['accuracy']) + +estimator = keras.estimator.model_to_estimator(model) +``` + +Note: Enable [eager execution](./eager.md) for debugging +[Estimator input functions](./premade_estimators.md#create_input_functions) +and inspecting data. + +### Multiple GPUs + +`tf.keras` models can run on multiple GPUs using +`tf.contrib.distribute.DistributionStrategy`. This API provides distributed +training on multiple GPUs with almost no changes to existing code. + +Currently, `tf.contrib.distribute.MirroredStrategy` is the only supported +distribution strategy. `MirroredStrategy` does in-graph replication with +synchronous training using all-reduce on a single machine. To use +`DistributionStrategy` with Keras, convert the `tf.keras.Model` to a +`tf.estimator.Estimator` with `tf.keras.estimator.model_to_estimator`, then +train the estimator + +The following example distributes a `tf.keras.Model` across multiple GPUs on a +single machine. + +First, define a simple model: + +```python +model = keras.Sequential() +model.add(keras.layers.Dense(16, activation='relu', input_shape=(10,))) +model.add(keras.layers.Dense(1, activation='sigmoid')) + +optimizer = tf.train.GradientDescentOptimizer(0.2) + +model.compile(loss='binary_crossentropy', optimizer=optimizer) +model.summary() +``` + +Convert the Keras model to a `tf.estimator.Estimator` instance: + +```python +keras_estimator = keras.estimator.model_to_estimator( + keras_model=model, + config=config, + model_dir='/tmp/model_dir') +``` + +Define an *input pipeline*. The `input_fn` returns a `tf.data.Dataset` object +used to distribute the data across multiple devices—with each device processing +a slice of the input batch. + +```python +def input_fn(): + x = np.random.random((1024, 10)) + y = np.random.randint(2, size=(1024, 1)) + x = tf.cast(x, tf.float32) + dataset = tf.data.Dataset.from_tensor_slices((x, y)) + dataset = dataset.repeat(10) + dataset = dataset.batch(32) + return dataset +``` + +Next, create a `tf.estimator.RunConfig` and set the `train_distribute` argument +to the `tf.contrib.distribute.MirroredStrategy` instance. When creating +`MirroredStrategy`, you can specify a list of devices or set the `num_gpus` +argument. The default uses all available GPUs, like the following: + +```python +strategy = tf.contrib.distribute.MirroredStrategy() +config = tf.estimator.RunConfig(train_distribute=strategy) +``` + +Finally, train the `Estimator` instance by providing the `input_fn` and `steps` +arguments: + +```python +keras_estimator.train(input_fn=input_fn, steps=10) +``` diff --git a/tensorflow/docs_src/guide/leftnav_files b/tensorflow/docs_src/guide/leftnav_files new file mode 100644 index 0000000000..357a2a1cb9 --- /dev/null +++ b/tensorflow/docs_src/guide/leftnav_files @@ -0,0 +1,40 @@ +index.md + +### High Level APIs +keras.md +eager.md +datasets.md + +### Estimators +estimators.md: Introduction to Estimators +premade_estimators.md +custom_estimators.md +feature_columns.md +datasets_for_estimators.md +checkpoints.md + +### Accelerators +using_gpu.md +using_tpu.md + +### Low Level APIs +low_level_intro.md +tensors.md +variables.md +graphs.md +saved_model.md + +### ML Concepts +embedding.md + +### Debugging +debugger.md + +### TensorBoard +summaries_and_tensorboard.md: Visualizing Learning +graph_viz.md: Graphs +tensorboard_histograms.md: Histograms + +### Misc +version_compat.md +faq.md diff --git a/tensorflow/docs_src/guide/low_level_intro.md b/tensorflow/docs_src/guide/low_level_intro.md new file mode 100644 index 0000000000..665a5568b4 --- /dev/null +++ b/tensorflow/docs_src/guide/low_level_intro.md @@ -0,0 +1,604 @@ +# Introduction + +This guide gets you started programming in the low-level TensorFlow APIs +(TensorFlow Core), showing you how to: + + * Manage your own TensorFlow program (a `tf.Graph`) and TensorFlow + runtime (a `tf.Session`), instead of relying on Estimators to manage them. + * Run TensorFlow operations, using a `tf.Session`. + * Use high level components ([datasets](#datasets), [layers](#layers), and + [feature_columns](#feature_columns)) in this low level environment. + * Build your own training loop, instead of using the one + @{$premade_estimators$provided by Estimators}. + +We recommend using the higher level APIs to build models when possible. +Knowing TensorFlow Core is valuable for the following reasons: + + * Experimentation and debugging are both more straight forward + when you can use low level TensorFlow operations directly. + * It gives you a mental model of how things work internally when + using the higher level APIs. + +## Setup + +Before using this guide, @{$install$install TensorFlow}. + +To get the most out of this guide, you should know the following: + +* How to program in Python. +* At least a little bit about arrays. +* Ideally, something about machine learning. + +Feel free to launch `python` and follow along with this walkthrough. +Run the following lines to set up your Python environment: + +```python +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +import numpy as np +import tensorflow as tf +``` + +## Tensor Values + +The central unit of data in TensorFlow is the **tensor**. A tensor consists of a +set of primitive values shaped into an array of any number of dimensions. A +tensor's **rank** is its number of dimensions, while its **shape** is a tuple +of integers specifying the array's length along each dimension. Here are some +examples of tensor values: + +```python +3. # a rank 0 tensor; a scalar with shape [], +[1., 2., 3.] # a rank 1 tensor; a vector with shape [3] +[[1., 2., 3.], [4., 5., 6.]] # a rank 2 tensor; a matrix with shape [2, 3] +[[[1., 2., 3.]], [[7., 8., 9.]]] # a rank 3 tensor with shape [2, 1, 3] +``` + +TensorFlow uses numpy arrays to represent tensor **values**. + +## TensorFlow Core Walkthrough + +You might think of TensorFlow Core programs as consisting of two discrete +sections: + +1. Building the computational graph (a @{tf.Graph}). +2. Running the computational graph (using a @{tf.Session}). + +### Graph + +A **computational graph** is a series of TensorFlow operations arranged into a +graph. The graph is composed of two types of objects. + + * @{tf.Operation$Operations} (or "ops"): The nodes of the graph. + Operations describe calculations that consume and produce tensors. + * @{tf.Tensor$Tensors}: The edges in the graph. These represent the values + that will flow through the graph. Most TensorFlow functions return + `tf.Tensors`. + +Important: `tf.Tensors` do not have values, they are just handles to elements +in the computation graph. + +Let's build a simple computational graph. The most basic operation is a +constant. The Python function that builds the operation takes a tensor value as +input. The resulting operation takes no inputs. When run, it outputs the +value that was passed to the constructor. We can create two floating point +constants `a` and `b` as follows: + +```python +a = tf.constant(3.0, dtype=tf.float32) +b = tf.constant(4.0) # also tf.float32 implicitly +total = a + b +print(a) +print(b) +print(total) +``` + +The print statements produce: + +``` +Tensor("Const:0", shape=(), dtype=float32) +Tensor("Const_1:0", shape=(), dtype=float32) +Tensor("add:0", shape=(), dtype=float32) +``` + +Notice that printing the tensors does not output the values `3.0`, `4.0`, and +`7.0` as you might expect. The above statements only build the computation +graph. These `tf.Tensor` objects just represent the results of the operations +that will be run. + +Each operation in a graph is given a unique name. This name is independent of +the names the objects are assigned to in Python. Tensors are named after the +operation that produces them followed by an output index, as in +`"add:0"` above. + +### TensorBoard + +TensorFlow provides a utility called TensorBoard. One of TensorBoard's many +capabilities is visualizing a computation graph. You can easily do this with +a few simple commands. + +First you save the computation graph to a TensorBoard summary file as +follows: + +``` +writer = tf.summary.FileWriter('.') +writer.add_graph(tf.get_default_graph()) +``` + +This will produce an `event` file in the current directory with a name in the +following format: + +``` +events.out.tfevents.{timestamp}.{hostname} +``` + +Now, in a new terminal, launch TensorBoard with the following shell command: + +```bsh +tensorboard --logdir . +``` + +Then open TensorBoard's [graphs page](http://localhost:6006/#graphs) in your +browser, and you should see a graph similar to the following: + +![TensorBoard screenshot](https://www.tensorflow.org/images/getting_started_add.png) + +For more about TensorBoard's graph visualization tools see @{$graph_viz}. + +### Session + +To evaluate tensors, instantiate a @{tf.Session} object, informally known as a +**session**. A session encapsulates the state of the TensorFlow runtime, and +runs TensorFlow operations. If a `tf.Graph` is like a `.py` file, a `tf.Session` +is like the `python` executable. + +The following code creates a `tf.Session` object and then invokes its `run` +method to evaluate the `total` tensor we created above: + +```python +sess = tf.Session() +print(sess.run(total)) +``` + +When you request the output of a node with `Session.run` TensorFlow backtracks +through the graph and runs all the nodes that provide input to the requested +output node. So this prints the expected value of 7.0: + +``` +7.0 +``` + +You can pass multiple tensors to `tf.Session.run`. The `run` method +transparently handles any combination of tuples or dictionaries, as in the +following example: + +```python +print(sess.run({'ab':(a, b), 'total':total})) +``` + +which returns the results in a structure of the same layout: + +``` None +{'total': 7.0, 'ab': (3.0, 4.0)} +``` + +During a call to `tf.Session.run` any `tf.Tensor` only has a single value. +For example, the following code calls `tf.random_uniform` to produce a +`tf.Tensor` that generates a random 3-element vector (with values in `[0,1)`): + +```python +vec = tf.random_uniform(shape=(3,)) +out1 = vec + 1 +out2 = vec + 2 +print(sess.run(vec)) +print(sess.run(vec)) +print(sess.run((out1, out2))) +``` + +The result shows a different random value on each call to `run`, but +a consistent value during a single `run` (`out1` and `out2` receive the same +random input): + +``` +[ 0.52917576 0.64076328 0.68353939] +[ 0.66192627 0.89126778 0.06254101] +( + array([ 1.88408756, 1.87149239, 1.84057522], dtype=float32), + array([ 2.88408756, 2.87149239, 2.84057522], dtype=float32) +) +``` + +Some TensorFlow functions return `tf.Operations` instead of `tf.Tensors`. +The result of calling `run` on an Operation is `None`. You run an operation +to cause a side-effect, not to retrieve a value. Examples of this include the +[initialization](#Initializing Layers), and [training](#Training) ops +demonstrated later. + +### Feeding + +As it stands, this graph is not especially interesting because it always +produces a constant result. A graph can be parameterized to accept external +inputs, known as **placeholders**. A **placeholder** is a promise to provide a +value later, like a function argument. + +```python +x = tf.placeholder(tf.float32) +y = tf.placeholder(tf.float32) +z = x + y +``` + +The preceding three lines are a bit like a function in which we +define two input parameters (`x` and `y`) and then an operation on them. We can +evaluate this graph with multiple inputs by using the `feed_dict` argument of +the @{tf.Session.run$run method} to feed concrete values to the placeholders: + +```python +print(sess.run(z, feed_dict={x: 3, y: 4.5})) +print(sess.run(z, feed_dict={x: [1, 3], y: [2, 4]})) +``` +This results in the following output: + +``` +7.5 +[ 3. 7.] +``` + +Also note that the `feed_dict` argument can be used to overwrite any tensor in +the graph. The only difference between placeholders and other `tf.Tensors` is +that placeholders throw an error if no value is fed to them. + +## Datasets + +Placeholders work for simple experiments, but @{tf.data$Datasets} are the +preferred method of streaming data into a model. + +To get a runnable `tf.Tensor` from a Dataset you must first convert it to a +@{tf.data.Iterator}, and then call the Iterator's +@{tf.data.Iterator.get_next$`get_next`} method. + +The simplest way to create an Iterator is with the +@{tf.data.Dataset.make_one_shot_iterator$`make_one_shot_iterator`} method. +For example, in the following code the `next_item` tensor will return a row from +the `my_data` array on each `run` call: + +``` python +my_data = [ + [0, 1,], + [2, 3,], + [4, 5,], + [6, 7,], +] +slices = tf.data.Dataset.from_tensor_slices(my_data) +next_item = slices.make_one_shot_iterator().get_next() +``` + +Reaching the end of the data stream causes `Dataset` to throw an +@{tf.errors.OutOfRangeError$`OutOfRangeError`}. For example, the following code +reads the `next_item` until there is no more data to read: + +``` python +while True: + try: + print(sess.run(next_item)) + except tf.errors.OutOfRangeError: + break +``` + +If the `Dataset` depends on stateful operations you may need to +initialize the iterator before using it, as shown below: + +``` python +r = tf.random_normal([10,3]) +dataset = tf.data.Dataset.from_tensor_slices(r) +iterator = dataset.make_initializable_iterator() +next_row = iterator.get_next() + +sess.run(iterator.initializer) +while True: + try: + print(sess.run(next_row)) + except tf.errors.OutOfRangeError: + break +``` + +For more details on Datasets and Iterators see: @{$guide/datasets}. + +## Layers + +A trainable model must modify the values in the graph to get new outputs with +the same input. @{tf.layers$Layers} are the preferred way to add trainable +parameters to a graph. + +Layers package together both the variables and the operations that act +on them. For example a +[densely-connected layer](https://developers.google.com/machine-learning/glossary/#fully_connected_layer) +performs a weighted sum across all inputs +for each output and applies an optional +[activation function](https://developers.google.com/machine-learning/glossary/#activation_function). +The connection weights and biases are managed by the layer object. + +### Creating Layers + +The following code creates a @{tf.layers.Dense$`Dense`} layer that takes a +batch of input vectors, and produces a single output value for each. To apply a +layer to an input, call the layer as if it were a function. For example: + +```python +x = tf.placeholder(tf.float32, shape=[None, 3]) +linear_model = tf.layers.Dense(units=1) +y = linear_model(x) +``` + +The layer inspects its input to determine sizes for its internal variables. So +here we must set the shape of the `x` placeholder so that the layer can +build a weight matrix of the correct size. + +Now that we have defined the calculation of the output, `y`, there is one more +detail we need to take care of before we run the calculation. + +### Initializing Layers + +The layer contains variables that must be **initialized** before they can be +used. While it is possible to initialize variables individually, you can easily +initialize all the variables in a TensorFlow graph as follows: + +```python +init = tf.global_variables_initializer() +sess.run(init) +``` + +Important: Calling `tf.global_variables_initializer` only +creates and returns a handle to a TensorFlow operation. That op +will initialize all the global variables when we run it with `tf.Session.run`. + +Also note that this `global_variables_initializer` only initializes variables +that existed in the graph when the initializer was created. So the initializer +should be one of the last things added during graph construction. + +### Executing Layers + +Now that the layer is initialized, we can evaluate the `linear_model`'s output +tensor as we would any other tensor. For example, the following code: + +```python +print(sess.run(y, {x: [[1, 2, 3],[4, 5, 6]]})) +``` + +will generate a two-element output vector such as the following: + +``` +[[-3.41378999] + [-9.14999008]] +``` + +### Layer Function shortcuts + +For each layer class (like @{tf.layers.Dense}) TensorFlow also supplies a +shortcut function (like @{tf.layers.dense}). The only difference is that the +shortcut function versions create and run the layer in a single call. For +example, the following code is equivalent to the earlier version: + +```python +x = tf.placeholder(tf.float32, shape=[None, 3]) +y = tf.layers.dense(x, units=1) + +init = tf.global_variables_initializer() +sess.run(init) + +print(sess.run(y, {x: [[1, 2, 3], [4, 5, 6]]})) +``` + +While convenient, this approach allows no access to the @{tf.layers.Layer} +object. This makes introspection and debugging more difficult, +and layer reuse impossible. + +## Feature columns + +The easiest way to experiment with feature columns is using the +@{tf.feature_column.input_layer} function. This function only accepts +@{$feature_columns$dense columns} as inputs, so to view the result +of a categorical column you must wrap it in an +@{tf.feature_column.indicator_column}. For example: + +``` python +features = { + 'sales' : [[5], [10], [8], [9]], + 'department': ['sports', 'sports', 'gardening', 'gardening']} + +department_column = tf.feature_column.categorical_column_with_vocabulary_list( + 'department', ['sports', 'gardening']) +department_column = tf.feature_column.indicator_column(department_column) + +columns = [ + tf.feature_column.numeric_column('sales'), + department_column +] + +inputs = tf.feature_column.input_layer(features, columns) +``` + +Running the `inputs` tensor will parse the `features` into a batch of vectors. + +Feature columns can have internal state, like layers, so they often need to be +initialized. Categorical columns use @{tf.contrib.lookup$lookup tables} +internally and these require a separate initialization op, +@{tf.tables_initializer}. + +``` python +var_init = tf.global_variables_initializer() +table_init = tf.tables_initializer() +sess = tf.Session() +sess.run((var_init, table_init)) +``` + +Once the internal state has been initialized you can run `inputs` like any +other `tf.Tensor`: + +```python +print(sess.run(inputs)) +``` + +This shows how the feature columns have packed the input vectors, with the +one-hot "department" as the first two indices and "sales" as the third. + +```None +[[ 1. 0. 5.] + [ 1. 0. 10.] + [ 0. 1. 8.] + [ 0. 1. 9.]] +``` + +## Training + +Now that you're familiar with the basics of core TensorFlow, let's train a +small regression model manually. + +### Define the data + +First let's define some inputs, `x`, and the expected output for each input, +`y_true`: + +```python +x = tf.constant([[1], [2], [3], [4]], dtype=tf.float32) +y_true = tf.constant([[0], [-1], [-2], [-3]], dtype=tf.float32) +``` + +### Define the model + +Next, build a simple linear model, with 1 output: + +``` python +linear_model = tf.layers.Dense(units=1) + +y_pred = linear_model(x) +``` + +You can evaluate the predictions as follows: + +``` python +sess = tf.Session() +init = tf.global_variables_initializer() +sess.run(init) + +print(sess.run(y_pred)) +``` + +The model hasn't yet been trained, so the four "predicted" values aren't very +good. Here's what we got; your own output will almost certainly differ: + +``` None +[[ 0.02631879] + [ 0.05263758] + [ 0.07895637] + [ 0.10527515]] +``` + +### Loss + +To optimize a model, you first need to define the loss. We'll use the mean +square error, a standard loss for regression problems. + +While you could do this manually with lower level math operations, +the @{tf.losses} module provides a set of common loss functions. You can use it +to calculate the mean square error as follows: + +``` python +loss = tf.losses.mean_squared_error(labels=y_true, predictions=y_pred) + +print(sess.run(loss)) +``` +This will produce a loss value, something like: + +``` None +2.23962 +``` + +### Training + +TensorFlow provides +[**optimizers**](https://developers.google.com/machine-learning/glossary/#optimizer) +implementing standard optimization algorithms. These are implemented as +sub-classes of @{tf.train.Optimizer}. They incrementally change each +variable in order to minimize the loss. The simplest optimization algorithm is +[**gradient descent**](https://developers.google.com/machine-learning/glossary/#gradient_descent), +implemented by @{tf.train.GradientDescentOptimizer}. It modifies each +variable according to the magnitude of the derivative of loss with respect to +that variable. For example: + +```python +optimizer = tf.train.GradientDescentOptimizer(0.01) +train = optimizer.minimize(loss) +``` + +This code builds all the graph components necessary for the optimization, and +returns a training operation. When run, the training op will update variables +in the graph. You might run it as follows: + +```python +for i in range(100): + _, loss_value = sess.run((train, loss)) + print(loss_value) +``` + +Since `train` is an op, not a tensor, it doesn't return a value when run. +To see the progression of the loss during training, we run the loss tensor at +the same time, producing output like the following: + +``` None +1.35659 +1.00412 +0.759167 +0.588829 +0.470264 +0.387626 +0.329918 +0.289511 +0.261112 +0.241046 +... +``` + +### Complete program + +```python +x = tf.constant([[1], [2], [3], [4]], dtype=tf.float32) +y_true = tf.constant([[0], [-1], [-2], [-3]], dtype=tf.float32) + +linear_model = tf.layers.Dense(units=1) + +y_pred = linear_model(x) +loss = tf.losses.mean_squared_error(labels=y_true, predictions=y_pred) + +optimizer = tf.train.GradientDescentOptimizer(0.01) +train = optimizer.minimize(loss) + +init = tf.global_variables_initializer() + +sess = tf.Session() +sess.run(init) +for i in range(100): + _, loss_value = sess.run((train, loss)) + print(loss_value) + +print(sess.run(y_pred)) +``` + +## Next steps + +To learn more about building models with TensorFlow consider the following: + +* @{$custom_estimators$Custom Estimators}, to learn how to build + customized models with TensorFlow. Your knowledge of TensorFlow Core will + help you understand and debug your own models. + +If you want to learn more about the inner workings of TensorFlow consider the +following documents, which go into more depth on many of the topics discussed +here: + +* @{$graphs} +* @{$tensors} +* @{$variables} + + diff --git a/tensorflow/docs_src/guide/premade_estimators.md b/tensorflow/docs_src/guide/premade_estimators.md new file mode 100644 index 0000000000..3e910c1fe2 --- /dev/null +++ b/tensorflow/docs_src/guide/premade_estimators.md @@ -0,0 +1,430 @@ +# Premade Estimators + +This document introduces the TensorFlow programming environment and shows you +how to solve the Iris classification problem in TensorFlow. + +## Prerequisites + +Prior to using the sample code in this document, you'll need to do the +following: + +* @{$install$Install TensorFlow}. +* If you installed TensorFlow with virtualenv or Anaconda, activate your + TensorFlow environment. +* Install or upgrade pandas by issuing the following command: + + pip install pandas + +## Getting the sample code + +Take the following steps to get the sample code we'll be going through: + +1. Clone the TensorFlow Models repository from GitHub by entering the following + command: + + git clone https://github.com/tensorflow/models + +1. Change directory within that branch to the location containing the examples + used in this document: + + cd models/samples/core/get_started/ + +The program described in this document is +[`premade_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py). +This program uses +[`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py) +to fetch its training data. + +### Running the program + +You run TensorFlow programs as you would run any Python program. For example: + +``` bsh +python premade_estimator.py +``` + +The program should output training logs followed by some predictions against +the test set. For example, the first line in the following output shows that +the model thinks there is a 99.6% chance that the first example in the test +set is a Setosa. Since the test set expected Setosa, this appears to be +a good prediction. + +``` None +... +Prediction is "Setosa" (99.6%), expected "Setosa" + +Prediction is "Versicolor" (99.8%), expected "Versicolor" + +Prediction is "Virginica" (97.9%), expected "Virginica" +``` + +If the program generates errors instead of answers, ask yourself the following +questions: + +* Did you install TensorFlow properly? +* Are you using the correct version of TensorFlow? +* Did you activate the environment you installed TensorFlow in? (This is + only relevant in certain installation mechanisms.) + +## The programming stack + +Before getting into the details of the program itself, let's investigate the +programming environment. As the following illustration shows, TensorFlow +provides a programming stack consisting of multiple API layers: + +
+ +
+ +We strongly recommend writing TensorFlow programs with the following APIs: + +* @{$guide/estimators$Estimators}, which represent a complete model. + The Estimator API provides methods to train the model, to judge the model's + accuracy, and to generate predictions. +* @{$guide/datasets_for_estimators}, which build a data input + pipeline. The Dataset API has methods to load and manipulate data, and feed + it into your model. The Dataset API meshes well with the Estimators API. + +## Classifying irises: an overview + +The sample program in this document builds and tests a model that +classifies Iris flowers into three different species based on the size of their +[sepals](https://en.wikipedia.org/wiki/Sepal) and +[petals](https://en.wikipedia.org/wiki/Petal). + +
+Petal geometry compared for three iris species: Iris setosa, Iris virginica, and Iris versicolor +
+ +**From left to right, +[*Iris setosa*](https://commons.wikimedia.org/w/index.php?curid=170298) (by +[Radomil](https://commons.wikimedia.org/wiki/User:Radomil), CC BY-SA 3.0), +[*Iris versicolor*](https://commons.wikimedia.org/w/index.php?curid=248095) (by +[Dlanglois](https://commons.wikimedia.org/wiki/User:Dlanglois), CC BY-SA 3.0), +and [*Iris virginica*](https://www.flickr.com/photos/33397993@N05/3352169862) +(by [Frank Mayfield](https://www.flickr.com/photos/33397993@N05), CC BY-SA +2.0).** + +### The data set + +The Iris data set contains four features and one +[label](https://developers.google.com/machine-learning/glossary/#label). +The four features identify the following botanical characteristics of +individual Iris flowers: + +* sepal length +* sepal width +* petal length +* petal width + +Our model will represent these features as `float32` numerical data. + +The label identifies the Iris species, which must be one of the following: + +* Iris setosa (0) +* Iris versicolor (1) +* Iris virginica (2) + +Our model will represent the label as `int32` categorical data. + +The following table shows three examples in the data set: + +|sepal length | sepal width | petal length | petal width| species (label) | +|------------:|------------:|-------------:|-----------:|:---------------:| +| 5.1 | 3.3 | 1.7 | 0.5 | 0 (Setosa) | +| 5.0 | 2.3 | 3.3 | 1.0 | 1 (versicolor)| +| 6.4 | 2.8 | 5.6 | 2.2 | 2 (virginica) | + +### The algorithm + +The program trains a Deep Neural Network classifier model having the following +topology: + +* 2 hidden layers. +* Each hidden layer contains 10 nodes. + +The following figure illustrates the features, hidden layers, and predictions +(not all of the nodes in the hidden layers are shown): + +
+A diagram of the network architecture: Inputs, 2 hidden layers, and outputs +
+ +### Inference + +Running the trained model on an unlabeled example yields three predictions, +namely, the likelihood that this flower is the given Iris species. The sum of +those output predictions will be 1.0. For example, the prediction on an +unlabeled example might be something like the following: + +* 0.03 for Iris Setosa +* 0.95 for Iris Versicolor +* 0.02 for Iris Virginica + +The preceding prediction indicates a 95% probability that the given unlabeled +example is an Iris Versicolor. + +## Overview of programming with Estimators + +An Estimator is TensorFlow's high-level representation of a complete model. It +handles the details of initialization, logging, saving and restoring, and many +other features so you can concentrate on your model. For more details see +@{$guide/estimators}. + +An Estimator is any class derived from @{tf.estimator.Estimator}. TensorFlow +provides a collection of +@{tf.estimator$pre-made Estimators} +(for example, `LinearRegressor`) to implement common ML algorithms. Beyond +those, you may write your own +@{$custom_estimators$custom Estimators}. +We recommend using pre-made Estimators when just getting started. + +To write a TensorFlow program based on pre-made Estimators, you must perform the +following tasks: + +* Create one or more input functions. +* Define the model's feature columns. +* Instantiate an Estimator, specifying the feature columns and various + hyperparameters. +* Call one or more methods on the Estimator object, passing the appropriate + input function as the source of the data. + +Let's see how those tasks are implemented for Iris classification. + +## Create input functions + +You must create input functions to supply data for training, +evaluating, and prediction. + +An **input function** is a function that returns a @{tf.data.Dataset} object +which outputs the following two-element tuple: + +* [`features`](https://developers.google.com/machine-learning/glossary/#feature) - A Python dictionary in which: + * Each key is the name of a feature. + * Each value is an array containing all of that feature's values. +* `label` - An array containing the values of the + [label](https://developers.google.com/machine-learning/glossary/#label) for + every example. + +Just to demonstrate the format of the input function, here's a simple +implementation: + +```python +def input_evaluation_set(): + features = {'SepalLength': np.array([6.4, 5.0]), + 'SepalWidth': np.array([2.8, 2.3]), + 'PetalLength': np.array([5.6, 3.3]), + 'PetalWidth': np.array([2.2, 1.0])} + labels = np.array([2, 1]) + return features, labels +``` + +Your input function may generate the `features` dictionary and `label` list any +way you like. However, we recommend using TensorFlow's Dataset API, which can +parse all sorts of data. At a high level, the Dataset API consists of the +following classes: + +
+A diagram showing subclasses of the Dataset class +
+ +Where the individual members are: + +* `Dataset` - Base class containing methods to create and transform + datasets. Also allows you to initialize a dataset from data in memory, or from + a Python generator. +* `TextLineDataset` - Reads lines from text files. +* `TFRecordDataset` - Reads records from TFRecord files. +* `FixedLengthRecordDataset` - Reads fixed size records from binary files. +* `Iterator` - Provides a way to access one data set element at a time. + +The Dataset API can handle a lot of common cases for you. For example, +using the Dataset API, you can easily read in records from a large collection +of files in parallel and join them into a single stream. + +To keep things simple in this example we are going to load the data with +[pandas](https://pandas.pydata.org/), and build our input pipeline from this +in-memory data. + +Here is the input function used for training in this program, which is available +in [`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py): + +``` python +def train_input_fn(features, labels, batch_size): + """An input function for training""" + # Convert the inputs to a Dataset. + dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) + + # Shuffle, repeat, and batch the examples. + return dataset.shuffle(1000).repeat().batch(batch_size) +``` + +## Define the feature columns + +A [**feature column**](https://developers.google.com/machine-learning/glossary/#feature_columns) +is an object describing how the model should use raw input data from the +features dictionary. When you build an Estimator model, you pass it a list of +feature columns that describes each of the features you want the model to use. +The @{tf.feature_column} module provides many options for representing data +to the model. + +For Iris, the 4 raw features are numeric values, so we'll build a list of +feature columns to tell the Estimator model to represent each of the four +features as 32-bit floating-point values. Therefore, the code to create the +feature column is: + +```python +# Feature columns describe how to use the input. +my_feature_columns = [] +for key in train_x.keys(): + my_feature_columns.append(tf.feature_column.numeric_column(key=key)) +``` + +Feature columns can be far more sophisticated than those we're showing here. We +detail feature columns @{$feature_columns$later on} in our Getting +Started guide. + +Now that we have the description of how we want the model to represent the raw +features, we can build the estimator. + + +## Instantiate an estimator + +The Iris problem is a classic classification problem. Fortunately, TensorFlow +provides several pre-made classifier Estimators, including: + +* @{tf.estimator.DNNClassifier} for deep models that perform multi-class + classification. +* @{tf.estimator.DNNLinearCombinedClassifier} for wide & deep models. +* @{tf.estimator.LinearClassifier} for classifiers based on linear models. + +For the Iris problem, `tf.estimator.DNNClassifier` seems like the best choice. +Here's how we instantiated this Estimator: + +```python +# Build a DNN with 2 hidden layers and 10 nodes in each hidden layer. +classifier = tf.estimator.DNNClassifier( + feature_columns=my_feature_columns, + # Two hidden layers of 10 nodes each. + hidden_units=[10, 10], + # The model must choose between 3 classes. + n_classes=3) +``` + +## Train, Evaluate, and Predict + +Now that we have an Estimator object, we can call methods to do the following: + +* Train the model. +* Evaluate the trained model. +* Use the trained model to make predictions. + +### Train the model + +Train the model by calling the Estimator's `train` method as follows: + +```python +# Train the Model. +classifier.train( + input_fn=lambda:iris_data.train_input_fn(train_x, train_y, args.batch_size), + steps=args.train_steps) +``` + +Here we wrap up our `input_fn` call in a +[`lambda`](https://docs.python.org/3/tutorial/controlflow.html) +to capture the arguments while providing an input function that takes no +arguments, as expected by the Estimator. The `steps` argument tells the method +to stop training after a number of training steps. + +### Evaluate the trained model + +Now that the model has been trained, we can get some statistics on its +performance. The following code block evaluates the accuracy of the trained +model on the test data: + +```python +# Evaluate the model. +eval_result = classifier.evaluate( + input_fn=lambda:iris_data.eval_input_fn(test_x, test_y, args.batch_size)) + +print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result)) +``` + +Unlike our call to the `train` method, we did not pass the `steps` +argument to evaluate. Our `eval_input_fn` only yields a single +[epoch](https://developers.google.com/machine-learning/glossary/#epoch) of data. + +Running this code yields the following output (or something similar): + +```none +Test set accuracy: 0.967 +``` + +### Making predictions (inferring) from the trained model + +We now have a trained model that produces good evaluation results. +We can now use the trained model to predict the species of an Iris flower +based on some unlabeled measurements. As with training and evaluation, we make +predictions using a single function call: + +```python +# Generate predictions from the model +expected = ['Setosa', 'Versicolor', 'Virginica'] +predict_x = { + 'SepalLength': [5.1, 5.9, 6.9], + 'SepalWidth': [3.3, 3.0, 3.1], + 'PetalLength': [1.7, 4.2, 5.4], + 'PetalWidth': [0.5, 1.5, 2.1], +} + +predictions = classifier.predict( + input_fn=lambda:iris_data.eval_input_fn(predict_x, + batch_size=args.batch_size)) +``` + +The `predict` method returns a Python iterable, yielding a dictionary of +prediction results for each example. The following code prints a few +predictions and their probabilities: + + +``` python +template = ('\nPrediction is "{}" ({:.1f}%), expected "{}"') + +for pred_dict, expec in zip(predictions, expected): + class_id = pred_dict['class_ids'][0] + probability = pred_dict['probabilities'][class_id] + + print(template.format(iris_data.SPECIES[class_id], + 100 * probability, expec)) +``` + +Running the preceding code yields the following output: + +``` None +... +Prediction is "Setosa" (99.6%), expected "Setosa" + +Prediction is "Versicolor" (99.8%), expected "Versicolor" + +Prediction is "Virginica" (97.9%), expected "Virginica" +``` + + +## Summary + +Pre-made Estimators are an effective way to quickly create standard models. + +Now that you've gotten started writing TensorFlow programs, consider the +following material: + +* @{$checkpoints$Checkpoints} to learn how to save and restore models. +* @{$guide/datasets_for_estimators} to learn more about importing + data into your model. +* @{$custom_estimators$Creating Custom Estimators} to learn how to + write your own Estimator, customized for a particular problem. diff --git a/tensorflow/docs_src/guide/saved_model.md b/tensorflow/docs_src/guide/saved_model.md new file mode 100644 index 0000000000..27ef7bb0da --- /dev/null +++ b/tensorflow/docs_src/guide/saved_model.md @@ -0,0 +1,999 @@ +# Save and Restore + +The @{tf.train.Saver} class provides methods to save and restore models. The +@{tf.saved_model.simple_save} function is an easy way to build a +@{tf.saved_model$saved model} suitable for serving. +[Estimators](@{$guide/estimators}) automatically save and restore +variables in the `model_dir`. + +## Save and restore variables + +TensorFlow @{$variables} are the best way to represent shared, persistent state +manipulated by your program. The `tf.train.Saver` constructor adds `save` and +`restore` ops to the graph for all, or a specified list, of the variables in the +graph. The `Saver` object provides methods to run these ops, specifying paths +for the checkpoint files to write to or read from. + +`Saver` restores all variables already defined in your model. If you're +loading a model without knowing how to build its graph (for example, if you're +writing a generic program to load models), then read the +[Overview of saving and restoring models](#models) section +later in this document. + +TensorFlow saves variables in binary *checkpoint files* that map variable +names to tensor values. + +Caution: TensorFlow model files are code. Be careful with untrusted code. +See [Using TensorFlow Securely](https://github.com/tensorflow/tensorflow/blob/master/SECURITY.md) +for details. + +### Save variables + +Create a `Saver` with `tf.train.Saver()` to manage all variables in the +model. For example, the following snippet demonstrates how to call the +`tf.train.Saver.save` method to save variables to checkpoint files: + +```python +# Create some variables. +v1 = tf.get_variable("v1", shape=[3], initializer = tf.zeros_initializer) +v2 = tf.get_variable("v2", shape=[5], initializer = tf.zeros_initializer) + +inc_v1 = v1.assign(v1+1) +dec_v2 = v2.assign(v2-1) + +# Add an op to initialize the variables. +init_op = tf.global_variables_initializer() + +# Add ops to save and restore all the variables. +saver = tf.train.Saver() + +# Later, launch the model, initialize the variables, do some work, and save the +# variables to disk. +with tf.Session() as sess: + sess.run(init_op) + # Do some work with the model. + inc_v1.op.run() + dec_v2.op.run() + # Save the variables to disk. + save_path = saver.save(sess, "/tmp/model.ckpt") + print("Model saved in path: %s" % save_path) +``` + +### Restore variables + +The `tf.train.Saver` object not only saves variables to checkpoint files, it +also restores variables. Note that when you restore variables you do not have +to initialize them beforehand. For example, the following snippet demonstrates +how to call the `tf.train.Saver.restore` method to restore variables from the +checkpoint files: + +```python +tf.reset_default_graph() + +# Create some variables. +v1 = tf.get_variable("v1", shape=[3]) +v2 = tf.get_variable("v2", shape=[5]) + +# Add ops to save and restore all the variables. +saver = tf.train.Saver() + +# Later, launch the model, use the saver to restore variables from disk, and +# do some work with the model. +with tf.Session() as sess: + # Restore variables from disk. + saver.restore(sess, "/tmp/model.ckpt") + print("Model restored.") + # Check the values of the variables + print("v1 : %s" % v1.eval()) + print("v2 : %s" % v2.eval()) +``` + +Note: There is not a physical file called `/tmp/model.ckpt`. It is the *prefix* of +filenames created for the checkpoint. Users only interact with the prefix +instead of physical checkpoint files. + +### Choose variables to save and restore + +If you do not pass any arguments to `tf.train.Saver()`, the saver handles all +variables in the graph. Each variable is saved under the name that was passed +when the variable was created. + +It is sometimes useful to explicitly specify names for variables in the +checkpoint files. For example, you may have trained a model with a variable +named `"weights"` whose value you want to restore into a variable named +`"params"`. + +It is also sometimes useful to only save or restore a subset of the variables +used by a model. For example, you may have trained a neural net with five +layers, and you now want to train a new model with six layers that reuses the +existing weights of the five trained layers. You can use the saver to restore +the weights of just the first five layers. + +You can easily specify the names and variables to save or load by passing to the +`tf.train.Saver()` constructor either of the following: + +* A list of variables (which will be stored under their own names). +* A Python dictionary in which keys are the names to use and the values are the +variables to manage. + +Continuing from the save/restore examples shown earlier: + +```python +tf.reset_default_graph() +# Create some variables. +v1 = tf.get_variable("v1", [3], initializer = tf.zeros_initializer) +v2 = tf.get_variable("v2", [5], initializer = tf.zeros_initializer) + +# Add ops to save and restore only `v2` using the name "v2" +saver = tf.train.Saver({"v2": v2}) + +# Use the saver object normally after that. +with tf.Session() as sess: + # Initialize v1 since the saver will not. + v1.initializer.run() + saver.restore(sess, "/tmp/model.ckpt") + + print("v1 : %s" % v1.eval()) + print("v2 : %s" % v2.eval()) +``` + +Notes: + +* You can create as many `Saver` objects as you want if you need to save and + restore different subsets of the model variables. The same variable can be + listed in multiple saver objects; its value is only changed when the + `Saver.restore()` method is run. + +* If you only restore a subset of the model variables at the start of a + session, you have to run an initialize op for the other variables. See + @{tf.variables_initializer} for more information. + +* To inspect the variables in a checkpoint, you can use the + [`inspect_checkpoint`](https://www.tensorflow.org/code/tensorflow/python/tools/inspect_checkpoint.py) + library, particularly the `print_tensors_in_checkpoint_file` function. + +* By default, `Saver` uses the value of the @{tf.Variable.name} property + for each variable. However, when you create a `Saver` object, you may + optionally choose names for the variables in the checkpoint files. + + +### Inspect variables in a checkpoint + +We can quickly inspect variables in a checkpoint with the +[`inspect_checkpoint`](https://www.tensorflow.org/code/tensorflow/python/tools/inspect_checkpoint.py) library. + +Continuing from the save/restore examples shown earlier: + +```python +# import the inspect_checkpoint library +from tensorflow.python.tools import inspect_checkpoint as chkp + +# print all tensors in checkpoint file +chkp.print_tensors_in_checkpoint_file("/tmp/model.ckpt", tensor_name='', all_tensors=True) + +# tensor_name: v1 +# [ 1. 1. 1.] +# tensor_name: v2 +# [-1. -1. -1. -1. -1.] + +# print only tensor v1 in checkpoint file +chkp.print_tensors_in_checkpoint_file("/tmp/model.ckpt", tensor_name='v1', all_tensors=False) + +# tensor_name: v1 +# [ 1. 1. 1.] + +# print only tensor v2 in checkpoint file +chkp.print_tensors_in_checkpoint_file("/tmp/model.ckpt", tensor_name='v2', all_tensors=False) + +# tensor_name: v2 +# [-1. -1. -1. -1. -1.] +``` + + + +## Save and restore models + +Use `SavedModel` to save and load your model—variables, the graph, and the +graph's metadata. This is a language-neutral, recoverable, hermetic +serialization format that enables higher-level systems and tools to produce, +consume, and transform TensorFlow models. TensorFlow provides several ways to +interact with `SavedModel`, including the @{tf.saved_model} APIs, +@{tf.estimator.Estimator}, and a command-line interface. + + +## Build and load a SavedModel + +### Simple save + +The easiest way to create a `SavedModel` is to use the @{tf.saved_model.simple_save} +function: + +```python +simple_save(session, + export_dir, + inputs={"x": x, "y": y}, + outputs={"z": z}) +``` + +This configures the `SavedModel` so it can be loaded by +[TensorFlow serving](/serving/serving_basic) and supports the +[Predict API](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/predict.proto). +To access the classify, regress, or multi-inference APIs, use the manual +`SavedModel` builder APIs or an @{tf.estimator.Estimator}. + +### Manually build a SavedModel + +If your use case isn't covered by @{tf.saved_model.simple_save}, use the manual +@{tf.saved_model.builder$builder APIs} to create a `SavedModel`. + +The @{tf.saved_model.builder.SavedModelBuilder} class provides functionality to +save multiple `MetaGraphDef`s. A **MetaGraph** is a dataflow graph, plus +its associated variables, assets, and signatures. A **`MetaGraphDef`** +is the protocol buffer representation of a MetaGraph. A **signature** is +the set of inputs to and outputs from a graph. + +If assets need to be saved and written or copied to disk, they can be provided +when the first `MetaGraphDef` is added. If multiple `MetaGraphDef`s are +associated with an asset of the same name, only the first version is retained. + +Each `MetaGraphDef` added to the SavedModel must be annotated with +user-specified tags. The tags provide a means to identify the specific +`MetaGraphDef` to load and restore, along with the shared set of variables +and assets. These tags +typically annotate a `MetaGraphDef` with its functionality (for example, +serving or training), and optionally with hardware-specific aspects (for +example, GPU). + +For example, the following code suggests a typical way to use +`SavedModelBuilder` to build a SavedModel: + +```python +export_dir = ... +... +builder = tf.saved_model.builder.SavedModelBuilder(export_dir) +with tf.Session(graph=tf.Graph()) as sess: + ... + builder.add_meta_graph_and_variables(sess, + [tag_constants.TRAINING], + signature_def_map=foo_signatures, + assets_collection=foo_assets, + strip_default_attrs=True) +... +# Add a second MetaGraphDef for inference. +with tf.Session(graph=tf.Graph()) as sess: + ... + builder.add_meta_graph([tag_constants.SERVING], strip_default_attrs=True) +... +builder.save() +``` + + +#### Forward compatibility via `strip_default_attrs=True` + +Following the guidance below gives you forward compatibility only if the set of +Ops has not changed. + +The @{tf.saved_model.builder.SavedModelBuilder$`SavedModelBuilder`} class allows +users to control whether default-valued attributes must be stripped from the +@{$extend/tool_developers#nodes$`NodeDefs`} +while adding a meta graph to the SavedModel bundle. Both +@{tf.saved_model.builder.SavedModelBuilder.add_meta_graph_and_variables$`SavedModelBuilder.add_meta_graph_and_variables`} +and @{tf.saved_model.builder.SavedModelBuilder.add_meta_graph$`SavedModelBuilder.add_meta_graph`} +methods accept a Boolean flag `strip_default_attrs` that controls this behavior. + +If `strip_default_attrs` is `False`, the exported @{tf.MetaGraphDef} will have +the default valued attributes in all its @{tf.NodeDef} instances. +This can break forward compatibility with a sequence of events such as the +following: + +* An existing Op (`Foo`) is updated to include a new attribute (`T`) with a + default (`bool`) at version 101. +* A model producer such as a "trainer binary" picks up this change (version 101) + to the `OpDef` and re-exports an existing model that uses Op `Foo`. +* A model consumer (such as [Tensorflow Serving](/serving)) running an older + binary (version 100) doesn't have attribute `T` for Op `Foo`, but tries to + import this model. The model consumer doesn't recognize attribute `T` in a + `NodeDef` that uses Op `Foo` and therefore fails to load the model. +* By setting `strip_default_attrs` to True, the model producers can strip away + any default valued attributes in the `NodeDefs`. This helps ensure that newly + added attributes with defaults don't cause older model consumers to fail + loading models regenerated with newer training binaries. + +See [compatibility guidance](./version_compat.md) +for more information. + +### Loading a SavedModel in Python + +The Python version of the SavedModel +@{tf.saved_model.loader$loader} +provides load and restore capability for a SavedModel. The `load` operation +requires the following information: + +* The session in which to restore the graph definition and variables. +* The tags used to identify the MetaGraphDef to load. +* The location (directory) of the SavedModel. + +Upon a load, the subset of variables, assets, and signatures supplied as part of +the specific MetaGraphDef will be restored into the supplied session. + + +```python +export_dir = ... +... +with tf.Session(graph=tf.Graph()) as sess: + tf.saved_model.loader.load(sess, [tag_constants.TRAINING], export_dir) + ... +``` + + +### Load a SavedModel in C++ + +The C++ version of the SavedModel +[loader](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/loader.h) +provides an API to load a SavedModel from a path, while allowing +`SessionOptions` and `RunOptions`. +You have to specify the tags associated with the graph to be loaded. +The loaded version of SavedModel is referred to as `SavedModelBundle` +and contains the MetaGraphDef and the session within which it is loaded. + +```c++ +const string export_dir = ... +SavedModelBundle bundle; +... +LoadSavedModel(session_options, run_options, export_dir, {kSavedModelTagTrain}, + &bundle); +``` + +### Load and serve a SavedModel in TensorFlow serving + +You can easily load and serve a SavedModel with the TensorFlow Serving Model +Server binary. See [instructions](https://www.tensorflow.org/serving/setup#installing_using_apt-get) +on how to install the server, or build it if you wish. + +Once you have the Model Server, run it with: +``` +tensorflow_model_server --port=port-numbers --model_name=your-model-name --model_base_path=your_model_base_path +``` +Set the port and model_name flags to values of your choosing. The +model_base_path flag expects to be to a base directory, with each version of +your model residing in a numerically named subdirectory. If you only have a +single version of your model, simply place it in a subdirectory like so: +* Place the model in /tmp/model/0001 +* Set model_base_path to /tmp/model + +Store different versions of your model in numerically named subdirectories of a +common base directory. For example, suppose the base directory is `/tmp/model`. +If you have only one version of your model, store it in `/tmp/model/0001`. If +you have two versions of your model, store the second version in +`/tmp/model/0002`, and so on. Set the `--model-base_path` flag to the base +directory (`/tmp/model`, in this example). TensorFlow Model Server will serve +the model in the highest numbered subdirectory of that base directory. + +### Standard constants + +SavedModel offers the flexibility to build and load TensorFlow graphs for a +variety of use-cases. For the most common use-cases, SavedModel's APIs +provide a set of constants in Python and C++ that are easy to +reuse and share across tools consistently. + +#### Standard MetaGraphDef tags + +You may use sets of tags to uniquely identify a `MetaGraphDef` saved in a +SavedModel. A subset of commonly used tags is specified in: + +* [Python](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/saved_model/tag_constants.py) +* [C++](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/tag_constants.h) + + +#### Standard SignatureDef constants + +A [**SignatureDef**](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/protobuf/meta_graph.proto) +is a protocol buffer that defines the signature of a computation +supported by a graph. +Commonly used input keys, output keys, and method names are +defined in: + +* [Python](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/saved_model/signature_constants.py) +* [C++](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/signature_constants.h) + +## Using SavedModel with Estimators + +After training an `Estimator` model, you may want to create a service +from that model that takes requests and returns a result. You can run such a +service locally on your machine or deploy it in the cloud. + +To prepare a trained Estimator for serving, you must export it in the standard +SavedModel format. This section explains how to: + +* Specify the output nodes and the corresponding + [APIs](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto) + that can be served (Classify, Regress, or Predict). +* Export your model to the SavedModel format. +* Serve the model from a local server and request predictions. + + +### Prepare serving inputs + +During training, an @{$premade_estimators#input_fn$`input_fn()`} ingests data +and prepares it for use by the model. At serving time, similarly, a +`serving_input_receiver_fn()` accepts inference requests and prepares them for +the model. This function has the following purposes: + +* To add placeholders to the graph that the serving system will feed + with inference requests. +* To add any additional ops needed to convert data from the input format + into the feature `Tensor`s expected by the model. + +The function returns a @{tf.estimator.export.ServingInputReceiver} object, +which packages the placeholders and the resulting feature `Tensor`s together. + +A typical pattern is that inference requests arrive in the form of serialized +`tf.Example`s, so the `serving_input_receiver_fn()` creates a single string +placeholder to receive them. The `serving_input_receiver_fn()` is then also +responsible for parsing the `tf.Example`s by adding a @{tf.parse_example} op to +the graph. + +When writing such a `serving_input_receiver_fn()`, you must pass a parsing +specification to @{tf.parse_example} to tell the parser what feature names to +expect and how to map them to `Tensor`s. A parsing specification takes the +form of a dict from feature names to @{tf.FixedLenFeature}, @{tf.VarLenFeature}, +and @{tf.SparseFeature}. Note this parsing specification should not include +any label or weight columns, since those will not be available at serving +time—in contrast to a parsing specification used in the `input_fn()` at +training time. + +In combination, then: + +```py +feature_spec = {'foo': tf.FixedLenFeature(...), + 'bar': tf.VarLenFeature(...)} + +def serving_input_receiver_fn(): + """An input receiver that expects a serialized tf.Example.""" + serialized_tf_example = tf.placeholder(dtype=tf.string, + shape=[default_batch_size], + name='input_example_tensor') + receiver_tensors = {'examples': serialized_tf_example} + features = tf.parse_example(serialized_tf_example, feature_spec) + return tf.estimator.export.ServingInputReceiver(features, receiver_tensors) +``` + +The @{tf.estimator.export.build_parsing_serving_input_receiver_fn} utility +function provides that input receiver for the common case. + +> Note: when training a model to be served using the Predict API with a local +> server, the parsing step is not needed because the model will receive raw +> feature data. + +Even if you require no parsing or other input processing—that is, if the +serving system will feed feature `Tensor`s directly—you must still provide +a `serving_input_receiver_fn()` that creates placeholders for the feature +`Tensor`s and passes them through. The +@{tf.estimator.export.build_raw_serving_input_receiver_fn} utility provides for +this. + +If these utilities do not meet your needs, you are free to write your own +`serving_input_receiver_fn()`. One case where this may be needed is if your +training `input_fn()` incorporates some preprocessing logic that must be +recapitulated at serving time. To reduce the risk of training-serving skew, we +recommend encapsulating such processing in a function which is then called +from both `input_fn()` and `serving_input_receiver_fn()`. + +Note that the `serving_input_receiver_fn()` also determines the *input* +portion of the signature. That is, when writing a +`serving_input_receiver_fn()`, you must tell the parser what signatures +to expect and how to map them to your model's expected inputs. +By contrast, the *output* portion of the signature is determined by the model. + + +### Specify the outputs of a custom model + +When writing a custom `model_fn`, you must populate the `export_outputs` element +of the @{tf.estimator.EstimatorSpec} return value. This is a dict of +`{name: output}` describing the output signatures to be exported and used during +serving. + +In the usual case of making a single prediction, this dict contains +one element, and the `name` is immaterial. In a multi-headed model, each head +is represented by an entry in this dict. In this case the `name` is a string +of your choice that can be used to request a specific head at serving time. + +Each `output` value must be an `ExportOutput` object such as +@{tf.estimator.export.ClassificationOutput}, +@{tf.estimator.export.RegressionOutput}, or +@{tf.estimator.export.PredictOutput}. + +These output types map straightforwardly to the +[TensorFlow Serving APIs](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto), +and so determine which request types will be honored. + +Note: In the multi-headed case, a `SignatureDef` will be generated for each +element of the `export_outputs` dict returned from the model_fn, named using +the same keys. These `SignatureDef`s differ only in their outputs, as +provided by the corresponding `ExportOutput` entry. The inputs are always +those provided by the `serving_input_receiver_fn`. +An inference request may specify the head by name. One head must be named +using [`signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY`](https://www.tensorflow.org/code/tensorflow/python/saved_model/signature_constants.py) +indicating which `SignatureDef` will be served when an inference request +does not specify one. + + +### Perform the export + +To export your trained Estimator, call +@{tf.estimator.Estimator.export_savedmodel} with the export base path and +the `serving_input_receiver_fn`. + +```py +estimator.export_savedmodel(export_dir_base, serving_input_receiver_fn, + strip_default_attrs=True) +``` + +This method builds a new graph by first calling the +`serving_input_receiver_fn()` to obtain feature `Tensor`s, and then calling +this `Estimator`'s `model_fn()` to generate the model graph based on those +features. It starts a fresh `Session`, and, by default, restores the most recent +checkpoint into it. (A different checkpoint may be passed, if needed.) +Finally it creates a time-stamped export directory below the given +`export_dir_base` (i.e., `export_dir_base/`), and writes a +SavedModel into it containing a single `MetaGraphDef` saved from this +Session. + +> Note: It is your responsibility to garbage-collect old exports. +> Otherwise, successive exports will accumulate under `export_dir_base`. + +### Serve the exported model locally + +For local deployment, you can serve your model using +[TensorFlow Serving](https://github.com/tensorflow/serving), an open-source project that loads a +SavedModel and exposes it as a [gRPC](https://www.grpc.io/) service. + +First, [install TensorFlow Serving](https://github.com/tensorflow/serving). + +Then build and run the local model server, substituting `$export_dir_base` with +the path to the SavedModel you exported above: + +```sh +bazel build //tensorflow_serving/model_servers:tensorflow_model_server +bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_base_path=$export_dir_base +``` + +Now you have a server listening for inference requests via gRPC on port 9000! + + +### Request predictions from a local server + +The server responds to gRPC requests according to the +[PredictionService](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto#L15) +gRPC API service definition. (The nested protocol buffers are defined in +various [neighboring files](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis)). + +From the API service definition, the gRPC framework generates client libraries +in various languages providing remote access to the API. In a project using the +Bazel build tool, these libraries are built automatically and provided via +dependencies like these (using Python for example): + +```build + deps = [ + "//tensorflow_serving/apis:classification_proto_py_pb2", + "//tensorflow_serving/apis:regression_proto_py_pb2", + "//tensorflow_serving/apis:predict_proto_py_pb2", + "//tensorflow_serving/apis:prediction_service_proto_py_pb2" + ] +``` + +Python client code can then import the libraries thus: + +```py +from tensorflow_serving.apis import classification_pb2 +from tensorflow_serving.apis import regression_pb2 +from tensorflow_serving.apis import predict_pb2 +from tensorflow_serving.apis import prediction_service_pb2 +``` + +> Note: `prediction_service_pb2` defines the service as a whole and so +> is always required. However a typical client will need only one of +> `classification_pb2`, `regression_pb2`, and `predict_pb2`, depending on the +> type of requests being made. + +Sending a gRPC request is then accomplished by assembling a protocol buffer +containing the request data and passing it to the service stub. Note how the +request protocol buffer is created empty and then populated via the +[generated protocol buffer API](https://developers.google.com/protocol-buffers/docs/reference/python-generated). + +```py +from grpc.beta import implementations + +channel = implementations.insecure_channel(host, int(port)) +stub = prediction_service_pb2.beta_create_PredictionService_stub(channel) + +request = classification_pb2.ClassificationRequest() +example = request.input.example_list.examples.add() +example.features.feature['x'].float_list.value.extend(image[0].astype(float)) + +result = stub.Classify(request, 10.0) # 10 secs timeout +``` + +The returned result in this example is a `ClassificationResponse` protocol +buffer. + +This is a skeletal example; please see the @{$deploy$Tensorflow Serving} +documentation and [examples](https://github.com/tensorflow/serving/tree/master/tensorflow_serving/example) +for more details. + +> Note: `ClassificationRequest` and `RegressionRequest` contain a +> `tensorflow.serving.Input` protocol buffer, which in turn contains a list of +> `tensorflow.Example` protocol buffers. `PredictRequest`, by contrast, +> contains a mapping from feature names to values encoded via `TensorProto`. +> Correspondingly: When using the `Classify` and `Regress` APIs, TensorFlow +> Serving feeds serialized `tf.Example`s to the graph, so your +> `serving_input_receiver_fn()` should include a `tf.parse_example()` Op. +> When using the generic `Predict` API, however, TensorFlow Serving feeds raw +> feature data to the graph, so a pass through `serving_input_receiver_fn()` +> should be used. + + + + + + + + + +## CLI to inspect and execute SavedModel + +You can use the SavedModel Command Line Interface (CLI) to inspect and +execute a SavedModel. +For example, you can use the CLI to inspect the model's `SignatureDef`s. +The CLI enables you to quickly confirm that the input +@{$tensors$Tensor dtype and shape} match the model. Moreover, if you +want to test your model, you can use the CLI to do a sanity check by +passing in sample inputs in various formats (for example, Python +expressions) and then fetching the output. + + +### Install the SavedModel CLI + +Broadly speaking, you can install TensorFlow in either of the following +two ways: + +* By installing a pre-built TensorFlow binary. +* By building TensorFlow from source code. + +If you installed TensorFlow through a pre-built TensorFlow binary, +then the SavedModel CLI is already installed on your system +at pathname `bin\saved_model_cli`. + +If you built TensorFlow from source code, you must run the following +additional command to build `saved_model_cli`: + +``` +$ bazel build tensorflow/python/tools:saved_model_cli +``` + +### Overview of commands + +The SavedModel CLI supports the following two commands on a +`MetaGraphDef` in a SavedModel: + +* `show`, which shows a computation on a `MetaGraphDef` in a SavedModel. +* `run`, which runs a computation on a `MetaGraphDef`. + + +### `show` command + +A SavedModel contains one or more `MetaGraphDef`s, identified by their tag-sets. +To serve a model, you +might wonder what kind of `SignatureDef`s are in each model, and what are their +inputs and outputs. The `show` command let you examine the contents of the +SavedModel in hierarchical order. Here's the syntax: + +``` +usage: saved_model_cli show [-h] --dir DIR [--all] +[--tag_set TAG_SET] [--signature_def SIGNATURE_DEF_KEY] +``` + +For example, the following command shows all available +MetaGraphDef tag-sets in the SavedModel: + +``` +$ saved_model_cli show --dir /tmp/saved_model_dir +The given SavedModel contains the following tag-sets: +serve +serve, gpu +``` + +The following command shows all available `SignatureDef` keys in +a `MetaGraphDef`: + +``` +$ saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve +The given SavedModel `MetaGraphDef` contains `SignatureDefs` with the +following keys: +SignatureDef key: "classify_x2_to_y3" +SignatureDef key: "classify_x_to_y" +SignatureDef key: "regress_x2_to_y3" +SignatureDef key: "regress_x_to_y" +SignatureDef key: "regress_x_to_y2" +SignatureDef key: "serving_default" +``` + +If a `MetaGraphDef` has *multiple* tags in the tag-set, you must specify +all tags, each tag separated by a comma. For example: + +```none +$ saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve,gpu +``` + +To show all inputs and outputs TensorInfo for a specific `SignatureDef`, pass in +the `SignatureDef` key to `signature_def` option. This is very useful when you +want to know the tensor key value, dtype and shape of the input tensors for +executing the computation graph later. For example: + +``` +$ saved_model_cli show --dir \ +/tmp/saved_model_dir --tag_set serve --signature_def serving_default +The given SavedModel SignatureDef contains the following input(s): + inputs['x'] tensor_info: + dtype: DT_FLOAT + shape: (-1, 1) + name: x:0 +The given SavedModel SignatureDef contains the following output(s): + outputs['y'] tensor_info: + dtype: DT_FLOAT + shape: (-1, 1) + name: y:0 +Method name is: tensorflow/serving/predict +``` + +To show all available information in the SavedModel, use the `--all` option. +For example: + +```none +$ saved_model_cli show --dir /tmp/saved_model_dir --all +MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs: + +signature_def['classify_x2_to_y3']: + The given SavedModel SignatureDef contains the following input(s): + inputs['inputs'] tensor_info: + dtype: DT_FLOAT + shape: (-1, 1) + name: x2:0 + The given SavedModel SignatureDef contains the following output(s): + outputs['scores'] tensor_info: + dtype: DT_FLOAT + shape: (-1, 1) + name: y3:0 + Method name is: tensorflow/serving/classify + +... + +signature_def['serving_default']: + The given SavedModel SignatureDef contains the following input(s): + inputs['x'] tensor_info: + dtype: DT_FLOAT + shape: (-1, 1) + name: x:0 + The given SavedModel SignatureDef contains the following output(s): + outputs['y'] tensor_info: + dtype: DT_FLOAT + shape: (-1, 1) + name: y:0 + Method name is: tensorflow/serving/predict +``` + + +### `run` command + +Invoke the `run` command to run a graph computation, passing +inputs and then displaying (and optionally saving) the outputs. +Here's the syntax: + +``` +usage: saved_model_cli run [-h] --dir DIR --tag_set TAG_SET --signature_def + SIGNATURE_DEF_KEY [--inputs INPUTS] + [--input_exprs INPUT_EXPRS] [--outdir OUTDIR] + [--overwrite] [--tf_debug] +``` + +The `run` command provides the following two ways to pass inputs to the model: + +* `--inputs` option enables you to pass numpy ndarray in files. +* `--input_exprs` option enables you to pass Python expressions. +* `--input_examples` option enables you to pass `tf.train.Example`. + + +#### `--inputs` + +To pass input data in files, specify the `--inputs` option, which takes the +following general format: + +```bsh +--inputs +``` + +where *INPUTS* is either of the following formats: + +* `=` +* `=[]` + +You may pass multiple *INPUTS*. If you do pass multiple inputs, use a semicolon +to separate each of the *INPUTS*. + +`saved_model_cli` uses `numpy.load` to load the *filename*. +The *filename* may be in any of the following formats: + +* `.npy` +* `.npz` +* pickle format + +A `.npy` file always contains a numpy ndarray. Therefore, when loading from +a `.npy` file, the content will be directly assigned to the specified input +tensor. If you specify a *variable_name* with that `.npy` file, the +*variable_name* will be ignored and a warning will be issued. + +When loading from a `.npz` (zip) file, you may optionally specify a +*variable_name* to identify the variable within the zip file to load for +the input tensor key. If you don't specify a *variable_name*, the SavedModel +CLI will check that only one file is included in the zip file and load it +for the specified input tensor key. + +When loading from a pickle file, if no `variable_name` is specified in the +square brackets, whatever that is inside the pickle file will be passed to the +specified input tensor key. Otherwise, the SavedModel CLI will assume a +dictionary is stored in the pickle file and the value corresponding to +the *variable_name* will be used. + + +#### `--inputs_exprs` + +To pass inputs through Python expressions, specify the `--input_exprs` option. +This can be useful for when you don't have data +files lying around, but still want to sanity check the model with some simple +inputs that match the dtype and shape of the model's `SignatureDef`s. +For example: + +```bsh +`=[[1],[2],[3]]` +``` + +In addition to Python expressions, you may also pass numpy functions. For +example: + +```bsh +`=np.ones((32,32,3))` +``` + +(Note that the `numpy` module is already available to you as `np`.) + + +#### `--inputs_examples` + +To pass `tf.train.Example` as inputs, specify the `--input_examples` option. +For each input key, it takes a list of dictionary, where each dictionary is an +instance of `tf.train.Example`. The dictionary keys are the features and the +values are the value lists for each feature. +For example: + +```bsh +`=[{"age":[22,24],"education":["BS","MS"]}]` +``` + +#### Save output + +By default, the SavedModel CLI writes output to stdout. If a directory is +passed to `--outdir` option, the outputs will be saved as npy files named after +output tensor keys under the given directory. + +Use `--overwrite` to overwrite existing output files. + + +#### TensorFlow debugger (tfdbg) integration + +If `--tf_debug` option is set, the SavedModel CLI will use the +TensorFlow Debugger (tfdbg) to watch the intermediate Tensors and runtime +graphs or subgraphs while running the SavedModel. + + +#### Full examples of `run` + +Given: + +* Your model simply adds `x1` and `x2` to get output `y`. +* All tensors in the model have shape `(-1, 1)`. +* You have two `npy` files: + * `/tmp/my_data1.npy`, which contains a numpy ndarray `[[1], [2], [3]]`. + * `/tmp/my_data2.npy`, which contains another numpy + ndarray `[[0.5], [0.5], [0.5]]`. + +To run these two `npy` files through the model to get output `y`, issue +the following command: + +``` +$ saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \ +--signature_def x1_x2_to_y --inputs x1=/tmp/my_data1.npy;x2=/tmp/my_data2.npy \ +--outdir /tmp/out +Result for output key y: +[[ 1.5] + [ 2.5] + [ 3.5]] +``` + +Let's change the preceding example slightly. This time, instead of two +`.npy` files, you now have an `.npz` file and a pickle file. Furthermore, +you want to overwrite any existing output file. Here's the command: + +``` +$ saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \ +--signature_def x1_x2_to_y \ +--inputs x1=/tmp/my_data1.npz[x];x2=/tmp/my_data2.pkl --outdir /tmp/out \ +--overwrite +Result for output key y: +[[ 1.5] + [ 2.5] + [ 3.5]] +``` + +You may specify python expression instead of an input file. For example, +the following command replaces input `x2` with a Python expression: + +``` +$ saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \ +--signature_def x1_x2_to_y --inputs x1=/tmp/my_data1.npz[x] \ +--input_exprs 'x2=np.ones((3,1))' +Result for output key y: +[[ 2] + [ 3] + [ 4]] +``` + +To run the model with the TensorFlow Debugger on, issue the +following command: + +``` +$ saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \ +--signature_def serving_default --inputs x=/tmp/data.npz[x] --tf_debug +``` + + + +## Structure of a SavedModel directory + +When you save a model in SavedModel format, TensorFlow creates +a SavedModel directory consisting of the following subdirectories +and files: + +```bsh +assets/ +assets.extra/ +variables/ + variables.data-?????-of-????? + variables.index +saved_model.pb|saved_model.pbtxt +``` + +where: + +* `assets` is a subfolder containing auxiliary (external) files, + such as vocabularies. Assets are copied to the SavedModel location + and can be read when loading a specific `MetaGraphDef`. +* `assets.extra` is a subfolder where higher-level libraries and users can + add their own assets that co-exist with the model, but are not loaded by + the graph. This subfolder is not managed by the SavedModel libraries. +* `variables` is a subfolder that includes output from + `tf.train.Saver`. +* `saved_model.pb` or `saved_model.pbtxt` is the SavedModel protocol buffer. + It includes the graph definitions as `MetaGraphDef` protocol buffers. + +A single SavedModel can represent multiple graphs. In this case, all the +graphs in the SavedModel share a *single* set of checkpoints (variables) +and assets. For example, the following diagram shows one SavedModel +containing three `MetaGraphDef`s, all three of which share the same set +of checkpoints and assets: + +![SavedModel represents checkpoints, assets, and one or more MetaGraphDefs](../images/SavedModel.svg) + +Each graph is associated with a specific set of tags, which enables +identification during a load or restore operation. diff --git a/tensorflow/docs_src/guide/summaries_and_tensorboard.md b/tensorflow/docs_src/guide/summaries_and_tensorboard.md new file mode 100644 index 0000000000..fadfa03e78 --- /dev/null +++ b/tensorflow/docs_src/guide/summaries_and_tensorboard.md @@ -0,0 +1,225 @@ +# TensorBoard: Visualizing Learning + +The computations you'll use TensorFlow for - like training a massive +deep neural network - can be complex and confusing. To make it easier to +understand, debug, and optimize TensorFlow programs, we've included a suite of +visualization tools called TensorBoard. You can use TensorBoard to visualize +your TensorFlow graph, plot quantitative metrics about the execution of your +graph, and show additional data like images that pass through it. When +TensorBoard is fully configured, it looks like this: + +![MNIST TensorBoard](https://www.tensorflow.org/images/mnist_tensorboard.png "MNIST TensorBoard") + +
+ +
+ +This 30-minute tutorial is intended to get you started with simple TensorBoard +usage. It assumes a basic understanding of TensorFlow. + +There are other resources available as well! The [TensorBoard GitHub](https://github.com/tensorflow/tensorboard) +has a lot more information on using individual dashboards within TensorBoard +including tips & tricks and debugging information. + +## Setup + +[Install TensorFlow](https://www.tensorflow.org/install/). Installing TensorFlow +via pip should also automatically install TensorBoard. + +## Serializing the data + +TensorBoard operates by reading TensorFlow events files, which contain summary +data that you can generate when running TensorFlow. Here's the general +lifecycle for summary data within TensorBoard. + +First, create the TensorFlow graph that you'd like to collect summary +data from, and decide which nodes you would like to annotate with +@{$python/summary$summary operations}. + +For example, suppose you are training a convolutional neural network for +recognizing MNIST digits. You'd like to record how the learning rate +varies over time, and how the objective function is changing. Collect these by +attaching @{tf.summary.scalar} ops +to the nodes that output the learning rate and loss respectively. Then, give +each `scalar_summary` a meaningful `tag`, like `'learning rate'` or `'loss +function'`. + +Perhaps you'd also like to visualize the distributions of activations coming +off a particular layer, or the distribution of gradients or weights. Collect +this data by attaching +@{tf.summary.histogram} ops to +the gradient outputs and to the variable that holds your weights, respectively. + +For details on all of the summary operations available, check out the docs on +@{$python/summary$summary operations}. + +Operations in TensorFlow don't do anything until you run them, or an op that +depends on their output. And the summary nodes that we've just created are +peripheral to your graph: none of the ops you are currently running depend on +them. So, to generate summaries, we need to run all of these summary nodes. +Managing them by hand would be tedious, so use +@{tf.summary.merge_all} +to combine them into a single op that generates all the summary data. + +Then, you can just run the merged summary op, which will generate a serialized +`Summary` protobuf object with all of your summary data at a given step. +Finally, to write this summary data to disk, pass the summary protobuf to a +@{tf.summary.FileWriter}. + +The `FileWriter` takes a logdir in its constructor - this logdir is quite +important, it's the directory where all of the events will be written out. +Also, the `FileWriter` can optionally take a `Graph` in its constructor. +If it receives a `Graph` object, then TensorBoard will visualize your graph +along with tensor shape information. This will give you a much better sense of +what flows through the graph: see +@{$graph_viz#tensor-shape-information$Tensor shape information}. + +Now that you've modified your graph and have a `FileWriter`, you're ready to +start running your network! If you want, you could run the merged summary op +every single step, and record a ton of training data. That's likely to be more +data than you need, though. Instead, consider running the merged summary op +every `n` steps. + +The code example below is a modification of the +[simple MNIST tutorial](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/mnist/mnist.py), +in which we have added some summary ops, and run them every ten steps. If you +run this and then launch `tensorboard --logdir=/tmp/tensorflow/mnist`, you'll be able +to visualize statistics, such as how the weights or accuracy varied during +training. The code below is an excerpt; full source is +[here](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/mnist_with_summaries.py). + +```python +def variable_summaries(var): + """Attach a lot of summaries to a Tensor (for TensorBoard visualization).""" + with tf.name_scope('summaries'): + mean = tf.reduce_mean(var) + tf.summary.scalar('mean', mean) + with tf.name_scope('stddev'): + stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean))) + tf.summary.scalar('stddev', stddev) + tf.summary.scalar('max', tf.reduce_max(var)) + tf.summary.scalar('min', tf.reduce_min(var)) + tf.summary.histogram('histogram', var) + +def nn_layer(input_tensor, input_dim, output_dim, layer_name, act=tf.nn.relu): + """Reusable code for making a simple neural net layer. + + It does a matrix multiply, bias add, and then uses relu to nonlinearize. + It also sets up name scoping so that the resultant graph is easy to read, + and adds a number of summary ops. + """ + # Adding a name scope ensures logical grouping of the layers in the graph. + with tf.name_scope(layer_name): + # This Variable will hold the state of the weights for the layer + with tf.name_scope('weights'): + weights = weight_variable([input_dim, output_dim]) + variable_summaries(weights) + with tf.name_scope('biases'): + biases = bias_variable([output_dim]) + variable_summaries(biases) + with tf.name_scope('Wx_plus_b'): + preactivate = tf.matmul(input_tensor, weights) + biases + tf.summary.histogram('pre_activations', preactivate) + activations = act(preactivate, name='activation') + tf.summary.histogram('activations', activations) + return activations + +hidden1 = nn_layer(x, 784, 500, 'layer1') + +with tf.name_scope('dropout'): + keep_prob = tf.placeholder(tf.float32) + tf.summary.scalar('dropout_keep_probability', keep_prob) + dropped = tf.nn.dropout(hidden1, keep_prob) + +# Do not apply softmax activation yet, see below. +y = nn_layer(dropped, 500, 10, 'layer2', act=tf.identity) + +with tf.name_scope('cross_entropy'): + # The raw formulation of cross-entropy, + # + # tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.softmax(y)), + # reduction_indices=[1])) + # + # can be numerically unstable. + # + # So here we use tf.losses.sparse_softmax_cross_entropy on the + # raw logit outputs of the nn_layer above. + with tf.name_scope('total'): + cross_entropy = tf.losses.sparse_softmax_cross_entropy(labels=y_, logits=y) +tf.summary.scalar('cross_entropy', cross_entropy) + +with tf.name_scope('train'): + train_step = tf.train.AdamOptimizer(FLAGS.learning_rate).minimize( + cross_entropy) + +with tf.name_scope('accuracy'): + with tf.name_scope('correct_prediction'): + correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) + with tf.name_scope('accuracy'): + accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) +tf.summary.scalar('accuracy', accuracy) + +# Merge all the summaries and write them out to /tmp/mnist_logs (by default) +merged = tf.summary.merge_all() +train_writer = tf.summary.FileWriter(FLAGS.summaries_dir + '/train', + sess.graph) +test_writer = tf.summary.FileWriter(FLAGS.summaries_dir + '/test') +tf.global_variables_initializer().run() +``` + +After we've initialized the `FileWriters`, we have to add summaries to the +`FileWriters` as we train and test the model. + +```python +# Train the model, and also write summaries. +# Every 10th step, measure test-set accuracy, and write test summaries +# All other steps, run train_step on training data, & add training summaries + +def feed_dict(train): + """Make a TensorFlow feed_dict: maps data onto Tensor placeholders.""" + if train or FLAGS.fake_data: + xs, ys = mnist.train.next_batch(100, fake_data=FLAGS.fake_data) + k = FLAGS.dropout + else: + xs, ys = mnist.test.images, mnist.test.labels + k = 1.0 + return {x: xs, y_: ys, keep_prob: k} + +for i in range(FLAGS.max_steps): + if i % 10 == 0: # Record summaries and test-set accuracy + summary, acc = sess.run([merged, accuracy], feed_dict=feed_dict(False)) + test_writer.add_summary(summary, i) + print('Accuracy at step %s: %s' % (i, acc)) + else: # Record train set summaries, and train + summary, _ = sess.run([merged, train_step], feed_dict=feed_dict(True)) + train_writer.add_summary(summary, i) +``` + +You're now all set to visualize this data using TensorBoard. + + +## Launching TensorBoard + +To run TensorBoard, use the following command (alternatively `python -m +tensorboard.main`) + +```bash +tensorboard --logdir=path/to/log-directory +``` + +where `logdir` points to the directory where the `FileWriter` serialized its +data. If this `logdir` directory contains subdirectories which contain +serialized data from separate runs, then TensorBoard will visualize the data +from all of those runs. Once TensorBoard is running, navigate your web browser +to `localhost:6006` to view the TensorBoard. + +When looking at TensorBoard, you will see the navigation tabs in the top right +corner. Each tab represents a set of serialized data that can be visualized. + +For in depth information on how to use the *graph* tab to visualize your graph, +see @{$graph_viz$TensorBoard: Graph Visualization}. + +For more usage information on TensorBoard in general, see the +[TensorBoard GitHub](https://github.com/tensorflow/tensorboard). diff --git a/tensorflow/docs_src/guide/tensorboard_histograms.md b/tensorflow/docs_src/guide/tensorboard_histograms.md new file mode 100644 index 0000000000..918deda190 --- /dev/null +++ b/tensorflow/docs_src/guide/tensorboard_histograms.md @@ -0,0 +1,245 @@ +# TensorBoard Histogram Dashboard + +The TensorBoard Histogram Dashboard displays how the distribution of some +`Tensor` in your TensorFlow graph has changed over time. It does this by showing +many histograms visualizations of your tensor at different points in time. + +## A Basic Example + +Let's start with a simple case: a normally-distributed variable, where the mean +shifts over time. +TensorFlow has an op +[`tf.random_normal`](https://www.tensorflow.org/api_docs/python/tf/random_normal) +which is perfect for this purpose. As is usually the case with TensorBoard, we +will ingest data using a summary op; in this case, +['tf.summary.histogram'](https://www.tensorflow.org/api_docs/python/tf/summary/histogram). +For a primer on how summaries work, please see the general +[TensorBoard tutorial](https://www.tensorflow.org/get_started/summaries_and_tensorboard). + +Here is a code snippet that will generate some histogram summaries containing +normally distributed data, where the mean of the distribution increases over +time. + +```python +import tensorflow as tf + +k = tf.placeholder(tf.float32) + +# Make a normal distribution, with a shifting mean +mean_moving_normal = tf.random_normal(shape=[1000], mean=(5*k), stddev=1) +# Record that distribution into a histogram summary +tf.summary.histogram("normal/moving_mean", mean_moving_normal) + +# Setup a session and summary writer +sess = tf.Session() +writer = tf.summary.FileWriter("/tmp/histogram_example") + +summaries = tf.summary.merge_all() + +# Setup a loop and write the summaries to disk +N = 400 +for step in range(N): + k_val = step/float(N) + summ = sess.run(summaries, feed_dict={k: k_val}) + writer.add_summary(summ, global_step=step) +``` + +Once that code runs, we can load the data into TensorBoard via the command line: + + +```sh +tensorboard --logdir=/tmp/histogram_example +``` + +Once TensorBoard is running, load it in Chrome or Firefox and navigate to the +Histogram Dashboard. Then we can see a histogram visualization for our normally +distributed data. + +![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/1_moving_mean.png) + +`tf.summary.histogram` takes an arbitrarily sized and shaped Tensor, and +compresses it into a histogram data structure consisting of many bins with +widths and counts. For example, let's say we want to organize the numbers +`[0.5, 1.1, 1.3, 2.2, 2.9, 2.99]` into bins. We could make three bins: +* a bin +containing everything from 0 to 1 (it would contain one element, 0.5), +* a bin +containing everything from 1-2 (it would contain two elements, 1.1 and 1.3), +* a bin containing everything from 2-3 (it would contain three elements: 2.2, +2.9 and 2.99). + +TensorFlow uses a similar approach to create bins, but unlike in our example, it +doesn't create integer bins. For large, sparse datasets, that might result in +many thousands of bins. +Instead, [the bins are exponentially distributed, with many bins close to 0 and +comparatively few bins for very large numbers.](https://github.com/tensorflow/tensorflow/blob/c8b59c046895fa5b6d79f73e0b5817330fcfbfc1/tensorflow/core/lib/histogram/histogram.cc#L28) +However, visualizing exponentially-distributed bins is tricky; if height is used +to encode count, then wider bins take more space, even if they have the same +number of elements. Conversely, encoding count in the area makes height +comparisons impossible. Instead, the histograms [resample the data](https://github.com/tensorflow/tensorflow/blob/17c47804b86e340203d451125a721310033710f1/tensorflow/tensorboard/components/tf_backend/backend.ts#L400) +into uniform bins. This can lead to unfortunate artifacts in some cases. + +Each slice in the histogram visualizer displays a single histogram. +The slices are organized by step; +older slices (e.g. step 0) are further "back" and darker, while newer slices +(e.g. step 400) are close to the foreground, and lighter in color. +The y-axis on the right shows the step number. + +You can mouse over the histogram to see tooltips with some more detailed +information. For example, in the following image we can see that the histogram +at timestep 176 has a bin centered at 2.25 with 177 elements in that bin. + +![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/2_moving_mean_tooltip.png) + +Also, you may note that the histogram slices are not always evenly spaced in +step count or time. This is because TensorBoard uses +[reservoir sampling](https://en.wikipedia.org/wiki/Reservoir_sampling) to keep a +subset of all the histograms, to save on memory. Reservoir sampling guarantees +that every sample has an equal likelihood of being included, but because it is +a randomized algorithm, the samples chosen don't occur at even steps. + +## Overlay Mode + +There is a control on the left of the dashboard that allows you to toggle the +histogram mode from "offset" to "overlay": + +![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/3_overlay_offset.png) + +In "offset" mode, the visualization rotates 45 degrees, so that the individual +histogram slices are no longer spread out in time, but instead are all plotted +on the same y-axis. + +![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/4_overlay.png) +Now, each slice is a separate line on the chart, and the y-axis shows the item +count within each bucket. Darker lines are older, earlier steps, and lighter +lines are more recent, later steps. Once again, you can mouse over the chart to +see some additional information. + +![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/5_overlay_tooltips.png) + +In general, the overlay visualization is useful if you want to directly compare +the counts of different histograms. + +## Multimodal Distributions + +The Histogram Dashboard is great for visualizing multimodal +distributions. Let's construct a simple bimodal distribution by concatenating +the outputs from two different normal distributions. The code will look like +this: + +```python +import tensorflow as tf + +k = tf.placeholder(tf.float32) + +# Make a normal distribution, with a shifting mean +mean_moving_normal = tf.random_normal(shape=[1000], mean=(5*k), stddev=1) +# Record that distribution into a histogram summary +tf.summary.histogram("normal/moving_mean", mean_moving_normal) + +# Make a normal distribution with shrinking variance +variance_shrinking_normal = tf.random_normal(shape=[1000], mean=0, stddev=1-(k)) +# Record that distribution too +tf.summary.histogram("normal/shrinking_variance", variance_shrinking_normal) + +# Let's combine both of those distributions into one dataset +normal_combined = tf.concat([mean_moving_normal, variance_shrinking_normal], 0) +# We add another histogram summary to record the combined distribution +tf.summary.histogram("normal/bimodal", normal_combined) + +summaries = tf.summary.merge_all() + +# Setup a session and summary writer +sess = tf.Session() +writer = tf.summary.FileWriter("/tmp/histogram_example") + +# Setup a loop and write the summaries to disk +N = 400 +for step in range(N): + k_val = step/float(N) + summ = sess.run(summaries, feed_dict={k: k_val}) + writer.add_summary(summ, global_step=step) +``` + +You already remember our "moving mean" normal distribution from the example +above. Now we also have a "shrinking variance" distribution. Side-by-side, they +look like this: +![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/6_two_distributions.png) + +When we concatenate them, we get a chart that clearly reveals the divergent, +bimodal structure: +![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/7_bimodal.png) + +## Some more distributions + +Just for fun, let's generate and visualize a few more distributions, and then +combine them all into one chart. Here's the code we'll use: + +```python +import tensorflow as tf + +k = tf.placeholder(tf.float32) + +# Make a normal distribution, with a shifting mean +mean_moving_normal = tf.random_normal(shape=[1000], mean=(5*k), stddev=1) +# Record that distribution into a histogram summary +tf.summary.histogram("normal/moving_mean", mean_moving_normal) + +# Make a normal distribution with shrinking variance +variance_shrinking_normal = tf.random_normal(shape=[1000], mean=0, stddev=1-(k)) +# Record that distribution too +tf.summary.histogram("normal/shrinking_variance", variance_shrinking_normal) + +# Let's combine both of those distributions into one dataset +normal_combined = tf.concat([mean_moving_normal, variance_shrinking_normal], 0) +# We add another histogram summary to record the combined distribution +tf.summary.histogram("normal/bimodal", normal_combined) + +# Add a gamma distribution +gamma = tf.random_gamma(shape=[1000], alpha=k) +tf.summary.histogram("gamma", gamma) + +# And a poisson distribution +poisson = tf.random_poisson(shape=[1000], lam=k) +tf.summary.histogram("poisson", poisson) + +# And a uniform distribution +uniform = tf.random_uniform(shape=[1000], maxval=k*10) +tf.summary.histogram("uniform", uniform) + +# Finally, combine everything together! +all_distributions = [mean_moving_normal, variance_shrinking_normal, + gamma, poisson, uniform] +all_combined = tf.concat(all_distributions, 0) +tf.summary.histogram("all_combined", all_combined) + +summaries = tf.summary.merge_all() + +# Setup a session and summary writer +sess = tf.Session() +writer = tf.summary.FileWriter("/tmp/histogram_example") + +# Setup a loop and write the summaries to disk +N = 400 +for step in range(N): + k_val = step/float(N) + summ = sess.run(summaries, feed_dict={k: k_val}) + writer.add_summary(summ, global_step=step) +``` +### Gamma Distribution +![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/8_gamma.png) + +### Uniform Distribution +![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/9_uniform.png) + +### Poisson Distribution +![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/10_poisson.png) +The poisson distribution is defined over the integers. So, all of the values +being generated are perfect integers. The histogram compression moves the data +into floating-point bins, causing the visualization to show little +bumps over the integer values rather than perfect spikes. + +### All Together Now +Finally, we can concatenate all of the data into one funny-looking curve. +![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/11_all_combined.png) + diff --git a/tensorflow/docs_src/guide/tensors.md b/tensorflow/docs_src/guide/tensors.md new file mode 100644 index 0000000000..7227260f1a --- /dev/null +++ b/tensorflow/docs_src/guide/tensors.md @@ -0,0 +1,330 @@ +# Tensors + +TensorFlow, as the name indicates, is a framework to define and run computations +involving tensors. A **tensor** is a generalization of vectors and matrices to +potentially higher dimensions. Internally, TensorFlow represents tensors as +n-dimensional arrays of base datatypes. + +When writing a TensorFlow program, the main object you manipulate and pass +around is the `tf.Tensor`. A `tf.Tensor` object represents a partially defined +computation that will eventually produce a value. TensorFlow programs work by +first building a graph of `tf.Tensor` objects, detailing how each tensor is +computed based on the other available tensors and then by running parts of this +graph to achieve the desired results. + +A `tf.Tensor` has the following properties: + + * a data type (`float32`, `int32`, or `string`, for example) + * a shape + + +Each element in the Tensor has the same data type, and the data type is always +known. The shape (that is, the number of dimensions it has and the size of each +dimension) might be only partially known. Most operations produce tensors of +fully-known shapes if the shapes of their inputs are also fully known, but in +some cases it's only possible to find the shape of a tensor at graph execution +time. + +Some types of tensors are special, and these will be covered in other +units of the TensorFlow guide. The main ones are: + + * `tf.Variable` + * `tf.constant` + * `tf.placeholder` + * `tf.SparseTensor` + +With the exception of `tf.Variable`, the value of a tensor is immutable, which +means that in the context of a single execution tensors only have a single +value. However, evaluating the same tensor twice can return different values; +for example that tensor can be the result of reading data from disk, or +generating a random number. + +## Rank + +The **rank** of a `tf.Tensor` object is its number of dimensions. Synonyms for +rank include **order** or **degree** or **n-dimension**. +Note that rank in TensorFlow is not the same as matrix rank in mathematics. +As the following table shows, each rank in TensorFlow corresponds to a +different mathematical entity: + +Rank | Math entity +--- | --- +0 | Scalar (magnitude only) +1 | Vector (magnitude and direction) +2 | Matrix (table of numbers) +3 | 3-Tensor (cube of numbers) +n | n-Tensor (you get the idea) + + +### Rank 0 + +The following snippet demonstrates creating a few rank 0 variables: + +```python +mammal = tf.Variable("Elephant", tf.string) +ignition = tf.Variable(451, tf.int16) +floating = tf.Variable(3.14159265359, tf.float64) +its_complicated = tf.Variable(12.3 - 4.85j, tf.complex64) +``` + +Note: A string is treated as a single item in TensorFlow, not as a sequence of +characters. It is possible to have scalar strings, vectors of strings, etc. + +### Rank 1 + +To create a rank 1 `tf.Tensor` object, you can pass a list of items as the +initial value. For example: + +```python +mystr = tf.Variable(["Hello"], tf.string) +cool_numbers = tf.Variable([3.14159, 2.71828], tf.float32) +first_primes = tf.Variable([2, 3, 5, 7, 11], tf.int32) +its_very_complicated = tf.Variable([12.3 - 4.85j, 7.5 - 6.23j], tf.complex64) +``` + + +### Higher ranks + +A rank 2 `tf.Tensor` object consists of at least one row and at least +one column: + +```python +mymat = tf.Variable([[7],[11]], tf.int16) +myxor = tf.Variable([[False, True],[True, False]], tf.bool) +linear_squares = tf.Variable([[4], [9], [16], [25]], tf.int32) +squarish_squares = tf.Variable([ [4, 9], [16, 25] ], tf.int32) +rank_of_squares = tf.rank(squarish_squares) +mymatC = tf.Variable([[7],[11]], tf.int32) +``` + +Higher-rank Tensors, similarly, consist of an n-dimensional array. For example, +during image processing, many tensors of rank 4 are used, with dimensions +corresponding to example-in-batch, image width, image height, and color channel. + +``` python +my_image = tf.zeros([10, 299, 299, 3]) # batch x height x width x color +``` + +### Getting a `tf.Tensor` object's rank + +To determine the rank of a `tf.Tensor` object, call the `tf.rank` method. +For example, the following method programmatically determines the rank +of the `tf.Tensor` defined in the previous section: + +```python +r = tf.rank(my_image) +# After the graph runs, r will hold the value 4. +``` + +### Referring to `tf.Tensor` slices + +Since a `tf.Tensor` is an n-dimensional array of cells, to access a single cell +in a `tf.Tensor` you need to specify n indices. + +For a rank 0 tensor (a scalar), no indices are necessary, since it is already a +single number. + +For a rank 1 tensor (a vector), passing a single index allows you to access a +number: + +```python +my_scalar = my_vector[2] +``` + +Note that the index passed inside the `[]` can itself be a scalar `tf.Tensor`, if +you want to dynamically choose an element from the vector. + +For tensors of rank 2 or higher, the situation is more interesting. For a +`tf.Tensor` of rank 2, passing two numbers returns a scalar, as expected: + + +```python +my_scalar = my_matrix[1, 2] +``` + + +Passing a single number, however, returns a subvector of a matrix, as follows: + + +```python +my_row_vector = my_matrix[2] +my_column_vector = my_matrix[:, 3] +``` + +The `:` notation is python slicing syntax for "leave this dimension alone". This +is useful in higher-rank Tensors, as it allows you to access its subvectors, +submatrices, and even other subtensors. + + +## Shape + +The **shape** of a tensor is the number of elements in each dimension. +TensorFlow automatically infers shapes during graph construction. These inferred +shapes might have known or unknown rank. If the rank is known, the sizes of each +dimension might be known or unknown. + +The TensorFlow documentation uses three notational conventions to describe +tensor dimensionality: rank, shape, and dimension number. The following table +shows how these relate to one another: + +Rank | Shape | Dimension number | Example +--- | --- | --- | --- +0 | [] | 0-D | A 0-D tensor. A scalar. +1 | [D0] | 1-D | A 1-D tensor with shape [5]. +2 | [D0, D1] | 2-D | A 2-D tensor with shape [3, 4]. +3 | [D0, D1, D2] | 3-D | A 3-D tensor with shape [1, 4, 3]. +n | [D0, D1, ... Dn-1] | n-D | A tensor with shape [D0, D1, ... Dn-1]. + +Shapes can be represented via Python lists / tuples of ints, or with the +@{tf.TensorShape}. + +### Getting a `tf.Tensor` object's shape + +There are two ways of accessing the shape of a `tf.Tensor`. While building the +graph, it is often useful to ask what is already known about a tensor's +shape. This can be done by reading the `shape` property of a `tf.Tensor` object. +This method returns a `TensorShape` object, which is a convenient way of +representing partially-specified shapes (since, when building the graph, not all +shapes will be fully known). + +It is also possible to get a `tf.Tensor` that will represent the fully-defined +shape of another `tf.Tensor` at runtime. This is done by calling the `tf.shape` +operation. This way, you can build a graph that manipulates the shapes of +tensors by building other tensors that depend on the dynamic shape of the input +`tf.Tensor`. + +For example, here is how to make a vector of zeros with the same size as the +number of columns in a given matrix: + +``` python +zeros = tf.zeros(my_matrix.shape[1]) +``` + +### Changing the shape of a `tf.Tensor` + +The **number of elements** of a tensor is the product of the sizes of all its +shapes. The number of elements of a scalar is always `1`. Since there are often +many different shapes that have the same number of elements, it's often +convenient to be able to change the shape of a `tf.Tensor`, keeping its elements +fixed. This can be done with `tf.reshape`. + +The following examples demonstrate how to reshape tensors: + +```python +rank_three_tensor = tf.ones([3, 4, 5]) +matrix = tf.reshape(rank_three_tensor, [6, 10]) # Reshape existing content into + # a 6x10 matrix +matrixB = tf.reshape(matrix, [3, -1]) # Reshape existing content into a 3x20 + # matrix. -1 tells reshape to calculate + # the size of this dimension. +matrixAlt = tf.reshape(matrixB, [4, 3, -1]) # Reshape existing content into a + #4x3x5 tensor + +# Note that the number of elements of the reshaped Tensors has to match the +# original number of elements. Therefore, the following example generates an +# error because no possible value for the last dimension will match the number +# of elements. +yet_another = tf.reshape(matrixAlt, [13, 2, -1]) # ERROR! +``` + +## Data types + +In addition to dimensionality, Tensors have a data type. Refer to the +`tf.DType` page for a complete list of the data types. + +It is not possible to have a `tf.Tensor` with more than one data type. It is +possible, however, to serialize arbitrary data structures as `string`s and store +those in `tf.Tensor`s. + +It is possible to cast `tf.Tensor`s from one datatype to another using +`tf.cast`: + +``` python +# Cast a constant integer tensor into floating point. +float_tensor = tf.cast(tf.constant([1, 2, 3]), dtype=tf.float32) +``` + +To inspect a `tf.Tensor`'s data type use the `Tensor.dtype` property. + +When creating a `tf.Tensor` from a python object you may optionally specify the +datatype. If you don't, TensorFlow chooses a datatype that can represent your +data. TensorFlow converts Python integers to `tf.int32` and python floating +point numbers to `tf.float32`. Otherwise TensorFlow uses the same rules numpy +uses when converting to arrays. + +## Evaluating Tensors + +Once the computation graph has been built, you can run the computation that +produces a particular `tf.Tensor` and fetch the value assigned to it. This is +often useful for debugging as well as being required for much of TensorFlow to +work. + +The simplest way to evaluate a Tensor is using the `Tensor.eval` method. For +example: + +```python +constant = tf.constant([1, 2, 3]) +tensor = constant * constant +print(tensor.eval()) +``` + +The `eval` method only works when a default `tf.Session` is active (see +Graphs and Sessions for more information). + +`Tensor.eval` returns a numpy array with the same contents as the tensor. + +Sometimes it is not possible to evaluate a `tf.Tensor` with no context because +its value might depend on dynamic information that is not available. For +example, tensors that depend on `placeholder`s can't be evaluated without +providing a value for the `placeholder`. + +``` python +p = tf.placeholder(tf.float32) +t = p + 1.0 +t.eval() # This will fail, since the placeholder did not get a value. +t.eval(feed_dict={p:2.0}) # This will succeed because we're feeding a value + # to the placeholder. +``` + +Note that it is possible to feed any `tf.Tensor`, not just placeholders. + +Other model constructs might make evaluating a `tf.Tensor` +complicated. TensorFlow can't directly evaluate `tf.Tensor`s defined inside +functions or inside control flow constructs. If a `tf.Tensor` depends on a value +from a queue, evaluating the `tf.Tensor` will only work once something has been +enqueued; otherwise, evaluating it will hang. When working with queues, remember +to call `tf.train.start_queue_runners` before evaluating any `tf.Tensor`s. + +## Printing Tensors + +For debugging purposes you might want to print the value of a `tf.Tensor`. While + @{$debugger$tfdbg} provides advanced debugging support, TensorFlow also has an + operation to directly print the value of a `tf.Tensor`. + +Note that you rarely want to use the following pattern when printing a +`tf.Tensor`: + +``` python +t = <> +print(t) # This will print the symbolic tensor when the graph is being built. + # This tensor does not have a value in this context. +``` + +This code prints the `tf.Tensor` object (which represents deferred computation) +and not its value. Instead, TensorFlow provides the `tf.Print` operation, which +returns its first tensor argument unchanged while printing the set of +`tf.Tensor`s it is passed as the second argument. + +To correctly use `tf.Print` its return value must be used. See the example below + +``` python +t = <> +tf.Print(t, [t]) # This does nothing +t = tf.Print(t, [t]) # Here we are using the value returned by tf.Print +result = t + 1 # Now when result is evaluated the value of `t` will be printed. +``` + +When you evaluate `result` you will evaluate everything `result` depends +upon. Since `result` depends upon `t`, and evaluating `t` has the side effect of +printing its input (the old value of `t`), `t` gets printed. + diff --git a/tensorflow/docs_src/guide/using_gpu.md b/tensorflow/docs_src/guide/using_gpu.md new file mode 100644 index 0000000000..c429ca4750 --- /dev/null +++ b/tensorflow/docs_src/guide/using_gpu.md @@ -0,0 +1,215 @@ +# Using GPUs + +## Supported devices + +On a typical system, there are multiple computing devices. In TensorFlow, the +supported device types are `CPU` and `GPU`. They are represented as `strings`. +For example: + +* `"/cpu:0"`: The CPU of your machine. +* `"/device:GPU:0"`: The GPU of your machine, if you have one. +* `"/device:GPU:1"`: The second GPU of your machine, etc. + +If a TensorFlow operation has both CPU and GPU implementations, the GPU devices +will be given priority when the operation is assigned to a device. For example, +`matmul` has both CPU and GPU kernels. On a system with devices `cpu:0` and +`gpu:0`, `gpu:0` will be selected to run `matmul`. + +## Logging Device placement + +To find out which devices your operations and tensors are assigned to, create +the session with `log_device_placement` configuration option set to `True`. + +```python +# Creates a graph. +a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') +b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') +c = tf.matmul(a, b) +# Creates a session with log_device_placement set to True. +sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) +# Runs the op. +print(sess.run(c)) +``` + +You should see the following output: + +``` +Device mapping: +/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K40c, pci bus +id: 0000:05:00.0 +b: /job:localhost/replica:0/task:0/device:GPU:0 +a: /job:localhost/replica:0/task:0/device:GPU:0 +MatMul: /job:localhost/replica:0/task:0/device:GPU:0 +[[ 22. 28.] + [ 49. 64.]] + +``` + +## Manual device placement + +If you would like a particular operation to run on a device of your choice +instead of what's automatically selected for you, you can use `with tf.device` +to create a device context such that all the operations within that context will +have the same device assignment. + +```python +# Creates a graph. +with tf.device('/cpu:0'): + a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') + b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') +c = tf.matmul(a, b) +# Creates a session with log_device_placement set to True. +sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) +# Runs the op. +print(sess.run(c)) +``` + +You will see that now `a` and `b` are assigned to `cpu:0`. Since a device was +not explicitly specified for the `MatMul` operation, the TensorFlow runtime will +choose one based on the operation and available devices (`gpu:0` in this +example) and automatically copy tensors between devices if required. + +``` +Device mapping: +/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K40c, pci bus +id: 0000:05:00.0 +b: /job:localhost/replica:0/task:0/cpu:0 +a: /job:localhost/replica:0/task:0/cpu:0 +MatMul: /job:localhost/replica:0/task:0/device:GPU:0 +[[ 22. 28.] + [ 49. 64.]] +``` + +## Allowing GPU memory growth + +By default, TensorFlow maps nearly all of the GPU memory of all GPUs (subject to +[`CUDA_VISIBLE_DEVICES`](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars)) +visible to the process. This is done to more efficiently use the relatively +precious GPU memory resources on the devices by reducing [memory +fragmentation](https://en.wikipedia.org/wiki/Fragmentation_\(computing\)). + +In some cases it is desirable for the process to only allocate a subset of the +available memory, or to only grow the memory usage as is needed by the process. +TensorFlow provides two Config options on the Session to control this. + +The first is the `allow_growth` option, which attempts to allocate only as much +GPU memory based on runtime allocations: it starts out allocating very little +memory, and as Sessions get run and more GPU memory is needed, we extend the GPU +memory region needed by the TensorFlow process. Note that we do not release +memory, since that can lead to even worse memory fragmentation. To turn this +option on, set the option in the ConfigProto by: + +```python +config = tf.ConfigProto() +config.gpu_options.allow_growth = True +session = tf.Session(config=config, ...) +``` + +The second method is the `per_process_gpu_memory_fraction` option, which +determines the fraction of the overall amount of memory that each visible GPU +should be allocated. For example, you can tell TensorFlow to only allocate 40% +of the total memory of each GPU by: + +```python +config = tf.ConfigProto() +config.gpu_options.per_process_gpu_memory_fraction = 0.4 +session = tf.Session(config=config, ...) +``` + +This is useful if you want to truly bound the amount of GPU memory available to +the TensorFlow process. + +## Using a single GPU on a multi-GPU system + +If you have more than one GPU in your system, the GPU with the lowest ID will be +selected by default. If you would like to run on a different GPU, you will need +to specify the preference explicitly: + +```python +# Creates a graph. +with tf.device('/device:GPU:2'): + a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') + b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') + c = tf.matmul(a, b) +# Creates a session with log_device_placement set to True. +sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) +# Runs the op. +print(sess.run(c)) +``` + +If the device you have specified does not exist, you will get +`InvalidArgumentError`: + +``` +InvalidArgumentError: Invalid argument: Cannot assign a device to node 'b': +Could not satisfy explicit device specification '/device:GPU:2' + [[Node: b = Const[dtype=DT_FLOAT, value=Tensor, _device="/device:GPU:2"]()]] +``` + +If you would like TensorFlow to automatically choose an existing and supported +device to run the operations in case the specified one doesn't exist, you can +set `allow_soft_placement` to `True` in the configuration option when creating +the session. + +```python +# Creates a graph. +with tf.device('/device:GPU:2'): + a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') + b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') + c = tf.matmul(a, b) +# Creates a session with allow_soft_placement and log_device_placement set +# to True. +sess = tf.Session(config=tf.ConfigProto( + allow_soft_placement=True, log_device_placement=True)) +# Runs the op. +print(sess.run(c)) +``` + +## Using multiple GPUs + +If you would like to run TensorFlow on multiple GPUs, you can construct your +model in a multi-tower fashion where each tower is assigned to a different GPU. +For example: + +``` python +# Creates a graph. +c = [] +for d in ['/device:GPU:2', '/device:GPU:3']: + with tf.device(d): + a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3]) + b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2]) + c.append(tf.matmul(a, b)) +with tf.device('/cpu:0'): + sum = tf.add_n(c) +# Creates a session with log_device_placement set to True. +sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) +# Runs the op. +print(sess.run(sum)) +``` + +You will see the following output. + +``` +Device mapping: +/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K20m, pci bus +id: 0000:02:00.0 +/job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: Tesla K20m, pci bus +id: 0000:03:00.0 +/job:localhost/replica:0/task:0/device:GPU:2 -> device: 2, name: Tesla K20m, pci bus +id: 0000:83:00.0 +/job:localhost/replica:0/task:0/device:GPU:3 -> device: 3, name: Tesla K20m, pci bus +id: 0000:84:00.0 +Const_3: /job:localhost/replica:0/task:0/device:GPU:3 +Const_2: /job:localhost/replica:0/task:0/device:GPU:3 +MatMul_1: /job:localhost/replica:0/task:0/device:GPU:3 +Const_1: /job:localhost/replica:0/task:0/device:GPU:2 +Const: /job:localhost/replica:0/task:0/device:GPU:2 +MatMul: /job:localhost/replica:0/task:0/device:GPU:2 +AddN: /job:localhost/replica:0/task:0/cpu:0 +[[ 44. 56.] + [ 98. 128.]] +``` + +The @{$deep_cnn$cifar10 tutorial} is a good example +demonstrating how to do training with multiple GPUs. diff --git a/tensorflow/docs_src/guide/using_tpu.md b/tensorflow/docs_src/guide/using_tpu.md new file mode 100644 index 0000000000..41d80d9d60 --- /dev/null +++ b/tensorflow/docs_src/guide/using_tpu.md @@ -0,0 +1,395 @@ +# Using TPUs + +This document walks through the principal TensorFlow APIs necessary to make +effective use of a [Cloud TPU](https://cloud.google.com/tpu/), and highlights +the differences between regular TensorFlow usage, and usage on a TPU. + +This doc is aimed at users who: + +* Are familiar with TensorFlow's `Estimator` and `Dataset` APIs +* Have maybe [tried out a Cloud TPU](https://cloud.google.com/tpu/docs/quickstart) + using an existing model. +* Have, perhaps, skimmed the code of an example TPU model + [[1]](https://github.com/tensorflow/models/blob/master/official/mnist/mnist_tpu.py) + [[2]](https://github.com/tensorflow/tpu/tree/master/models). +* Are interested in porting an existing `Estimator` model to + run on Cloud TPUs + +## TPUEstimator + +@{tf.estimator.Estimator$Estimators} are TensorFlow's model-level abstraction. +Standard `Estimators` can drive models on CPU and GPUs. You must use +@{tf.contrib.tpu.TPUEstimator} to drive a model on TPUs. + +Refer to TensorFlow's Getting Started section for an introduction to the basics +of using a @{$premade_estimators$pre-made `Estimator`}, and +@{$custom_estimators$custom `Estimator`s}. + +The `TPUEstimator` class differs somewhat from the `Estimator` class. + +The simplest way to maintain a model that can be run both on CPU/GPU or on a +Cloud TPU is to define the model's inference phase (from inputs to predictions) +outside of the `model_fn`. Then maintain separate implementations of the +`Estimator` setup and `model_fn`, both wrapping this inference step. For an +example of this pattern compare the `mnist.py` and `mnist_tpu.py` implementation in +[tensorflow/models](https://github.com/tensorflow/models/tree/master/official/mnist). + +### Running a `TPUEstimator` locally + +To create a standard `Estimator` you call the constructor, and pass it a +`model_fn`, for example: + +``` +my_estimator = tf.estimator.Estimator( + model_fn=my_model_fn) +``` + +The changes required to use a @{tf.contrib.tpu.TPUEstimator} on your local +machine are relatively minor. The constructor requires two additional arguments. +You should set the `use_tpu` argument to `False`, and pass a +@{tf.contrib.tpu.RunConfig} as the `config` argument, as shown below: + +``` python +my_tpu_estimator = tf.contrib.tpu.TPUEstimator( + model_fn=my_model_fn, + config=tf.contrib.tpu.RunConfig() + use_tpu=False) +``` + +Just this simple change will allow you to run a `TPUEstimator` locally. +The majority of example TPU models can be run in this local mode, +by setting the command line flags as follows: + + +``` +$> python mnist_tpu.py --use_tpu=false --master='' +``` + +Note: This `use_tpu=False` argument is useful for trying out the `TPUEstimator` +API. It is not meant to be a complete TPU compatibility test. Successfully +running a model locally in a `TPUEstimator` does not guarantee that it will +work on a TPU. + + +### Building a `tpu.RunConfig` + +While the default `RunConfig` is sufficient for local training, these settings +cannot be ignored in real usage. + +A more typical setup for a `RunConfig`, that can be switched to use a Cloud +TPU, might be as follows: + +``` python +import tempfile +import subprocess + +class FLAGS(object): + use_tpu=False + tpu_name=None + # Use a local temporary path for the `model_dir` + model_dir = tempfile.mkdtemp() + # Number of training steps to run on the Cloud TPU before returning control. + iterations = 50 + # A single Cloud TPU has 8 shards. + num_shards = 8 + +if FLAGS.use_tpu: + my_project_name = subprocess.check_output([ + 'gcloud','config','get-value','project']) + my_zone = subprocess.check_output([ + 'gcloud','config','get-value','compute/zone']) + cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver( + tpu_names=[FLAGS.tpu_name], + zone=my_zone, + project=my_project) + master = tpu_cluster_resolver.get_master() +else: + master = '' + +my_tpu_run_config = tf.contrib.tpu.RunConfig( + master=master, + evaluation_master=master, + model_dir=FLAGS.model_dir, + session_config=tf.ConfigProto( + allow_soft_placement=True, log_device_placement=True), + tpu_config=tf.contrib.tpu.TPUConfig(FLAGS.iterations, + FLAGS.num_shards), +) +``` + +Then you must pass the @{tf.contrib.tpu.RunConfig} to the constructor: + +``` python +my_tpu_estimator = tf.contrib.tpu.TPUEstimator( + model_fn=my_model_fn, + config = my_tpu_run_config, + use_tpu=FLAGS.use_tpu) +``` + +Typically the `FLAGS` would be set by command line arguments. To switch from +training locally to training on a cloud TPU you would need to: + +* Set `FLAGS.use_tpu` to `True` +* Set `FLAGS.tpu_name` so the `tf.contrib.cluster_resolver.TPUClusterResolver` can find it +* Set `FLAGS.model_dir` to a Google Cloud Storage bucket url (`gs://`). + + +## Optimizer + +When training on a cloud TPU you **must** wrap the optimizer in a +@{tf.contrib.tpu.CrossShardOptimizer}, which uses an `allreduce` to aggregate +gradients and broadcast the result to each shard (each TPU core). + +The `CrossShardOptimizer` is not compatible with local training. So, to have +the same code run both locally and on a Cloud TPU, add lines like the following: + +``` python +optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate) +if FLAGS.use_tpu: + optimizer = tf.contrib.tpu.CrossShardOptimizer(optimizer) +``` + +If you prefer to avoid a global `FLAGS` variable in your model code, one +approach is to set the optimizer as one of the `Estimator`'s params, +as follows: + +``` python +my_tpu_estimator = tf.contrib.tpu.TPUEstimator( + model_fn=my_model_fn, + config = my_tpu_run_config, + use_tpu=FLAGS.use_tpu, + params={'optimizer':optimizer}) +``` + +## Model Function + +This section details the changes you must make to the model function +(`model_fn()`) to make it `TPUEstimator` compatible. + +### Static shapes + +During regular usage TensorFlow attempts to determine the shapes of each +`tf.Tensor` during graph construction. During execution any unknown shape +dimensions are determined dynamically, +see @{$guide/tensors#shape$Tensor Shapes} for more details. + +To run on Cloud TPUs TensorFlow models are compiled using @{$xla$XLA}. +XLA uses a similar system for determining shapes at compile time. XLA requires +that all tensor dimensions be statically defined at compile time. All shapes +must evaluate to a constant, and not depend on external data, or stateful +operations like variables or a random number generator. + + +### Summaries + +Remove any use of `tf.summary` from your model. + +@{$summaries_and_tensorboard$TensorBoard summaries} are a great way see inside +your model. A minimal set of basic summaries are automatically recorded by the +`TPUEstimator`, to `event` files in the `model_dir`. Custom summaries, however, +are currently unsupported when training on a Cloud TPU. So while the +`TPUEstimator` will still run locally with summaries, it will fail if used on a +TPU. + +### Metrics + +Build your evaluation metrics dictionary in a stand-alone `metric_fn`. + + + +Evaluation metrics are an essential part of training a model. These are fully +supported on Cloud TPUs, but with a slightly different syntax. + +A standard @{tf.metrics} returns two tensors. The first returns the running +average of the metric value, while the second updates the running average and +returns the value for this batch: + +``` +running_average, current_batch = tf.metrics.accuracy(labels, predictions) +``` + +In a standard `Estimator` you create a dictionary of these pairs, and return it +as part of the `EstimatorSpec`. + +```python +my_metrics = {'accuracy': tf.metrics.accuracy(labels, predictions)} + +return tf.estimator.EstimatorSpec( + ... + eval_metric_ops=my_metrics +) +``` + +In a `TPUEstimator` you instead pass a function (which returns a metrics +dictionary) and a list of argument tensors, as shown below: + +```python +def my_metric_fn(labels, predictions): + return {'accuracy': tf.metrics.accuracy(labels, predictions)} + +return tf.contrib.tpu.TPUEstimatorSpec( + ... + eval_metrics=(my_metric_fn, [labels, predictions]) +) +``` + +### Use `TPUEstimatorSpec` + +`TPUEstimatorSpec` do not support hooks, and require function wrappers for +some fields. + +An `Estimator`'s `model_fn` must return an `EstimatorSpec`. An `EstimatorSpec` +is a simple structure of named fields containing all the `tf.Tensors` of the +model that the `Estimator` may need to interact with. + +`TPUEstimators` use a @{tf.contrib.tpu.TPUEstimatorSpec}. There are a few +differences between it and a standard @{tf.estimator.EstimatorSpec}: + + +* The `eval_metric_ops` must be wrapped into a `metrics_fn`, this field is + renamed `eval_metrics` ([see above](#metrics)). +* The @{tf.train.SessionRunHook$hooks} are unsupported, so these fields are + omitted. +* The @{tf.train.Scaffold$`scaffold`}, if used, must also be wrapped in a + function. This field is renamed to `scaffold_fn`. + +`Scaffold` and `Hooks` are for advanced usage, and can typically be omitted. + +## Input functions + +Input functions work mainly unchanged as they run on the host computer, not the +Cloud TPU itself. This section explains the two necessary adjustments. + +### Params argument + + + +The `input_fn` for a standard `Estimator` _can_ include a +`params` argument; the `input_fn` for a `TPUEstimator` *must* include a +`params` argument. This is necessary to allow the estimator to set the batch +size for each replica of the input stream. So the minimum signature for an +`input_fn` for a `TPUEstimator` is: + +``` +def my_input_fn(params): + pass +``` + +Where `params['batch-size']` will contain the batch size. + +### Static shapes and batch size + +The input pipeline generated by your `input_fn` is run on CPU. So it is mostly +free from the strict static shape requirements imposed by the XLA/TPU environment. +The one requirement is that the batches of data fed from your input pipeline to +the TPU have a static shape, as determined by the standard TensorFlow shape +inference algorithm. Intermediate tensors are free to have a dynamic shapes. +If shape inference has failed, but the shape is known it is possible to +impose the correct shape using `tf.set_shape()`. + +In the example below the shape +inference algorithm fails, but it is correctly using `set_shape`: + +``` +>>> x = tf.zeros(tf.constant([1,2,3])+1) +>>> x.shape + +TensorShape([Dimension(None), Dimension(None), Dimension(None)]) + +>>> x.set_shape([2,3,4]) +``` + +In many cases the batch size is the only unknown dimension. + +A typical input pipeline, using `tf.data`, will usually produce batches of a +fixed size. The last batch of a finite `Dataset`, however, is typically smaller, +containing just the remaining elements. Since a `Dataset` does not know its own +length or finiteness, the standard @{tf.data.Dataset.batch$`batch`} method +cannot determine if all batches will have a fixed size batch on its own: + +``` +>>> params = {'batch_size':32} +>>> ds = tf.data.Dataset.from_tensors([0, 1, 2]) +>>> ds = ds.repeat().batch(params['batch-size']) +>>> ds + + +``` + +The most straightforward fix is to +@{tf.data.Dataset.apply$apply} @{tf.contrib.data.batch_and_drop_remainder} +as follows: + +``` +>>> params = {'batch_size':32} +>>> ds = tf.data.Dataset.from_tensors([0, 1, 2]) +>>> ds = ds.repeat().apply( +... tf.contrib.data.batch_and_drop_remainder(params['batch-size'])) +>>> ds + + <_RestructuredDataset shapes: (32, 3), types: tf.int32> +``` + +The one downside to this approach is that, as the name implies, this batching +method throws out any fractional batch at the end of the dataset. This is fine +for an infinitely repeating dataset being used for training, but could be a +problem if you want to train for an exact number of epochs. + +To do an exact 1-epoch of _evaluation_ you can work around this by manually +padding the length of the batches, and setting the padding entries to have zero +weight when creating your `tf.metrics`. + +## Datasets + +Efficient use of the `tf.data.Dataset` API is critical when using a Cloud +TPU, as it is impossible to use the Cloud TPU's unless you can feed it data +quickly enough. See @{$datasets_performance} for details on dataset performance. + +For all but the simplest experimentation (using +@{tf.data.Dataset.from_tensor_slices} or other in-graph data) you will need to +store all data files read by the `TPUEstimator`'s `Dataset` in Google Cloud +Storage Buckets. + + + +For most use-cases, we recommend converting your data into `TFRecord` +format and using a @{tf.data.TFRecordDataset} to read it. This, however, is not +a hard requirement and you can use other dataset readers +(`FixedLengthRecordDataset` or `TextLineDataset`) if you prefer. + +Small datasets can be loaded entirely into memory using +@{tf.data.Dataset.cache}. + +Regardless of the data format used, it is strongly recommended that you +@{$performance_guide#use_large_files$use large files}, on the order of +100MB. This is especially important in this networked setting as the overhead +of opening a file is significantly higher. + +It is also important, regardless of the type of reader used, to enable buffering +using the `buffer_size` argument to the constructor. This argument is specified +in bytes. A minimum of a few MB (`buffer_size=8*1024*1024`) is recommended so +that data is available when needed. + +The TPU-demos repo includes +[a script](https://github.com/tensorflow/tpu/blob/master/tools/datasets/imagenet_to_gcs.py) +for downloading the imagenet dataset and converting it to an appropriate format. +This together with the imagenet +[models](https://github.com/tensorflow/tpu/tree/master/models) +included in the repo demonstrate all of these best-practices. + + +## What Next + +For details on how to actually set up and run a Cloud TPU see: + + * [Google Cloud TPU Documentation](https://cloud.google.com/tpu/docs/) + +This document is by no means exhaustive. The best source of more detail on how +to make a Cloud TPU compatible model are the example models published in: + + * The [TPU Demos Repository.](https://github.com/tensorflow/tpu) + +For more information about tuning TensorFlow code for performance see: + + * The @{$performance$Performance Section.} + diff --git a/tensorflow/docs_src/guide/variables.md b/tensorflow/docs_src/guide/variables.md new file mode 100644 index 0000000000..cd8c4b5b9a --- /dev/null +++ b/tensorflow/docs_src/guide/variables.md @@ -0,0 +1,319 @@ +# Variables + +A TensorFlow **variable** is the best way to represent shared, persistent state +manipulated by your program. + +Variables are manipulated via the `tf.Variable` class. A `tf.Variable` +represents a tensor whose value can be changed by running ops on it. Unlike +`tf.Tensor` objects, a `tf.Variable` exists outside the context of a single +`session.run` call. + +Internally, a `tf.Variable` stores a persistent tensor. Specific ops allow you +to read and modify the values of this tensor. These modifications are visible +across multiple `tf.Session`s, so multiple workers can see the same values for a +`tf.Variable`. + +## Creating a Variable + +The best way to create a variable is to call the `tf.get_variable` +function. This function requires you to specify the Variable's name. This name +will be used by other replicas to access the same variable, as well as to name +this variable's value when checkpointing and exporting models. `tf.get_variable` +also allows you to reuse a previously created variable of the same name, making it +easy to define models which reuse layers. + +To create a variable with `tf.get_variable`, simply provide the name and shape + +``` python +my_variable = tf.get_variable("my_variable", [1, 2, 3]) +``` + +This creates a variable named "my_variable" which is a three-dimensional tensor +with shape `[1, 2, 3]`. This variable will, by default, have the `dtype` +`tf.float32` and its initial value will be randomized via +`tf.glorot_uniform_initializer`. + +You may optionally specify the `dtype` and initializer to `tf.get_variable`. For +example: + +``` python +my_int_variable = tf.get_variable("my_int_variable", [1, 2, 3], dtype=tf.int32, + initializer=tf.zeros_initializer) +``` + +TensorFlow provides many convenient initializers. Alternatively, you may +initialize a `tf.Variable` to have the value of a `tf.Tensor`. For example: + +``` python +other_variable = tf.get_variable("other_variable", dtype=tf.int32, + initializer=tf.constant([23, 42])) +``` + +Note that when the initializer is a `tf.Tensor` you should not specify the +variable's shape, as the shape of the initializer tensor will be used. + + + +### Variable collections + +Because disconnected parts of a TensorFlow program might want to create +variables, it is sometimes useful to have a single way to access all of +them. For this reason TensorFlow provides **collections**, which are named lists +of tensors or other objects, such as `tf.Variable` instances. + +By default every `tf.Variable` gets placed in the following two collections: + + * `tf.GraphKeys.GLOBAL_VARIABLES` --- variables that can be shared across + multiple devices, + * `tf.GraphKeys.TRAINABLE_VARIABLES` --- variables for which TensorFlow will + calculate gradients. + +If you don't want a variable to be trainable, add it to the +`tf.GraphKeys.LOCAL_VARIABLES` collection instead. For example, the following +snippet demonstrates how to add a variable named `my_local` to this collection: + +``` python +my_local = tf.get_variable("my_local", shape=(), +collections=[tf.GraphKeys.LOCAL_VARIABLES]) +``` + +Alternatively, you can specify `trainable=False` as an argument to +`tf.get_variable`: + +``` python +my_non_trainable = tf.get_variable("my_non_trainable", + shape=(), + trainable=False) +``` + + +You can also use your own collections. Any string is a valid collection name, +and there is no need to explicitly create a collection. To add a variable (or +any other object) to a collection after creating the variable, call +`tf.add_to_collection`. For example, the following code adds an existing +variable named `my_local` to a collection named `my_collection_name`: + +``` python +tf.add_to_collection("my_collection_name", my_local) +``` + +And to retrieve a list of all the variables (or other objects) you've placed in +a collection you can use: + +``` python +tf.get_collection("my_collection_name") +``` + +### Device placement + +Just like any other TensorFlow operation, you can place variables on particular +devices. For example, the following snippet creates a variable named `v` and +places it on the second GPU device: + +``` python +with tf.device("/device:GPU:1"): + v = tf.get_variable("v", [1]) +``` + +It is particularly important for variables to be in the correct device in +distributed settings. Accidentally putting variables on workers instead of +parameter servers, for example, can severely slow down training or, in the worst +case, let each worker blithely forge ahead with its own independent copy of each +variable. For this reason we provide @{tf.train.replica_device_setter}, which +can automatically place variables in parameter servers. For example: + +``` python +cluster_spec = { + "ps": ["ps0:2222", "ps1:2222"], + "worker": ["worker0:2222", "worker1:2222", "worker2:2222"]} +with tf.device(tf.train.replica_device_setter(cluster=cluster_spec)): + v = tf.get_variable("v", shape=[20, 20]) # this variable is placed + # in the parameter server + # by the replica_device_setter +``` + +## Initializing variables + +Before you can use a variable, it must be initialized. If you are programming in +the low-level TensorFlow API (that is, you are explicitly creating your own +graphs and sessions), you must explicitly initialize the variables. Most +high-level frameworks such as `tf.contrib.slim`, `tf.estimator.Estimator` and +`Keras` automatically initialize variables for you before training a model. + +Explicit initialization is otherwise useful because it allows you not to rerun +potentially expensive initializers when reloading a model from a checkpoint as +well as allowing determinism when randomly-initialized variables are shared in a +distributed setting. + +To initialize all trainable variables in one go, before training starts, call +`tf.global_variables_initializer()`. This function returns a single operation +responsible for initializing all variables in the +`tf.GraphKeys.GLOBAL_VARIABLES` collection. Running this operation initializes +all variables. For example: + +``` python +session.run(tf.global_variables_initializer()) +# Now all variables are initialized. +``` + +If you do need to initialize variables yourself, you can run the variable's +initializer operation. For example: + +``` python +session.run(my_variable.initializer) +``` + + +You can also ask which variables have still not been initialized. For example, +the following code prints the names of all variables which have not yet been +initialized: + +``` python +print(session.run(tf.report_uninitialized_variables())) +``` + + +Note that by default `tf.global_variables_initializer` does not specify the +order in which variables are initialized. Therefore, if the initial value of a +variable depends on another variable's value, it's likely that you'll get an +error. Any time you use the value of a variable in a context in which not all +variables are initialized (say, if you use a variable's value while initializing +another variable), it is best to use `variable.initialized_value()` instead of +`variable`: + +``` python +v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer()) +w = tf.get_variable("w", initializer=v.initialized_value() + 1) +``` + +## Using variables + +To use the value of a `tf.Variable` in a TensorFlow graph, simply treat it like +a normal `tf.Tensor`: + +``` python +v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer()) +w = v + 1 # w is a tf.Tensor which is computed based on the value of v. + # Any time a variable is used in an expression it gets automatically + # converted to a tf.Tensor representing its value. +``` + +To assign a value to a variable, use the methods `assign`, `assign_add`, and +friends in the `tf.Variable` class. For example, here is how you can call these +methods: + +``` python +v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer()) +assignment = v.assign_add(1) +tf.global_variables_initializer().run() +sess.run(assignment) # or assignment.op.run(), or assignment.eval() +``` + +Most TensorFlow optimizers have specialized ops that efficiently update the +values of variables according to some gradient descent-like algorithm. See +@{tf.train.Optimizer} for an explanation of how to use optimizers. + +Because variables are mutable it's sometimes useful to know what version of a +variable's value is being used at any point in time. To force a re-read of the +value of a variable after something has happened, you can use +`tf.Variable.read_value`. For example: + +``` python +v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer()) +assignment = v.assign_add(1) +with tf.control_dependencies([assignment]): + w = v.read_value() # w is guaranteed to reflect v's value after the + # assign_add operation. +``` + + +## Sharing variables + +TensorFlow supports two ways of sharing variables: + + * Explicitly passing `tf.Variable` objects around. + * Implicitly wrapping `tf.Variable` objects within `tf.variable_scope` objects. + +While code which explicitly passes variables around is very clear, it is +sometimes convenient to write TensorFlow functions that implicitly use +variables in their implementations. Most of the functional layers from +`tf.layers` use this approach, as well as all `tf.metrics`, and a few other +library utilities. + +Variable scopes allow you to control variable reuse when calling functions which +implicitly create and use variables. They also allow you to name your variables +in a hierarchical and understandable way. + +For example, let's say we write a function to create a convolutional / relu +layer: + +```python +def conv_relu(input, kernel_shape, bias_shape): + # Create variable named "weights". + weights = tf.get_variable("weights", kernel_shape, + initializer=tf.random_normal_initializer()) + # Create variable named "biases". + biases = tf.get_variable("biases", bias_shape, + initializer=tf.constant_initializer(0.0)) + conv = tf.nn.conv2d(input, weights, + strides=[1, 1, 1, 1], padding='SAME') + return tf.nn.relu(conv + biases) +``` + +This function uses short names `weights` and `biases`, which is good for +clarity. In a real model, however, we want many such convolutional layers, and +calling this function repeatedly would not work: + +``` python +input1 = tf.random_normal([1,10,10,32]) +input2 = tf.random_normal([1,20,20,32]) +x = conv_relu(input1, kernel_shape=[5, 5, 32, 32], bias_shape=[32]) +x = conv_relu(x, kernel_shape=[5, 5, 32, 32], bias_shape = [32]) # This fails. +``` + +Since the desired behavior is unclear (create new variables or reuse the +existing ones?) TensorFlow will fail. Calling `conv_relu` in different scopes, +however, clarifies that we want to create new variables: + +```python +def my_image_filter(input_images): + with tf.variable_scope("conv1"): + # Variables created here will be named "conv1/weights", "conv1/biases". + relu1 = conv_relu(input_images, [5, 5, 32, 32], [32]) + with tf.variable_scope("conv2"): + # Variables created here will be named "conv2/weights", "conv2/biases". + return conv_relu(relu1, [5, 5, 32, 32], [32]) +``` + +If you do want the variables to be shared, you have two options. First, you can +create a scope with the same name using `reuse=True`: + +``` python +with tf.variable_scope("model"): + output1 = my_image_filter(input1) +with tf.variable_scope("model", reuse=True): + output2 = my_image_filter(input2) + +``` + +You can also call `scope.reuse_variables()` to trigger a reuse: + +``` python +with tf.variable_scope("model") as scope: + output1 = my_image_filter(input1) + scope.reuse_variables() + output2 = my_image_filter(input2) + +``` + +Since depending on exact string names of scopes can feel dangerous, it's also +possible to initialize a variable scope based on another one: + +``` python +with tf.variable_scope("model") as scope: + output1 = my_image_filter(input1) +with tf.variable_scope(scope, reuse=True): + output2 = my_image_filter(input2) + +``` + diff --git a/tensorflow/docs_src/guide/version_compat.md b/tensorflow/docs_src/guide/version_compat.md new file mode 100644 index 0000000000..72e427c5f8 --- /dev/null +++ b/tensorflow/docs_src/guide/version_compat.md @@ -0,0 +1,319 @@ +# TensorFlow Version Compatibility + +This document is for users who need backwards compatibility across different +versions of TensorFlow (either for code or data), and for developers who want +to modify TensorFlow while preserving compatibility. + +## Semantic Versioning 2.0 + +TensorFlow follows Semantic Versioning 2.0 ([semver](http://semver.org)) for its +public API. Each release version of TensorFlow has the form `MAJOR.MINOR.PATCH`. +For example, TensorFlow version 1.2.3 has `MAJOR` version 1, `MINOR` version 2, +and `PATCH` version 3. Changes to each number have the following meaning: + +* **MAJOR**: Potentially backwards incompatible changes. Code and data that + worked with a previous major release will not necessarily work with the new + release. However, in some cases existing TensorFlow graphs and checkpoints + may be migratable to the newer release; see + [Compatibility of graphs and checkpoints](#compatibility_of_graphs_and_checkpoints) + for details on data compatibility. + +* **MINOR**: Backwards compatible features, speed improvements, etc. Code and + data that worked with a previous minor release *and* which depends only on the + public API will continue to work unchanged. For details on what is and is + not the public API, see [What is covered](#what_is_covered). + +* **PATCH**: Backwards compatible bug fixes. + +For example, release 1.0.0 introduced backwards *incompatible* changes from +release 0.12.1. However, release 1.1.1 was backwards *compatible* with release +1.0.0. + +## What is covered + +Only the public APIs of TensorFlow are backwards compatible across minor and +patch versions. The public APIs consist of + +* All the documented [Python](../api_docs/python) functions and classes in the + `tensorflow` module and its submodules, except for + * functions and classes in `tf.contrib` + * functions and classes whose names start with `_` (as these are private) + Note that the code in the `examples/` and `tools/` directories is not + reachable through the `tensorflow` Python module and is thus not covered by + the compatibility guarantee. + + If a symbol is available through the `tensorflow` Python module or its + submodules, but is not documented, then it is **not** considered part of the + public API. + +* The [C API](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/c/c_api.h). + +* The following protocol buffer files: + * [`attr_value`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/attr_value.proto) + * [`config`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/protobuf/config.proto) + * [`event`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/util/event.proto) + * [`graph`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/graph.proto) + * [`op_def`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/op_def.proto) + * [`reader_base`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/reader_base.proto) + * [`summary`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/summary.proto) + * [`tensor`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/tensor.proto) + * [`tensor_shape`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/tensor_shape.proto) + * [`types`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/types.proto) + + +## What is *not* covered + +Some API functions are explicitly marked as "experimental" and can change in +backward incompatible ways between minor releases. These include: + +* **Experimental APIs**: The @{tf.contrib} module and its submodules in Python + and any functions in the C API or fields in protocol buffers that are + explicitly commented as being experimental. In particular, any field in a + protocol buffer which is called "experimental" and all its fields and + submessages can change at any time. + +* **Other languages**: TensorFlow APIs in languages other than Python and C, + such as: + + - @{$cc/guide$C++} (exposed through header files in + [`tensorflow/cc`](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/cc)). + - [Java](../api_docs/java/reference/org/tensorflow/package-summary), + - [Go](https://godoc.org/github.com/tensorflow/tensorflow/tensorflow/go) + +* **Details of composite ops:** Many public functions in Python expand to + several primitive ops in the graph, and these details will be part of any + graphs saved to disk as `GraphDef`s. These details may change for + minor releases. In particular, regressions tests that check for exact + matching between graphs are likely to break across minor releases, even + though the behavior of the graph should be unchanged and existing + checkpoints will still work. + +* **Floating point numerical details:** The specific floating point values + computed by ops may change at any time. Users should rely only on + approximate accuracy and numerical stability, not on the specific bits + computed. Changes to numerical formulas in minor and patch releases should + result in comparable or improved accuracy, with the caveat that in machine + learning improved accuracy of specific formulas may result in decreased + accuracy for the overall system. + +* **Random numbers:** The specific random numbers computed by the + @{$python/constant_op#Random_Tensors$random ops} may change at any time. + Users should rely only on approximately correct distributions and + statistical strength, not the specific bits computed. However, we will make + changes to random bits rarely (or perhaps never) for patch releases. We + will, of course, document all such changes. + +* **Version skew in distributed Tensorflow:** Running two different versions + of TensorFlow in a single cluster is unsupported. There are no guarantees + about backwards compatibility of the wire protocol. + +* **Bugs:** We reserve the right to make backwards incompatible behavior + (though not API) changes if the current implementation is clearly broken, + that is, if it contradicts the documentation or if a well-known and + well-defined intended behavior is not properly implemented due to a bug. + For example, if an optimizer claims to implement a well-known optimization + algorithm but does not match that algorithm due to a bug, then we will fix + the optimizer. Our fix may break code relying on the wrong behavior for + convergence. We will note such changes in the release notes. + +* **Error messages:** We reserve the right to change the text of error + messages. In addition, the type of an error may change unless the type is + specified in the documentation. For example, a function documented to + raise an `InvalidArgument` exception will continue to + raise `InvalidArgument`, but the human-readable message contents can change. + +## Compatibility of graphs and checkpoints + +You'll sometimes need to preserve graphs and checkpoints. +Graphs describe the data flow of ops to be run during training and +inference, and checkpoints contain the saved tensor values of variables in a +graph. + +Many TensorFlow users save graphs and trained models to disk for +later evaluation or additional training, but end up running their saved graphs +or models on a later release. In compliance with semver, any graph or checkpoint +written out with one version of TensorFlow can be loaded and evaluated with a +later version of TensorFlow with the same major release. However, we will +endeavor to preserve backwards compatibility even across major releases when +possible, so that the serialized files are usable over long periods of time. + + +Graphs are serialized via the `GraphDef` protocol buffer. To facilitate (rare) +backwards incompatible changes to graphs, each `GraphDef` has a version number +separate from the TensorFlow version. For example, `GraphDef` version 17 +deprecated the `inv` op in favor of `reciprocal`. The semantics are: + +* Each version of TensorFlow supports an interval of `GraphDef` versions. This + interval will be constant across patch releases, and will only grow across + minor releases. Dropping support for a `GraphDef` version will only occur + for a major release of TensorFlow. + +* Newly created graphs are assigned the latest `GraphDef` version number. + +* If a given version of TensorFlow supports the `GraphDef` version of a graph, + it will load and evaluate with the same behavior as the TensorFlow version + used to generate it (except for floating point numerical details and random + numbers), regardless of the major version of TensorFlow. In particular, all + checkpoint files will be compatible. + +* If the `GraphDef` *upper* bound is increased to X in a (minor) release, there + will be at least six months before the *lower* bound is increased to X. For + example (we're using hypothetical version numbers here): + * TensorFlow 1.2 might support `GraphDef` versions 4 to 7. + * TensorFlow 1.3 could add `GraphDef` version 8 and support versions 4 to 8. + * At least six months later, TensorFlow 2.0.0 could drop support for + versions 4 to 7, leaving version 8 only. + +Finally, when support for a `GraphDef` version is dropped, we will attempt to +provide tools for automatically converting graphs to a newer supported +`GraphDef` version. + +## Graph and checkpoint compatibility when extending TensorFlow + +This section is relevant only when making incompatible changes to the `GraphDef` +format, such as when adding ops, removing ops, or changing the functionality +of existing ops. The previous section should suffice for most users. + +### Backward and partial forward compatibility + +Our versioning scheme has three requirements: + +* **Backward compatibility** to support loading graphs and checkpoints + created with older versions of TensorFlow. +* **Forward compatibility** to support scenarios where the producer of a + graph or checkpoint is upgraded to a newer version of TensorFlow before + the consumer. +* Enable evolving TensorFlow in incompatible ways. For example, removing ops, + adding attributes, and removing attributes. + +Note that while the `GraphDef` version mechanism is separate from the TensorFlow +version, backwards incompatible changes to the `GraphDef` format are still +restricted by Semantic Versioning. This means functionality can only be removed +or changed between `MAJOR` versions of TensorFlow (such as `1.7` to `2.0`). +Additionally, forward compatibility is enforced within Patch releases (`1.x.1` +to `1.x.2` for example). + +To achieve backward and forward compatibility and to know when to enforce changes +in formats, graphs and checkpoints have metadata that describes when they +were produced. The sections below detail the TensorFlow implementation and +guidelines for evolving `GraphDef` versions. + +### Independent data version schemes + +There are different data versions for graphs and checkpoints. The two data +formats evolve at different rates from each other and also at different rates +from TensorFlow. Both versioning systems are defined in +[`core/public/version.h`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/public/version.h). +Whenever a new version is added, a note is added to the header detailing what +changed and the date. + +### Data, producers, and consumers + +We distinguish between the following kinds of data version information: +* **producers**: binaries that produce data. Producers have a version + (`producer`) and a minimum consumer version that they are compatible with + (`min_consumer`). +* **consumers**: binaries that consume data. Consumers have a version + (`consumer`) and a minimum producer version that they are compatible with + (`min_producer`). + +Each piece of versioned data has a [`VersionDef +versions`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/versions.proto) +field which records the `producer` that made the data, the `min_consumer` +that it is compatible with, and a list of `bad_consumers` versions that are +disallowed. + +By default, when a producer makes some data, the data inherits the producer's +`producer` and `min_consumer` versions. `bad_consumers` can be set if specific +consumer versions are known to contain bugs and must be avoided. A consumer can +accept a piece of data if the following are all true: + +* `consumer` >= data's `min_consumer` +* data's `producer` >= consumer's `min_producer` +* `consumer` not in data's `bad_consumers` + +Since both producers and consumers come from the same TensorFlow code base, +[`core/public/version.h`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/public/version.h) +contains a main data version which is treated as either `producer` or +`consumer` depending on context and both `min_consumer` and `min_producer` +(needed by producers and consumers, respectively). Specifically, + +* For `GraphDef` versions, we have `TF_GRAPH_DEF_VERSION`, + `TF_GRAPH_DEF_VERSION_MIN_CONSUMER`, and + `TF_GRAPH_DEF_VERSION_MIN_PRODUCER`. +* For checkpoint versions, we have `TF_CHECKPOINT_VERSION`, + `TF_CHECKPOINT_VERSION_MIN_CONSUMER`, and + `TF_CHECKPOINT_VERSION_MIN_PRODUCER`. + +### Add a new attribute with default to an existing op + +Following the guidance below gives you forward compatibility only if the set of +ops has not changed: + +1. If forward compatibility is desired, set `strip_default_attrs` to `True` + while exporting the model using either the + @{tf.saved_model.builder.SavedModelBuilder.add_meta_graph_and_variables$`add_meta_graph_and_variables`} + and @{tf.saved_model.builder.SavedModelBuilder.add_meta_graph$`add_meta_graph`} + methods of the `SavedModelBuilder` class, or + @{tf.estimator.Estimator.export_savedmodel$`Estimator.export_savedmodel`} +2. This strips off the default valued attributes at the time of + producing/exporting the models. This makes sure that the exported + @{tf.MetaGraphDef} does not contain the new op-attribute when the default + value is used. +3. Having this control could allow out-of-date consumers (for example, serving + binaries that lag behind training binaries) to continue loading the models + and prevent interruptions in model serving. + +### Evolving GraphDef versions + +This section explains how to use this versioning mechanism to make different +types of changes to the `GraphDef` format. + +#### Add an op + +Add the new op to both consumers and producers at the same time, and do not +change any `GraphDef` versions. This type of change is automatically +backward compatible, and does not impact forward compatibility plan since +existing producer scripts will not suddenly use the new functionality. + +#### Add an op and switch existing Python wrappers to use it + +1. Implement new consumer functionality and increment the `GraphDef` version. +2. If it is possible to make the wrappers use the new functionality only in + cases that did not work before, the wrappers can be updated now. +3. Change Python wrappers to use the new functionality. Do not increment + `min_consumer`, since models that do not use this op should not break. + +#### Remove or restrict an op's functionality + +1. Fix all producer scripts (not TensorFlow itself) to not use the banned op or + functionality. +2. Increment the `GraphDef` version and implement new consumer functionality + that bans the removed op or functionality for GraphDefs at the new version + and above. If possible, make TensorFlow stop producing `GraphDefs` with the + banned functionality. To do so, add the + [`REGISTER_OP(...).Deprecated(deprecated_at_version, + message)`](https://github.com/tensorflow/tensorflow/blob/b289bc7a50fc0254970c60aaeba01c33de61a728/tensorflow/core/ops/array_ops.cc#L1009). +3. Wait for a major release for backward compatibility purposes. +4. Increase `min_producer` to the GraphDef version from (2) and remove the + functionality entirely. + +#### Change an op's functionality + +1. Add a new similar op named `SomethingV2` or similar and go through the + process of adding it and switching existing Python wrappers to use it, which + may take three weeks if forward compatibility is desired. +2. Remove the old op (Can only take place with a major version change due to + backward compatibility). +3. Increase `min_consumer` to rule out consumers with the old op, add back the + old op as an alias for `SomethingV2`, and go through the process to switch + existing Python wrappers to use it. +4. Go through the process to remove `SomethingV2`. + +#### Ban a single unsafe consumer version + +1. Bump the `GraphDef` version and add the bad version to `bad_consumers` for + all new GraphDefs. If possible, add to `bad_consumers` only for GraphDefs + which contain a certain op or similar. +2. If existing consumers have the bad version, push them out as soon as + possible. diff --git a/tensorflow/docs_src/install/install_go.md b/tensorflow/docs_src/install/install_go.md index 1c03dd223e..5451e1b319 100644 --- a/tensorflow/docs_src/install/install_go.md +++ b/tensorflow/docs_src/install/install_go.md @@ -6,7 +6,7 @@ a Go application. This guide explains how to install and set up the [TensorFlow Go package](https://godoc.org/github.com/tensorflow/tensorflow/tensorflow/go). Warning: The TensorFlow Go API is *not* covered by the TensorFlow -[API stability guarantees](https://www.tensorflow.org/programmers_guide/version_semantics). +[API stability guarantees](../guide/version_semantics.md). ## Supported Platforms diff --git a/tensorflow/docs_src/install/install_java.md b/tensorflow/docs_src/install/install_java.md index c73e2f4281..ad3544b595 100644 --- a/tensorflow/docs_src/install/install_java.md +++ b/tensorflow/docs_src/install/install_java.md @@ -7,7 +7,7 @@ Java application. This guide explains how to install and use it in a Java application. Warning: The TensorFlow Java API is *not* covered by the TensorFlow -[API stability guarantees](https://www.tensorflow.org/programmers_guide/version_semantics). +[API stability guarantees](../guide/version_semantics.md). ## Supported Platforms diff --git a/tensorflow/docs_src/programmers_guide/checkpoints.md b/tensorflow/docs_src/programmers_guide/checkpoints.md deleted file mode 100644 index 8dfd91e3c8..0000000000 --- a/tensorflow/docs_src/programmers_guide/checkpoints.md +++ /dev/null @@ -1,240 +0,0 @@ -# Checkpoints - -This document examines how to save and restore TensorFlow models built with -Estimators. TensorFlow provides two model formats: - -* checkpoints, which is a format dependent on the code that created - the model. -* SavedModel, which is a format independent of the code that created - the model. - -This document focuses on checkpoints. For details on SavedModel, see the -@{$saved_model$Saving and Restoring} chapter of the -*TensorFlow Programmer's Guide*. - - -## Sample code - -This document relies on the same -[Iris classification example](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py) detailed in @{$premade_estimators$Getting Started with TensorFlow}. -To download and access the example, invoke the following two commands: - -```shell -git clone https://github.com/tensorflow/models/ -cd models/samples/core/get_started -``` - -Most of the code snippets in this document are minor variations -on `premade_estimator.py`. - - -## Saving partially-trained models - -Estimators automatically write the following to disk: - -* **checkpoints**, which are versions of the model created during training. -* **event files**, which contain information that - [TensorBoard](https://developers.google.com/machine-learning/glossary/#TensorBoard) - uses to create visualizations. - -To specify the top-level directory in which the Estimator stores its -information, assign a value to the optional `model_dir` argument of *any* -`Estimator`'s constructor. -Taking `DNNClassifier` as an example, -the following code sets the `model_dir` -argument to the `models/iris` directory: - -```python -classifier = tf.estimator.DNNClassifier( - feature_columns=my_feature_columns, - hidden_units=[10, 10], - n_classes=3, - model_dir='models/iris') -``` - -Suppose you call the Estimator's `train` method. For example: - - -```python -classifier.train( - input_fn=lambda:train_input_fn(train_x, train_y, batch_size=100), - steps=200) -``` - -As suggested by the following diagrams, the first call to `train` -adds checkpoints and other files to the `model_dir` directory: - -
- -
-
-The first call to train(). -
- - -To see the objects in the created `model_dir` directory on a -UNIX-based system, just call `ls` as follows: - -```none -$ ls -1 models/iris -checkpoint -events.out.tfevents.timestamp.hostname -graph.pbtxt -model.ckpt-1.data-00000-of-00001 -model.ckpt-1.index -model.ckpt-1.meta -model.ckpt-200.data-00000-of-00001 -model.ckpt-200.index -model.ckpt-200.meta -``` - -The preceding `ls` command shows that the Estimator created checkpoints -at steps 1 (the start of training) and 200 (the end of training). - - -### Default checkpoint directory - -If you don't specify `model_dir` in an Estimator's constructor, the Estimator -writes checkpoint files to a temporary directory chosen by Python's -[tempfile.mkdtemp](https://docs.python.org/3/library/tempfile.html#tempfile.mkdtemp) -function. For example, the following Estimator constructor does *not* specify -the `model_dir` argument: - -```python -classifier = tf.estimator.DNNClassifier( - feature_columns=my_feature_columns, - hidden_units=[10, 10], - n_classes=3) - -print(classifier.model_dir) -``` - -The `tempfile.mkdtemp` function picks a secure, temporary directory -appropriate for your operating system. For example, a typical temporary -directory on macOS might be something like the following: - -```None -/var/folders/0s/5q9kfzfj3gx2knj0vj8p68yc00dhcr/T/tmpYm1Rwa -``` - -### Checkpointing Frequency - -By default, the Estimator saves -[checkpoints](https://developers.google.com/machine-learning/glossary/#checkpoint) -in the `model_dir` according to the following schedule: - -* Writes a checkpoint every 10 minutes (600 seconds). -* Writes a checkpoint when the `train` method starts (first iteration) - and completes (final iteration). -* Retains only the 5 most recent checkpoints in the directory. - -You may alter the default schedule by taking the following steps: - -1. Create a @{tf.estimator.RunConfig$`RunConfig`} object that defines the - desired schedule. -2. When instantiating the Estimator, pass that `RunConfig` object to the - Estimator's `config` argument. - -For example, the following code changes the checkpointing schedule to every -20 minutes and retains the 10 most recent checkpoints: - -```python -my_checkpointing_config = tf.estimator.RunConfig( - save_checkpoints_secs = 20*60, # Save checkpoints every 20 minutes. - keep_checkpoint_max = 10, # Retain the 10 most recent checkpoints. -) - -classifier = tf.estimator.DNNClassifier( - feature_columns=my_feature_columns, - hidden_units=[10, 10], - n_classes=3, - model_dir='models/iris', - config=my_checkpointing_config) -``` - -## Restoring your model - -The first time you call an Estimator's `train` method, TensorFlow saves a -checkpoint to the `model_dir`. Each subsequent call to the Estimator's -`train`, `evaluate`, or `predict` method causes the following: - -1. The Estimator builds the model's - [graph](https://developers.google.com/machine-learning/glossary/#graph) - by running the `model_fn()`. (For details on the `model_fn()`, see - @{$custom_estimators$Creating Custom Estimators.}) -2. The Estimator initializes the weights of the new model from the data - stored in the most recent checkpoint. - -In other words, as the following illustration suggests, once checkpoints -exist, TensorFlow rebuilds the model each time you call `train()`, -`evaluate()`, or `predict()`. - -
- -
-
-Subsequent calls to train(), evaluate(), or predict() -
- - -### Avoiding a bad restoration - -Restoring a model's state from a checkpoint only works if the model -and checkpoint are compatible. For example, suppose you trained a -`DNNClassifier` Estimator containing two hidden layers, -each having 10 nodes: - -```python -classifier = tf.estimator.DNNClassifier( - feature_columns=feature_columns, - hidden_units=[10, 10], - n_classes=3, - model_dir='models/iris') - -classifier.train( - input_fn=lambda:train_input_fn(train_x, train_y, batch_size=100), - steps=200) -``` - -After training (and, therefore, after creating checkpoints in `models/iris`), -imagine that you changed the number of neurons in each hidden layer from 10 to -20 and then attempted to retrain the model: - -``` python -classifier2 = tf.estimator.DNNClassifier( - feature_columns=my_feature_columns, - hidden_units=[20, 20], # Change the number of neurons in the model. - n_classes=3, - model_dir='models/iris') - -classifier.train( - input_fn=lambda:train_input_fn(train_x, train_y, batch_size=100), - steps=200) -``` - -Since the state in the checkpoint is incompatible with the model described -in `classifier2`, retraining fails with the following error: - -```None -... -InvalidArgumentError (see above for traceback): tensor_name = -dnn/hiddenlayer_1/bias/t_0/Adagrad; shape in shape_and_slice spec [10] -does not match the shape stored in checkpoint: [20] -``` - -To run experiments in which you train and compare slightly different -versions of a model, save a copy of the code that created each -`model_dir`, possibly by creating a separate git branch for each version. -This separation will keep your checkpoints recoverable. - -## Summary - -Checkpoints provide an easy automatic mechanism for saving and restoring -models created by Estimators. - -See the @{$saved_model$Saving and Restoring} -chapter of the *TensorFlow Programmer's Guide* for details on: - -* Saving and restoring models using low-level TensorFlow APIs. -* Exporting and importing models in the SavedModel format, which is a - language-neutral, recoverable, serialization format. diff --git a/tensorflow/docs_src/programmers_guide/custom_estimators.md b/tensorflow/docs_src/programmers_guide/custom_estimators.md deleted file mode 100644 index fb20b35c12..0000000000 --- a/tensorflow/docs_src/programmers_guide/custom_estimators.md +++ /dev/null @@ -1,602 +0,0 @@ - -# Creating Custom Estimators - -This document introduces custom Estimators. In particular, this document -demonstrates how to create a custom @{tf.estimator.Estimator$Estimator} that -mimics the behavior of the pre-made Estimator -@{tf.estimator.DNNClassifier$`DNNClassifier`} in solving the Iris problem. See -the @{$premade_estimators$Pre-Made Estimators chapter} for details -on the Iris problem. - -To download and access the example code invoke the following two commands: - -```shell -git clone https://github.com/tensorflow/models/ -cd models/samples/core/get_started -``` - -In this document we will be looking at -[`custom_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py). -You can run it with the following command: - -```bsh -python custom_estimator.py -``` - -If you are feeling impatient, feel free to compare and contrast -[`custom_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py) -with -[`premade_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py). -(which is in the same directory). - - - -## Pre-made vs. custom - -As the following figure shows, pre-made Estimators are subclasses of the -@{tf.estimator.Estimator} base class, while custom Estimators are an instance -of tf.estimator.Estimator: - -
-Premade estimators are sub-classes of `Estimator`. Custom Estimators are usually (direct) instances of `Estimator` -
-
-Pre-made and custom Estimators are all Estimators. -
- -Pre-made Estimators are fully baked. Sometimes though, you need more control -over an Estimator's behavior. That's where custom Estimators come in. You can -create a custom Estimator to do just about anything. If you want hidden layers -connected in some unusual fashion, write a custom Estimator. If you want to -calculate a unique -[metric](https://developers.google.com/machine-learning/glossary/#metric) -for your model, write a custom Estimator. Basically, if you want an Estimator -optimized for your specific problem, write a custom Estimator. - -A model function (or `model_fn`) implements the ML algorithm. The -only difference between working with pre-made Estimators and custom Estimators -is: - -* With pre-made Estimators, someone already wrote the model function for you. -* With custom Estimators, you must write the model function. - -Your model function could implement a wide range of algorithms, defining all -sorts of hidden layers and metrics. Like input functions, all model functions -must accept a standard group of input parameters and return a standard group of -output values. Just as input functions can leverage the Dataset API, model -functions can leverage the Layers API and the Metrics API. - -Let's see how to solve the Iris problem with a custom Estimator. A quick -reminder--here's the organization of the Iris model that we're trying to mimic: - -
-A diagram of the network architecture: Inputs, 2 hidden layers, and outputs -
-
-Our implementation of Iris contains four features, two hidden layers, -and a logits output layer. -
- -## Write an Input function - -Our custom Estimator implementation uses the same input function as our -@{$premade_estimators$pre-made Estimator implementation}, from -[`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py). -Namely: - -```python -def train_input_fn(features, labels, batch_size): - """An input function for training""" - # Convert the inputs to a Dataset. - dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) - - # Shuffle, repeat, and batch the examples. - dataset = dataset.shuffle(1000).repeat().batch(batch_size) - - # Return the read end of the pipeline. - return dataset.make_one_shot_iterator().get_next() -``` - -This input function builds an input pipeline that yields batches of -`(features, labels)` pairs, where `features` is a dictionary features. - -## Create feature columns - -As detailed in the @{$premade_estimators$Premade Estimators} and -@{$feature_columns$Feature Columns} chapters, you must define -your model's feature columns to specify how the model should use each feature. -Whether working with pre-made Estimators or custom Estimators, you define -feature columns in the same fashion. - -The following code creates a simple `numeric_column` for each input feature, -indicating that the value of the input feature should be used directly as an -input to the model: - -```python -# Feature columns describe how to use the input. -my_feature_columns = [] -for key in train_x.keys(): - my_feature_columns.append(tf.feature_column.numeric_column(key=key)) -``` - -## Write a model function - -The model function we'll use has the following call signature: - -```python -def my_model_fn( - features, # This is batch_features from input_fn - labels, # This is batch_labels from input_fn - mode, # An instance of tf.estimator.ModeKeys - params): # Additional configuration -``` - -The first two arguments are the batches of features and labels returned from -the input function; that is, `features` and `labels` are the handles to the -data your model will use. The `mode` argument indicates whether the caller is -requesting training, predicting, or evaluation. - -The caller may pass `params` to an Estimator's constructor. Any `params` passed -to the constructor are in turn passed on to the `model_fn`. In -[`custom_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/custom_estimator.py) -the following lines create the estimator and set the params to configure the -model. This configuration step is similar to how we configured the @{tf.estimator.DNNClassifier} in -@{$premade_estimators}. - -```python -classifier = tf.estimator.Estimator( - model_fn=my_model, - params={ - 'feature_columns': my_feature_columns, - # Two hidden layers of 10 nodes each. - 'hidden_units': [10, 10], - # The model must choose between 3 classes. - 'n_classes': 3, - }) -``` - -To implement a typical model function, you must do the following: - -* [Define the model](#define_the_model). -* Specify additional calculations for each of - the [three different modes](#modes): - * [Predict](#predict) - * [Evaluate](#evaluate) - * [Train](#train) - -## Define the model - -The basic deep neural network model must define the following three sections: - -* An [input layer](https://developers.google.com/machine-learning/glossary/#input_layer) -* One or more [hidden layers](https://developers.google.com/machine-learning/glossary/#hidden_layer) -* An [output layer](https://developers.google.com/machine-learning/glossary/#output_layer) - -### Define the input layer - -The first line of the `model_fn` calls @{tf.feature_column.input_layer} to -convert the feature dictionary and `feature_columns` into input for your model, -as follows: - -```python - # Use `input_layer` to apply the feature columns. - net = tf.feature_column.input_layer(features, params['feature_columns']) -``` - -The preceding line applies the transformations defined by your feature columns, -creating the model's input layer. - -
-A diagram of the input layer, in this case a 1:1 mapping from raw-inputs to features. -
- - -### Hidden Layers - -If you are creating a deep neural network, you must define one or more hidden -layers. The Layers API provides a rich set of functions to define all types of -hidden layers, including convolutional, pooling, and dropout layers. For Iris, -we're simply going to call @{tf.layers.dense} to create hidden layers, with -dimensions defined by `params['hidden_layers']`. In a `dense` layer each node -is connected to every node in the preceding layer. Here's the relevant code: - -``` python - # Build the hidden layers, sized according to the 'hidden_units' param. - for units in params['hidden_units']: - net = tf.layers.dense(net, units=units, activation=tf.nn.relu) -``` - -* The `units` parameter defines the number of output neurons in a given layer. -* The `activation` parameter defines the [activation function](https://developers.google.com/machine-learning/glossary/#activation_function) — - [Relu](https://developers.google.com/machine-learning/glossary/#ReLU) in this - case. - -The variable `net` here signifies the current top layer of the network. During -the first iteration, `net` signifies the input layer. On each loop iteration -`tf.layers.dense` creates a new layer, which takes the previous layer's output -as its input, using the variable `net`. - -After creating two hidden layers, our network looks as follows. For -simplicity, the figure does not show all the units in each layer. - -
-The input layer with two hidden layers added. -
- -Note that @{tf.layers.dense} provides many additional capabilities, including -the ability to set a multitude of regularization parameters. For the sake of -simplicity, though, we're going to simply accept the default values of the -other parameters. - -### Output Layer - -We'll define the output layer by calling @{tf.layers.dense} yet again, this -time without an activation function: - -```python - # Compute logits (1 per class). - logits = tf.layers.dense(net, params['n_classes'], activation=None) -``` - -Here, `net` signifies the final hidden layer. Therefore, the full set of layers -is now connected as follows: - -
-A logit output layer connected to the top hidden layer -
-
-The final hidden layer feeds into the output layer. -
- -When defining an output layer, the `units` parameter specifies the number of -outputs. So, by setting `units` to `params['n_classes']`, the model produces -one output value per class. Each element of the output vector will contain the -score, or "logit", calculated for the associated class of Iris: Setosa, -Versicolor, or Virginica, respectively. - -Later on, these logits will be transformed into probabilities by the -@{tf.nn.softmax} function. - -## Implement training, evaluation, and prediction {#modes} - -The final step in creating a model function is to write branching code that -implements prediction, evaluation, and training. - -The model function gets invoked whenever someone calls the Estimator's `train`, -`evaluate`, or `predict` methods. Recall that the signature for the model -function looks like this: - -``` python -def my_model_fn( - features, # This is batch_features from input_fn - labels, # This is batch_labels from input_fn - mode, # An instance of tf.estimator.ModeKeys, see below - params): # Additional configuration -``` - -Focus on that third argument, mode. As the following table shows, when someone -calls `train`, `evaluate`, or `predict`, the Estimator framework invokes your model -function with the mode parameter set as follows: - -| Estimator method | Estimator Mode | -|:---------------------------------|:------------------| -|@{tf.estimator.Estimator.train$`train()`} |@{tf.estimator.ModeKeys.TRAIN$`ModeKeys.TRAIN`} | -|@{tf.estimator.Estimator.evaluate$`evaluate()`} |@{tf.estimator.ModeKeys.EVAL$`ModeKeys.EVAL`} | -|@{tf.estimator.Estimator.predict$`predict()`}|@{tf.estimator.ModeKeys.PREDICT$`ModeKeys.PREDICT`} | - -For example, suppose you instantiate a custom Estimator to generate an object -named `classifier`. Then, you make the following call: - -``` python -classifier = tf.estimator.Estimator(...) -classifier.train(input_fn=lambda: my_input_fn(FILE_TRAIN, True, 500)) -``` -The Estimator framework then calls your model function with mode set to -`ModeKeys.TRAIN`. - -Your model function must provide code to handle all three of the mode values. -For each mode value, your code must return an instance of -`tf.estimator.EstimatorSpec`, which contains the information the caller -requires. Let's examine each mode. - -### Predict - -When the Estimator's `predict` method is called, the `model_fn` receives -`mode = ModeKeys.PREDICT`. In this case, the model function must return a -`tf.estimator.EstimatorSpec` containing the prediction. - -The model must have been trained prior to making a prediction. The trained model -is stored on disk in the `model_dir` directory established when you -instantiated the Estimator. - -The code to generate the prediction for this model looks as follows: - -```python -# Compute predictions. -predicted_classes = tf.argmax(logits, 1) -if mode == tf.estimator.ModeKeys.PREDICT: - predictions = { - 'class_ids': predicted_classes[:, tf.newaxis], - 'probabilities': tf.nn.softmax(logits), - 'logits': logits, - } - return tf.estimator.EstimatorSpec(mode, predictions=predictions) -``` -The prediction dictionary contains everything that your model returns when run -in prediction mode. - -
-Additional outputs added to the output layer. -
- -The `predictions` holds the following three key/value pairs: - -* `class_ids` holds the class id (0, 1, or 2) representing the model's - prediction of the most likely species for this example. -* `probabilities` holds the three probabilities (in this example, 0.02, 0.95, - and 0.03) -* `logit` holds the raw logit values (in this example, -1.3, 2.6, and -0.9) - -We return that dictionary to the caller via the `predictions` parameter of the -@{tf.estimator.EstimatorSpec}. The Estimator's -@{tf.estimator.Estimator.predict$`predict`} method will yield these -dictionaries. - -### Calculate the loss - -For both [training](#train) and [evaluation](#evaluate) we need to calculate the -model's loss. This is the -[objective](https://developers.google.com/machine-learning/glossary/#objective) -that will be optimized. - -We can calculate the loss by calling @{tf.losses.sparse_softmax_cross_entropy}. -The value returned by this function will be lowest, approximately 0, -probability of the correct class (at index `label`) is near 1.0. The loss value -returned is progressively larger as the probability of the correct class -decreases. - -This function returns the average over the whole batch. - -```python -# Compute loss. -loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits) -``` - -### Evaluate - -When the Estimator's `evaluate` method is called, the `model_fn` receives -`mode = ModeKeys.EVAL`. In this case, the model function must return a -`tf.estimator.EstimatorSpec` containing the model's loss and optionally one -or more metrics. - -Although returning metrics is optional, most custom Estimators do return at -least one metric. TensorFlow provides a Metrics module @{tf.metrics} to -calculate common metrics. For brevity's sake, we'll only return accuracy. The -@{tf.metrics.accuracy} function compares our predictions against the -true values, that is, against the labels provided by the input function. The -@{tf.metrics.accuracy} function requires the labels and predictions to have the -same shape. Here's the call to @{tf.metrics.accuracy}: - -``` python -# Compute evaluation metrics. -accuracy = tf.metrics.accuracy(labels=labels, - predictions=predicted_classes, - name='acc_op') -``` - -The @{tf.estimator.EstimatorSpec$`EstimatorSpec`} returned for evaluation -typically contains the following information: - -* `loss`, which is the model's loss -* `eval_metric_ops`, which is an optional dictionary of metrics. - -So, we'll create a dictionary containing our sole metric. If we had calculated -other metrics, we would have added them as additional key/value pairs to that -same dictionary. Then, we'll pass that dictionary in the `eval_metric_ops` -argument of `tf.estimator.EstimatorSpec`. Here's the code: - -```python -metrics = {'accuracy': accuracy} -tf.summary.scalar('accuracy', accuracy[1]) - -if mode == tf.estimator.ModeKeys.EVAL: - return tf.estimator.EstimatorSpec( - mode, loss=loss, eval_metric_ops=metrics) -``` - -The @{tf.summary.scalar} will make accuracy available to TensorBoard -in both `TRAIN` and `EVAL` modes. (More on this later). - -### Train - -When the Estimator's `train` method is called, the `model_fn` is called -with `mode = ModeKeys.TRAIN`. In this case, the model function must return an -`EstimatorSpec` that contains the loss and a training operation. - -Building the training operation will require an optimizer. We will use -@{tf.train.AdagradOptimizer} because we're mimicking the `DNNClassifier`, which -also uses `Adagrad` by default. The `tf.train` package provides many other -optimizers—feel free to experiment with them. - -Here is the code that builds the optimizer: - -``` python -optimizer = tf.train.AdagradOptimizer(learning_rate=0.1) -``` - -Next, we build the training operation using the optimizer's -@{tf.train.Optimizer.minimize$`minimize`} method on the loss we calculated -earlier. - -The `minimize` method also takes a `global_step` parameter. TensorFlow uses this -parameter to count the number of training steps that have been processed -(to know when to end a training run). Furthermore, the `global_step` is -essential for TensorBoard graphs to work correctly. Simply call -@{tf.train.get_global_step} and pass the result to the `global_step` -argument of `minimize`. - -Here's the code to train the model: - -``` python -train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step()) -``` - -The @{tf.estimator.EstimatorSpec$`EstimatorSpec`} returned for training -must have the following fields set: - -* `loss`, which contains the value of the loss function. -* `train_op`, which executes a training step. - -Here's our code to call `EstimatorSpec`: - -```python -return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op) -``` - -The model function is now complete. - -## The custom Estimator - -Instantiate the custom Estimator through the Estimator base class as follows: - -```python - # Build 2 hidden layer DNN with 10, 10 units respectively. - classifier = tf.estimator.Estimator( - model_fn=my_model, - params={ - 'feature_columns': my_feature_columns, - # Two hidden layers of 10 nodes each. - 'hidden_units': [10, 10], - # The model must choose between 3 classes. - 'n_classes': 3, - }) -``` -Here the `params` dictionary serves the same purpose as the key-word -arguments of `DNNClassifier`; that is, the `params` dictionary lets you -configure your Estimator without modifying the code in the `model_fn`. - -The rest of the code to train, evaluate, and generate predictions using our -Estimator is the same as in the -@{$premade_estimators$Premade Estimators} chapter. For -example, the following line will train the model: - -```python -# Train the Model. -classifier.train( - input_fn=lambda:iris_data.train_input_fn(train_x, train_y, args.batch_size), - steps=args.train_steps) -``` - -## TensorBoard - -You can view training results for your custom Estimator in TensorBoard. To see -this reporting, start TensorBoard from your command line as follows: - -```bsh -# Replace PATH with the actual path passed as model_dir -tensorboard --logdir=PATH -``` - -Then, open TensorBoard by browsing to: [http://localhost:6006](http://localhost:6006) - -All the pre-made Estimators automatically log a lot of information to -TensorBoard. With custom Estimators, however, TensorBoard only provides one -default log (a graph of the loss) plus the information you explicitly tell -TensorBoard to log. For the custom Estimator you just created, TensorBoard -generates the following: - -
- -Accuracy, 'scalar' graph from tensorboard - -loss 'scalar' graph from tensorboard - -steps/second 'scalar' graph from tensorboard -
- -
-TensorBoard displays three graphs. -
- - -In brief, here's what the three graphs tell you: - -* global_step/sec: A performance indicator showing how many batches (gradient - updates) we processed per second as the model trains. - -* loss: The loss reported. - -* accuracy: The accuracy is recorded by the following two lines: - - * `eval_metric_ops={'my_accuracy': accuracy}`, during evaluation. - * `tf.summary.scalar('accuracy', accuracy[1])`, during training. - -These tensorboard graphs are one of the main reasons it's important to pass a -`global_step` to your optimizer's `minimize` method. The model can't record -the x-coordinate for these graphs without it. - -Note the following in the `my_accuracy` and `loss` graphs: - -* The orange line represents training. -* The blue dot represents evaluation. - -During training, summaries (the orange line) are recorded periodically as -batches are processed, which is why it becomes a graph spanning x-axis range. - -By contrast, evaluation produces only a single point on the graph for each call -to `evaluate`. This point contains the average over the entire evaluation call. -This has no width on the graph as it is evaluated entirely from the model state -at a particular training step (from a single checkpoint). - -As suggested in the following figure, you may see and also selectively -disable/enable the reporting using the controls on the left side. - -
-Check-boxes allowing the user to select which runs are shown. -
-
-Enable or disable reporting. -
- - -## Summary - -Although pre-made Estimators can be an effective way to quickly create new -models, you will often need the additional flexibility that custom Estimators -provide. Fortunately, pre-made and custom Estimators follow the same -programming model. The only practical difference is that you must write a model -function for custom Estimators; everything else is the same. - -For more details, be sure to check out: - -* The - [official TensorFlow implementation of MNIST](https://github.com/tensorflow/models/tree/master/official/mnist), - which uses a custom estimator. -* The TensorFlow - [official models repository](https://github.com/tensorflow/models/tree/master/official), - which contains more curated examples using custom estimators. -* This [TensorBoard video](https://youtu.be/eBbEDRsCmv4), which introduces - TensorBoard. -* The @{$low_level_intro$Low Level Introduction}, which demonstrates - how to experiment directly with TensorFlow's low level APIs, making debugging - easier. diff --git a/tensorflow/docs_src/programmers_guide/datasets.md b/tensorflow/docs_src/programmers_guide/datasets.md deleted file mode 100644 index 8b69860a68..0000000000 --- a/tensorflow/docs_src/programmers_guide/datasets.md +++ /dev/null @@ -1,823 +0,0 @@ -# Importing Data - -The @{tf.data} API enables you to build complex input pipelines from -simple, reusable pieces. For example, the pipeline for an image model might -aggregate data from files in a distributed file system, apply random -perturbations to each image, and merge randomly selected images into a batch -for training. The pipeline for a text model might involve extracting symbols -from raw text data, converting them to embedding identifiers with a lookup -table, and batching together sequences of different lengths. The `tf.data` API -makes it easy to deal with large amounts of data, different data formats, and -complicated transformations. - -The `tf.data` API introduces two new abstractions to TensorFlow: - -* A `tf.data.Dataset` represents a sequence of elements, in which - each element contains one or more `Tensor` objects. For example, in an image - pipeline, an element might be a single training example, with a pair of - tensors representing the image data and a label. There are two distinct - ways to create a dataset: - - * Creating a **source** (e.g. `Dataset.from_tensor_slices()`) constructs a - dataset from - one or more `tf.Tensor` objects. - - * Applying a **transformation** (e.g. `Dataset.batch()`) constructs a dataset - from one or more `tf.data.Dataset` objects. - -* A `tf.data.Iterator` provides the main way to extract elements from a - dataset. The operation returned by `Iterator.get_next()` yields the next - element of a `Dataset` when executed, and typically acts as the interface - between input pipeline code and your model. The simplest iterator is a - "one-shot iterator", which is associated with a particular `Dataset` and - iterates through it once. For more sophisticated uses, the - `Iterator.initializer` operation enables you to reinitialize and parameterize - an iterator with different datasets, so that you can, for example, iterate - over training and validation data multiple times in the same program. - -## Basic mechanics - -This section of the guide describes the fundamentals of creating different kinds -of `Dataset` and `Iterator` objects, and how to extract data from them. - -To start an input pipeline, you must define a *source*. For example, to -construct a `Dataset` from some tensors in memory, you can use -`tf.data.Dataset.from_tensors()` or -`tf.data.Dataset.from_tensor_slices()`. Alternatively, if your input -data are on disk in the recommended TFRecord format, you can construct a -`tf.data.TFRecordDataset`. - -Once you have a `Dataset` object, you can *transform* it into a new `Dataset` by -chaining method calls on the `tf.data.Dataset` object. For example, you -can apply per-element transformations such as `Dataset.map()` (to apply a -function to each element), and multi-element transformations such as -`Dataset.batch()`. See the documentation for @{tf.data.Dataset} -for a complete list of transformations. - -The most common way to consume values from a `Dataset` is to make an -**iterator** object that provides access to one element of the dataset at a time -(for example, by calling `Dataset.make_one_shot_iterator()`). A -`tf.data.Iterator` provides two operations: `Iterator.initializer`, -which enables you to (re)initialize the iterator's state; and -`Iterator.get_next()`, which returns `tf.Tensor` objects that correspond to the -symbolic next element. Depending on your use case, you might choose a different -type of iterator, and the options are outlined below. - -### Dataset structure - -A dataset comprises elements that each have the same structure. An element -contains one or more `tf.Tensor` objects, called *components*. Each component -has a `tf.DType` representing the type of elements in the tensor, and a -`tf.TensorShape` representing the (possibly partially specified) static shape of -each element. The `Dataset.output_types` and `Dataset.output_shapes` properties -allow you to inspect the inferred types and shapes of each component of a -dataset element. The *nested structure* of these properties map to the structure -of an element, which may be a single tensor, a tuple of tensors, or a nested -tuple of tensors. For example: - -```python -dataset1 = tf.data.Dataset.from_tensor_slices(tf.random_uniform([4, 10])) -print(dataset1.output_types) # ==> "tf.float32" -print(dataset1.output_shapes) # ==> "(10,)" - -dataset2 = tf.data.Dataset.from_tensor_slices( - (tf.random_uniform([4]), - tf.random_uniform([4, 100], maxval=100, dtype=tf.int32))) -print(dataset2.output_types) # ==> "(tf.float32, tf.int32)" -print(dataset2.output_shapes) # ==> "((), (100,))" - -dataset3 = tf.data.Dataset.zip((dataset1, dataset2)) -print(dataset3.output_types) # ==> (tf.float32, (tf.float32, tf.int32)) -print(dataset3.output_shapes) # ==> "(10, ((), (100,)))" -``` - -It is often convenient to give names to each component of an element, for -example if they represent different features of a training example. In addition -to tuples, you can use `collections.namedtuple` or a dictionary mapping strings -to tensors to represent a single element of a `Dataset`. - -```python -dataset = tf.data.Dataset.from_tensor_slices( - {"a": tf.random_uniform([4]), - "b": tf.random_uniform([4, 100], maxval=100, dtype=tf.int32)}) -print(dataset.output_types) # ==> "{'a': tf.float32, 'b': tf.int32}" -print(dataset.output_shapes) # ==> "{'a': (), 'b': (100,)}" -``` - -The `Dataset` transformations support datasets of any structure. When using the -`Dataset.map()`, `Dataset.flat_map()`, and `Dataset.filter()` transformations, -which apply a function to each element, the element structure determines the -arguments of the function: - -```python -dataset1 = dataset1.map(lambda x: ...) - -dataset2 = dataset2.flat_map(lambda x, y: ...) - -# Note: Argument destructuring is not available in Python 3. -dataset3 = dataset3.filter(lambda x, (y, z): ...) -``` - -### Creating an iterator - -Once you have built a `Dataset` to represent your input data, the next step is to -create an `Iterator` to access elements from that dataset. The `tf.data` API -currently supports the following iterators, in increasing level of -sophistication: - -* **one-shot**, -* **initializable**, -* **reinitializable**, and -* **feedable**. - -A **one-shot** iterator is the simplest form of iterator, which only supports -iterating once through a dataset, with no need for explicit initialization. -One-shot iterators handle almost all of the cases that the existing queue-based -input pipelines support, but they do not support parameterization. Using the -example of `Dataset.range()`: - -```python -dataset = tf.data.Dataset.range(100) -iterator = dataset.make_one_shot_iterator() -next_element = iterator.get_next() - -for i in range(100): - value = sess.run(next_element) - assert i == value -``` - -Note: Currently, one-shot iterators are the only type that is easily usable -with an `Estimator`. - -An **initializable** iterator requires you to run an explicit -`iterator.initializer` operation before using it. In exchange for this -inconvenience, it enables you to *parameterize* the definition of the dataset, -using one or more `tf.placeholder()` tensors that can be fed when you -initialize the iterator. Continuing the `Dataset.range()` example: - -```python -max_value = tf.placeholder(tf.int64, shape=[]) -dataset = tf.data.Dataset.range(max_value) -iterator = dataset.make_initializable_iterator() -next_element = iterator.get_next() - -# Initialize an iterator over a dataset with 10 elements. -sess.run(iterator.initializer, feed_dict={max_value: 10}) -for i in range(10): - value = sess.run(next_element) - assert i == value - -# Initialize the same iterator over a dataset with 100 elements. -sess.run(iterator.initializer, feed_dict={max_value: 100}) -for i in range(100): - value = sess.run(next_element) - assert i == value -``` - -A **reinitializable** iterator can be initialized from multiple different -`Dataset` objects. For example, you might have a training input pipeline that -uses random perturbations to the input images to improve generalization, and -a validation input pipeline that evaluates predictions on unmodified data. These -pipelines will typically use different `Dataset` objects that have the same -structure (i.e. the same types and compatible shapes for each component). - -```python -# Define training and validation datasets with the same structure. -training_dataset = tf.data.Dataset.range(100).map( - lambda x: x + tf.random_uniform([], -10, 10, tf.int64)) -validation_dataset = tf.data.Dataset.range(50) - -# A reinitializable iterator is defined by its structure. We could use the -# `output_types` and `output_shapes` properties of either `training_dataset` -# or `validation_dataset` here, because they are compatible. -iterator = tf.data.Iterator.from_structure(training_dataset.output_types, - training_dataset.output_shapes) -next_element = iterator.get_next() - -training_init_op = iterator.make_initializer(training_dataset) -validation_init_op = iterator.make_initializer(validation_dataset) - -# Run 20 epochs in which the training dataset is traversed, followed by the -# validation dataset. -for _ in range(20): - # Initialize an iterator over the training dataset. - sess.run(training_init_op) - for _ in range(100): - sess.run(next_element) - - # Initialize an iterator over the validation dataset. - sess.run(validation_init_op) - for _ in range(50): - sess.run(next_element) -``` - -A **feedable** iterator can be used together with @{tf.placeholder} to select -what `Iterator` to use in each call to @{tf.Session.run}, via the familiar -`feed_dict` mechanism. It offers the same functionality as a reinitializable -iterator, but it does not require you to initialize the iterator from the start -of a dataset when you switch between iterators. For example, using the same -training and validation example from above, you can use -@{tf.data.Iterator.from_string_handle} to define a feedable iterator -that allows you to switch between the two datasets: - -```python -# Define training and validation datasets with the same structure. -training_dataset = tf.data.Dataset.range(100).map( - lambda x: x + tf.random_uniform([], -10, 10, tf.int64)).repeat() -validation_dataset = tf.data.Dataset.range(50) - -# A feedable iterator is defined by a handle placeholder and its structure. We -# could use the `output_types` and `output_shapes` properties of either -# `training_dataset` or `validation_dataset` here, because they have -# identical structure. -handle = tf.placeholder(tf.string, shape=[]) -iterator = tf.data.Iterator.from_string_handle( - handle, training_dataset.output_types, training_dataset.output_shapes) -next_element = iterator.get_next() - -# You can use feedable iterators with a variety of different kinds of iterator -# (such as one-shot and initializable iterators). -training_iterator = training_dataset.make_one_shot_iterator() -validation_iterator = validation_dataset.make_initializable_iterator() - -# The `Iterator.string_handle()` method returns a tensor that can be evaluated -# and used to feed the `handle` placeholder. -training_handle = sess.run(training_iterator.string_handle()) -validation_handle = sess.run(validation_iterator.string_handle()) - -# Loop forever, alternating between training and validation. -while True: - # Run 200 steps using the training dataset. Note that the training dataset is - # infinite, and we resume from where we left off in the previous `while` loop - # iteration. - for _ in range(200): - sess.run(next_element, feed_dict={handle: training_handle}) - - # Run one pass over the validation dataset. - sess.run(validation_iterator.initializer) - for _ in range(50): - sess.run(next_element, feed_dict={handle: validation_handle}) -``` - -### Consuming values from an iterator - -The `Iterator.get_next()` method returns one or more `tf.Tensor` objects that -correspond to the symbolic next element of an iterator. Each time these tensors -are evaluated, they take the value of the next element in the underlying -dataset. (Note that, like other stateful objects in TensorFlow, calling -`Iterator.get_next()` does not immediately advance the iterator. Instead you -must use the returned `tf.Tensor` objects in a TensorFlow expression, and pass -the result of that expression to `tf.Session.run()` to get the next elements and -advance the iterator.) - -If the iterator reaches the end of the dataset, executing -the `Iterator.get_next()` operation will raise a `tf.errors.OutOfRangeError`. -After this point the iterator will be in an unusable state, and you must -initialize it again if you want to use it further. - -```python -dataset = tf.data.Dataset.range(5) -iterator = dataset.make_initializable_iterator() -next_element = iterator.get_next() - -# Typically `result` will be the output of a model, or an optimizer's -# training operation. -result = tf.add(next_element, next_element) - -sess.run(iterator.initializer) -print(sess.run(result)) # ==> "0" -print(sess.run(result)) # ==> "2" -print(sess.run(result)) # ==> "4" -print(sess.run(result)) # ==> "6" -print(sess.run(result)) # ==> "8" -try: - sess.run(result) -except tf.errors.OutOfRangeError: - print("End of dataset") # ==> "End of dataset" -``` - -A common pattern is to wrap the "training loop" in a `try`-`except` block: - -```python -sess.run(iterator.initializer) -while True: - try: - sess.run(result) - except tf.errors.OutOfRangeError: - break -``` - -If each element of the dataset has a nested structure, the return value of -`Iterator.get_next()` will be one or more `tf.Tensor` objects in the same -nested structure: - -```python -dataset1 = tf.data.Dataset.from_tensor_slices(tf.random_uniform([4, 10])) -dataset2 = tf.data.Dataset.from_tensor_slices((tf.random_uniform([4]), tf.random_uniform([4, 100]))) -dataset3 = tf.data.Dataset.zip((dataset1, dataset2)) - -iterator = dataset3.make_initializable_iterator() - -sess.run(iterator.initializer) -next1, (next2, next3) = iterator.get_next() -``` - -Note that `next1`, `next2`, and `next3` are tensors produced by the -same op/node (created by `Iterator.get_next()`). Therefore, evaluating *any* of -these tensors will advance the iterator for all components. A typical consumer -of an iterator will include all components in a single expression. - -### Saving iterator state - -The @{tf.contrib.data.make_saveable_from_iterator} function creates a -`SaveableObject` from an iterator, which can be used to save and -restore the current state of the iterator (and, effectively, the whole input -pipeline). A saveable object thus created can be added to @{tf.train.Saver} -variables list or the `tf.GraphKeys.SAVEABLE_OBJECTS` collection for saving and -restoring in the same manner as a @{tf.Variable}. Refer to -@{$saved_model$Saving and Restoring} for details on how to save and restore -variables. - -```python -# Create saveable object from iterator. -saveable = tf.contrib.data.make_saveable_from_iterator(iterator) - -# Save the iterator state by adding it to the saveable objects collection. -tf.add_to_collection(tf.GraphKeys.SAVEABLE_OBJECTS, saveable) -saver = tf.train.Saver() - -with tf.Session() as sess: - - if should_checkpoint: - saver.save(path_to_checkpoint) - -# Restore the iterator state. -with tf.Session() as sess: - saver.restore(sess, path_to_checkpoint) -``` - -## Reading input data - -### Consuming NumPy arrays - -If all of your input data fit in memory, the simplest way to create a `Dataset` -from them is to convert them to `tf.Tensor` objects and use -`Dataset.from_tensor_slices()`. - -```python -# Load the training data into two NumPy arrays, for example using `np.load()`. -with np.load("/var/data/training_data.npy") as data: - features = data["features"] - labels = data["labels"] - -# Assume that each row of `features` corresponds to the same row as `labels`. -assert features.shape[0] == labels.shape[0] - -dataset = tf.data.Dataset.from_tensor_slices((features, labels)) -``` - -Note that the above code snippet will embed the `features` and `labels` arrays -in your TensorFlow graph as `tf.constant()` operations. This works well for a -small dataset, but wastes memory---because the contents of the array will be -copied multiple times---and can run into the 2GB limit for the `tf.GraphDef` -protocol buffer. - -As an alternative, you can define the `Dataset` in terms of `tf.placeholder()` -tensors, and *feed* the NumPy arrays when you initialize an `Iterator` over the -dataset. - -```python -# Load the training data into two NumPy arrays, for example using `np.load()`. -with np.load("/var/data/training_data.npy") as data: - features = data["features"] - labels = data["labels"] - -# Assume that each row of `features` corresponds to the same row as `labels`. -assert features.shape[0] == labels.shape[0] - -features_placeholder = tf.placeholder(features.dtype, features.shape) -labels_placeholder = tf.placeholder(labels.dtype, labels.shape) - -dataset = tf.data.Dataset.from_tensor_slices((features_placeholder, labels_placeholder)) -# [Other transformations on `dataset`...] -dataset = ... -iterator = dataset.make_initializable_iterator() - -sess.run(iterator.initializer, feed_dict={features_placeholder: features, - labels_placeholder: labels}) -``` - -### Consuming TFRecord data - -The `tf.data` API supports a variety of file formats so that you can process -large datasets that do not fit in memory. For example, the TFRecord file format -is a simple record-oriented binary format that many TensorFlow applications use -for training data. The `tf.data.TFRecordDataset` class enables you to -stream over the contents of one or more TFRecord files as part of an input -pipeline. - -```python -# Creates a dataset that reads all of the examples from two files. -filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] -dataset = tf.data.TFRecordDataset(filenames) -``` - -The `filenames` argument to the `TFRecordDataset` initializer can either be a -string, a list of strings, or a `tf.Tensor` of strings. Therefore if you have -two sets of files for training and validation purposes, you can use a -`tf.placeholder(tf.string)` to represent the filenames, and initialize an -iterator from the appropriate filenames: - -```python -filenames = tf.placeholder(tf.string, shape=[None]) -dataset = tf.data.TFRecordDataset(filenames) -dataset = dataset.map(...) # Parse the record into tensors. -dataset = dataset.repeat() # Repeat the input indefinitely. -dataset = dataset.batch(32) -iterator = dataset.make_initializable_iterator() - -# You can feed the initializer with the appropriate filenames for the current -# phase of execution, e.g. training vs. validation. - -# Initialize `iterator` with training data. -training_filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] -sess.run(iterator.initializer, feed_dict={filenames: training_filenames}) - -# Initialize `iterator` with validation data. -validation_filenames = ["/var/data/validation1.tfrecord", ...] -sess.run(iterator.initializer, feed_dict={filenames: validation_filenames}) -``` - -### Consuming text data - -Many datasets are distributed as one or more text files. The -`tf.data.TextLineDataset` provides an easy way to extract lines from -one or more text files. Given one or more filenames, a `TextLineDataset` will -produce one string-valued element per line of those files. Like a -`TFRecordDataset`, `TextLineDataset` accepts `filenames` as a `tf.Tensor`, so -you can parameterize it by passing a `tf.placeholder(tf.string)`. - -```python -filenames = ["/var/data/file1.txt", "/var/data/file2.txt"] -dataset = tf.data.TextLineDataset(filenames) -``` - -By default, a `TextLineDataset` yields *every* line of each file, which may -not be desirable, for example if the file starts with a header line, or contains -comments. These lines can be removed using the `Dataset.skip()` and -`Dataset.filter()` transformations. To apply these transformations to each -file separately, we use `Dataset.flat_map()` to create a nested `Dataset` for -each file. - -```python -filenames = ["/var/data/file1.txt", "/var/data/file2.txt"] - -dataset = tf.data.Dataset.from_tensor_slices(filenames) - -# Use `Dataset.flat_map()` to transform each file as a separate nested dataset, -# and then concatenate their contents sequentially into a single "flat" dataset. -# * Skip the first line (header row). -# * Filter out lines beginning with "#" (comments). -dataset = dataset.flat_map( - lambda filename: ( - tf.data.TextLineDataset(filename) - .skip(1) - .filter(lambda line: tf.not_equal(tf.substr(line, 0, 1), "#")))) -``` - -### Consuming CSV data - -The CSV file format is a popular format for storing tabular data in plain text. -The @{tf.contrib.data.CsvDataset} class provides a way to extract records from -one or more CSV files that comply with [RFC 4180](https://tools.ietf.org/html/rfc4180). -Given one or more filenames and a list of defaults, a `CsvDataset` will produce -a tuple of elements whose types correspond to the types of the defaults -provided, per CSV record. Like `TFRecordDataset` and `TextLineDataset`, -`CsvDataset` accepts `filenames` as a `tf.Tensor`, so you can parameterize it -by passing a `tf.placeholder(tf.string)`. - -``` -# Creates a dataset that reads all of the records from two CSV files, each with -# eight float columns -filenames = ["/var/data/file1.csv", "/var/data/file2.csv"] -record_defaults = [tf.float32] * 8 # Eight required float columns -dataset = tf.contrib.data.CsvDataset(filenames, record_defaults) -``` - -If some columns are empty, you can provide defaults instead of types. - -``` -# Creates a dataset that reads all of the records from two CSV files, each with -# four float columns which may have missing values -record_defaults = [[0.0]] * 8 -dataset = tf.contrib.data.CsvDataset(filenames, record_defaults) -``` - -By default, a `CsvDataset` yields *every* column of *every* line of the file, -which may not be desirable, for example if the file starts with a header line -that should be ignored, or if some columns are not required in the input. -These lines and fields can be removed with the `header` and `select_cols` -arguments respectively. - -``` -# Creates a dataset that reads all of the records from two CSV files with -# headers, extracting float data from columns 2 and 4. -record_defaults = [[0.0]] * 2 # Only provide defaults for the selected columns -dataset = tf.contrib.data.CsvDataset(filenames, record_defaults, header=True, select_cols=[2,4]) -``` - - -## Preprocessing data with `Dataset.map()` - -The `Dataset.map(f)` transformation produces a new dataset by applying a given -function `f` to each element of the input dataset. It is based on -the -[`map()` function](https://en.wikipedia.org/wiki/Map_(higher-order_function)) -that is commonly applied to lists (and other structures) in functional -programming languages. The function `f` takes the `tf.Tensor` objects that -represent a single element in the input, and returns the `tf.Tensor` objects -that will represent a single element in the new dataset. Its implementation uses -standard TensorFlow operations to transform one element into another. - -This section covers common examples of how to use `Dataset.map()`. - -### Parsing `tf.Example` protocol buffer messages - -Many input pipelines extract `tf.train.Example` protocol buffer messages from a -TFRecord-format file (written, for example, using -`tf.python_io.TFRecordWriter`). Each `tf.train.Example` record contains one or -more "features", and the input pipeline typically converts these features into -tensors. - -```python -# Transforms a scalar string `example_proto` into a pair of a scalar string and -# a scalar integer, representing an image and its label, respectively. -def _parse_function(example_proto): - features = {"image": tf.FixedLenFeature((), tf.string, default_value=""), - "label": tf.FixedLenFeature((), tf.int32, default_value=0)} - parsed_features = tf.parse_single_example(example_proto, features) - return parsed_features["image"], parsed_features["label"] - -# Creates a dataset that reads all of the examples from two files, and extracts -# the image and label features. -filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] -dataset = tf.data.TFRecordDataset(filenames) -dataset = dataset.map(_parse_function) -``` - -### Decoding image data and resizing it - -When training a neural network on real-world image data, it is often necessary -to convert images of different sizes to a common size, so that they may be -batched into a fixed size. - -```python -# Reads an image from a file, decodes it into a dense tensor, and resizes it -# to a fixed shape. -def _parse_function(filename, label): - image_string = tf.read_file(filename) - image_decoded = tf.image.decode_jpeg(image_string) - image_resized = tf.image.resize_images(image_decoded, [28, 28]) - return image_resized, label - -# A vector of filenames. -filenames = tf.constant(["/var/data/image1.jpg", "/var/data/image2.jpg", ...]) - -# `labels[i]` is the label for the image in `filenames[i]. -labels = tf.constant([0, 37, ...]) - -dataset = tf.data.Dataset.from_tensor_slices((filenames, labels)) -dataset = dataset.map(_parse_function) -``` - -### Applying arbitrary Python logic with `tf.py_func()` - -For performance reasons, we encourage you to use TensorFlow operations for -preprocessing your data whenever possible. However, it is sometimes useful to -call upon external Python libraries when parsing your input data. To do so, -invoke, the `tf.py_func()` operation in a `Dataset.map()` transformation. - -```python -import cv2 - -# Use a custom OpenCV function to read the image, instead of the standard -# TensorFlow `tf.read_file()` operation. -def _read_py_function(filename, label): - image_decoded = cv2.imread(filename.decode(), cv2.IMREAD_GRAYSCALE) - return image_decoded, label - -# Use standard TensorFlow operations to resize the image to a fixed shape. -def _resize_function(image_decoded, label): - image_decoded.set_shape([None, None, None]) - image_resized = tf.image.resize_images(image_decoded, [28, 28]) - return image_resized, label - -filenames = ["/var/data/image1.jpg", "/var/data/image2.jpg", ...] -labels = [0, 37, 29, 1, ...] - -dataset = tf.data.Dataset.from_tensor_slices((filenames, labels)) -dataset = dataset.map( - lambda filename, label: tuple(tf.py_func( - _read_py_function, [filename, label], [tf.uint8, label.dtype]))) -dataset = dataset.map(_resize_function) -``` - - - -## Batching dataset elements - -### Simple batching - -The simplest form of batching stacks `n` consecutive elements of a dataset into -a single element. The `Dataset.batch()` transformation does exactly this, with -the same constraints as the `tf.stack()` operator, applied to each component -of the elements: i.e. for each component *i*, all elements must have a tensor -of the exact same shape. - -```python -inc_dataset = tf.data.Dataset.range(100) -dec_dataset = tf.data.Dataset.range(0, -100, -1) -dataset = tf.data.Dataset.zip((inc_dataset, dec_dataset)) -batched_dataset = dataset.batch(4) - -iterator = batched_dataset.make_one_shot_iterator() -next_element = iterator.get_next() - -print(sess.run(next_element)) # ==> ([0, 1, 2, 3], [ 0, -1, -2, -3]) -print(sess.run(next_element)) # ==> ([4, 5, 6, 7], [-4, -5, -6, -7]) -print(sess.run(next_element)) # ==> ([8, 9, 10, 11], [-8, -9, -10, -11]) -``` - -### Batching tensors with padding - -The above recipe works for tensors that all have the same size. However, many -models (e.g. sequence models) work with input data that can have varying size -(e.g. sequences of different lengths). To handle this case, the -`Dataset.padded_batch()` transformation enables you to batch tensors of -different shape by specifying one or more dimensions in which they may be -padded. - -```python -dataset = tf.data.Dataset.range(100) -dataset = dataset.map(lambda x: tf.fill([tf.cast(x, tf.int32)], x)) -dataset = dataset.padded_batch(4, padded_shapes=[None]) - -iterator = dataset.make_one_shot_iterator() -next_element = iterator.get_next() - -print(sess.run(next_element)) # ==> [[0, 0, 0], [1, 0, 0], [2, 2, 0], [3, 3, 3]] -print(sess.run(next_element)) # ==> [[4, 4, 4, 4, 0, 0, 0], - # [5, 5, 5, 5, 5, 0, 0], - # [6, 6, 6, 6, 6, 6, 0], - # [7, 7, 7, 7, 7, 7, 7]] -``` - -The `Dataset.padded_batch()` transformation allows you to set different padding -for each dimension of each component, and it may be variable-length (signified -by `None` in the example above) or constant-length. It is also possible to -override the padding value, which defaults to 0. - - - -## Training workflows - -### Processing multiple epochs - -The `tf.data` API offers two main ways to process multiple epochs of the same -data. - -The simplest way to iterate over a dataset in multiple epochs is to use the -`Dataset.repeat()` transformation. For example, to create a dataset that repeats -its input for 10 epochs: - -```python -filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] -dataset = tf.data.TFRecordDataset(filenames) -dataset = dataset.map(...) -dataset = dataset.repeat(10) -dataset = dataset.batch(32) -``` - -Applying the `Dataset.repeat()` transformation with no arguments will repeat -the input indefinitely. The `Dataset.repeat()` transformation concatenates its -arguments without signaling the end of one epoch and the beginning of the next -epoch. - -If you want to receive a signal at the end of each epoch, you can write a -training loop that catches the `tf.errors.OutOfRangeError` at the end of a -dataset. At that point you might collect some statistics (e.g. the validation -error) for the epoch. - -```python -filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] -dataset = tf.data.TFRecordDataset(filenames) -dataset = dataset.map(...) -dataset = dataset.batch(32) -iterator = dataset.make_initializable_iterator() -next_element = iterator.get_next() - -# Compute for 100 epochs. -for _ in range(100): - sess.run(iterator.initializer) - while True: - try: - sess.run(next_element) - except tf.errors.OutOfRangeError: - break - - # [Perform end-of-epoch calculations here.] -``` - -### Randomly shuffling input data - -The `Dataset.shuffle()` transformation randomly shuffles the input dataset -using a similar algorithm to `tf.RandomShuffleQueue`: it maintains a fixed-size -buffer and chooses the next element uniformly at random from that buffer. - -```python -filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] -dataset = tf.data.TFRecordDataset(filenames) -dataset = dataset.map(...) -dataset = dataset.shuffle(buffer_size=10000) -dataset = dataset.batch(32) -dataset = dataset.repeat() -``` - -### Using high-level APIs - -The @{tf.train.MonitoredTrainingSession} API simplifies many aspects of running -TensorFlow in a distributed setting. `MonitoredTrainingSession` uses the -@{tf.errors.OutOfRangeError} to signal that training has completed, so to use it -with the `tf.data` API, we recommend using -`Dataset.make_one_shot_iterator()`. For example: - -```python -filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] -dataset = tf.data.TFRecordDataset(filenames) -dataset = dataset.map(...) -dataset = dataset.shuffle(buffer_size=10000) -dataset = dataset.batch(32) -dataset = dataset.repeat(num_epochs) -iterator = dataset.make_one_shot_iterator() - -next_example, next_label = iterator.get_next() -loss = model_function(next_example, next_label) - -training_op = tf.train.AdagradOptimizer(...).minimize(loss) - -with tf.train.MonitoredTrainingSession(...) as sess: - while not sess.should_stop(): - sess.run(training_op) -``` - -To use a `Dataset` in the `input_fn` of a @{tf.estimator.Estimator}, we also -recommend using `Dataset.make_one_shot_iterator()`. For example: - -```python -def dataset_input_fn(): - filenames = ["/var/data/file1.tfrecord", "/var/data/file2.tfrecord"] - dataset = tf.data.TFRecordDataset(filenames) - - # Use `tf.parse_single_example()` to extract data from a `tf.Example` - # protocol buffer, and perform any additional per-record preprocessing. - def parser(record): - keys_to_features = { - "image_data": tf.FixedLenFeature((), tf.string, default_value=""), - "date_time": tf.FixedLenFeature((), tf.int64, default_value=""), - "label": tf.FixedLenFeature((), tf.int64, - default_value=tf.zeros([], dtype=tf.int64)), - } - parsed = tf.parse_single_example(record, keys_to_features) - - # Perform additional preprocessing on the parsed data. - image = tf.image.decode_jpeg(parsed["image_data"]) - image = tf.reshape(image, [299, 299, 1]) - label = tf.cast(parsed["label"], tf.int32) - - return {"image_data": image, "date_time": parsed["date_time"]}, label - - # Use `Dataset.map()` to build a pair of a feature dictionary and a label - # tensor for each example. - dataset = dataset.map(parser) - dataset = dataset.shuffle(buffer_size=10000) - dataset = dataset.batch(32) - dataset = dataset.repeat(num_epochs) - iterator = dataset.make_one_shot_iterator() - - # `features` is a dictionary in which each value is a batch of values for - # that feature; `labels` is a batch of labels. - features, labels = iterator.get_next() - return features, labels -``` diff --git a/tensorflow/docs_src/programmers_guide/datasets_for_estimators.md b/tensorflow/docs_src/programmers_guide/datasets_for_estimators.md deleted file mode 100644 index 345a31b985..0000000000 --- a/tensorflow/docs_src/programmers_guide/datasets_for_estimators.md +++ /dev/null @@ -1,387 +0,0 @@ -# Datasets for Estimators - -The @{tf.data} module contains a collection of classes that allows you to -easily load data, manipulate it, and pipe it into your model. This document -introduces the API by walking through two simple examples: - -* Reading in-memory data from numpy arrays. -* Reading lines from a csv file. - - - -## Basic input - -Taking slices from an array is the simplest way to get started with `tf.data`. - -The @{$premade_estimators$Premade Estimators} chapter describes -the following `train_input_fn`, from -[`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py), -to pipe the data into the Estimator: - -``` python -def train_input_fn(features, labels, batch_size): - """An input function for training""" - # Convert the inputs to a Dataset. - dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) - - # Shuffle, repeat, and batch the examples. - dataset = dataset.shuffle(1000).repeat().batch(batch_size) - - # Return the dataset. - return dataset -``` - -Let's look at this more closely. - -### Arguments - -This function expects three arguments. Arguments expecting an "array" can -accept nearly anything that can be converted to an array with `numpy.array`. -One exception is -[`tuple`](https://docs.python.org/3/tutorial/datastructures.html#tuples-and-sequences) -which, as we will see, has special meaning for `Datasets`. - -* `features`: A `{'feature_name':array}` dictionary (or - [`DataFrame`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html)) - containing the raw input features. -* `labels` : An array containing the - [label](https://developers.google.com/machine-learning/glossary/#label) - for each example. -* `batch_size` : An integer indicating the desired batch size. - -In [`premade_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py) -we retrieved the Iris data using the `iris_data.load_data()` function. -You can run it, and unpack the results as follows: - -``` python -import iris_data - -# Fetch the data -train, test = iris_data.load_data() -features, labels = train -``` - -Then we passed this data to the input function, with a line similar to this: - -``` python -batch_size=100 -iris_data.train_input_fn(features, labels, batch_size) -``` - -Let's walk through the `train_input_fn()`. - -### Slices - -The function starts by using the @{tf.data.Dataset.from_tensor_slices} function -to create a @{tf.data.Dataset} representing slices of the array. The array is -sliced across the first dimension. For example, an array containing the -@{$tutorials/layers$mnist training data} has a shape of `(60000, 28, 28)`. -Passing this to `from_tensor_slices` returns a `Dataset` object containing -60000 slices, each one a 28x28 image. - -The code that returns this `Dataset` is as follows: - -``` python -train, test = tf.keras.datasets.mnist.load_data() -mnist_x, mnist_y = train - -mnist_ds = tf.data.Dataset.from_tensor_slices(mnist_x) -print(mnist_ds) -``` - -This will print the following line, showing the -@{$programmers_guide/tensors#shapes$shapes} and -@{$programmers_guide/tensors#data_types$types} of the items in -the dataset. Note that a `Dataset` does not know how many items it contains. - -``` None - -``` - -The `Dataset` above represents a simple collection of arrays, but datasets are -much more powerful than this. A `Dataset` can transparently handle any nested -combination of dictionaries or tuples (or -[`namedtuple`](https://docs.python.org/2/library/collections.html#collections.namedtuple) -). - -For example after converting the iris `features` -to a standard python dictionary, you can then convert the dictionary of arrays -to a `Dataset` of dictionaries as follows: - -``` python -dataset = tf.data.Dataset.from_tensor_slices(dict(features)) -print(dataset) -``` -``` None - -``` - -Here we see that when a `Dataset` contains structured elements, the `shapes` -and `types` of the `Dataset` take on the same structure. This dataset contains -dictionaries of @{$programmers_guide/tensors#rank$scalars}, all of type -`tf.float64`. - -The first line of the iris `train_input_fn` uses the same functionality, but -adds another level of structure. It creates a dataset containing -`(features_dict, label)` pairs. - -The following code shows that the label is a scalar with type `int64`: - -``` python -# Convert the inputs to a Dataset. -dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) -print(dataset) -``` -``` - -``` - -### Manipulation - -Currently the `Dataset` would iterate over the data once, in a fixed order, and -only produce a single element at a time. It needs further processing before it -can be used for training. Fortunately, the `tf.data.Dataset` class provides -methods to better prepare the data for training. The next line of the input -function takes advantage of several of these methods: - -``` python -# Shuffle, repeat, and batch the examples. -dataset = dataset.shuffle(1000).repeat().batch(batch_size) -``` - -The @{tf.data.Dataset.shuffle$`shuffle`} method uses a fixed-size buffer to -shuffle the items as they pass through. In this case the `buffer_size` is -greater than the number of examples in the `Dataset`, ensuring that the data is -completely shuffled (The Iris data set only contains 150 examples). - -The @{tf.data.Dataset.repeat$`repeat`} method restarts the `Dataset` when -it reaches the end. To limit the number of epochs, set the `count` argument. - -The @{tf.data.Dataset.batch$`batch`} method collects a number of examples and -stacks them, to create batches. This adds a dimension to their shape. The new -dimension is added as the first dimension. The following code uses -the `batch` method on the MNIST `Dataset`, from earlier. This results in a -`Dataset` containing 3D arrays representing stacks of `(28,28)` images: - -``` python -print(mnist_ds.batch(100)) -``` - -``` none - -``` -Note that the dataset has an unknown batch size because the last batch will -have fewer elements. - -In `train_input_fn`, after batching the `Dataset` contains 1D vectors of -elements where each scalar was previously: - -```python -print(dataset) -``` -``` - -``` - - -### Return - -At this point the `Dataset` contains `(features_dict, labels)` pairs. -This is the format expected by the `train` and `evaluate` methods, so the -`input_fn` returns the dataset. - -The `labels` can/should be omitted when using the `predict` method. - - - - -## Reading a CSV File - -The most common real-world use case for the `Dataset` class is to stream data -from files on disk. The @{tf.data} module includes a variety of -file readers. Let's see how parsing the Iris dataset from the csv file looks -using a `Dataset`. - -The following call to the `iris_data.maybe_download` function downloads the -data if necessary, and returns the pathnames of the resulting files: - -``` python -import iris_data -train_path, test_path = iris_data.maybe_download() -``` - -The [`iris_data.csv_input_fn`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py) -function contains an alternative implementation that parses the csv files using -a `Dataset`. - -Let's look at how to build an Estimator-compatible input function that reads -from the local files. - -### Build the `Dataset` - -We start by building a @{tf.data.TextLineDataset$`TextLineDataset`} object to -read the file one line at a time. Then, we call the -@{tf.data.Dataset.skip$`skip`} method to skip over the first line of the file, which contains a header, not an example: - -``` python -ds = tf.data.TextLineDataset(train_path).skip(1) -``` - -### Build a csv line parser - -We will start by building a function to parse a single line. - -The following `iris_data.parse_line` function accomplishes this task using the -@{tf.decode_csv} function, and some simple python code: - -We must parse each of the lines in the dataset in order to generate the -necessary `(features, label)` pairs. The following `_parse_line` function -calls @{tf.decode_csv} to parse a single line into its features -and the label. Since Estimators require that features be represented as a -dictionary, we rely on Python's built-in `dict` and `zip` functions to build -that dictionary. The feature names are the keys of that dictionary. -We then call the dictionary's `pop` method to remove the label field from -the features dictionary: - -``` python -# Metadata describing the text columns -COLUMNS = ['SepalLength', 'SepalWidth', - 'PetalLength', 'PetalWidth', - 'label'] -FIELD_DEFAULTS = [[0.0], [0.0], [0.0], [0.0], [0]] -def _parse_line(line): - # Decode the line into its fields - fields = tf.decode_csv(line, FIELD_DEFAULTS) - - # Pack the result into a dictionary - features = dict(zip(COLUMNS,fields)) - - # Separate the label from the features - label = features.pop('label') - - return features, label -``` - -### Parse the lines - -Datasets have many methods for manipulating the data while it is being piped -to a model. The most heavily-used method is @{tf.data.Dataset.map$`map`}, which -applies a transformation to each element of the `Dataset`. - -The `map` method takes a `map_func` argument that describes how each item in the -`Dataset` should be transformed. - -
- -
-
-The @{tf.data.Dataset.map$`map`} method applies the `map_func` to -transform each item in the Dataset. -
- -So to parse the lines as they are streamed out of the csv file, we pass our -`_parse_line` function to the `map` method: - -``` python -ds = ds.map(_parse_line) -print(ds) -``` -``` None - -``` - -Now instead of simple scalar strings, the dataset contains `(features, label)` -pairs. - -the remainder of the `iris_data.csv_input_fn` function is identical -to `iris_data.train_input_fn` which was covered in the in the -[Basic input](#basic_input) section. - -### Try it out - -This function can be used as a replacement for -`iris_data.train_input_fn`. It can be used to feed an estimator as follows: - -``` python -train_path, test_path = iris_data.maybe_download() - -# All the inputs are numeric -feature_columns = [ - tf.feature_column.numeric_column(name) - for name in iris_data.CSV_COLUMN_NAMES[:-1]] - -# Build the estimator -est = tf.estimator.LinearClassifier(feature_columns, - n_classes=3) -# Train the estimator -batch_size = 100 -est.train( - steps=1000, - input_fn=lambda : iris_data.csv_input_fn(train_path, batch_size)) -``` - -Estimators expect an `input_fn` to take no arguments. To work around this -restriction, we use `lambda` to capture the arguments and provide the expected -interface. - -## Summary - -The `tf.data` module provides a collection of classes and functions for easily -reading data from a variety of sources. Furthermore, `tf.data` has simple -powerful methods for applying a wide variety of standard and custom -transformations. - -Now you have the basic idea of how to efficiently load data into an -Estimator. Consider the following documents next: - - -* @{$custom_estimators}, which demonstrates how to build your own - custom `Estimator` model. -* The @{$low_level_intro#datasets$Low Level Introduction}, which demonstrates - how to experiment directly with `tf.data.Datasets` using TensorFlow's low - level APIs. -* @{$programmers_guide/datasets} which goes into great detail about additional - functionality of `Datasets`. - diff --git a/tensorflow/docs_src/programmers_guide/debugger.md b/tensorflow/docs_src/programmers_guide/debugger.md deleted file mode 100644 index 6bd941886d..0000000000 --- a/tensorflow/docs_src/programmers_guide/debugger.md +++ /dev/null @@ -1,804 +0,0 @@ -# TensorFlow Debugger - - - -[TOC] - -`tfdbg` is a specialized debugger for TensorFlow. It lets you view the internal -structure and states of running TensorFlow graphs during training and inference, -which is difficult to debug with general-purpose debuggers such as Python's `pdb` -due to TensorFlow's computation-graph paradigm. - -This guide focuses on the command-line interface (CLI) of `tfdbg`. For guide on -how to use the graphical user interface (GUI) of tfdbg, i.e., the -**TensorBoard Debugger Plugin**, please visit -[its README](https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/debugger/README.md). - -Note: The TensorFlow debugger uses a -[curses](https://en.wikipedia.org/wiki/Curses_\(programming_library\))-based text -user interface. On Mac OS X, the `ncurses` library is required and can be -installed with `brew install homebrew/dupes/ncurses`. On Windows, curses isn't as -well supported, so a [readline](https://en.wikipedia.org/wiki/GNU_Readline)-based -interface can be used with tfdbg by installing `pyreadline` with `pip`. If you -use Anaconda3, you can install it with a command such as -`"C:\Program Files\Anaconda3\Scripts\pip.exe" install pyreadline`. Unofficial -Windows curses packages can be downloaded -[here](https://www.lfd.uci.edu/~gohlke/pythonlibs/#curses), then subsequently -installed using `pip install .whl`, however curses on Windows may -not work as reliably as curses on Linux or Mac. - -This tutorial demonstrates how to use the **tfdbg** CLI to debug the appearance -of [`nan`s](https://en.wikipedia.org/wiki/NaN) -and [`inf`s](https://en.wikipedia.org/wiki/Infinity), a frequently-encountered -type of bug in TensorFlow model development. -The following example is for users who use the low-level -[`Session`](https://www.tensorflow.org/api_docs/python/tf/Session) API of -TensorFlow. A later section of this document describes how to use **tfdbg** -with a higher-level API, namely `Estimator`s. -To *observe* such an issue, run the following command without the debugger (the -source code can be found -[here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/debug/examples/debug_mnist.py)): - -```none -python -m tensorflow.python.debug.examples.debug_mnist -``` - -This code trains a simple neural network for MNIST digit image recognition. -Notice that the accuracy increases slightly after the first training step, but -then gets stuck at a low (near-chance) level: - -```none -Accuracy at step 0: 0.1113 -Accuracy at step 1: 0.3183 -Accuracy at step 2: 0.098 -Accuracy at step 3: 0.098 -Accuracy at step 4: 0.098 -``` - -Wondering what might have gone wrong, you suspect that certain nodes in the -training graph generated bad numeric values such as `inf`s and `nan`s, because -this is a common cause of this type of training failure. -Let's use tfdbg to debug this issue and pinpoint the exact graph node where this -numeric problem first surfaced. - -## Wrapping TensorFlow Sessions with tfdbg - -To add support for tfdbg in our example, all that is needed is to add the -following lines of code and wrap the Session object with a debugger wrapper. -This code is already added in -[debug_mnist.py](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/debug/examples/debug_mnist.py), -so you can activate tfdbg CLI with the `--debug` flag at the command line. - -```python -# Let your BUILD target depend on "//tensorflow/python/debug:debug_py" -# (You don't need to worry about the BUILD dependency if you are using a pip -# install of open-source TensorFlow.) -from tensorflow.python import debug as tf_debug - -sess = tf_debug.LocalCLIDebugWrapperSession(sess) -``` - -This wrapper has the same interface as Session, so enabling debugging requires -no other changes to the code. The wrapper provides additional features, -including: - -* Bringing up a CLI before and after `Session.run()` calls, to let you -control the execution and inspect the graph's internal state. -* Allowing you to register special `filters` for tensor values, to facilitate -the diagnosis of issues. - -In this example, we have already registered a tensor filter called -@{tfdbg.has_inf_or_nan}, -which simply determines if there are any `nan` or `inf` values in any -intermediate tensors (tensors that are neither inputs or outputs of the -`Session.run()` call, but are in the path leading from the inputs to the -outputs). This filter is for `nan`s and `inf`s is a common enough use case that -we ship it with the -@{$python/tfdbg#Classes_for_debug_dump_data_and_directories$`debug_data`} -module. - -Note: You can also write your own custom filters. See -the @{tfdbg.DebugDumpDir.find$API documentation} -of `DebugDumpDir.find()` for additional information. - -## Debugging Model Training with tfdbg - - -Let's try training the model again, but with the `--debug` flag added this time: - -```none -python -m tensorflow.python.debug.examples.debug_mnist --debug -``` - -The debug wrapper session will prompt you when it is about to execute the first -`Session.run()` call, with information regarding the fetched tensor and feed -dictionaries displayed on the screen. - -![tfdbg run-start UI](https://www.tensorflow.org/images/tfdbg_screenshot_run_start.png) - -This is what we refer to as the *run-start CLI*. It lists the feeds and fetches -to the current `Session.run` call, before executing anything. - -If the screen size is too small to display the content of the message in its -entirety, you can resize it. - -Use the **PageUp** / **PageDown** / **Home** / **End** keys to navigate the -screen output. On most keyboards lacking those keys **Fn + Up** / -**Fn + Down** / **Fn + Right** / **Fn + Left** will work. - -Enter the `run` command (or just `r`) at the command prompt: - -``` -tfdbg> run -``` - -The `run` command causes tfdbg to execute until the end of the next -`Session.run()` call, which calculates the model's accuracy using a test data -set. tfdbg augments the runtime Graph to dump all intermediate tensors. -After the run ends, tfdbg displays all the dumped tensors values in the -*run-end CLI*. For example: - -![tfdbg run-end UI: accuracy](https://www.tensorflow.org/images/tfdbg_screenshot_run_end_accuracy.png) - -This list of tensors can also be obtained by running the command `lt` after you -executed `run`. - -### tfdbg CLI Frequently-Used Commands - -Try the following commands at the `tfdbg>` prompt (referencing the code at -`tensorflow/python/debug/examples/debug_mnist.py`): - -| Command | Syntax or Option | Explanation | Example | -|:-------------------|:---------------- |:------------ |:------------------------- | -| **`lt`** | | **List dumped tensors.** | `lt` | -| | `-n ` | List dumped tensors with names matching given regular-expression pattern. | `lt -n Softmax.*` | -| | `-t ` | List dumped tensors with op types matching given regular-expression pattern. | `lt -t MatMul` | -| | `-f ` | List only the tensors that pass a registered tensor filter. | `lt -f has_inf_or_nan` | -| | `-f -fenn ` | List only the tensors that pass a registered tensor filter, excluding nodes with names matching the regular expression. | `lt -f has_inf_or_nan` `-fenn .*Sqrt.*` | -| | `-s ` | Sort the output by given `sort_key`, whose possible values are `timestamp` (default), `dump_size`, `op_type` and `tensor_name`. | `lt -s dump_size` | -| | `-r` | Sort in reverse order. | `lt -r -s dump_size` | -| **`pt`** | | **Print value of a dumped tensor.** | | -| | `pt ` | Print tensor value. | `pt hidden/Relu:0` | -| | `pt [slicing]` | Print a subarray of tensor, using [numpy](http://www.numpy.org/)-style array slicing. | `pt hidden/Relu:0[0:50,:]` | -| | `-a` | Print the entirety of a large tensor, without using ellipses. (May take a long time for large tensors.) | `pt -a hidden/Relu:0[0:50,:]` | -| | `-r ` | Highlight elements falling into specified numerical range. Multiple ranges can be used in conjunction. | `pt hidden/Relu:0 -a -r [[-inf,-1],[1,inf]]` | -| | `-n ` | Print dump corresponding to specified 0-based dump number. Required for tensors with multiple dumps. | `pt -n 0 hidden/Relu:0` | -| | `-s` | Include a summary of the numeric values of the tensor (applicable only to non-empty tensors with Boolean and numeric types such as `int*` and `float*`.) | `pt -s hidden/Relu:0[0:50,:]` | -| | `-w` | Write the value of the tensor (possibly sliced) to a Numpy file using [`numpy.save()`](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.save.html) | `pt -s hidden/Relu:0 -w /tmp/relu.npy` | -| **`@[coordinates]`** | | Navigate to specified element in `pt` output. | `@[10,0]` or `@10,0` | -| **`/regex`** | | [less](https://linux.die.net/man/1/less)-style search for given regular expression. | `/inf` | -| **`/`** | | Scroll to the next line with matches to the searched regex (if any). | `/` | -| **`pf`** | | **Print a value in the feed_dict to `Session.run`.** | | -| | `pf ` | Print the value of the feed. Also note that the `pf` command has the `-a`, `-r` and `-s` flags (not listed below), which have the same syntax and semantics as the identically-named flags of `pt`. | `pf input_xs:0` | -| **eval** | | **Evaluate arbitrary Python and numpy expression.** | | -| | `eval ` | Evaluate a Python / numpy expression, with numpy available as `np` and debug tensor names enclosed in backticks. | ``eval "np.matmul((`output/Identity:0` / `Softmax:0`).T, `Softmax:0`)"`` | -| | `-a` | Print a large-sized evaluation result in its entirety, i.e., without using ellipses. | ``eval -a 'np.sum(`Softmax:0`, axis=1)'`` | -| | `-w` | Write the result of the evaluation to a Numpy file using [`numpy.save()`](https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.save.html) | ``eval -a 'np.sum(`Softmax:0`, axis=1)' -w /tmp/softmax_sum.npy`` | -| **`ni`** | | **Display node information.** | | -| | `-a` | Include node attributes in the output. | `ni -a hidden/Relu` | -| | `-d` | List the debug dumps available from the node. | `ni -d hidden/Relu` | -| | `-t` | Display the Python stack trace of the node's creation. | `ni -t hidden/Relu` | -| **`li`** | | **List inputs to node** | | -| | `-r` | List the inputs to node, recursively (the input tree.) | `li -r hidden/Relu:0` | -| | `-d ` | Limit recursion depth under the `-r` mode. | `li -r -d 3 hidden/Relu:0` | -| | `-c` | Include control inputs. | `li -c -r hidden/Relu:0` | -| | `-t` | Show op types of input nodes. | `li -t -r hidden/Relu:0` | -| **`lo`** | | **List output recipients of node** | | -| | `-r` | List the output recipients of node, recursively (the output tree.) | `lo -r hidden/Relu:0` | -| | `-d ` | Limit recursion depth under the `-r` mode. | `lo -r -d 3 hidden/Relu:0` | -| | `-c` | Include recipients via control edges. | `lo -c -r hidden/Relu:0` | -| | `-t` | Show op types of recipient nodes. | `lo -t -r hidden/Relu:0` | -| **`ls`** | | **List Python source files involved in node creation.** | | -| | `-p ` | Limit output to source files matching given regular-expression path pattern. | `ls -p .*debug_mnist.*` | -| | `-n` | Limit output to node names matching given regular-expression pattern. | `ls -n Softmax.*` | -| **`ps`** | | **Print Python source file.** | | -| | `ps ` | Print given Python source file source.py, with the lines annotated with the nodes created at each of them (if any). | `ps /path/to/source.py` | -| | `-t` | Perform annotation with respect to Tensors, instead of the default, nodes. | `ps -t /path/to/source.py` | -| | `-b ` | Annotate source.py beginning at given line. | `ps -b 30 /path/to/source.py` | -| | `-m ` | Limit the number of elements in the annotation for each line. | `ps -m 100 /path/to/source.py` | -| **`run`** | | **Proceed to the next Session.run()** | `run` | -| | `-n` | Execute through the next `Session.run` without debugging, and drop to CLI right before the run after that. | `run -n` | -| | `-t ` | Execute `Session.run` `T - 1` times without debugging, followed by a run with debugging. Then drop to CLI right after the debugged run. | `run -t 10` | -| | `-f ` | Continue executing `Session.run` until any intermediate tensor triggers the specified Tensor filter (causes the filter to return `True`). | `run -f has_inf_or_nan` | -| | `-f -fenn ` | Continue executing `Session.run` until any intermediate tensor whose node names doesn't match the regular expression triggers the specified Tensor filter (causes the filter to return `True`). | `run -f has_inf_or_nan -fenn .*Sqrt.*` | -| | `--node_name_filter ` | Execute the next `Session.run`, watching only nodes with names matching the given regular-expression pattern. | `run --node_name_filter Softmax.*` | -| | `--op_type_filter ` | Execute the next `Session.run`, watching only nodes with op types matching the given regular-expression pattern. | `run --op_type_filter Variable.*` | -| | `--tensor_dtype_filter ` | Execute the next `Session.run`, dumping only Tensors with data types (`dtype`s) matching the given regular-expression pattern. | `run --tensor_dtype_filter int.*` | -| | `-p` | Execute the next `Session.run` call in profiling mode. | `run -p` | -| **`ri`** | | **Display information about the run the current run, including fetches and feeds.** | `ri` | -| **`config`** | | **Set or show persistent TFDBG UI configuration.** | | -| | `set` | Set the value of a config item: {`graph_recursion_depth`, `mouse_mode`}. | `config set graph_recursion_depth 3` | -| | `show` | Show current persistent UI configuration. | `config show` | -| **`help`** | | **Print general help information** | `help` | -| | `help ` | Print help for given command. | `help lt` | - -Note that each time you enter a command, a new screen output -will appear. This is somewhat analogous to web pages in a browser. You can -navigate between these screens by clicking the `<--` and -`-->` text arrows near the top-left corner of the CLI. - -### Other Features of the tfdbg CLI - -In addition to the commands listed above, the tfdbg CLI provides the following -additional features: - -* To navigate through previous tfdbg commands, type in a few characters - followed by the Up or Down arrow keys. tfdbg will show you the history of - commands that started with those characters. -* To navigate through the history of screen outputs, do either of the - following: - * Use the `prev` and `next` commands. - * Click underlined `<--` and `-->` links near the top left corner of the - screen. -* Tab completion of commands and some command arguments. -* To redirect the screen output to a file instead of the screen, end the - command with bash-style redirection. For example, the following command - redirects the output of the pt command to the `/tmp/xent_value_slices.txt` - file: - - ```none - tfdbg> pt cross_entropy/Log:0[:, 0:10] > /tmp/xent_value_slices.txt - ``` - -### Finding `nan`s and `inf`s - -In this first `Session.run()` call, there happen to be no problematic numerical -values. You can move on to the next run by using the command `run` or its -shorthand `r`. - -> TIP: If you enter `run` or `r` repeatedly, you will be able to move through -> the `Session.run()` calls in a sequential manner. -> -> You can also use the `-t` flag to move ahead a number of `Session.run()` calls -> at a time, for example: -> -> ``` -> tfdbg> run -t 10 -> ``` - -Instead of entering `run` repeatedly and manually searching for `nan`s and -`inf`s in the run-end UI after every `Session.run()` call (for example, by using -the `pt` command shown in the table above) , you can use the following -command to let the debugger repeatedly execute `Session.run()` calls without -stopping at the run-start or run-end prompt, until the first `nan` or `inf` -value shows up in the graph. This is analogous to *conditional breakpoints* in -some procedural-language debuggers: - -```none -tfdbg> run -f has_inf_or_nan -``` - -> NOTE: The preceding command works properly because a tensor filter called -> `has_inf_or_nan` has been registered for you when the wrapped session is -> created. This filter detects `nan`s and `inf`s (as explained previously). -> If you have registered any other filters, you can -> use "run -f" to have tfdbg run until any tensor triggers that filter (cause -> the filter to return True). -> -> ``` python -> def my_filter_callable(datum, tensor): -> # A filter that detects zero-valued scalars. -> return len(tensor.shape) == 0 and tensor == 0.0 -> -> sess.add_tensor_filter('my_filter', my_filter_callable) -> ``` -> -> Then at the tfdbg run-start prompt run until your filter is triggered: -> -> ``` -> tfdbg> run -f my_filter -> ``` - -See [this API document](https://www.tensorflow.org/api_docs/python/tfdbg/DebugDumpDir#find) -for more information on the expected signature and return value of the predicate -`Callable` used with `add_tensor_filter()`. - -![tfdbg run-end UI: infs and nans](https://www.tensorflow.org/images/tfdbg_screenshot_run_end_inf_nan.png) - -As the screen display indicates on the first line, the `has_inf_or_nan` filter is first triggered -during the fourth `Session.run()` call: an -[Adam optimizer](https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer) -forward-backward training pass on the graph. In this run, 36 (out of the total -95) intermediate tensors contain `nan` or `inf` values. These tensors are listed -in chronological order, with their timestamps displayed on the left. At the top -of the list, you can see the first tensor in which the bad numerical values -first surfaced: `cross_entropy/Log:0`. - -To view the value of the tensor, click the underlined tensor name -`cross_entropy/Log:0` or enter the equivalent command: - -```none -tfdbg> pt cross_entropy/Log:0 -``` - -Scroll down a little and you will notice some scattered `inf` values. If the -instances of `inf` and `nan` are difficult to spot by eye, you can use the -following command to perform a regex search and highlight the output: - -```none -tfdbg> /inf -``` - -Or, alternatively: - -```none -tfdbg> /(inf|nan) -``` - -You can also use the `-s` or `--numeric_summary` command to get a quick summary -of the types of numeric values in the tensor: - -``` none -tfdbg> pt -s cross_entropy/Log:0 -``` - -From the summary, you can see that several of the 1000 elements of the -`cross_entropy/Log:0` tensor are `-inf`s (negative infinities). - -Why did these infinities appear? To further debug, display more information -about the node `cross_entropy/Log` by clicking the underlined `node_info` menu -item on the top or entering the equivalent node_info (`ni`) command: - -```none -tfdbg> ni cross_entropy/Log -``` - -![tfdbg run-end UI: infs and nans](https://www.tensorflow.org/images/tfdbg_screenshot_run_end_node_info.png) - -You can see that this node has the op type `Log` -and that its input is the node `Softmax`. Run the following command to -take a closer look at the input tensor: - -```none -tfdbg> pt Softmax:0 -``` - -Examine the values in the input tensor, searching for zeros: - -```none -tfdbg> /0\.000 -``` - -Indeed, there are zeros. Now it is clear that the origin of the bad numerical -values is the node `cross_entropy/Log` taking logs of zeros. To find out the -culprit line in the Python source code, use the `-t` flag of the `ni` command -to show the traceback of the node's construction: - -```none -tfdbg> ni -t cross_entropy/Log -``` - -If you click "node_info" at the top of the screen, tfdbg automatically shows the -traceback of the node's construction. - -From the traceback, you can see that the op is constructed at the following -line: -[`debug_mnist.py`](https://www.tensorflow.org/code/tensorflow/python/debug/examples/debug_mnist.py): - -```python -diff = y_ * tf.log(y) -``` - -**tfdbg** has a feature that makes it easy to trace Tensors and ops back to -lines in Python source files. It can annotate lines of a Python file with -the ops or Tensors created by them. To use this feature, -simply click the underlined line numbers in the stack trace output of the -`ni -t ` commands, or use the `ps` (or `print_source`) command such as: -`ps /path/to/source.py`. For example, the following screenshot shows the output -of a `ps` command. - -![tfdbg run-end UI: annotated Python source file](https://www.tensorflow.org/images/tfdbg_screenshot_run_end_annotated_source.png) - -### Fixing the problem - -To fix the problem, edit `debug_mnist.py`, changing the original line: - -```python -diff = -(y_ * tf.log(y)) -``` - -to the built-in, numerically-stable implementation of softmax cross-entropy: - -```python -diff = tf.losses.softmax_cross_entropy(labels=y_, logits=logits) -``` - -Rerun with the `--debug` flag as follows: - -```none -python -m tensorflow.python.debug.examples.debug_mnist --debug -``` - -At the `tfdbg>` prompt, enter the following command: - -```none -run -f has_inf_or_nan` -``` - -Confirm that no tensors are flagged as containing `nan` or `inf` values, and -accuracy now continues to rise rather than getting stuck. Success! - -## Debugging TensorFlow Estimators - -This section explains how to debug TensorFlow programs that use the `Estimator` -APIs. Part of the convenience provided by these APIs is that -they manage `Session`s internally. This makes the `LocalCLIDebugWrapperSession` -described in the preceding sections inapplicable. Fortunately, you can still -debug them by using special `hook`s provided by `tfdbg`. - -`tfdbg` can debug the -@{tf.estimator.Estimator.train$`train()`}, -@{tf.estimator.Estimator.evaluate$`evaluate()`} and -@{tf.estimator.Estimator.predict$`predict()`} -methods of tf-learn `Estimator`s. To debug `Estimator.train()`, -create a `LocalCLIDebugHook` and supply it in the `hooks` argument. For example: - -```python -# First, let your BUILD target depend on "//tensorflow/python/debug:debug_py" -# (You don't need to worry about the BUILD dependency if you are using a pip -# install of open-source TensorFlow.) -from tensorflow.python import debug as tf_debug - -# Create a LocalCLIDebugHook and use it as a monitor when calling fit(). -hooks = [tf_debug.LocalCLIDebugHook()] - -# To debug `train`: -classifier.train(input_fn, - steps=1000, - hooks=hooks) -``` - -Similarly, to debug `Estimator.evaluate()` and `Estimator.predict()`, assign -hooks to the `hooks` parameter, as in the following example: - -```python -# To debug `evaluate`: -accuracy_score = classifier.evaluate(eval_input_fn, - hooks=hooks)["accuracy"] - -# To debug `predict`: -predict_results = classifier.predict(predict_input_fn, hooks=hooks) -``` - -[debug_tflearn_iris.py](https://www.tensorflow.org/code/tensorflow/python/debug/examples/debug_tflearn_iris.py), -based on [tf-learn's iris tutorial](https://www.tensorflow.org/versions/r1.8/get_started/tflearn), -contains a full example of how to use the tfdbg with `Estimator`s. -To run this example, do: - -```none -python -m tensorflow.python.debug.examples.debug_tflearn_iris --debug -``` - -The `LocalCLIDebugHook` also allows you to configure a `watch_fn` that can be -used to flexibly specify what `Tensor`s to watch on different `Session.run()` -calls, as a function of the `fetches` and `feed_dict` and other states. See -@{tfdbg.DumpingDebugWrapperSession.__init__$this API doc} -for more details. - -## Debugging Keras Models with TFDBG - -To use TFDBG with [Keras](https://keras.io/), let the Keras backend use -a TFDBG-wrapped Session object. For example, to use the CLI wrapper: - -``` python -import tensorflow as tf -from keras import backend as keras_backend -from tensorflow.python import debug as tf_debug - -keras_backend.set_session(tf_debug.LocalCLIDebugWrapperSession(tf.Session())) - -# Define your keras model, called "model". -model.fit(...) # This will break into the TFDBG CLI. -``` - -## Debugging tf-slim with TFDBG - -TFDBG supports debugging of training and evaluation with -[tf-slim](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim). -As detailed below, training and evaluation require slightly different debugging -workflows. - -### Debugging training in tf-slim -To debug the training process, provide `LocalCLIDebugWrapperSession` to the -`session_wrapper` argument of `slim.learning.train()`. For example: - -``` python -import tensorflow as tf -from tensorflow.python import debug as tf_debug - -# ... Code that creates the graph and the train_op ... -tf.contrib.slim.learning.train( - train_op, - logdir, - number_of_steps=10, - session_wrapper=tf_debug.LocalCLIDebugWrapperSession) -``` - -### Debugging evaluation in tf-slim -To debug the evaluation process, provide `LocalCLIDebugHook` to the -`hooks` argument of `slim.evaluation.evaluate_once()`. For example: - -``` python -import tensorflow as tf -from tensorflow.python import debug as tf_debug - -# ... Code that creates the graph and the eval and final ops ... -tf.contrib.slim.evaluation.evaluate_once( - '', - checkpoint_path, - logdir, - eval_op=my_eval_op, - final_op=my_value_op, - hooks=[tf_debug.LocalCLIDebugHook()]) -``` - -## Offline Debugging of Remotely-Running Sessions - -Often, your model is running on a remote machine or a process that you don't -have terminal access to. To perform model debugging in such cases, you can use -the `offline_analyzer` binary of `tfdbg` (described below). It operates on -dumped data directories. This can be done to both the lower-level `Session` API -and the higher-level `Estimator` API. - -### Debugging Remote tf.Sessions - -If you interact directly with the `tf.Session` API in `python`, you can -configure the `RunOptions` proto that you call your `Session.run()` method -with, by using the method @{tfdbg.watch_graph}. -This will cause the intermediate tensors and runtime graphs to be dumped to a -shared storage location of your choice when the `Session.run()` call occurs -(at the cost of slower performance). For example: - -```python -from tensorflow.python import debug as tf_debug - -# ... Code where your session and graph are set up... - -run_options = tf.RunOptions() -tf_debug.watch_graph( - run_options, - session.graph, - debug_urls=["file:///shared/storage/location/tfdbg_dumps_1"]) -# Be sure to specify different directories for different run() calls. - -session.run(fetches, feed_dict=feeds, options=run_options) -``` - -Later, in an environment that you have terminal access to (for example, a local -computer that can access the shared storage location specified in the code -above), you can load and inspect the data in the dump directory on the shared -storage by using the `offline_analyzer` binary of `tfdbg`. For example: - -```none -python -m tensorflow.python.debug.cli.offline_analyzer \ - --dump_dir=/shared/storage/location/tfdbg_dumps_1 -``` - -The `Session` wrapper `DumpingDebugWrapperSession` offers an easier and more -flexible way to generate file-system dumps that can be analyzed offline. -To use it, simply wrap your session in a `tf_debug.DumpingDebugWrapperSession`. -For example: - -```python -# Let your BUILD target depend on "//tensorflow/python/debug:debug_py -# (You don't need to worry about the BUILD dependency if you are using a pip -# install of open-source TensorFlow.) -from tensorflow.python import debug as tf_debug - -sess = tf_debug.DumpingDebugWrapperSession( - sess, "/shared/storage/location/tfdbg_dumps_1/", watch_fn=my_watch_fn) -``` - -The `watch_fn` argument accepts a `Callable` that allows you to configure what -`tensor`s to watch on different `Session.run()` calls, as a function of the -`fetches` and `feed_dict` to the `run()` call and other states. - -### C++ and other languages - -If your model code is written in C++ or other languages, you can also -modify the `debug_options` field of `RunOptions` to generate debug dumps that -can be inspected offline. See -[the proto definition](https://www.tensorflow.org/code/tensorflow/core/protobuf/debug.proto) -for more details. - -### Debugging Remotely-Running Estimators - -If your remote TensorFlow server runs `Estimator`s, -you can use the non-interactive `DumpingDebugHook`. For example: - -```python -# Let your BUILD target depend on "//tensorflow/python/debug:debug_py -# (You don't need to worry about the BUILD dependency if you are using a pip -# install of open-source TensorFlow.) -from tensorflow.python import debug as tf_debug - -hooks = [tf_debug.DumpingDebugHook("/shared/storage/location/tfdbg_dumps_1")] -``` - -Then this `hook` can be used in the same way as the `LocalCLIDebugHook` examples -described earlier in this document. -As the training, evalution or prediction happens with `Estimator`, -tfdbg creates directories having the following name pattern: -`/shared/storage/location/tfdbg_dumps_1/run__`. -Each directory corresponds to a `Session.run()` call that underlies -the `fit()` or `evaluate()` call. You can load these directories and inspect -them in a command-line interface in an offline manner using the -`offline_analyzer` offered by tfdbg. For example: - -```bash -python -m tensorflow.python.debug.cli.offline_analyzer \ - --dump_dir="/shared/storage/location/tfdbg_dumps_1/run__" -``` - -## Frequently Asked Questions - -**Q**: _Do the timestamps on the left side of the `lt` output reflect actual - performance in a non-debugging session?_ - -**A**: No. The debugger inserts additional special-purpose debug nodes to the - graph to record the values of intermediate tensors. These nodes - slow down the graph execution. If you are interested in profiling your - model, check out - - 1. The profiling mode of tfdbg: `tfdbg> run -p`. - 2. [tfprof](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/core/profiler) - and other profiling tools for TensorFlow. - -**Q**: _How do I link tfdbg against my `Session` in Bazel? Why do I see an - error such as "ImportError: cannot import name debug"?_ - -**A**: In your BUILD rule, declare dependencies: - `"//tensorflow:tensorflow_py"` and `"//tensorflow/python/debug:debug_py"`. - The first is the dependency that you include to use TensorFlow even - without debugger support; the second enables the debugger. - Then, In your Python file, add: - -```python -from tensorflow.python import debug as tf_debug - -# Then wrap your TensorFlow Session with the local-CLI wrapper. -sess = tf_debug.LocalCLIDebugWrapperSession(sess) -``` - -**Q**: _Does tfdbg help debug runtime errors such as shape mismatches?_ - -**A**: Yes. tfdbg intercepts errors generated by ops during runtime and presents - the errors with some debug instructions to the user in the CLI. - See examples: - -```none -# Debugging shape mismatch during matrix multiplication. -python -m tensorflow.python.debug.examples.debug_errors \ - --error shape_mismatch --debug - -# Debugging uninitialized variable. -python -m tensorflow.python.debug.examples.debug_errors \ - --error uninitialized_variable --debug -``` - -**Q**: _How can I let my tfdbg-wrapped Sessions or Hooks run the debug mode -only from the main thread?_ - -**A**: -This is a common use case, in which the `Session` object is used from multiple -threads concurrently. Typically, the child threads take care of background tasks -such as running enqueue operations. Often, you want to debug only the main -thread (or less frequently, only one of the child threads). You can use the -`thread_name_filter` keyword argument of `LocalCLIDebugWrapperSession` to -achieve this type of thread-selective debugging. For example, to debug from the -main thread only, construct a wrapped `Session` as follows: - -```python -sess = tf_debug.LocalCLIDebugWrapperSession(sess, thread_name_filter="MainThread$") -``` - -The above example relies on the fact that main threads in Python have the -default name `MainThread`. - -**Q**: _The model I am debugging is very large. The data dumped by tfdbg -fills up the free space of my disk. What can I do?_ - -**A**: -You might encounter this problem in any of the following situations: - -* models with many intermediate tensors -* very large intermediate tensors -* many @{tf.while_loop} iterations - -There are three possible workarounds or solutions: - -* The constructors of `LocalCLIDebugWrapperSession` and `LocalCLIDebugHook` - provide a keyword argument, `dump_root`, to specify the path - to which tfdbg dumps the debug data. You can use it to let tfdbg dump the - debug data on a disk with larger free space. For example: - -```python -# For LocalCLIDebugWrapperSession -sess = tf_debug.LocalCLIDebugWrapperSession(dump_root="/with/lots/of/space") - -# For LocalCLIDebugHook -hooks = [tf_debug.LocalCLIDebugHook(dump_root="/with/lots/of/space")] -``` - Make sure that the directory pointed to by dump_root is empty or nonexistent. - `tfdbg` cleans up the dump directories before exiting. - -* Reduce the batch size used during the runs. -* Use the filtering options of tfdbg's `run` command to watch only specific - nodes in the graph. For example: - - ``` - tfdbg> run --node_name_filter .*hidden.* - tfdbg> run --op_type_filter Variable.* - tfdbg> run --tensor_dtype_filter int.* - ``` - - The first command above watches only nodes whose name match the - regular-expression pattern `.*hidden.*`. The second command watches only - operations whose name match the pattern `Variable.*`. The third one watches - only the tensors whose dtype match the pattern `int.*` (e.g., `int32`). - - -**Q**: _Why can't I select text in the tfdbg CLI?_ - -**A**: This is because the tfdbg CLI enables mouse events in the terminal by - default. This [mouse-mask](https://linux.die.net/man/3/mousemask) mode - overrides default terminal interactions, including text selection. You - can re-enable text selection by using the command `mouse off` or - `m off`. - -**Q**: _Why does the tfdbg CLI show no dumped tensors when I debug code like the following?_ - -``` python -a = tf.ones([10], name="a") -b = tf.add(a, a, name="b") -sess = tf.Session() -sess = tf_debug.LocalCLIDebugWrapperSession(sess) -sess.run(b) -``` - -**A**: The reason why you see no data dumped is because every node in the - executed TensorFlow graph is constant-folded by the TensorFlow runtime. - In this exapmle, `a` is a constant tensor; therefore, the fetched - tensor `b` is effectively also a constant tensor. TensorFlow's graph - optimization folds the graph that contains `a` and `b` into a single - node to speed up future runs of the graph, which is why `tfdbg` does - not generate any intermediate tensor dumps. However, if `a` were a - @{tf.Variable}, as in the following example: - -``` python -import numpy as np - -a = tf.Variable(np.ones[10], name="a") -b = tf.add(a, a, name="b") -sess = tf.Session() -sess.run(tf.global_variables_initializer()) -sess = tf_debug.LocalCLIDebugWrapperSession(sess) -sess.run(b) -``` - -the constant-folding would not occur and `tfdbg` should show the intermediate -tensor dumps. - - -**Q**: I am debugging a model that generates unwanted infinities or NaNs. But - there are some nodes in my model that are known to generate infinities - or NaNs in their output tensors even under completely normal conditions. - How can I skip those nodes during my `run -f has_inf_or_nan` actions? - -**A**: Use the `--filter_exclude_node_names` (`-fenn` for short) flag. For - example, if you known you have a node with name matching the regular - expression `.*Sqrt.*` that generates infinities or NaNs regardless - of whether the model is behaving correctly, you can exclude the nodes - from the infinity/NaN-finding runs with the command - `run -f has_inf_or_nan -fenn .*Sqrt.*`. - - -**Q**: Is there a GUI for tfdbg? - -**A**: Yes, the **TensorBoard Debugger Plugin** is the GUI of tfdbg. - It offers features such as inspection of the computation graph, - real-time visualization of tensor values, continuation to tensor - and conditional breakpoints, and tying tensors to their - graph-construction source code, all in the browser environment. - To get started, please visit - [its README](https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/debugger/README.md). diff --git a/tensorflow/docs_src/programmers_guide/eager.md b/tensorflow/docs_src/programmers_guide/eager.md deleted file mode 100644 index 00d02b4455..0000000000 --- a/tensorflow/docs_src/programmers_guide/eager.md +++ /dev/null @@ -1,849 +0,0 @@ -# Eager Execution - -TensorFlow's eager execution is an imperative programming environment that -evaluates operations immediately, without building graphs: operations return -concrete values instead of constructing a computational graph to run later. This -makes it easy to get started with TensorFlow and debug models, and it -reduces boilerplate as well. To follow along with this guide, run the code -samples below in an interactive `python` interpreter. - -Eager execution is a flexible machine learning platform for research and -experimentation, providing: - -* *An intuitive interface*—Structure your code naturally and use Python data - structures. Quickly iterate on small models and small data. -* *Easier debugging*—Call ops directly to inspect running models and test - changes. Use standard Python debugging tools for immediate error reporting. -* *Natural control flow*—Use Python control flow instead of graph control - flow, simplifying the specification of dynamic models. - -Eager execution supports most TensorFlow operations and GPU acceleration. For a -collection of examples running in eager execution, see: -[tensorflow/contrib/eager/python/examples](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples). - -Note: Some models may experience increased overhead with eager execution -enabled. Performance improvements are ongoing, but please -[file a bug](https://github.com/tensorflow/tensorflow/issues) if you find a -problem and share your benchmarks. - -## Setup and basic usage - -Upgrade to the latest version of TensorFlow: - -``` -$ pip install --upgrade tensorflow -``` - -To start eager execution, add `tf.enable_eager_execution()` to the beginning of -the program or console session. Do not add this operation to other modules that -the program calls. - -```py -from __future__ import absolute_import, division, print_function - -import tensorflow as tf - -tf.enable_eager_execution() -``` - -Now you can run TensorFlow operations and the results will return immediately: - -```py -tf.executing_eagerly() # => True - -x = [[2.]] -m = tf.matmul(x, x) -print("hello, {}".format(m)) # => "hello, [[4.]]" -``` - -Enabling eager execution changes how TensorFlow operations behave—now they -immediately evaluate and return their values to Python. `tf.Tensor` objects -reference concrete values instead of symbolic handles to nodes in a computational -graph. Since there isn't a computational graph to build and run later in a -session, it's easy to inspect results using `print()` or a debugger. Evaluating, -printing, and checking tensor values does not break the flow for computing -gradients. - -Eager execution works nicely with [NumPy](http://www.numpy.org/). NumPy -operations accept `tf.Tensor` arguments. TensorFlow -[math operations](https://www.tensorflow.org/api_guides/python/math_ops) convert -Python objects and NumPy arrays to `tf.Tensor` objects. The -`tf.Tensor.numpy` method returns the object's value as a NumPy `ndarray`. - -```py -a = tf.constant([[1, 2], - [3, 4]]) -print(a) -# => tf.Tensor([[1 2] -# [3 4]], shape=(2, 2), dtype=int32) - -# Broadcasting support -b = tf.add(a, 1) -print(b) -# => tf.Tensor([[2 3] -# [4 5]], shape=(2, 2), dtype=int32) - -# Operator overloading is supported -print(a * b) -# => tf.Tensor([[ 2 6] -# [12 20]], shape=(2, 2), dtype=int32) - -# Use NumPy values -import numpy as np - -c = np.multiply(a, b) -print(c) -# => [[ 2 6] -# [12 20]] - -# Obtain numpy value from a tensor: -print(a.numpy()) -# => [[1 2] -# [3 4]] -``` - -The `tf.contrib.eager` module contains symbols available to both eager and graph execution -environments and is useful for writing code to [work with graphs](#work_with_graphs): - -```py -tfe = tf.contrib.eager -``` - -## Dynamic control flow - -A major benefit of eager execution is that all the functionality of the host -language is available while your model is executing. So, for example, -it is easy to write [fizzbuzz](https://en.wikipedia.org/wiki/Fizz_buzz): - -```py -def fizzbuzz(max_num): - counter = tf.constant(0) - max_num = tf.convert_to_tensor(max_num) - for num in range(max_num.numpy()): - num = tf.constant(num) - if int(num % 3) == 0 and int(num % 5) == 0: - print('FizzBuzz') - elif int(num % 3) == 0: - print('Fizz') - elif int(num % 5) == 0: - print('Buzz') - else: - print(num) - counter += 1 - return counter -``` - -This has conditionals that depend on tensor values and it prints these values -at runtime. - -## Build a model - -Many machine learning models are represented by composing layers. When -using TensorFlow with eager execution you can either write your own layers or -use a layer provided in the `tf.keras.layers` package. - -While you can use any Python object to represent a layer, -TensorFlow has `tf.keras.layers.Layer` as a convenient base class. Inherit from -it to implement your own layer: - -```py -class MySimpleLayer(tf.keras.layers.Layer): - def __init__(self, output_units): - self.output_units = output_units - - def build(self, input): - # The build method gets called the first time your layer is used. - # Creating variables on build() allows you to make their shape depend - # on the input shape and hence remove the need for the user to specify - # full shapes. It is possible to create variables during __init__() if - # you already know their full shapes. - self.kernel = self.add_variable( - "kernel", [input.shape[-1], self.output_units]) - - def call(self, input): - # Override call() instead of __call__ so we can perform some bookkeeping. - return tf.matmul(input, self.kernel) -``` - -Use `tf.keras.layers.Dense` layer instead of `MySimpleLayer` above as it has -a superset of its functionality (it can also add a bias). - -When composing layers into models you can use `tf.keras.Sequential` to represent -models which are a linear stack of layers. It is easy to use for basic models: - -```py -model = tf.keras.Sequential([ - tf.keras.layers.Dense(10, input_shape=(784,)), # must declare input shape - tf.keras.layers.Dense(10) -]) -``` - -Alternatively, organize models in classes by inheriting from `tf.keras.Model`. -This is a container for layers that is a layer itself, allowing `tf.keras.Model` -objects to contain other `tf.keras.Model` objects. - -```py -class MNISTModel(tf.keras.Model): - def __init__(self): - super(MNISTModel, self).__init__() - self.dense1 = tf.keras.layers.Dense(units=10) - self.dense2 = tf.keras.layers.Dense(units=10) - - def call(self, input): - """Run the model.""" - result = self.dense1(input) - result = self.dense2(result) - result = self.dense2(result) # reuse variables from dense2 layer - return result - -model = MNISTModel() -``` - -It's not required to set an input shape for the `tf.keras.Model` class since -the parameters are set the first time input is passed to the layer. - -`tf.keras.layers` classes create and contain their own model variables that -are tied to the lifetime of their layer objects. To share layer variables, share -their objects. - - -## Eager training - -### Computing gradients - -[Automatic differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation) -is useful for implementing machine learning algorithms such as -[backpropagation](https://en.wikipedia.org/wiki/Backpropagation) for training -neural networks. During eager execution, use `tf.GradientTape` to trace -operations for computing gradients later. - -`tf.GradientTape` is an opt-in feature to provide maximal performance when -not tracing. Since different operations can occur during each call, all -forward-pass operations get recorded to a "tape". To compute the gradient, play -the tape backwards and then discard. A particular `tf.GradientTape` can only -compute one gradient; subsequent calls throw a runtime error. - -```py -w = tfe.Variable([[1.0]]) -with tf.GradientTape() as tape: - loss = w * w - -grad = tape.gradient(loss, w) -print(grad) # => tf.Tensor([[ 2.]], shape=(1, 1), dtype=float32) -``` - -Here's an example of `tf.GradientTape` that records forward-pass operations -to train a simple model: - -```py -# A toy dataset of points around 3 * x + 2 -NUM_EXAMPLES = 1000 -training_inputs = tf.random_normal([NUM_EXAMPLES]) -noise = tf.random_normal([NUM_EXAMPLES]) -training_outputs = training_inputs * 3 + 2 + noise - -def prediction(input, weight, bias): - return input * weight + bias - -# A loss function using mean-squared error -def loss(weights, biases): - error = prediction(training_inputs, weights, biases) - training_outputs - return tf.reduce_mean(tf.square(error)) - -# Return the derivative of loss with respect to weight and bias -def grad(weights, biases): - with tf.GradientTape() as tape: - loss_value = loss(weights, biases) - return tape.gradient(loss_value, [weights, biases]) - -train_steps = 200 -learning_rate = 0.01 -# Start with arbitrary values for W and B on the same batch of data -W = tfe.Variable(5.) -B = tfe.Variable(10.) - -print("Initial loss: {:.3f}".format(loss(W, B))) - -for i in range(train_steps): - dW, dB = grad(W, B) - W.assign_sub(dW * learning_rate) - B.assign_sub(dB * learning_rate) - if i % 20 == 0: - print("Loss at step {:03d}: {:.3f}".format(i, loss(W, B))) - -print("Final loss: {:.3f}".format(loss(W, B))) -print("W = {}, B = {}".format(W.numpy(), B.numpy())) -``` - -Output (exact numbers may vary): - -``` -Initial loss: 71.204 -Loss at step 000: 68.333 -Loss at step 020: 30.222 -Loss at step 040: 13.691 -Loss at step 060: 6.508 -Loss at step 080: 3.382 -Loss at step 100: 2.018 -Loss at step 120: 1.422 -Loss at step 140: 1.161 -Loss at step 160: 1.046 -Loss at step 180: 0.996 -Final loss: 0.974 -W = 3.01582956314, B = 2.1191945076 -``` - -Replay the `tf.GradientTape` to compute the gradients and apply them in a -training loop. This is demonstrated in an excerpt from the -[mnist_eager.py](https://github.com/tensorflow/models/blob/master/official/mnist/mnist_eager.py) -example: - -```py -dataset = tf.data.Dataset.from_tensor_slices((data.train.images, - data.train.labels)) -... -for (batch, (images, labels)) in enumerate(dataset): - ... - with tf.GradientTape() as tape: - logits = model(images, training=True) - loss_value = loss(logits, labels) - ... - grads = tape.gradient(loss_value, model.variables) - optimizer.apply_gradients(zip(grads, model.variables), - global_step=tf.train.get_or_create_global_step()) -``` - - -The following example creates a multi-layer model that classifies the standard -[MNIST handwritten digits](https://www.tensorflow.org/tutorials/layers). It -demonstrates the optimizer and layer APIs to build trainable graphs in an eager -execution environment. - -### Train a model - -Even without training, call the model and inspect the output in eager execution: - -```py -# Create a tensor representing a blank image -batch = tf.zeros([1, 1, 784]) -print(batch.shape) # => (1, 1, 784) - -result = model(batch) -# => tf.Tensor([[[ 0. 0., ..., 0.]]], shape=(1, 1, 10), dtype=float32) -``` - -This example uses the -[dataset.py module](https://github.com/tensorflow/models/blob/master/official/mnist/dataset.py) -from the -[TensorFlow MNIST example](https://github.com/tensorflow/models/tree/master/official/mnist); -download this file to your local directory. Run the following to download the -MNIST data files to your working directory and prepare a `tf.data.Dataset` -for training: - -```py -import dataset # download dataset.py file -dataset_train = dataset.train('./datasets').shuffle(60000).repeat(4).batch(32) -``` - -To train a model, define a loss function to optimize and then calculate -gradients. Use an optimizer to update the variables: - -```py -def loss(model, x, y): - prediction = model(x) - return tf.losses.sparse_softmax_cross_entropy(labels=y, logits=prediction) - -def grad(model, inputs, targets): - with tf.GradientTape() as tape: - loss_value = loss(model, inputs, targets) - return tape.gradient(loss_value, model.variables) - -optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001) - -x, y = iter(dataset_train).next() -print("Initial loss: {:.3f}".format(loss(model, x, y))) - -# Training loop -for (i, (x, y)) in enumerate(dataset_train): - # Calculate derivatives of the input function with respect to its parameters. - grads = grad(model, x, y) - # Apply the gradient to the model - optimizer.apply_gradients(zip(grads, model.variables), - global_step=tf.train.get_or_create_global_step()) - if i % 200 == 0: - print("Loss at step {:04d}: {:.3f}".format(i, loss(model, x, y))) - -print("Final loss: {:.3f}".format(loss(model, x, y))) -``` - -Output (exact numbers may vary): - -``` -Initial loss: 2.674 -Loss at step 0000: 2.593 -Loss at step 0200: 2.143 -Loss at step 0400: 2.009 -Loss at step 0600: 2.103 -Loss at step 0800: 1.621 -Loss at step 1000: 1.695 -... -Loss at step 6600: 0.602 -Loss at step 6800: 0.557 -Loss at step 7000: 0.499 -Loss at step 7200: 0.744 -Loss at step 7400: 0.681 -Final loss: 0.670 -``` - -And for faster training, move the computation to a GPU: - -```py -with tf.device("/gpu:0"): - for (i, (x, y)) in enumerate(dataset_train): - # minimize() is equivalent to the grad() and apply_gradients() calls. - optimizer.minimize(lambda: loss(model, x, y), - global_step=tf.train.get_or_create_global_step()) -``` - -### Variables and optimizers - -`tfe.Variable` objects store mutable `tf.Tensor` values accessed during -training to make automatic differentiation easier. The parameters of a model can -be encapsulated in classes as variables. - -Better encapsulate model parameters by using `tfe.Variable` with -`tf.GradientTape`. For example, the automatic differentiation example above -can be rewritten: - -```py -class Model(tf.keras.Model): - def __init__(self): - super(Model, self).__init__() - self.W = tfe.Variable(5., name='weight') - self.B = tfe.Variable(10., name='bias') - def predict(self, inputs): - return inputs * self.W + self.B - -# A toy dataset of points around 3 * x + 2 -NUM_EXAMPLES = 2000 -training_inputs = tf.random_normal([NUM_EXAMPLES]) -noise = tf.random_normal([NUM_EXAMPLES]) -training_outputs = training_inputs * 3 + 2 + noise - -# The loss function to be optimized -def loss(model, inputs, targets): - error = model.predict(inputs) - targets - return tf.reduce_mean(tf.square(error)) - -def grad(model, inputs, targets): - with tf.GradientTape() as tape: - loss_value = loss(model, inputs, targets) - return tape.gradient(loss_value, [model.W, model.B]) - -# Define: -# 1. A model. -# 2. Derivatives of a loss function with respect to model parameters. -# 3. A strategy for updating the variables based on the derivatives. -model = Model() -optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.01) - -print("Initial loss: {:.3f}".format(loss(model, training_inputs, training_outputs))) - -# Training loop -for i in range(300): - grads = grad(model, training_inputs, training_outputs) - optimizer.apply_gradients(zip(grads, [model.W, model.B]), - global_step=tf.train.get_or_create_global_step()) - if i % 20 == 0: - print("Loss at step {:03d}: {:.3f}".format(i, loss(model, training_inputs, training_outputs))) - -print("Final loss: {:.3f}".format(loss(model, training_inputs, training_outputs))) -print("W = {}, B = {}".format(model.W.numpy(), model.B.numpy())) -``` - -Output (exact numbers may vary): - -``` -Initial loss: 69.066 -Loss at step 000: 66.368 -Loss at step 020: 30.107 -Loss at step 040: 13.959 -Loss at step 060: 6.769 -Loss at step 080: 3.567 -Loss at step 100: 2.141 -Loss at step 120: 1.506 -Loss at step 140: 1.223 -Loss at step 160: 1.097 -Loss at step 180: 1.041 -Loss at step 200: 1.016 -Loss at step 220: 1.005 -Loss at step 240: 1.000 -Loss at step 260: 0.998 -Loss at step 280: 0.997 -Final loss: 0.996 -W = 2.99431324005, B = 2.02129220963 -``` - -## Use objects for state during eager execution - -With graph execution, program state (such as the variables) is stored in global -collections and their lifetime is managed by the `tf.Session` object. In -contrast, during eager execution the lifetime of state objects is determined by -the lifetime of their corresponding Python object. - -### Variables are objects - -During eager execution, variables persist until the last reference to the object -is removed, and is then deleted. - -```py -with tf.device("gpu:0"): - v = tfe.Variable(tf.random_normal([1000, 1000])) - v = None # v no longer takes up GPU memory -``` - -### Object-based saving - -`tfe.Checkpoint` can save and restore `tfe.Variable`s to and from -checkpoints: - -```py -x = tfe.Variable(10.) - -checkpoint = tfe.Checkpoint(x=x) # save as "x" - -x.assign(2.) # Assign a new value to the variables and save. -save_path = checkpoint.save('./ckpt/') - -x.assign(11.) # Change the variable after saving. - -# Restore values from the checkpoint -checkpoint.restore(save_path) - -print(x) # => 2.0 -``` - -To save and load models, `tfe.Checkpoint` stores the internal state of objects, -without requiring hidden variables. To record the state of a `model`, -an `optimizer`, and a global step, pass them to a `tfe.Checkpoint`: - -```py -model = MyModel() -optimizer = tf.train.AdamOptimizer(learning_rate=0.001) -checkpoint_dir = ‘/path/to/model_dir’ -checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt") -root = tfe.Checkpoint(optimizer=optimizer, - model=model, - optimizer_step=tf.train.get_or_create_global_step()) - -root.save(file_prefix=checkpoint_prefix) -# or -root.restore(tf.train.latest_checkpoint(checkpoint_dir)) -``` - -### Object-oriented metrics - -`tfe.metrics` are stored as objects. Update a metric by passing the new data to -the callable, and retrieve the result using the `tfe.metrics.result` method, -for example: - -```py -m = tfe.metrics.Mean("loss") -m(0) -m(5) -m.result() # => 2.5 -m([8, 9]) -m.result() # => 5.5 -``` - -#### Summaries and TensorBoard - -@{$summaries_and_tensorboard$TensorBoard} is a visualization tool for -understanding, debugging and optimizing the model training process. It uses -summary events that are written while executing the program. - -`tf.contrib.summary` is compatible with both eager and graph execution -environments. Summary operations, such as `tf.contrib.summary.scalar`, are -inserted during model construction. For example, to record summaries once every -100 global steps: - -```py -writer = tf.contrib.summary.create_file_writer(logdir) -global_step=tf.train.get_or_create_global_step() # return global step var - -writer.set_as_default() - -for _ in range(iterations): - global_step.assign_add(1) - # Must include a record_summaries method - with tf.contrib.summary.record_summaries_every_n_global_steps(100): - # your model code goes here - tf.contrib.summary.scalar('loss', loss) - ... -``` - -## Advanced automatic differentiation topics - -### Dynamic models - -`tf.GradientTape` can also be used in dynamic models. This example for a -[backtracking line search](https://wikipedia.org/wiki/Backtracking_line_search) -algorithm looks like normal NumPy code, except there are gradients and is -differentiable, despite the complex control flow: - -```py -def line_search_step(fn, init_x, rate=1.0): - with tf.GradientTape() as tape: - # Variables are automatically recorded, but manually watch a tensor - tape.watch(init_x) - value = fn(init_x) - grad = tape.gradient(value, init_x) - grad_norm = tf.reduce_sum(grad * grad) - init_value = value - while value > init_value - rate * grad_norm: - x = init_x - rate * grad - value = fn(x) - rate /= 2.0 - return x, value -``` - -### Additional functions to compute gradients - -`tf.GradientTape` is a powerful interface for computing gradients, but there -is another [Autograd](https://github.com/HIPS/autograd)-style API available for -automatic differentiation. These functions are useful if writing math code with -only tensors and gradient functions, and without `tfe.Variables`: - -* `tfe.gradients_function` —Returns a function that computes the derivatives - of its input function parameter with respect to its arguments. The input - function parameter must return a scalar value. When the returned function is - invoked, it returns a list of `tf.Tensor` objects: one element for each - argument of the input function. Since anything of interest must be passed as a - function parameter, this becomes unwieldy if there's a dependency on many - trainable parameters. -* `tfe.value_and_gradients_function` —Similar to - `tfe.gradients_function`, but when the returned function is invoked, it - returns the value from the input function in addition to the list of - derivatives of the input function with respect to its arguments. - -In the following example, `tfe.gradients_function` takes the `square` -function as an argument and returns a function that computes the partial -derivatives of `square` with respect to its inputs. To calculate the derivative -of `square` at `3`, `grad(3.0)` returns `6`. - -```py -def square(x): - return tf.multiply(x, x) - -grad = tfe.gradients_function(square) - -square(3.) # => 9.0 -grad(3.) # => [6.0] - -# The second-order derivative of square: -gradgrad = tfe.gradients_function(lambda x: grad(x)[0]) -gradgrad(3.) # => [2.0] - -# The third-order derivative is None: -gradgradgrad = tfe.gradients_function(lambda x: gradgrad(x)[0]) -gradgradgrad(3.) # => [None] - - -# With flow control: -def abs(x): - return x if x > 0. else -x - -grad = tfe.gradients_function(abs) - -grad(3.) # => [1.0] -grad(-3.) # => [-1.0] -``` - -### Custom gradients - -Custom gradients are an easy way to override gradients in eager and graph -execution. Within the forward function, define the gradient with respect to the -inputs, outputs, or intermediate results. For example, here's an easy way to clip -the norm of the gradients in the backward pass: - -```py -@tf.custom_gradient -def clip_gradient_by_norm(x, norm): - y = tf.identity(x) - def grad_fn(dresult): - return [tf.clip_by_norm(dresult, norm), None] - return y, grad_fn -``` - -Custom gradients are commonly used to provide a numerically stable gradient for a -sequence of operations: - -```py -def log1pexp(x): - return tf.log(1 + tf.exp(x)) -grad_log1pexp = tfe.gradients_function(log1pexp) - -# The gradient computation works fine at x = 0. -grad_log1pexp(0.) # => [0.5] - -# However, x = 100 fails because of numerical instability. -grad_log1pexp(100.) # => [nan] -``` - -Here, the `log1pexp` function can be analytically simplified with a custom -gradient. The implementation below reuses the value for `tf.exp(x)` that is -computed during the forward pass—making it more efficient by eliminating -redundant calculations: - -```py -@tf.custom_gradient -def log1pexp(x): - e = tf.exp(x) - def grad(dy): - return dy * (1 - 1 / (1 + e)) - return tf.log(1 + e), grad - -grad_log1pexp = tfe.gradients_function(log1pexp) - -# As before, the gradient computation works fine at x = 0. -grad_log1pexp(0.) # => [0.5] - -# And the gradient computation also works at x = 100. -grad_log1pexp(100.) # => [1.0] -``` - -## Performance - -Computation is automatically offloaded to GPUs during eager execution. If you -want control over where a computation runs you can enclose it in a -`tf.device('/gpu:0')` block (or the CPU equivalent): - -```py -import time - -def measure(x, steps): - # TensorFlow initializes a GPU the first time it's used, exclude from timing. - tf.matmul(x, x) - start = time.time() - for i in range(steps): - x = tf.matmul(x, x) - _ = x.numpy() # Make sure to execute op and not just enqueue it - end = time.time() - return end - start - -shape = (1000, 1000) -steps = 200 -print("Time to multiply a {} matrix by itself {} times:".format(shape, steps)) - -# Run on CPU: -with tf.device("/cpu:0"): - print("CPU: {} secs".format(measure(tf.random_normal(shape), steps))) - -# Run on GPU, if available: -if tfe.num_gpus() > 0: - with tf.device("/gpu:0"): - print("GPU: {} secs".format(measure(tf.random_normal(shape), steps))) -else: - print("GPU: not found") -``` - -Output (exact numbers depend on hardware): - -``` -Time to multiply a (1000, 1000) matrix by itself 200 times: -CPU: 4.614904403686523 secs -GPU: 0.5581181049346924 secs -``` - -A `tf.Tensor` object can be copied to a different device to execute its -operations: - -```py -x = tf.random_normal([10, 10]) - -x_gpu0 = x.gpu() -x_cpu = x.cpu() - -_ = tf.matmul(x_cpu, x_cpu) # Runs on CPU -_ = tf.matmul(x_gpu0, x_gpu0) # Runs on GPU:0 - -if tfe.num_gpus() > 1: - x_gpu1 = x.gpu(1) - _ = tf.matmul(x_gpu1, x_gpu1) # Runs on GPU:1 -``` - -### Benchmarks - -For compute-heavy models, such as -[ResNet50](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples/resnet50) -training on a GPU, eager execution performance is comparable to graph execution. -But this gap grows larger for models with less computation and there is work to -be done for optimizing hot code paths for models with lots of small operations. - - -## Work with graphs - -While eager execution makes development and debugging more interactive, -TensorFlow graph execution has advantages for distributed training, performance -optimizations, and production deployment. However, writing graph code can feel -different than writing regular Python code and more difficult to debug. - -For building and training graph-constructed models, the Python program first -builds a graph representing the computation, then invokes `Session.run` to send -the graph for execution on the C++-based runtime. This provides: - -* Automatic differentiation using static autodiff. -* Simple deployment to a platform independent server. -* Graph-based optimizations (common subexpression elimination, constant-folding, etc.). -* Compilation and kernel fusion. -* Automatic distribution and replication (placing nodes on the distributed system). - -Deploying code written for eager execution is more difficult: either generate a -graph from the model, or run the Python runtime and code directly on the server. - -### Write compatible code - -The same code written for eager execution will also build a graph during graph -execution. Do this by simply running the same code in a new Python session where -eager execution is not enabled. - -Most TensorFlow operations work during eager execution, but there are some things -to keep in mind: - -* Use `tf.data` for input processing instead of queues. It's faster and easier. -* Use object-oriented layer APIs—like `tf.keras.layers` and - `tf.keras.Model`—since they have explicit storage for variables. -* Most model code works the same during eager and graph execution, but there are - exceptions. (For example, dynamic models using Python control flow to change the - computation based on inputs.) -* Once eager execution is enabled with `tf.enable_eager_execution`, it - cannot be turned off. Start a new Python session to return to graph execution. - -It's best to write code for both eager execution *and* graph execution. This -gives you eager's interactive experimentation and debuggability with the -distributed performance benefits of graph execution. - -Write, debug, and iterate in eager execution, then import the model graph for -production deployment. Use `tfe.Checkpoint` to save and restore model -variables, this allows movement between eager and graph execution environments. -See the examples in: -[tensorflow/contrib/eager/python/examples](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/eager/python/examples). - -### Use eager execution in a graph environment - -Selectively enable eager execution in a TensorFlow graph environment using -`tfe.py_func`. This is used when `tf.enable_eager_execution()` has *not* -been called. - -```py -def my_py_func(x): - x = tf.matmul(x, x) # You can use tf ops - print(x) # but it's eager! - return x - -with tf.Session() as sess: - x = tf.placeholder(dtype=tf.float32) - # Call eager function in graph! - pf = tfe.py_func(my_py_func, [x], tf.float32) - sess.run(pf, feed_dict={x: [[2.0]]}) # [[4.0]] -``` diff --git a/tensorflow/docs_src/programmers_guide/embedding.md b/tensorflow/docs_src/programmers_guide/embedding.md deleted file mode 100644 index 8a98367dfb..0000000000 --- a/tensorflow/docs_src/programmers_guide/embedding.md +++ /dev/null @@ -1,262 +0,0 @@ -# Embeddings - -This document introduces the concept of embeddings, gives a simple example of -how to train an embedding in TensorFlow, and explains how to view embeddings -with the TensorBoard Embedding Projector -([live example](http://projector.tensorflow.org)). The first two parts target -newcomers to machine learning or TensorFlow, and the Embedding Projector how-to -is for users at all levels. - -An alternative tutorial on these concepts is available in the -[Embeddings section of Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/embeddings/video-lecture). - -[TOC] - -An **embedding** is a mapping from discrete objects, such as words, to vectors -of real numbers. For example, a 300-dimensional embedding for English words -could include: - -``` -blue: (0.01359, 0.00075997, 0.24608, ..., -0.2524, 1.0048, 0.06259) -blues: (0.01396, 0.11887, -0.48963, ..., 0.033483, -0.10007, 0.1158) -orange: (-0.24776, -0.12359, 0.20986, ..., 0.079717, 0.23865, -0.014213) -oranges: (-0.35609, 0.21854, 0.080944, ..., -0.35413, 0.38511, -0.070976) -``` - -The individual dimensions in these vectors typically have no inherent meaning. -Instead, it's the overall patterns of location and distance between vectors -that machine learning takes advantage of. - -Embeddings are important for input to machine learning. Classifiers, and neural -networks more generally, work on vectors of real numbers. They train best on -dense vectors, where all values contribute to define an object. However, many -important inputs to machine learning, such as words of text, do not have a -natural vector representation. Embedding functions are the standard and -effective way to transform such discrete input objects into useful -continuous vectors. - -Embeddings are also valuable as outputs of machine learning. Because embeddings -map objects to vectors, applications can use similarity in vector space (for -instance, Euclidean distance or the angle between vectors) as a robust and -flexible measure of object similarity. One common use is to find nearest -neighbors. Using the same word embeddings as above, for instance, here are the -three nearest neighbors for each word and the corresponding angles: - -``` -blue: (red, 47.6°), (yellow, 51.9°), (purple, 52.4°) -blues: (jazz, 53.3°), (folk, 59.1°), (bluegrass, 60.6°) -orange: (yellow, 53.5°), (colored, 58.0°), (bright, 59.9°) -oranges: (apples, 45.3°), (lemons, 48.3°), (mangoes, 50.4°) -``` - -This would tell an application that apples and oranges are in some way more -similar (45.3° apart) than lemons and oranges (48.3° apart). - -## Embeddings in TensorFlow - -To create word embeddings in TensorFlow, we first split the text into words -and then assign an integer to every word in the vocabulary. Let us assume that -this has already been done, and that `word_ids` is a vector of these integers. -For example, the sentence “I have a cat.” could be split into -`[“I”, “have”, “a”, “cat”, “.”]` and then the corresponding `word_ids` tensor -would have shape `[5]` and consist of 5 integers. To map these word ids -to vectors, we need to create the embedding variable and use the -`tf.nn.embedding_lookup` function as follows: - -``` -word_embeddings = tf.get_variable(“word_embeddings”, - [vocabulary_size, embedding_size]) -embedded_word_ids = tf.nn.embedding_lookup(word_embeddings, word_ids) -``` - -After this, the tensor `embedded_word_ids` will have shape `[5, embedding_size]` -in our example and contain the embeddings (dense vectors) for each of the 5 -words. At the end of training, `word_embeddings` will contain the embeddings -for all words in the vocabulary. - -Embeddings can be trained in many network types, and with various loss -functions and data sets. For example, one could use a recurrent neural network -to predict the next word from the previous one given a large corpus of -sentences, or one could train two networks to do multi-lingual translation. -These methods are described in the @{$word2vec$Vector Representations of Words} -tutorial. - -## Visualizing Embeddings - -TensorBoard includes the **Embedding Projector**, a tool that lets you -interactively visualize embeddings. This tool can read embeddings from your -model and render them in two or three dimensions. - -The Embedding Projector has three panels: - -- *Data panel* on the top left, where you can choose the run, the embedding - variable and data columns to color and label points by. -- *Projections panel* on the bottom left, where you can choose the type of - projection. -- *Inspector panel* on the right side, where you can search for particular - points and see a list of nearest neighbors. - -### Projections -The Embedding Projector provides three ways to reduce the dimensionality of a -data set. - -- *[t-SNE](https://en.wikipedia.org/wiki/T-distributed_stochastic_neighbor_embedding)*: - a nonlinear nondeterministic algorithm (T-distributed stochastic neighbor - embedding) that tries to preserve local neighborhoods in the data, often at - the expense of distorting global structure. You can choose whether to compute - two- or three-dimensional projections. - -- *[PCA](https://en.wikipedia.org/wiki/Principal_component_analysis)*: - a linear deterministic algorithm (principal component analysis) that tries to - capture as much of the data variability in as few dimensions as possible. PCA - tends to highlight large-scale structure in the data, but can distort local - neighborhoods. The Embedding Projector computes the top 10 principal - components, from which you can choose two or three to view. - -- *Custom*: a linear projection onto horizontal and vertical axes that you - specify using labels in the data. You define the horizontal axis, for - instance, by giving text patterns for "Left" and "Right". The Embedding - Projector finds all points whose label matches the "Left" pattern and - computes the centroid of that set; similarly for "Right". The line passing - through these two centroids defines the horizontal axis. The vertical axis is - likewise computed from the centroids for points matching the "Up" and "Down" - text patterns. - -Further useful articles are -[How to Use t-SNE Effectively](https://distill.pub/2016/misread-tsne/) and -[Principal Component Analysis Explained Visually](http://setosa.io/ev/principal-component-analysis/). - -### Exploration - -You can explore visually by zooming, rotating, and panning using natural -click-and-drag gestures. Hovering your mouse over a point will show any -[metadata](#metadata) for that point. You can also inspect nearest-neighbor -subsets. Clicking on a point causes the right pane to list the nearest -neighbors, along with distances to the current point. The nearest-neighbor -points are also highlighted in the projection. - -It is sometimes useful to restrict the view to a subset of points and perform -projections only on those points. To do so, you can select points in multiple -ways: - -- After clicking on a point, its nearest neighbors are also selected. -- After a search, the points matching the query are selected. -- Enabling selection, clicking on a point and dragging defines a selection - sphere. - -Then click the "Isolate *nnn* points" button at the top of the Inspector pane -on the right hand side. The following image shows 101 points selected and ready -for the user to click "Isolate 101 points": - -![Selection of nearest neighbors](https://www.tensorflow.org/images/embedding-nearest-points.png "Selection of nearest neighbors") - -*Selection of the nearest neighbors of “important” in a word embedding dataset.* - -Advanced tip: filtering with custom projection can be powerful. Below, we -filtered the 100 nearest neighbors of “politics” and projected them onto the -“worst” - “best” vector as an x axis. The y axis is random. As a result, one -finds on the right side “ideas”, “science”, “perspective”, “journalism” but on -the left “crisis”, “violence” and “conflict”. - - - - - - - - - - -
- Custom controls panel - - Custom projection -
- Custom projection controls. - - Custom projection of neighbors of "politics" onto "best" - "worst" vector. -
- -To share your findings, you can use the bookmark panel in the bottom right -corner and save the current state (including computed coordinates of any -projection) as a small file. The Projector can then be pointed to a set of one -or more of these files, producing the panel below. Other users can then walk -through a sequence of bookmarks. - -Bookmark panel - -### Metadata - -If you are working with an embedding, you'll probably want to attach -labels/images to the data points. You can do this by generating a metadata file -containing the labels for each point and clicking "Load data" in the data panel -of the Embedding Projector. - -The metadata can be either labels or images, which are -stored in a separate file. For labels, the format should -be a [TSV file](https://en.wikipedia.org/wiki/Tab-separated_values) -(tab characters shown in red) whose first line contains column headers -(shown in bold) and subsequent lines contain the metadata values. For example: - - -Word\tFrequency
- Airplane\t345
- Car\t241
- ... -
- -The order of lines in the metadata file is assumed to match the order of -vectors in the embedding variable, except for the header. Consequently, the -(i+1)-th line in the metadata file corresponds to the i-th row of the embedding -variable. If the TSV metadata file has only a single column, then we don’t -expect a header row, and assume each row is the label of the embedding. We -include this exception because it matches the commonly-used "vocab file" -format. - -To use images as metadata, you must produce a single -[sprite image](https://www.google.com/webhp#q=what+is+a+sprite+image), -consisting of small thumbnails, one for each vector in the embedding. The -sprite should store thumbnails in row-first order: the first data point placed -in the top left and the last data point in the bottom right, though the last -row doesn't have to be filled, as shown below. - - - - - - - - - - - - - - - - - -
012
345
67
- -Follow [this link](https://www.tensorflow.org/images/embedding-mnist.mp4) -to see a fun example of thumbnail images in the Embedding Projector. - - -## Mini-FAQ - -**Is "embedding" an action or a thing?** -Both. People talk about embedding words in a vector space (action) and about -producing word embeddings (things). Common to both is the notion of embedding -as a mapping from discrete objects to vectors. Creating or applying that -mapping is an action, but the mapping itself is a thing. - -**Are embeddings high-dimensional or low-dimensional?** -It depends. A 300-dimensional vector space of words and phrases, for instance, -is often called low-dimensional (and dense) when compared to the millions of -words and phrases it can contain. But mathematically it is high-dimensional, -displaying many properties that are dramatically different from what our human -intuition has learned about 2- and 3-dimensional spaces. - -**Is an embedding the same as an embedding layer?** -No. An *embedding layer* is a part of neural network, but an *embedding* is a more -general concept. diff --git a/tensorflow/docs_src/programmers_guide/estimators.md b/tensorflow/docs_src/programmers_guide/estimators.md deleted file mode 100644 index b13b47184d..0000000000 --- a/tensorflow/docs_src/programmers_guide/estimators.md +++ /dev/null @@ -1,193 +0,0 @@ -# Estimators - -This document introduces @{tf.estimator$**Estimators**}--a high-level TensorFlow -API that greatly simplifies machine learning programming. Estimators encapsulate -the following actions: - -* training -* evaluation -* prediction -* export for serving - -You may either use the pre-made Estimators we provide or write your -own custom Estimators. All Estimators--whether pre-made or custom--are -classes based on the @{tf.estimator.Estimator} class. - -Note: TensorFlow also includes a deprecated `Estimator` class at -@{tf.contrib.learn.Estimator}, which you should not use. - - -## Advantages of Estimators - -Estimators provide the following benefits: - -* You can run Estimator-based models on a local host or on a - distributed multi-server environment without changing your model. - Furthermore, you can run Estimator-based models on CPUs, GPUs, - or TPUs without recoding your model. -* Estimators simplify sharing implementations between model developers. -* You can develop a state of the art model with high-level intuitive code. - In short, it is generally much easier to create models with Estimators - than with the low-level TensorFlow APIs. -* Estimators are themselves built on @{tf.layers}, which - simplifies customization. -* Estimators build the graph for you. -* Estimators provide a safe distributed training loop that controls how and - when to: - * build the graph - * initialize variables - * start queues - * handle exceptions - * create checkpoint files and recover from failures - * save summaries for TensorBoard - -When writing an application with Estimators, you must separate the data input -pipeline from the model. This separation simplifies experiments with -different data sets. - - -## Pre-made Estimators - -Pre-made Estimators enable you to work at a much higher conceptual level -than the base TensorFlow APIs. You no longer have to worry about creating -the computational graph or sessions since Estimators handle all -the "plumbing" for you. That is, pre-made Estimators create and manage -@{tf.Graph$`Graph`} and @{tf.Session$`Session`} objects for you. Furthermore, -pre-made Estimators let you experiment with different model architectures by -making only minimal code changes. @{tf.estimator.DNNClassifier$`DNNClassifier`}, -for example, is a pre-made Estimator class that trains classification models -based on dense, feed-forward neural networks. - - -### Structure of a pre-made Estimators program - -A TensorFlow program relying on a pre-made Estimator typically consists -of the following four steps: - -1. **Write one or more dataset importing functions.** For example, you might - create one function to import the training set and another function to - import the test set. Each dataset importing function must return two - objects: - - * a dictionary in which the keys are feature names and the - values are Tensors (or SparseTensors) containing the corresponding - feature data - * a Tensor containing one or more labels - - For example, the following code illustrates the basic skeleton for - an input function: - - def input_fn(dataset): - ... # manipulate dataset, extracting the feature dict and the label - return feature_dict, label - - (See @{$programmers_guide/datasets} for full details.) - -2. **Define the feature columns.** Each @{tf.feature_column} - identifies a feature name, its type, and any input pre-processing. - For example, the following snippet creates three feature - columns that hold integer or floating-point data. The first two - feature columns simply identify the feature's name and type. The - third feature column also specifies a lambda the program will invoke - to scale the raw data: - - # Define three numeric feature columns. - population = tf.feature_column.numeric_column('population') - crime_rate = tf.feature_column.numeric_column('crime_rate') - median_education = tf.feature_column.numeric_column('median_education', - normalizer_fn=lambda x: x - global_education_mean) - -3. **Instantiate the relevant pre-made Estimator.** For example, here's - a sample instantiation of a pre-made Estimator named `LinearClassifier`: - - # Instantiate an estimator, passing the feature columns. - estimator = tf.estimator.LinearClassifier( - feature_columns=[population, crime_rate, median_education], - ) - -4. **Call a training, evaluation, or inference method.** - For example, all Estimators provide a `train` method, which trains a model. - - # my_training_set is the function created in Step 1 - estimator.train(input_fn=my_training_set, steps=2000) - - -### Benefits of pre-made Estimators - -Pre-made Estimators encode best practices, providing the following benefits: - -* Best practices for determining where different parts of the computational - graph should run, implementing strategies on a single machine or on a - cluster. -* Best practices for event (summary) writing and universally useful - summaries. - -If you don't use pre-made Estimators, you must implement the preceding -features yourself. - - -## Custom Estimators - -The heart of every Estimator--whether pre-made or custom--is its -**model function**, which is a method that builds graphs for training, -evaluation, and prediction. When you are using a pre-made Estimator, -someone else has already implemented the model function. When relying -on a custom Estimator, you must write the model function yourself. A -@{$custom_estimators$companion document} -explains how to write the model function. - - -## Recommended workflow - -We recommend the following workflow: - -1. Assuming a suitable pre-made Estimator exists, use it to build your - first model and use its results to establish a baseline. -2. Build and test your overall pipeline, including the integrity and - reliability of your data with this pre-made Estimator. -3. If suitable alternative pre-made Estimators are available, run - experiments to determine which pre-made Estimator produces the - best results. -4. Possibly, further improve your model by building your own custom Estimator. - - -## Creating Estimators from Keras models - -You can convert existing Keras models to Estimators. Doing so enables your Keras -model to access Estimator's strengths, such as distributed training. Call -@{tf.keras.estimator.model_to_estimator} as in the -following sample: - -```python -# Instantiate a Keras inception v3 model. -keras_inception_v3 = tf.keras.applications.inception_v3.InceptionV3(weights=None) -# Compile model with the optimizer, loss, and metrics you'd like to train with. -keras_inception_v3.compile(optimizer=tf.keras.optimizers.SGD(lr=0.0001, momentum=0.9), - loss='categorical_crossentropy', - metric='accuracy') -# Create an Estimator from the compiled Keras model. Note the initial model -# state of the keras model is preserved in the created Estimator. -est_inception_v3 = tf.keras.estimator.model_to_estimator(keras_model=keras_inception_v3) - -# Treat the derived Estimator as you would with any other Estimator. -# First, recover the input name(s) of Keras model, so we can use them as the -# feature column name(s) of the Estimator input function: -keras_inception_v3.input_names # print out: ['input_1'] -# Once we have the input name(s), we can create the input function, for example, -# for input(s) in the format of numpy ndarray: -train_input_fn = tf.estimator.inputs.numpy_input_fn( - x={"input_1": train_data}, - y=train_labels, - num_epochs=1, - shuffle=False) -# To train, we call Estimator's train function: -est_inception_v3.train(input_fn=train_input_fn, steps=2000) -``` -Note that the names of feature columns and labels of a keras estimator come from -the corresponding compiled keras model. For example, the input key names for -`train_input_fn` above can be obtained from `keras_inception_v3.input_names`, -and similarly, the predicted output names can be obtained from -`keras_inception_v3.output_names`. - -For more details, please refer to the documentation for -@{tf.keras.estimator.model_to_estimator}. diff --git a/tensorflow/docs_src/programmers_guide/faq.md b/tensorflow/docs_src/programmers_guide/faq.md deleted file mode 100644 index b6291a9ffa..0000000000 --- a/tensorflow/docs_src/programmers_guide/faq.md +++ /dev/null @@ -1,297 +0,0 @@ -# Frequently Asked Questions - -This document provides answers to some of the frequently asked questions about -TensorFlow. If you have a question that is not covered here, you might find an -answer on one of the TensorFlow @{$about$community resources}. - -[TOC] - -## Features and Compatibility - -#### Can I run distributed training on multiple computers? - -Yes! TensorFlow gained -@{$distributed$support for distributed computation} in -version 0.8. TensorFlow now supports multiple devices (CPUs and GPUs) in one or -more computers. - -#### Does TensorFlow work with Python 3? - -As of the 0.6.0 release timeframe (Early December 2015), we do support Python -3.3+. - -## Building a TensorFlow graph - -See also the -@{$python/framework$API documentation on building graphs}. - -#### Why does `c = tf.matmul(a, b)` not execute the matrix multiplication immediately? - -In the TensorFlow Python API, `a`, `b`, and `c` are -@{tf.Tensor} objects. A `Tensor` object is -a symbolic handle to the result of an operation, but does not actually hold the -values of the operation's output. Instead, TensorFlow encourages users to build -up complicated expressions (such as entire neural networks and its gradients) as -a dataflow graph. You then offload the computation of the entire dataflow graph -(or a subgraph of it) to a TensorFlow -@{tf.Session}, which is able to execute the -whole computation much more efficiently than executing the operations -one-by-one. - -#### How are devices named? - -The supported device names are `"/device:CPU:0"` (or `"/cpu:0"`) for the CPU -device, and `"/device:GPU:i"` (or `"/gpu:i"`) for the *i*th GPU device. - -#### How do I place operations on a particular device? - -To place a group of operations on a device, create them within a -@{tf.device$`with tf.device(name):`} context. See -the how-to documentation on -@{$using_gpu$using GPUs with TensorFlow} for details of how -TensorFlow assigns operations to devices, and the -@{$deep_cnn$CIFAR-10 tutorial} for an example model that -uses multiple GPUs. - - -## Running a TensorFlow computation - -See also the -@{$python/client$API documentation on running graphs}. - -#### What's the deal with feeding and placeholders? - -Feeding is a mechanism in the TensorFlow Session API that allows you to -substitute different values for one or more tensors at run time. The `feed_dict` -argument to @{tf.Session.run} is a -dictionary that maps @{tf.Tensor} objects to -numpy arrays (and some other types), which will be used as the values of those -tensors in the execution of a step. - -#### What is the difference between `Session.run()` and `Tensor.eval()`? - -If `t` is a @{tf.Tensor} object, -@{tf.Tensor.eval} is shorthand for -@{tf.Session.run}, where `sess` is the -current @{tf.get_default_session}. The -two following snippets of code are equivalent: - -```python -# Using `Session.run()`. -sess = tf.Session() -c = tf.constant(5.0) -print(sess.run(c)) - -# Using `Tensor.eval()`. -c = tf.constant(5.0) -with tf.Session(): - print(c.eval()) -``` - -In the second example, the session acts as a -[context manager](https://docs.python.org/2.7/reference/compound_stmts.html#with), -which has the effect of installing it as the default session for the lifetime of -the `with` block. The context manager approach can lead to more concise code for -simple use cases (like unit tests); if your code deals with multiple graphs and -sessions, it may be more straightforward to make explicit calls to -`Session.run()`. - -#### Do Sessions have a lifetime? What about intermediate tensors? - -Sessions can own resources, such as -@{tf.Variable}, -@{tf.QueueBase}, and -@{tf.ReaderBase}. These resources can sometimes use -a significant amount of memory, and can be released when the session is closed by calling -@{tf.Session.close}. - -The intermediate tensors that are created as part of a call to -@{$python/client$`Session.run()`} will be freed at or before the -end of the call. - -#### Does the runtime parallelize parts of graph execution? - -The TensorFlow runtime parallelizes graph execution across many different -dimensions: - -* The individual ops have parallel implementations, using multiple cores in a - CPU, or multiple threads in a GPU. -* Independent nodes in a TensorFlow graph can run in parallel on multiple - devices, which makes it possible to speed up - @{$deep_cnn$CIFAR-10 training using multiple GPUs}. -* The Session API allows multiple concurrent steps (i.e. calls to - @{tf.Session.run} in parallel). This - enables the runtime to get higher throughput, if a single step does not use - all of the resources in your computer. - -#### Which client languages are supported in TensorFlow? - -TensorFlow is designed to support multiple client languages. -Currently, the best-supported client language is [Python](../api_docs/python/index.md). Experimental interfaces for -executing and constructing graphs are also available for -[C++](../api_docs/cc/index.md), [Java](../api_docs/java/reference/org/tensorflow/package-summary.html) and [Go](https://godoc.org/github.com/tensorflow/tensorflow/tensorflow/go). - -TensorFlow also has a -[C-based client API](https://www.tensorflow.org/code/tensorflow/c/c_api.h) -to help build support for more client languages. We invite contributions of new -language bindings. - -Bindings for various other languages (such as [C#](https://github.com/migueldeicaza/TensorFlowSharp), [Julia](https://github.com/malmaud/TensorFlow.jl), [Ruby](https://github.com/somaticio/tensorflow.rb) and [Scala](https://github.com/eaplatanios/tensorflow_scala)) created and supported by the open source community build on top of the C API supported by the TensorFlow maintainers. - -#### Does TensorFlow make use of all the devices (GPUs and CPUs) available on my machine? - -TensorFlow supports multiple GPUs and CPUs. See the how-to documentation on -@{$using_gpu$using GPUs with TensorFlow} for details of how -TensorFlow assigns operations to devices, and the -@{$deep_cnn$CIFAR-10 tutorial} for an example model that -uses multiple GPUs. - -Note that TensorFlow only uses GPU devices with a compute capability greater -than 3.5. - -#### Why does `Session.run()` hang when using a reader or a queue? - -The @{tf.ReaderBase} and -@{tf.QueueBase} classes provide special operations that -can *block* until input (or free space in a bounded queue) becomes -available. These operations allow you to build sophisticated -@{$reading_data$input pipelines}, at the cost of making the -TensorFlow computation somewhat more complicated. See the how-to documentation -for -@{$reading_data#creating_threads_to_prefetch_using_queuerunner_objects$using `QueueRunner` objects to drive queues and readers} -for more information on how to use them. - -## Variables - -See also the how-to documentation on @{$variables$variables} and -@{$python/state_ops$the API documentation for variables}. - -#### What is the lifetime of a variable? - -A variable is created when you first run the -@{tf.Variable.initializer} -operation for that variable in a session. It is destroyed when that -@{tf.Session.close}. - -#### How do variables behave when they are concurrently accessed? - -Variables allow concurrent read and write operations. The value read from a -variable may change if it is concurrently updated. By default, concurrent -assignment operations to a variable are allowed to run with no mutual exclusion. -To acquire a lock when assigning to a variable, pass `use_locking=True` to -@{tf.Variable.assign}. - -## Tensor shapes - -See also the -@{tf.TensorShape}. - -#### How can I determine the shape of a tensor in Python? - -In TensorFlow, a tensor has both a static (inferred) shape and a dynamic (true) -shape. The static shape can be read using the -@{tf.Tensor.get_shape} -method: this shape is inferred from the operations that were used to create the -tensor, and may be -@{tf.TensorShape$partially complete}. If the static -shape is not fully defined, the dynamic shape of a `Tensor` `t` can be -determined by evaluating @{tf.shape$`tf.shape(t)`}. - -#### What is the difference between `x.set_shape()` and `x = tf.reshape(x)`? - -The @{tf.Tensor.set_shape} method updates -the static shape of a `Tensor` object, and it is typically used to provide -additional shape information when this cannot be inferred directly. It does not -change the dynamic shape of the tensor. - -The @{tf.reshape} operation creates -a new tensor with a different dynamic shape. - -#### How do I build a graph that works with variable batch sizes? - -It is often useful to build a graph that works with variable batch sizes -so that the same code can be used for (mini-)batch training, and -single-instance inference. The resulting graph can be -@{tf.Graph.as_graph_def$saved as a protocol buffer} -and -@{tf.import_graph_def$imported into another program}. - -When building a variable-size graph, the most important thing to remember is not -to encode the batch size as a Python constant, but instead to use a symbolic -`Tensor` to represent it. The following tips may be useful: - -* Use [`batch_size = tf.shape(input)[0]`](../api_docs/python/array_ops.md#shape) - to extract the batch dimension from a `Tensor` called `input`, and store it in - a `Tensor` called `batch_size`. - -* Use @{tf.reduce_mean} instead - of `tf.reduce_sum(...) / batch_size`. - - -## TensorBoard - -#### How can I visualize a TensorFlow graph? - -See the @{$graph_viz$graph visualization tutorial}. - -#### What is the simplest way to send data to TensorBoard? - -Add summary ops to your TensorFlow graph, and write -these summaries to a log directory. Then, start TensorBoard using - - python tensorflow/tensorboard/tensorboard.py --logdir=path/to/log-directory - -For more details, see the -@{$summaries_and_tensorboard$Summaries and TensorBoard tutorial}. - -#### Every time I launch TensorBoard, I get a network security popup! - -You can change TensorBoard to serve on localhost rather than '0.0.0.0' by -the flag --host=localhost. This should quiet any security warnings. - -## Extending TensorFlow - -See the how-to documentation for -@{$adding_an_op$adding a new operation to TensorFlow}. - -#### My data is in a custom format. How do I read it using TensorFlow? - -There are three main options for dealing with data in a custom format. - -The easiest option is to write parsing code in Python that transforms the data -into a numpy array. Then, use @{tf.data.Dataset.from_tensor_slices} to -create an input pipeline from the in-memory data. - -If your data doesn't fit in memory, try doing the parsing in the Dataset -pipeline. Start with an appropriate file reader, like -@{tf.data.TextLineDataset}. Then convert the dataset by mapping -@{tf.data.Dataset.map$mapping} appropriate operations over it. -Prefer predefined TensorFlow operations such as @{tf.decode_raw}, -@{tf.decode_csv}, @{tf.parse_example}, or @{tf.image.decode_png}. - -If your data is not easily parsable with the built-in TensorFlow operations, -consider converting it, offline, to a format that is easily parsable, such -as @{tf.python_io.TFRecordWriter$`TFRecord`} format. - -The most efficient method to customize the parsing behavior is to -@{$adding_an_op$add a new op written in C++} that parses your -data format. The @{$new_data_formats$guide to handling new data formats} has -more information about the steps for doing this. - - -## Miscellaneous - -#### What is TensorFlow's coding style convention? - -The TensorFlow Python API adheres to the -[PEP8](https://www.python.org/dev/peps/pep-0008/) conventions.* In -particular, we use `CamelCase` names for classes, and `snake_case` names for -functions, methods, and properties. We also adhere to the -[Google Python style guide](https://google.github.io/styleguide/pyguide.html). - -The TensorFlow C++ code base adheres to the -[Google C++ style guide](https://google.github.io/styleguide/cppguide.html). - -(* With one exception: we use 2-space indentation instead of 4-space -indentation.) - diff --git a/tensorflow/docs_src/programmers_guide/feature_columns.md b/tensorflow/docs_src/programmers_guide/feature_columns.md deleted file mode 100644 index 90f5c53a17..0000000000 --- a/tensorflow/docs_src/programmers_guide/feature_columns.md +++ /dev/null @@ -1,572 +0,0 @@ -# Feature Columns - -This document details feature columns. Think of **feature columns** as the -intermediaries between raw data and Estimators. Feature columns are very rich, -enabling you to transform a diverse range of raw data into formats that -Estimators can use, allowing easy experimentation. - -In @{$premade_estimators$Premade Estimators}, we used the premade -Estimator, @{tf.estimator.DNNClassifier$`DNNClassifier`} to train a model to -predict different types of Iris flowers from four input features. That example -created only numerical feature columns (of type -@{tf.feature_column.numeric_column}). Although numerical feature columns model -the lengths of petals and sepals effectively, real world data sets contain all -kinds of features, many of which are non-numerical. - -
- -
-
-Some real-world features (such as, longitude) are numerical, but many are not. -
- -## Input to a Deep Neural Network - -What kind of data can a deep neural network operate on? The answer -is, of course, numbers (for example, `tf.float32`). After all, every neuron in -a neural network performs multiplication and addition operations on weights and -input data. Real-life input data, however, often contains non-numerical -(categorical) data. For example, consider a `product_class` feature that can -contain the following three non-numerical values: - -* `kitchenware` -* `electronics` -* `sports` - -ML models generally represent categorical values as simple vectors in which a -1 represents the presence of a value and a 0 represents the absence of a value. -For example, when `product_class` is set to `sports`, an ML model would usually -represent `product_class` as `[0, 0, 1]`, meaning: - -* `0`: `kitchenware` is absent -* `0`: `electronics` is absent -* `1`: `sports` is present - -So, although raw data can be numerical or categorical, an ML model represents -all features as numbers. - -## Feature Columns - -As the following figure suggests, you specify the input to a model through the -`feature_columns` argument of an Estimator (`DNNClassifier` for Iris). -Feature Columns bridge input data (as returned by `input_fn`) with your model. - -
- -
-
-Feature columns bridge raw data with the data your model needs. -
- -To create feature columns, call functions from the -@{tf.feature_column} module. This document explains nine of the functions in -that module. As the following figure shows, all nine functions return either a -Categorical-Column or a Dense-Column object, except `bucketized_column`, which -inherits from both classes: - -
- -
-
-Feature column methods fall into two main categories and one hybrid category. -
- -Let's look at these functions in more detail. - -### Numeric column - -The Iris classifier calls the @{tf.feature_column.numeric_column} function for -all input features: - - * `SepalLength` - * `SepalWidth` - * `PetalLength` - * `PetalWidth` - -Although `tf.numeric_column` provides optional arguments, calling -`tf.numeric_column` without any arguments, as follows, is a fine way to specify -a numerical value with the default data type (`tf.float32`) as input to your -model: - -```python -# Defaults to a tf.float32 scalar. -numeric_feature_column = tf.feature_column.numeric_column(key="SepalLength") -``` - -To specify a non-default numerical data type, use the `dtype` argument. For -example: - -``` python -# Represent a tf.float64 scalar. -numeric_feature_column = tf.feature_column.numeric_column(key="SepalLength", - dtype=tf.float64) -``` - -By default, a numeric column creates a single value (scalar). Use the shape -argument to specify another shape. For example: - - -```python -# Represent a 10-element vector in which each cell contains a tf.float32. -vector_feature_column = tf.feature_column.numeric_column(key="Bowling", - shape=10) - -# Represent a 10x5 matrix in which each cell contains a tf.float32. -matrix_feature_column = tf.feature_column.numeric_column(key="MyMatrix", - shape=[10,5]) -``` -### Bucketized column - -Often, you don't want to feed a number directly into the model, but instead -split its value into different categories based on numerical ranges. To do so, -create a @{tf.feature_column.bucketized_column$bucketized column}. For -example, consider raw data that represents the year a house was built. Instead -of representing that year as a scalar numeric column, we could split the year -into the following four buckets: - -
- -
-
-Dividing year data into four buckets. -
- -The model will represent the buckets as follows: - -|Date Range |Represented as... | -|:----------|:-----------------| -|< 1960 | [1, 0, 0, 0] | -|>= 1960 but < 1980 | [0, 1, 0, 0] | -|>= 1980 but < 2000 | [0, 0, 1, 0] | -|>= 2000 | [0, 0, 0, 1] | - -Why would you want to split a number—a perfectly valid input to your -model—into a categorical value? Well, notice that the categorization splits a -single input number into a four-element vector. Therefore, the model now can -learn _four individual weights_ rather than just one; four weights creates a -richer model than one weight. More importantly, bucketizing enables the model -to clearly distinguish between different year categories since only one of the -elements is set (1) and the other three elements are cleared (0). For example, -when we just use a single number (a year) as input, a linear model can only -learn a linear relationship. So, bucketing provides the model with additional -flexibility that the model can use to learn. - -The following code demonstrates how to create a bucketized feature: - - -```python -# First, convert the raw input to a numeric column. -numeric_feature_column = tf.feature_column.numeric_column("Year") - -# Then, bucketize the numeric column on the years 1960, 1980, and 2000. -bucketized_feature_column = tf.feature_column.bucketized_column( - source_column = numeric_feature_column, - boundaries = [1960, 1980, 2000]) -``` -Note that specifying a _three_-element boundaries vector creates a -_four_-element bucketized vector. - - -### Categorical identity column - -**Categorical identity columns** can be seen as a special case of bucketized -columns. In traditional bucketized columns, each bucket represents a range of -values (for example, from 1960 to 1979). In a categorical identity column, each -bucket represents a single, unique integer. For example, let's say you want to -represent the integer range `[0, 4)`. That is, you want to represent the -integers 0, 1, 2, or 3. In this case, the categorical identity mapping looks -like this: - -
- -
-
-A categorical identity column mapping. Note that this is a one-hot -encoding, not a binary numerical encoding. -
- -As with bucketized columns, a model can learn a separate weight for each class -in a categorical identity column. For example, instead of using a string to -represent the `product_class`, let's represent each class with a unique integer -value. That is: - -* `0="kitchenware"` -* `1="electronics"` -* `2="sport"` - -Call @{tf.feature_column.categorical_column_with_identity} to implement a -categorical identity column. For example: - -``` python -# Create categorical output for an integer feature named "my_feature_b", -# The values of my_feature_b must be >= 0 and < num_buckets -identity_feature_column = tf.feature_column.categorical_column_with_identity( - key='my_feature_b', - num_buckets=4) # Values [0, 4) - -# In order for the preceding call to work, the input_fn() must return -# a dictionary containing 'my_feature_b' as a key. Furthermore, the values -# assigned to 'my_feature_b' must belong to the set [0, 4). -def input_fn(): - ... - return ({ 'my_feature_a':[7, 9, 5, 2], 'my_feature_b':[3, 1, 2, 2] }, - [Label_values]) -``` - -### Categorical vocabulary column - -We cannot input strings directly to a model. Instead, we must first map strings -to numeric or categorical values. Categorical vocabulary columns provide a good -way to represent strings as a one-hot vector. For example: - -
- -
-
-Mapping string values to vocabulary columns. -
- -As you can see, categorical vocabulary columns are kind of an enum version of -categorical identity columns. TensorFlow provides two different functions to -create categorical vocabulary columns: - -* @{tf.feature_column.categorical_column_with_vocabulary_list} -* @{tf.feature_column.categorical_column_with_vocabulary_file} - -`categorical_column_with_vocabulary_list` maps each string to an integer based -on an explicit vocabulary list. For example: - -```python -# Given input "feature_name_from_input_fn" which is a string, -# create a categorical feature by mapping the input to one of -# the elements in the vocabulary list. -vocabulary_feature_column = - tf.feature_column.categorical_column_with_vocabulary_list( - key=feature_name_from_input_fn, - vocabulary_list=["kitchenware", "electronics", "sports"]) -``` - -The preceding function is pretty straightforward, but it has a significant -drawback. Namely, there's way too much typing when the vocabulary list is long. -For these cases, call -`tf.feature_column.categorical_column_with_vocabulary_file` instead, which lets -you place the vocabulary words in a separate file. For example: - -```python - -# Given input "feature_name_from_input_fn" which is a string, -# create a categorical feature to our model by mapping the input to one of -# the elements in the vocabulary file -vocabulary_feature_column = - tf.feature_column.categorical_column_with_vocabulary_file( - key=feature_name_from_input_fn, - vocabulary_file="product_class.txt", - vocabulary_size=3) -``` - -`product_class.txt` should contain one line for each vocabulary element. In our -case: - -```None -kitchenware -electronics -sports -``` - -### Hashed Column - -So far, we've worked with a naively small number of categories. For example, -our product_class example has only 3 categories. Often though, the number of -categories can be so big that it's not possible to have individual categories -for each vocabulary word or integer because that would consume too much memory. -For these cases, we can instead turn the question around and ask, "How many -categories am I willing to have for my input?" In fact, the -@{tf.feature_column.categorical_column_with_hash_bucket} function enables you -to specify the number of categories. For this type of feature column the model -calculates a hash value of the input, then puts it into one of -the `hash_bucket_size` categories using the modulo operator, as in the following -pseudocode: - -```python -# pseudocode -feature_id = hash(raw_feature) % hash_buckets_size -``` - -The code to create the `feature_column` might look something like this: - -``` python -hashed_feature_column = - tf.feature_column.categorical_column_with_hash_bucket( - key = "some_feature", - hash_buckets_size = 100) # The number of categories -``` -At this point, you might rightfully think: "This is crazy!" After all, we are -forcing the different input values to a smaller set of categories. This means -that two probably unrelated inputs will be mapped to the same -category, and consequently mean the same thing to the neural network. The -following figure illustrates this dilemma, showing that kitchenware and sports -both get assigned to category (hash bucket) 12: - -
- -
-
-Representing data with hash buckets. -
- -As with many counterintuitive phenomena in machine learning, it turns out that -hashing often works well in practice. That's because hash categories provide -the model with some separation. The model can use additional features to further -separate kitchenware from sports. - -### Crossed column - -Combining features into a single feature, better known as -[feature crosses](https://developers.google.com/machine-learning/glossary/#feature_cross), -enables the model to learn separate weights for each combination of -features. - -More concretely, suppose we want our model to calculate real estate prices in -Atlanta, GA. Real-estate prices within this city vary greatly depending on -location. Representing latitude and longitude as separate features isn't very -useful in identifying real-estate location dependencies; however, crossing -latitude and longitude into a single feature can pinpoint locations. Suppose we -represent Atlanta as a grid of 100x100 rectangular sections, identifying each -of the 10,000 sections by a feature cross of latitude and longitude. This -feature cross enables the model to train on pricing conditions related to each -individual section, which is a much stronger signal than latitude and longitude -alone. - -The following figure shows our plan, with the latitude & longitude values for -the corners of the city in red text: - -
- -
-
-Map of Atlanta. Imagine this map divided into 10,000 sections of -equal size. -
- -For the solution, we used a combination of the `bucketized_column` we looked at -earlier, with the @{tf.feature_column.crossed_column} function. - - - -``` python -def make_dataset(latitude, longitude, labels): - assert latitude.shape == longitude.shape == labels.shape - - features = {'latitude': latitude.flatten(), - 'longitude': longitude.flatten()} - labels=labels.flatten() - - return tf.data.Dataset.from_tensor_slices((features, labels)) - - -# Bucketize the latitude and longitude using the `edges` -latitude_bucket_fc = tf.feature_column.bucketized_column( - tf.feature_column.numeric_column('latitude'), - list(atlanta.latitude.edges)) - -longitude_bucket_fc = tf.feature_column.bucketized_column( - tf.feature_column.numeric_column('longitude'), - list(atlanta.longitude.edges)) - -# Cross the bucketized columns, using 5000 hash bins. -crossed_lat_lon_fc = tf.feature_column.crossed_column( - [latitude_bucket_fc, longitude_bucket_fc], 5000) - -fc = [ - latitude_bucket_fc, - longitude_bucket_fc, - crossed_lat_lon_fc] - -# Build and train the Estimator. -est = tf.estimator.LinearRegressor(fc, ...) -``` - -You may create a feature cross from either of the following: - -* Feature names; that is, names from the `dict` returned from `input_fn`. -* Any categorical column, except `categorical_column_with_hash_bucket` - (since `crossed_column` hashes the input). - -When the feature columns `latitude_bucket_fc` and `longitude_bucket_fc` are -crossed, TensorFlow will create `(latitude_fc, longitude_fc)` pairs for each -example. This would produce a full grid of possibilities as follows: - -``` None - (0,0), (0,1)... (0,99) - (1,0), (1,1)... (1,99) - ... ... ... -(99,0), (99,1)...(99, 99) -``` - -Except that a full grid would only be tractable for inputs with limited -vocabularies. Instead of building this, potentially huge, table of inputs, -the `crossed_column` only builds the number requested by the `hash_bucket_size` -argument. The feature column assigns an example to a index by running a hash -function on the tuple of inputs, followed by a modulo operation with -`hash_bucket_size`. - -As discussed earlier, performing the -hash and modulo function limits the number of categories, but can cause category -collisions; that is, multiple (latitude, longitude) feature crosses will end -up in the same hash bucket. In practice though, performing feature crosses -still adds significant value to the learning capability of your models. - -Somewhat counterintuitively, when creating feature crosses, you typically still -should include the original (uncrossed) features in your model (as in the -preceding code snippet). The independent latitude and longitude features help the -model distinguish between examples where a hash collision has occurred in the -crossed feature. - -## Indicator and embedding columns - -Indicator columns and embedding columns never work on features directly, but -instead take categorical columns as input. - -When using an indicator column, we're telling TensorFlow to do exactly what -we've seen in our categorical product_class example. That is, an -**indicator column** treats each category as an element in a one-hot vector, -where the matching category has value 1 and the rest have 0s: - -
- -
-
-Representing data in indicator columns. -
- -Here's how you create an indicator column by calling -@{tf.feature_column.indicator_column}: - -``` python -categorical_column = ... # Create any type of categorical column. - -# Represent the categorical column as an indicator column. -indicator_column = tf.feature_column.indicator_column(categorical_column) -``` - -Now, suppose instead of having just three possible classes, we have a million. -Or maybe a billion. For a number of reasons, as the number of categories grow -large, it becomes infeasible to train a neural network using indicator columns. - -We can use an embedding column to overcome this limitation. Instead of -representing the data as a one-hot vector of many dimensions, an -**embedding column** represents that data as a lower-dimensional, ordinary -vector in which each cell can contain any number, not just 0 or 1. By -permitting a richer palette of numbers for every cell, an embedding column -contains far fewer cells than an indicator column. - -Let's look at an example comparing indicator and embedding columns. Suppose our -input examples consist of different words from a limited palette of only 81 -words. Further suppose that the data set provides the following input -words in 4 separate examples: - -* `"dog"` -* `"spoon"` -* `"scissors"` -* `"guitar"` - -In that case, the following figure illustrates the processing path for -embedding columns or indicator columns. - -
- -
-
-An embedding column stores categorical data in a lower-dimensional -vector than an indicator column. (We just placed random numbers into the -embedding vectors; training determines the actual numbers.) -
- -When an example is processed, one of the `categorical_column_with...` functions -maps the example string to a numerical categorical value. For example, a -function maps "spoon" to `[32]`. (The 32 comes from our imagination—the actual -values depend on the mapping function.) You may then represent these numerical -categorical values in either of the following two ways: - -* As an indicator column. A function converts each numeric categorical value - into an 81-element vector (because our palette consists of 81 words), placing - a 1 in the index of the categorical value (0, 32, 79, 80) and a 0 in all the - other positions. - -* As an embedding column. A function uses the numerical categorical values - `(0, 32, 79, 80)` as indices to a lookup table. Each slot in that lookup table - contains a 3-element vector. - -How do the values in the embeddings vectors magically get assigned? Actually, -the assignments happen during training. That is, the model learns the best way -to map your input numeric categorical values to the embeddings vector value in -order to solve your problem. Embedding columns increase your model's -capabilities, since an embeddings vector learns new relationships between -categories from the training data. - -Why is the embedding vector size 3 in our example? Well, the following "formula" -provides a general rule of thumb about the number of embedding dimensions: - -```python -embedding_dimensions = number_of_categories**0.25 -``` - -That is, the embedding vector dimension should be the 4th root of the number of -categories. Since our vocabulary size in this example is 81, the recommended -number of dimensions is 3: - -``` python -3 = 81**0.25 -``` -Note that this is just a general guideline; you can set the number of embedding -dimensions as you please. - -Call @{tf.feature_column.embedding_column} to create an `embedding_column` as -suggested by the following snippet: - -``` python -categorical_column = ... # Create any categorical column - -# Represent the categorical column as an embedding column. -# This means creating an embedding vector lookup table with one element for each category. -embedding_column = tf.feature_column.embedding_column( - categorical_column=categorical_column, - dimension=embedding_dimensions) -``` - -@{$programmers_guide/embedding$Embeddings} is a significant topic within machine -learning. This information was just to get you started using them as feature -columns. - -## Passing feature columns to Estimators - -As the following list indicates, not all Estimators permit all types of -`feature_columns` argument(s): - -* @{tf.estimator.LinearClassifier$`LinearClassifier`} and - @{tf.estimator.LinearRegressor$`LinearRegressor`}: Accept all types of - feature column. -* @{tf.estimator.DNNClassifier$`DNNClassifier`} and - @{tf.estimator.DNNRegressor$`DNNRegressor`}: Only accept dense columns. Other - column types must be wrapped in either an `indicator_column` or - `embedding_column`. -* @{tf.estimator.DNNLinearCombinedClassifier$`DNNLinearCombinedClassifier`} and - @{tf.estimator.DNNLinearCombinedRegressor$`DNNLinearCombinedRegressor`}: - * The `linear_feature_columns` argument accepts any feature column type. - * The `dnn_feature_columns` argument only accepts dense columns. - -## Other Sources - -For more examples on feature columns, view the following: - -* The @{$low_level_intro#feature_columns$Low Level Introduction} demonstrates how - experiment directly with `feature_columns` using TensorFlow's low level APIs. -* The @{$wide$wide} and @{$wide_and_deep$Wide & Deep} Tutorials solve a - binary classification problem using `feature_columns` on a variety of input - data types. - -To learn more about embeddings, see the following: - -* [Deep Learning, NLP, and representations](http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/) - (Chris Olah's blog) -* The TensorFlow [Embedding Projector](http://projector.tensorflow.org) diff --git a/tensorflow/docs_src/programmers_guide/graph_viz.md b/tensorflow/docs_src/programmers_guide/graph_viz.md deleted file mode 100644 index f581ae56da..0000000000 --- a/tensorflow/docs_src/programmers_guide/graph_viz.md +++ /dev/null @@ -1,316 +0,0 @@ -# TensorBoard: Graph Visualization - -TensorFlow computation graphs are powerful but complicated. The graph visualization can help you understand and debug them. Here's an example of the visualization at work. - -![Visualization of a TensorFlow graph](https://www.tensorflow.org/images/graph_vis_animation.gif "Visualization of a TensorFlow graph") -*Visualization of a TensorFlow graph.* - -To see your own graph, run TensorBoard pointing it to the log directory of the job, click on the graph tab on the top pane and select the appropriate run using the menu at the upper left corner. For in depth information on how to run TensorBoard and make sure you are logging all the necessary information, see @{$summaries_and_tensorboard$TensorBoard: Visualizing Learning}. - -## Name scoping and nodes - -Typical TensorFlow graphs can have many thousands of nodes--far too many to see -easily all at once, or even to lay out using standard graph tools. To simplify, -variable names can be scoped and the visualization uses this information to -define a hierarchy on the nodes in the graph. By default, only the top of this -hierarchy is shown. Here is an example that defines three operations under the -`hidden` name scope using -@{tf.name_scope}: - -```python -import tensorflow as tf - -with tf.name_scope('hidden') as scope: - a = tf.constant(5, name='alpha') - W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0), name='weights') - b = tf.Variable(tf.zeros([1]), name='biases') -``` - -This results in the following three op names: - -* `hidden/alpha` -* `hidden/weights` -* `hidden/biases` - -By default, the visualization will collapse all three into a node labeled `hidden`. -The extra detail isn't lost. You can double-click, or click -on the orange `+` sign in the top right to expand the node, and then you'll see -three subnodes for `alpha`, `weights` and `biases`. - -Here's a real-life example of a more complicated node in its initial and -expanded states. - - - - - - - - - - -
- Unexpanded name scope - - Expanded name scope -
- Initial view of top-level name scope pool_1. Clicking on the orange + button on the top right or double-clicking on the node itself will expand it. - - Expanded view of pool_1 name scope. Clicking on the orange - button on the top right or double-clicking on the node itself will collapse the name scope. -
- -Grouping nodes by name scopes is critical to making a legible graph. If you're -building a model, name scopes give you control over the resulting visualization. -**The better your name scopes, the better your visualization.** - -The figure above illustrates a second aspect of the visualization. TensorFlow -graphs have two kinds of connections: data dependencies and control -dependencies. Data dependencies show the flow of tensors between two ops and -are shown as solid arrows, while control dependencies use dotted lines. In the -expanded view (right side of the figure above) all the connections are data -dependencies with the exception of the dotted line connecting `CheckNumerics` -and `control_dependency`. - -There's a second trick to simplifying the layout. Most TensorFlow graphs have a -few nodes with many connections to other nodes. For example, many nodes might -have a control dependency on an initialization step. Drawing all edges between -the `init` node and its dependencies would create a very cluttered view. - -To reduce clutter, the visualization separates out all high-degree nodes to an -*auxiliary* area on the right and doesn't draw lines to represent their edges. -Instead of lines, we draw small *node icons* to indicate the connections. -Separating out the auxiliary nodes typically doesn't remove critical -information since these nodes are usually related to bookkeeping functions. -See [Interaction](#interaction) for how to move nodes between the main graph -and the auxiliary area. - - - - - - - - - - -
- conv_1 is part of the main graph - - save is extracted as auxiliary node -
- Node conv_1 is connected to save. Note the little save node icon on its right. - - save has a high degree, and will appear as an auxiliary node. The connection with conv_1 is shown as a node icon on its left. To further reduce clutter, since save has a lot of connections, we show the first 5 and abbreviate the others as ... 12 more. -
- -One last structural simplification is *series collapsing*. Sequential -motifs--that is, nodes whose names differ by a number at the end and have -isomorphic structures--are collapsed into a single *stack* of nodes, as shown -below. For networks with long sequences, this greatly simplifies the view. As -with hierarchical nodes, double-clicking expands the series. See -[Interaction](#interaction) for how to disable/enable series collapsing for a -specific set of nodes. - - - - - - - - - - -
- Sequence of nodes - - Expanded sequence of nodes -
- A collapsed view of a node sequence. - - A small piece of the expanded view, after double-click. -
- -Finally, as one last aid to legibility, the visualization uses special icons -for constants and summary nodes. To summarize, here's a table of node symbols: - -Symbol | Meaning ---- | --- -![Name scope](https://www.tensorflow.org/images/namespace_node.png "Name scope") | *High-level* node representing a name scope. Double-click to expand a high-level node. -![Sequence of unconnected nodes](https://www.tensorflow.org/images/horizontal_stack.png "Sequence of unconnected nodes") | Sequence of numbered nodes that are not connected to each other. -![Sequence of connected nodes](https://www.tensorflow.org/images/vertical_stack.png "Sequence of connected nodes") | Sequence of numbered nodes that are connected to each other. -![Operation node](https://www.tensorflow.org/images/op_node.png "Operation node") | An individual operation node. -![Constant node](https://www.tensorflow.org/images/constant.png "Constant node") | A constant. -![Summary node](https://www.tensorflow.org/images/summary.png "Summary node") | A summary node. -![Data flow edge](https://www.tensorflow.org/images/dataflow_edge.png "Data flow edge") | Edge showing the data flow between operations. -![Control dependency edge](https://www.tensorflow.org/images/control_edge.png "Control dependency edge") | Edge showing the control dependency between operations. -![Reference edge](https://www.tensorflow.org/images/reference_edge.png "Reference edge") | A reference edge showing that the outgoing operation node can mutate the incoming tensor. - -## Interaction {#interaction} - -Navigate the graph by panning and zooming. Click and drag to pan, and use a -scroll gesture to zoom. Double-click on a node, or click on its `+` button, to -expand a name scope that represents a group of operations. To easily keep -track of the current viewpoint when zooming and panning, there is a minimap in -the bottom right corner. - -To close an open node, double-click it again or click its `-` button. You can -also click once to select a node. It will turn a darker color, and details -about it and the nodes it connects to will appear in the info card at upper -right corner of the visualization. - - - - - - - - - - -
- Info card of a name scope - - Info card of operation node -
- Info card showing detailed information for the conv2 name scope. The inputs and outputs are combined from the inputs and outputs of the operation nodes inside the name scope. For name scopes no attributes are shown. - - Info card showing detailed information for the DecodeRaw operation node. In addition to inputs and outputs, the card shows the device and the attributes associated with the current operation. -
- -TensorBoard provides several ways to change the visual layout of the graph. This -doesn't change the graph's computational semantics, but it can bring some -clarity to the network's structure. By right clicking on a node or pressing -buttons on the bottom of that node's info card, you can make the following -changes to its layout: - -* Nodes can be moved between the main graph and the auxiliary area. -* A series of nodes can be ungrouped so that the nodes in the series do not -appear grouped together. Ungrouped series can likewise be regrouped. - -Selection can also be helpful in understanding high-degree nodes. Select any -high-degree node, and the corresponding node icons for its other connections -will be selected as well. This makes it easy, for example, to see which nodes -are being saved--and which aren't. - -Clicking on a node name in the info card will select it. If necessary, the -viewpoint will automatically pan so that the node is visible. - -Finally, you can choose two color schemes for your graph, using the color menu -above the legend. The default *Structure View* shows structure: when two -high-level nodes have the same structure, they appear in the same color of the -rainbow. Uniquely structured nodes are gray. There's a second view, which shows -what device the different operations run on. Name scopes are colored -proportionally to the fraction of devices for the operations inside them. - -The images below give an illustration for a piece of a real-life graph. - - - - - - - - - - -
- Color by structure - - Color by device -
- Structure view: The gray nodes have unique structure. The orange conv1 and conv2 nodes have the same structure, and analogously for nodes with other colors. - - Device view: Name scopes are colored proportionally to the fraction of devices of the operation nodes inside them. Here, purple means GPU and the green is CPU. -
- -## Tensor shape information - -When the serialized `GraphDef` includes tensor shapes, the graph visualizer -labels edges with tensor dimensions, and edge thickness reflects total tensor -size. To include tensor shapes in the `GraphDef` pass the actual graph object -(as in `sess.graph`) to the `FileWriter` when serializing the graph. -The images below show the CIFAR-10 model with tensor shape information: - - - - - - - -
- CIFAR-10 model with tensor shape information -
- CIFAR-10 model with tensor shape information. -
- -## Runtime statistics - -Often it is useful to collect runtime metadata for a run, such as total memory -usage, total compute time, and tensor shapes for nodes. The code example below -is a snippet from the train and test section of a modification of the -@{$layers$simple MNIST tutorial}, in which we have recorded summaries and -runtime statistics. See the -@{$summaries_and_tensorboard#serializing-the-data$Summaries Tutorial} -for details on how to record summaries. -Full source is [here](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/mnist_with_summaries.py). - -```python - # Train the model, and also write summaries. - # Every 10th step, measure test-set accuracy, and write test summaries - # All other steps, run train_step on training data, & add training summaries - - def feed_dict(train): - """Make a TensorFlow feed_dict: maps data onto Tensor placeholders.""" - if train or FLAGS.fake_data: - xs, ys = mnist.train.next_batch(100, fake_data=FLAGS.fake_data) - k = FLAGS.dropout - else: - xs, ys = mnist.test.images, mnist.test.labels - k = 1.0 - return {x: xs, y_: ys, keep_prob: k} - - for i in range(FLAGS.max_steps): - if i % 10 == 0: # Record summaries and test-set accuracy - summary, acc = sess.run([merged, accuracy], feed_dict=feed_dict(False)) - test_writer.add_summary(summary, i) - print('Accuracy at step %s: %s' % (i, acc)) - else: # Record train set summaries, and train - if i % 100 == 99: # Record execution stats - run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE) - run_metadata = tf.RunMetadata() - summary, _ = sess.run([merged, train_step], - feed_dict=feed_dict(True), - options=run_options, - run_metadata=run_metadata) - train_writer.add_run_metadata(run_metadata, 'step%d' % i) - train_writer.add_summary(summary, i) - print('Adding run metadata for', i) - else: # Record a summary - summary, _ = sess.run([merged, train_step], feed_dict=feed_dict(True)) - train_writer.add_summary(summary, i) -``` - -This code will emit runtime statistics for every 100th step starting at step99. - -When you launch tensorboard and go to the Graph tab, you will now see options -under "Session runs" which correspond to the steps where run metadata was added. -Selecting one of these runs will show you the snapshot of the network at that -step, fading out unused nodes. In the controls on the left hand side, you will -be able to color the nodes by total memory or total compute time. Additionally, -clicking on a node will display the exact total memory, compute time, and -tensor output sizes. - - - - - - - - -
- Color by compute time - - Run metadata graph - - Run metadata info card -
diff --git a/tensorflow/docs_src/programmers_guide/graphs.md b/tensorflow/docs_src/programmers_guide/graphs.md deleted file mode 100644 index f0dd8def17..0000000000 --- a/tensorflow/docs_src/programmers_guide/graphs.md +++ /dev/null @@ -1,558 +0,0 @@ -# Graphs and Sessions - -TensorFlow uses a **dataflow graph** to represent your computation in terms of -the dependencies between individual operations. This leads to a low-level -programming model in which you first define the dataflow graph, then create a -TensorFlow **session** to run parts of the graph across a set of local and -remote devices. - -This guide will be most useful if you intend to use the low-level programming -model directly. Higher-level APIs such as @{tf.estimator.Estimator} and Keras -hide the details of graphs and sessions from the end user, but this guide may -also be useful if you want to understand how these APIs are implemented. - -## Why dataflow graphs? - -![](../images/tensors_flowing.gif) - -[Dataflow](https://en.wikipedia.org/wiki/Dataflow_programming) is a common -programming model for parallel computing. In a dataflow graph, the nodes -represent units of computation, and the edges represent the data consumed or -produced by a computation. For example, in a TensorFlow graph, the @{tf.matmul} -operation would correspond to a single node with two incoming edges (the -matrices to be multiplied) and one outgoing edge (the result of the -multiplication). - - - -Dataflow has several advantages that TensorFlow leverages when executing your -programs: - -* **Parallelism.** By using explicit edges to represent dependencies between - operations, it is easy for the system to identify operations that can execute - in parallel. - -* **Distributed execution.** By using explicit edges to represent the values - that flow between operations, it is possible for TensorFlow to partition your - program across multiple devices (CPUs, GPUs, and TPUs) attached to different - machines. TensorFlow inserts the necessary communication and coordination - between devices. - -* **Compilation.** TensorFlow's @{$performance/xla$XLA compiler} can - use the information in your dataflow graph to generate faster code, for - example, by fusing together adjacent operations. - -* **Portability.** The dataflow graph is a language-independent representation - of the code in your model. You can build a dataflow graph in Python, store it - in a @{$saved_model$SavedModel}, and restore it in a C++ program for - low-latency inference. - - -## What is a @{tf.Graph}? - -A @{tf.Graph} contains two relevant kinds of information: - -* **Graph structure.** The nodes and edges of the graph, indicating how - individual operations are composed together, but not prescribing how they - should be used. The graph structure is like assembly code: inspecting it can - convey some useful information, but it does not contain all of the useful - context that source code conveys. - -* **Graph collections.** TensorFlow provides a general mechanism for storing - collections of metadata in a @{tf.Graph}. The @{tf.add_to_collection} function - enables you to associate a list of objects with a key (where @{tf.GraphKeys} - defines some of the standard keys), and @{tf.get_collection} enables you to - look up all objects associated with a key. Many parts of the TensorFlow - library use this facility: for example, when you create a @{tf.Variable}, it - is added by default to collections representing "global variables" and - "trainable variables". When you later come to create a @{tf.train.Saver} or - @{tf.train.Optimizer}, the variables in these collections are used as the - default arguments. - - -## Building a @{tf.Graph} - -Most TensorFlow programs start with a dataflow graph construction phase. In this -phase, you invoke TensorFlow API functions that construct new @{tf.Operation} -(node) and @{tf.Tensor} (edge) objects and add them to a @{tf.Graph} -instance. TensorFlow provides a **default graph** that is an implicit argument -to all API functions in the same context. For example: - -* Calling `tf.constant(42.0)` creates a single @{tf.Operation} that produces the - value `42.0`, adds it to the default graph, and returns a @{tf.Tensor} that - represents the value of the constant. - -* Calling `tf.matmul(x, y)` creates a single @{tf.Operation} that multiplies - the values of @{tf.Tensor} objects `x` and `y`, adds it to the default graph, - and returns a @{tf.Tensor} that represents the result of the multiplication. - -* Executing `v = tf.Variable(0)` adds to the graph a @{tf.Operation} that will - store a writeable tensor value that persists between @{tf.Session.run} calls. - The @{tf.Variable} object wraps this operation, and can be used [like a - tensor](#tensor-like_objects), which will read the current value of the - stored value. The @{tf.Variable} object also has methods such as - @{tf.Variable.assign$`assign`} and @{tf.Variable.assign_add$`assign_add`} that - create @{tf.Operation} objects that, when executed, update the stored value. - (See @{$programmers_guide/variables} for more information about variables.) - -* Calling @{tf.train.Optimizer.minimize} will add operations and tensors to the - default graph that calculates gradients, and return a @{tf.Operation} that, - when run, will apply those gradients to a set of variables. - -Most programs rely solely on the default graph. However, -see [Dealing with multiple graphs](#programming_with_multiple_graphs) for more -advanced use cases. High-level APIs such as the @{tf.estimator.Estimator} API -manage the default graph on your behalf, and--for example--may create different -graphs for training and evaluation. - -Note: Calling most functions in the TensorFlow API merely adds operations -and tensors to the default graph, but **does not** perform the actual -computation. Instead, you compose these functions until you have a @{tf.Tensor} -or @{tf.Operation} that represents the overall computation--such as performing -one step of gradient descent--and then pass that object to a @{tf.Session} to -perform the computation. See the section "Executing a graph in a @{tf.Session}" -for more details. - -## Naming operations - -A @{tf.Graph} object defines a **namespace** for the @{tf.Operation} objects it -contains. TensorFlow automatically chooses a unique name for each operation in -your graph, but giving operations descriptive names can make your program easier -to read and debug. The TensorFlow API provides two ways to override the name of -an operation: - -* Each API function that creates a new @{tf.Operation} or returns a new - @{tf.Tensor} accepts an optional `name` argument. For example, - `tf.constant(42.0, name="answer")` creates a new @{tf.Operation} named - `"answer"` and returns a @{tf.Tensor} named `"answer:0"`. If the default graph - already contains an operation named `"answer"`, then TensorFlow would append - `"_1"`, `"_2"`, and so on to the name, in order to make it unique. - -* The @{tf.name_scope} function makes it possible to add a **name scope** prefix - to all operations created in a particular context. The current name scope - prefix is a `"/"`-delimited list of the names of all active @{tf.name_scope} - context managers. If a name scope has already been used in the current - context, TensorFlow appends `"_1"`, `"_2"`, and so on. For example: - - ```python - c_0 = tf.constant(0, name="c") # => operation named "c" - - # Already-used names will be "uniquified". - c_1 = tf.constant(2, name="c") # => operation named "c_1" - - # Name scopes add a prefix to all operations created in the same context. - with tf.name_scope("outer"): - c_2 = tf.constant(2, name="c") # => operation named "outer/c" - - # Name scopes nest like paths in a hierarchical file system. - with tf.name_scope("inner"): - c_3 = tf.constant(3, name="c") # => operation named "outer/inner/c" - - # Exiting a name scope context will return to the previous prefix. - c_4 = tf.constant(4, name="c") # => operation named "outer/c_1" - - # Already-used name scopes will be "uniquified". - with tf.name_scope("inner"): - c_5 = tf.constant(5, name="c") # => operation named "outer/inner_1/c" - ``` - -The graph visualizer uses name scopes to group operations and reduce the visual -complexity of a graph. See [Visualizing your graph](#visualizing-your-graph) for -more information. - -Note that @{tf.Tensor} objects are implicitly named after the @{tf.Operation} -that produces the tensor as output. A tensor name has the form `":"` -where: - -* `""` is the name of the operation that produces it. -* `""` is an integer representing the index of that tensor among the - operation's outputs. - -## Placing operations on different devices - -If you want your TensorFlow program to use multiple different devices, the -@{tf.device} function provides a convenient way to request that all operations -created in a particular context are placed on the same device (or type of -device). - -A **device specification** has the following form: - -``` -/job:/task:/device:: -``` - -where: - -* `` is an alpha-numeric string that does not start with a number. -* `` is a registered device type (such as `GPU` or `CPU`). -* `` is a non-negative integer representing the index of the task - in the job named ``. See @{tf.train.ClusterSpec} for an explanation - of jobs and tasks. -* `` is a non-negative integer representing the index of the - device, for example, to distinguish between different GPU devices used in the - same process. - -You do not need to specify every part of a device specification. For example, -if you are running in a single-machine configuration with a single GPU, you -might use @{tf.device} to pin some operations to the CPU and GPU: - -```python -# Operations created outside either context will run on the "best possible" -# device. For example, if you have a GPU and a CPU available, and the operation -# has a GPU implementation, TensorFlow will choose the GPU. -weights = tf.random_normal(...) - -with tf.device("/device:CPU:0"): - # Operations created in this context will be pinned to the CPU. - img = tf.decode_jpeg(tf.read_file("img.jpg")) - -with tf.device("/device:GPU:0"): - # Operations created in this context will be pinned to the GPU. - result = tf.matmul(weights, img) -``` -If you are deploying TensorFlow in a @{$distributed$typical distributed configuration}, -you might specify the job name and task ID to place variables on -a task in the parameter server job (`"/job:ps"`), and the other operations on -task in the worker job (`"/job:worker"`): - -```python -with tf.device("/job:ps/task:0"): - weights_1 = tf.Variable(tf.truncated_normal([784, 100])) - biases_1 = tf.Variable(tf.zeroes([100])) - -with tf.device("/job:ps/task:1"): - weights_2 = tf.Variable(tf.truncated_normal([100, 10])) - biases_2 = tf.Variable(tf.zeroes([10])) - -with tf.device("/job:worker"): - layer_1 = tf.matmul(train_batch, weights_1) + biases_1 - layer_2 = tf.matmul(train_batch, weights_2) + biases_2 -``` - -@{tf.device} gives you a lot of flexibility to choose placements for individual -operations or broad regions of a TensorFlow graph. In many cases, there are -simple heuristics that work well. For example, the -@{tf.train.replica_device_setter} API can be used with @{tf.device} to place -operations for **data-parallel distributed training**. For example, the -following code fragment shows how @{tf.train.replica_device_setter} applies -different placement policies to @{tf.Variable} objects and other operations: - -```python -with tf.device(tf.train.replica_device_setter(ps_tasks=3)): - # tf.Variable objects are, by default, placed on tasks in "/job:ps" in a - # round-robin fashion. - w_0 = tf.Variable(...) # placed on "/job:ps/task:0" - b_0 = tf.Variable(...) # placed on "/job:ps/task:1" - w_1 = tf.Variable(...) # placed on "/job:ps/task:2" - b_1 = tf.Variable(...) # placed on "/job:ps/task:0" - - input_data = tf.placeholder(tf.float32) # placed on "/job:worker" - layer_0 = tf.matmul(input_data, w_0) + b_0 # placed on "/job:worker" - layer_1 = tf.matmul(layer_0, w_1) + b_1 # placed on "/job:worker" -``` - -## Tensor-like objects - -Many TensorFlow operations take one or more @{tf.Tensor} objects as arguments. -For example, @{tf.matmul} takes two @{tf.Tensor} objects, and @{tf.add_n} takes -a list of `n` @{tf.Tensor} objects. For convenience, these functions will accept -a **tensor-like object** in place of a @{tf.Tensor}, and implicitly convert it -to a @{tf.Tensor} using the @{tf.convert_to_tensor} method. Tensor-like objects -include elements of the following types: - -* @{tf.Tensor} -* @{tf.Variable} -* [`numpy.ndarray`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.html) -* `list` (and lists of tensor-like objects) -* Scalar Python types: `bool`, `float`, `int`, `str` - -You can register additional tensor-like types using -@{tf.register_tensor_conversion_function}. - -Note: By default, TensorFlow will create a new @{tf.Tensor} each time you use -the same tensor-like object. If the tensor-like object is large (e.g. a -`numpy.ndarray` containing a set of training examples) and you use it multiple -times, you may run out of memory. To avoid this, manually call -@{tf.convert_to_tensor} on the tensor-like object once and use the returned -@{tf.Tensor} instead. - -## Executing a graph in a @{tf.Session} - -TensorFlow uses the @{tf.Session} class to represent a connection between the -client program---typically a Python program, although a similar interface is -available in other languages---and the C++ runtime. A @{tf.Session} object -provides access to devices in the local machine, and remote devices using the -distributed TensorFlow runtime. It also caches information about your -@{tf.Graph} so that you can efficiently run the same computation multiple times. - -### Creating a @{tf.Session} - -If you are using the low-level TensorFlow API, you can create a @{tf.Session} -for the current default graph as follows: - -```python -# Create a default in-process session. -with tf.Session() as sess: - # ... - -# Create a remote session. -with tf.Session("grpc://example.org:2222"): - # ... -``` - -Since a @{tf.Session} owns physical resources (such as GPUs and -network connections), it is typically used as a context manager (in a `with` -block) that automatically closes the session when you exit the block. It is -also possible to create a session without using a `with` block, but you should -explicitly call @{tf.Session.close} when you are finished with it to free the -resources. - -Note: Higher-level APIs such as @{tf.train.MonitoredTrainingSession} or -@{tf.estimator.Estimator} will create and manage a @{tf.Session} for you. These -APIs accept optional `target` and `config` arguments (either directly, or as -part of a @{tf.estimator.RunConfig} object), with the same meaning as -described below. - -@{tf.Session.__init__} accepts three optional arguments: - -* **`target`.** If this argument is left empty (the default), the session will - only use devices in the local machine. However, you may also specify a - `grpc://` URL to specify the address of a TensorFlow server, which gives the - session access to all devices on machines that this server controls. See - @{tf.train.Server} for details of how to create a TensorFlow - server. For example, in the common **between-graph replication** - configuration, the @{tf.Session} connects to a @{tf.train.Server} in the same - process as the client. The [distributed TensorFlow](../deploy/distributed.md) - deployment guide describes other common scenarios. - -* **`graph`.** By default, a new @{tf.Session} will be bound to---and only able - to run operations in---the current default graph. If you are using multiple - graphs in your program (see [Programming with multiple - graphs](#programming_with_multiple_graphs) for more details), you can specify - an explicit @{tf.Graph} when you construct the session. - -* **`config`.** This argument allows you to specify a @{tf.ConfigProto} that - controls the behavior of the session. For example, some of the configuration - options include: - - * `allow_soft_placement`. Set this to `True` to enable a "soft" device - placement algorithm, which ignores @{tf.device} annotations that attempt - to place CPU-only operations on a GPU device, and places them on the CPU - instead. - - * `cluster_def`. When using distributed TensorFlow, this option allows you - to specify what machines to use in the computation, and provide a mapping - between job names, task indices, and network addresses. See - @{tf.train.ClusterSpec.as_cluster_def} for details. - - * `graph_options.optimizer_options`. Provides control over the optimizations - that TensorFlow performs on your graph before executing it. - - * `gpu_options.allow_growth`. Set this to `True` to change the GPU memory - allocator so that it gradually increases the amount of memory allocated, - rather than allocating most of the memory at startup. - - -### Using @{tf.Session.run} to execute operations - -The @{tf.Session.run} method is the main mechanism for running a @{tf.Operation} -or evaluating a @{tf.Tensor}. You can pass one or more @{tf.Operation} or -@{tf.Tensor} objects to @{tf.Session.run}, and TensorFlow will execute the -operations that are needed to compute the result. - -@{tf.Session.run} requires you to specify a list of **fetches**, which determine -the return values, and may be a @{tf.Operation}, a @{tf.Tensor}, or -a [tensor-like type](#tensor-like_objects) such as @{tf.Variable}. These fetches -determine what **subgraph** of the overall @{tf.Graph} must be executed to -produce the result: this is the subgraph that contains all operations named in -the fetch list, plus all operations whose outputs are used to compute the value -of the fetches. For example, the following code fragment shows how different -arguments to @{tf.Session.run} cause different subgraphs to be executed: - -```python -x = tf.constant([[37.0, -23.0], [1.0, 4.0]]) -w = tf.Variable(tf.random_uniform([2, 2])) -y = tf.matmul(x, w) -output = tf.nn.softmax(y) -init_op = w.initializer - -with tf.Session() as sess: - # Run the initializer on `w`. - sess.run(init_op) - - # Evaluate `output`. `sess.run(output)` will return a NumPy array containing - # the result of the computation. - print(sess.run(output)) - - # Evaluate `y` and `output`. Note that `y` will only be computed once, and its - # result used both to return `y_val` and as an input to the `tf.nn.softmax()` - # op. Both `y_val` and `output_val` will be NumPy arrays. - y_val, output_val = sess.run([y, output]) -``` - -@{tf.Session.run} also optionally takes a dictionary of **feeds**, which is a -mapping from @{tf.Tensor} objects (typically @{tf.placeholder} tensors) to -values (typically Python scalars, lists, or NumPy arrays) that will be -substituted for those tensors in the execution. For example: - -```python -# Define a placeholder that expects a vector of three floating-point values, -# and a computation that depends on it. -x = tf.placeholder(tf.float32, shape=[3]) -y = tf.square(x) - -with tf.Session() as sess: - # Feeding a value changes the result that is returned when you evaluate `y`. - print(sess.run(y, {x: [1.0, 2.0, 3.0]})) # => "[1.0, 4.0, 9.0]" - print(sess.run(y, {x: [0.0, 0.0, 5.0]})) # => "[0.0, 0.0, 25.0]" - - # Raises `tf.errors.InvalidArgumentError`, because you must feed a value for - # a `tf.placeholder()` when evaluating a tensor that depends on it. - sess.run(y) - - # Raises `ValueError`, because the shape of `37.0` does not match the shape - # of placeholder `x`. - sess.run(y, {x: 37.0}) -``` - -@{tf.Session.run} also accepts an optional `options` argument that enables you -to specify options about the call, and an optional `run_metadata` argument that -enables you to collect metadata about the execution. For example, you can use -these options together to collect tracing information about the execution: - -``` -y = tf.matmul([[37.0, -23.0], [1.0, 4.0]], tf.random_uniform([2, 2])) - -with tf.Session() as sess: - # Define options for the `sess.run()` call. - options = tf.RunOptions() - options.output_partition_graphs = True - options.trace_level = tf.RunOptions.FULL_TRACE - - # Define a container for the returned metadata. - metadata = tf.RunMetadata() - - sess.run(y, options=options, run_metadata=metadata) - - # Print the subgraphs that executed on each device. - print(metadata.partition_graphs) - - # Print the timings of each operation that executed. - print(metadata.step_stats) -``` - - -## Visualizing your graph - -TensorFlow includes tools that can help you to understand the code in a graph. -The **graph visualizer** is a component of TensorBoard that renders the -structure of your graph visually in a browser. The easiest way to create a -visualization is to pass a @{tf.Graph} when creating the -@{tf.summary.FileWriter}: - -```python -# Build your graph. -x = tf.constant([[37.0, -23.0], [1.0, 4.0]]) -w = tf.Variable(tf.random_uniform([2, 2])) -y = tf.matmul(x, w) -# ... -loss = ... -train_op = tf.train.AdagradOptimizer(0.01).minimize(loss) - -with tf.Session() as sess: - # `sess.graph` provides access to the graph used in a `tf.Session`. - writer = tf.summary.FileWriter("/tmp/log/...", sess.graph) - - # Perform your computation... - for i in range(1000): - sess.run(train_op) - # ... - - writer.close() -``` - -Note: If you are using a @{tf.estimator.Estimator}, the graph (and any -summaries) will be logged automatically to the `model_dir` that you specified -when creating the estimator. - -You can then open the log in `tensorboard`, navigate to the "Graph" tab, and -see a high-level visualization of your graph's structure. Note that a typical -TensorFlow graph---especially training graphs with automatically computed -gradients---has too many nodes to visualize at once. The graph visualizer makes -use of name scopes to group related operations into "super" nodes. You can -click on the orange "+" button on any of these super nodes to expand the -subgraph inside. - -![](../images/mnist_deep.png) - -For more information about visualizing your TensorFlow application with -TensorBoard, see the [TensorBoard tutorial](../get_started/summaries_and_tensorboard.md). - -## Programming with multiple graphs - -Note: When training a model, a common way of organizing your code is to use one -graph for training your model, and a separate graph for evaluating or performing -inference with a trained model. In many cases, the inference graph will be -different from the training graph: for example, techniques like dropout and -batch normalization use different operations in each case. Furthermore, by -default utilities like @{tf.train.Saver} use the names of @{tf.Variable} objects -(which have names based on an underlying @{tf.Operation}) to identify each -variable in a saved checkpoint. When programming this way, you can either use -completely separate Python processes to build and execute the graphs, or you can -use multiple graphs in the same process. This section describes how to use -multiple graphs in the same process. - -As noted above, TensorFlow provides a "default graph" that is implicitly passed -to all API functions in the same context. For many applications, a single graph -is sufficient. However, TensorFlow also provides methods for manipulating -the default graph, which can be useful in more advanced use cases. For example: - -* A @{tf.Graph} defines the namespace for @{tf.Operation} objects: each - operation in a single graph must have a unique name. TensorFlow will - "uniquify" the names of operations by appending `"_1"`, `"_2"`, and so on to - their names if the requested name is already taken. Using multiple explicitly - created graphs gives you more control over what name is given to each - operation. - -* The default graph stores information about every @{tf.Operation} and - @{tf.Tensor} that was ever added to it. If your program creates a large number - of unconnected subgraphs, it may be more efficient to use a different - @{tf.Graph} to build each subgraph, so that unrelated state can be garbage - collected. - -You can install a different @{tf.Graph} as the default graph, using the -@{tf.Graph.as_default} context manager: - -```python -g_1 = tf.Graph() -with g_1.as_default(): - # Operations created in this scope will be added to `g_1`. - c = tf.constant("Node in g_1") - - # Sessions created in this scope will run operations from `g_1`. - sess_1 = tf.Session() - -g_2 = tf.Graph() -with g_2.as_default(): - # Operations created in this scope will be added to `g_2`. - d = tf.constant("Node in g_2") - -# Alternatively, you can pass a graph when constructing a `tf.Session`: -# `sess_2` will run operations from `g_2`. -sess_2 = tf.Session(graph=g_2) - -assert c.graph is g_1 -assert sess_1.graph is g_1 - -assert d.graph is g_2 -assert sess_2.graph is g_2 -``` - -To inspect the current default graph, call @{tf.get_default_graph}, which -returns a @{tf.Graph} object: - -```python -# Print all of the operations in the default graph. -g = tf.get_default_graph() -print(g.get_operations()) -``` diff --git a/tensorflow/docs_src/programmers_guide/index.md b/tensorflow/docs_src/programmers_guide/index.md deleted file mode 100644 index 9c58a3b45e..0000000000 --- a/tensorflow/docs_src/programmers_guide/index.md +++ /dev/null @@ -1,86 +0,0 @@ -# Programmer's Guide - -The documents in this unit dive into the details of how TensorFlow -works. The units are as follows: - -## High Level APIs - - * @{$programmers_guide/keras}, TensorFlow's high-level API for building and - training deep learning models. - * @{$programmers_guide/eager}, an API for writing TensorFlow code - imperatively, like you would use Numpy. - * @{$programmers_guide/estimators}, a high-level API that provides - fully-packaged models ready for large-scale training and production. - * @{$programmers_guide/datasets}, easy input pipelines to bring your data into - your TensorFlow program. - -## Estimators - -* @{$estimators} provides an introduction. -* @{$premade_estimators}, introduces Estimators for machine learning. -* @{$custom_estimators}, which demonstrates how to build and train models you - design yourself. -* @{$feature_columns}, which shows how an Estimator can handle a variety of input - data types without changes to the model. -* @{$datasets_for_estimators} describes using tf.data with estimators. -* @{$checkpoints}, which explains how to save training progress and resume where - you left off. - -## Accelerators - - * @{$using_gpu} explains how TensorFlow assigns operations to - devices and how you can change the arrangement manually. - * @{$using_tpu} explains how to modify `Estimator` programs to run on a TPU. - -## Low Level APIs - - * @{$programmers_guide/low_level_intro}, which introduces the - basics of how you can use TensorFlow outside of the high Level APIs. - * @{$programmers_guide/tensors}, which explains how to create, - manipulate, and access Tensors--the fundamental object in TensorFlow. - * @{$programmers_guide/variables}, which details how - to represent shared, persistent state in your program. - * @{$programmers_guide/graphs}, which explains: - * dataflow graphs, which are TensorFlow's representation of computations - as dependencies between operations. - * sessions, which are TensorFlow's mechanism for running dataflow graphs - across one or more local or remote devices. - If you are programming with the low-level TensorFlow API, this unit - is essential. If you are programming with a high-level TensorFlow API - such as Estimators or Keras, the high-level API creates and manages - graphs and sessions for you, but understanding graphs and sessions - can still be helpful. - * @{$programmers_guide/saved_model}, which - explains how to save and restore variables and models. - -## ML Concepts - - * @{$programmers_guide/embedding}, which introduces the concept - of embeddings, provides a simple example of training an embedding in - TensorFlow, and explains how to view embeddings with the TensorBoard - Embedding Projector. - -## Debugging - - * @{$programmers_guide/debugger}, which - explains how to use the TensorFlow debugger (tfdbg). - -## TensorBoard - -TensorBoard is a utility to visualize different aspects of machine learning. -The following guides explain how to use TensorBoard: - - * @{$programmers_guide/summaries_and_tensorboard}, - which introduces TensorBoard. - * @{$programmers_guide/graph_viz}, which - explains how to visualize the computational graph. - * @{$programmers_guide/tensorboard_histograms} which demonstrates the how to - use TensorBoard's histogram dashboard. - - -## Misc - - * @{$programmers_guide/version_compat}, - which explains backward compatibility guarantees and non-guarantees. - * @{$programmers_guide/faq}, which contains frequently asked - questions about TensorFlow. diff --git a/tensorflow/docs_src/programmers_guide/keras.md b/tensorflow/docs_src/programmers_guide/keras.md deleted file mode 100644 index c6aca7ebf4..0000000000 --- a/tensorflow/docs_src/programmers_guide/keras.md +++ /dev/null @@ -1,623 +0,0 @@ -# Keras - -Keras is a high-level API to build and train deep learning models. It's used for -fast prototyping, advanced research, and production, with three key advantages: - -- *User friendly*
- Keras has a simple, consistent interface optimized for common use cases. It - provides clear and actionable feedback for user errors. -- *Modular and composable*
- Keras models are made by connecting configurable building blocks together, - with few restrictions. -- *Easy to extend*
Write custom building blocks to express new ideas for - research. Create new layers, loss functions, and develop state-of-the-art - models. - -## Import tf.keras - -`tf.keras` is TensorFlow's implementation of the -[Keras API specification](https://keras.io){:.external}. This is a high-level -API to build and train models that includes first-class support for -TensorFlow-specific functionality, such as [eager execution](#eager_execution), -`tf.data` pipelines, and [Estimators](/programmers_guide/estimators). -`tf.keras` makes TensorFlow easier to use without sacrificing flexibility and -performance. - -To get started, import `tf.keras` as part of your TensorFlow program setup: - -```python -import tensorflow as tf -from tensorflow import keras -``` - -`tf.keras` can run any Keras-compatible code, but keep in mind: - -* The `tf.keras` version in the latest TensorFlow release might not be the same - as the latest `keras` version from PyPI. Check `tf.keras.__version__`. -* When [saving a model's weights](#weights_only), `tf.keras` defaults to the - [checkpoint format](/get_started/checkpoints). Pass `save_format='h5'` to use - HDF5. - -## Build a simple model - -### Sequential model - -In Keras, you assemble *layers* to build *models*. A model is (usually) a graph -of layers. The most common type of model is a stack of layers: the -`tf.keras.Sequential` model. - -To build a simple, fully-connected network (i.e. multi-layer perceptron): - -```python -model = keras.Sequential() -# Adds a densely-connected layer with 64 units to the model: -model.add(keras.layers.Dense(64, activation='relu')) -# Add another: -model.add(keras.layers.Dense(64, activation='relu')) -# Add a softmax layer with 10 output units: -model.add(keras.layers.Dense(10, activation='softmax')) -``` - -### Configure the layers - -There are many `tf.keras.layers` available with some common constructor -parameters: - -* `activation`: Set the activation function for the layer. This parameter is - specified by the name of a built-in function or as a callable object. By - default, no activation is applied. -* `kernel_initializer` and `bias_initializer`: The initialization schemes - that create the layer's weights (kernel and bias). This parameter is a name or - a callable object. This defaults to the `"Glorot uniform"` initializer. -* `kernel_regularizer` and `bias_regularizer`: The regularization schemes - that apply the layer's weights (kernel and bias), such as L1 or L2 - regularization. By default, no regularization is applied. - -The following instantiates `tf.keras.layers.Dense` layers using constructor -arguments: - -```python -# Create a sigmoid layer: -layers.Dense(64, activation='sigmoid') -# Or: -layers.Dense(64, activation=tf.sigmoid) - -# A linear layer with L1 regularization of factor 0.01 applied to the kernel matrix: -layers.Dense(64, kernel_regularizer=keras.regularizers.l1(0.01)) -# A linear layer with L2 regularization of factor 0.01 applied to the bias vector: -layers.Dense(64, bias_regularizer=keras.regularizers.l2(0.01)) - -# A linear layer with a kernel initialized to a random orthogonal matrix: -layers.Dense(64, kernel_initializer='orthogonal') -# A linear layer with a bias vector initialized to 2.0s: -layers.Dense(64, bias_initializer=keras.initializers.constant(2.0)) -``` - -## Train and evaluate - -### Set up training - -After the model is constructed, configure its learning process by calling the -`compile` method: - -```python -model.compile(optimizer=tf.train.AdamOptimizer(0.001), - loss='categorical_crossentropy', - metrics=['accuracy']) -``` - -`tf.keras.Model.compile` takes three important arguments: - -* `optimizer`: This object specifies the training procedure. Pass it optimizer - instances from the `tf.train` module, such as - [`AdamOptimizer`](/api_docs/python/tf/train/AdamOptimizer), - [`RMSPropOptimizer`](/api_docs/python/tf/train/RMSPropOptimizer), or - [`GradientDescentOptimizer`](/api_docs/python/tf/train/GradientDescentOptimizer). -* `loss`: The function to minimize during optimization. Common choices include - mean square error (`mse`), `categorical_crossentropy`, and - `binary_crossentropy`. Loss functions are specified by name or by - passing a callable object from the `tf.keras.losses` module. -* `metrics`: Used to monitor training. These are string names or callables from - the `tf.keras.metrics` module. - -The following shows a few examples of configuring a model for training: - -```python -# Configure a model for mean-squared error regression. -model.compile(optimizer=tf.train.AdamOptimizer(0.01), - loss='mse', # mean squared error - metrics=['mae']) # mean absolute error - -# Configure a model for categorical classification. -model.compile(optimizer=tf.train.RMSPropOptimizer(0.01), - loss=keras.losses.categorical_crossentropy, - metrics=[keras.metrics.categorical_accuracy]) -``` - -### Input NumPy data - -For small datasets, use in-memory [NumPy](https://www.numpy.org/){:.external} -arrays to train and evaluate a model. The model is "fit" to the training data -using the `fit` method: - -```python -import numpy as np - -data = np.random.random((1000, 32)) -labels = np.random.random((1000, 10)) - -model.fit(data, labels, epochs=10, batch_size=32) -``` - -`tf.keras.Model.fit` takes three important arguments: - -* `epochs`: Training is structured into *epochs*. An epoch is one iteration over - the entire input data (this is done in smaller batches). -* `batch_size`: When passed NumPy data, the model slices the data into smaller - batches and iterates over these batches during training. This integer - specifies the size of each batch. Be aware that the last batch may be smaller - if the total number of samples is not divisible by the batch size. -* `validation_data`: When prototyping a model, you want to easily monitor its - performance on some validation data. Passing this argument—a tuple of inputs - and labels—allows the model to display the loss and metrics in inference mode - for the passed data, at the end of each epoch. - -Here's an example using `validation_data`: - -```python -import numpy as np - -data = np.random.random((1000, 32)) -labels = np.random.random((1000, 10)) - -val_data = np.random.random((100, 32)) -val_labels = np.random.random((100, 10)) - -model.fit(data, labels, epochs=10, batch_size=32, - validation_data=(val_data, val_labels)) -``` - -### Input tf.data datasets - -Use the [Datasets API](/programmers_guide/datasets) to scale to large datasets -or multi-device training. Pass a `tf.data.Dataset` instance to the `fit` -method: - -```python -# Instantiates a toy dataset instance: -dataset = tf.data.Dataset.from_tensor_slices((data, labels)) -dataset = dataset.batch(32) -dataset = dataset.repeat() - -# Don't forget to specify `steps_per_epoch` when calling `fit` on a dataset. -model.fit(dataset, epochs=10, steps_per_epoch=30) -``` - -Here, the `fit` method uses the `steps_per_epoch` argument—this is the number of -training steps the model runs before it moves to the next epoch. Since the -`Dataset` yields batches of data, this snippet does not require a `batch_size`. - -Datasets can also be used for validation: - -```python -dataset = tf.data.Dataset.from_tensor_slices((data, labels)) -dataset = dataset.batch(32).repeat() - -val_dataset = tf.data.Dataset.from_tensor_slices((val_data, val_labels)) -val_dataset = val_dataset.batch(32).repeat() - -model.fit(dataset, epochs=10, steps_per_epoch=30, - validation_data=val_dataset, - validation_steps=3) -``` - -### Evaluate and predict - -The `tf.keras.Model.evaluate` and `tf.keras.Model.predict` methods can use NumPy -data and a `tf.data.Dataset`. - -To *evaluate* the inference-mode loss and metrics for the data provided: - -```python -model.evaluate(x, y, batch_size=32) - -model.evaluate(dataset, steps=30 -``` - -And to *predict* the output of the last layer in inference for the data provided, -as a NumPy array: - -``` -model.predict(x, batch_size=32) - -model.predict(dataset, steps=30) -``` - - -## Build advanced models - -### Functional API - -The `tf.keras.Sequential` model is a simple stack of layers that cannot -represent arbitrary models. Use the -[Keras functional API](https://keras.io/getting-started/functional-api-guide/){:.external} -to build complex model topologies such as: - -* Multi-input models, -* Multi-output models, -* Models with shared layers (the same layer called several times), -* Models with non-sequential data flows (e.g. residual connections). - -Building a model with the functional API works like this: - -1. A layer instance is callable and returns a tensor. -2. Input tensors and output tensors are used to define a `tf.keras.Model` - instance. -3. This model is trained just like the `Sequential` model. - -The following example uses the functional API to build a simple, fully-connected -network: - -```python -inputs = keras.Input(shape=(32,)) # Returns a placeholder tensor - -# A layer instance is callable on a tensor, and returns a tensor. -x = keras.layers.Dense(64, activation='relu')(inputs) -x = keras.layers.Dense(64, activation='relu')(x) -predictions = keras.layers.Dense(10, activation='softmax')(x) - -# Instantiate the model given inputs and outputs. -model = keras.Model(inputs=inputs, outputs=predictions) - -# The compile step specifies the training configuration. -model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), - loss='categorical_crossentropy', - metrics=['accuracy']) - -# Trains for 5 epochs -model.fit(data, labels, batch_size=32, epochs=5) -``` - -### Model subclassing - -Build a fully-customizable model by subclassing `tf.keras.Model` and defining -your own forward pass. Create layers in the `__init__` method and set them as -attributes of the class instance. Define the forward pass in the `call` method. - -Model subclassing is particularly useful when -[eager execution](/programmers_guide/eager) is enabled since the forward pass -can be written imperatively. - -Key Point: Use the right API for the job. While model subclassing offers -flexibility, it comes at a cost of greater complexity and more opportunities for -user errors. If possible, prefer the functional API. - -The following example shows a subclassed `tf.keras.Model` using a custom forward -pass: - -```python -class MyModel(keras.Model): - - def __init__(self, num_classes=10): - super(MyModel, self).__init__(name='my_model') - self.num_classes = num_classes - # Define your layers here. - self.dense_1 = keras.layers.Dense(32, activation='relu') - self.dense_2 = keras.layers.Dense(num_classes, activation='sigmoid') - - def call(self, inputs): - # Define your forward pass here, - # using layers you previously defined (in `__init__`). - x = self.dense_1(inputs) - return self.dense_2(x) - - def compute_output_shape(self, input_shape): - # You need to override this function if you want to use the subclassed model - # as part of a functional-style model. - # Otherwise, this method is optional. - shape = tf.TensorShape(input_shape).as_list() - shape[-1] = self.num_classes - return tf.TensorShape(shape) - - -# Instantiates the subclassed model. -model = MyModel(num_classes=10) - -# The compile step specifies the training configuration. -model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), - loss='categorical_crossentropy', - metrics=['accuracy']) - -# Trains for 5 epochs. -model.fit(data, labels, batch_size=32, epochs=5) -``` - - -### Custom layers - -Create a custom layer by subclassing `tf.keras.layers.Layer` and implementing -the following methods: - -* `build`: Create the weights of the layer. Add weights with the `add_weight` - method. -* `call`: Define the forward pass. -* `compute_output_shape`: Specify how to compute the output shape of the layer - given the input shape. -* Optionally, a layer can be serialized by implementing the `get_config` method - and the `from_config` class method. - -Here's an example of a custom layer that implements a `matmul` of an input with -a kernel matrix: - -```python -class MyLayer(keras.layers.Layer): - - def __init__(self, output_dim, **kwargs): - self.output_dim = output_dim - super(MyLayer, self).__init__(**kwargs) - - def build(self, input_shape): - shape = tf.TensorShape((input_shape[1], self.output_dim)) - # Create a trainable weight variable for this layer. - self.kernel = self.add_weight(name='kernel', - shape=shape, - initializer='uniform', - trainable=True) - # Be sure to call this at the end - super(MyLayer, self).build(input_shape) - - def call(self, inputs): - return tf.matmul(inputs, self.kernel) - - def compute_output_shape(self, input_shape): - shape = tf.TensorShape(input_shape).as_list() - shape[-1] = self.output_dim - return tf.TensorShape(shape) - - def get_config(self): - base_config = super(MyLayer, self).get_config() - base_config['output_dim'] = self.output_dim - - @classmethod - def from_config(cls, config): - return cls(**config) - - -# Create a model using the custom layer -model = keras.Sequential([MyLayer(10), - keras.layers.Activation('softmax')]) - -# The compile step specifies the training configuration -model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), - loss='categorical_crossentropy', - metrics=['accuracy']) - -# Trains for 5 epochs. -model.fit(data, targets, batch_size=32, epochs=5) -``` - - -## Callbacks - -A callback is an object passed to a model to customize and extend its behavior -during training. You can write your own custom callback, or use the built-in -`tf.keras.callbacks` that include: - -* `tf.keras.callbacks.ModelCheckpoint`: Save checkpoints of your model at - regular intervals. -* `tf.keras.callbacks.LearningRateScheduler`: Dynamically change the learning - rate. -* `tf.keras.callbacks.EarlyStopping`: Interrupt training when validation - performance has stopped improving. -* `tf.keras.callbacks.TensorBoard`: Monitor the model's behavior using - [TensorBoard](/programmers_guide/summaries_and_tensorboard). - -To use a `tf.keras.callbacks.Callback`, pass it to the model's `fit` method: - -```python -callbacks = [ - # Interrupt training if `val_loss` stops improving for over 2 epochs - keras.callbacks.EarlyStopping(patience=2, monitor='val_loss'), - # Write TensorBoard logs to `./logs` directory - keras.callbacks.TensorBoard(log_dir='./logs') -] -model.fit(data, labels, batch_size=32, epochs=5, callbacks=callbacks, - validation_data=(val_data, val_targets)) -``` - - -## Save and restore - -### Weights only - -Save and load the weights of a model using `tf.keras.Model.save_weights`: - -```python -# Save weights to a TensorFlow Checkpoint file -model.save_weights('./my_model') - -# Restore the model's state, -# this requires a model with the same architecture. -model.load_weights('my_model') -``` - -By default, this saves the model's weights in the -[TensorFlow checkpoint](/get_started/checkpoints) file format. Weights can also -be saved to the Keras HDF5 format (the default for the multi-backend -implementation of Keras): - -```python -# Save weights to a HDF5 file -model.save_weights('my_model.h5', save_format='h5') - -# Restore the model's state -model.load_weights('my_model.h5') -``` - - -### Configuration only - -A model's configuration can be saved—this serializes the model architecture -without any weights. A saved configuration can recreate and initialize the same -model, even without the code that defined the original model. Keras supports -JSON and YAML serialization formats: - -```python -# Serialize a model to JSON format -json_string = model.to_json() - -# Recreate the model (freshly initialized) -fresh_model = keras.models.from_json(json_string) - -# Serializes a model to YAML format -yaml_string = model.to_yaml() - -# Recreate the model -fresh_model = keras.models.from_yaml(yaml_string) -``` - -Caution: Subclassed models are not serializable because their architecture is -defined by the Python code in the body of the `call` method. - - -### Entire model - -The entire model can be saved to a file that contains the weight values, the -model's configuration, and even the optimizer's configuration. This allows you -to checkpoint a model and resume training later—from the exact same -state—without access to the original code. - -```python -# Create a trivial model -model = keras.Sequential([ - keras.layers.Dense(10, activation='softmax', input_shape=(32,)), - keras.layers.Dense(10, activation='softmax') -]) -model.compile(optimizer='rmsprop', - loss='categorical_crossentropy', - metrics=['accuracy']) -model.fit(data, targets, batch_size=32, epochs=5) - - -# Save entire model to a HDF5 file -model.save('my_model.h5') - -# Recreate the exact same model, including weights and optimizer. -model = keras.models.load_model('my_model.h5') -``` - - -## Eager execution - -[Eager execution](/programmers_guide/eager) is an imperative programming -environment that evaluates operations immediately. This is not required for -Keras, but is supported by `tf.keras` and useful for inspecting your program and -debugging. - -All of the `tf.keras` model-building APIs are compatible with eager execution. -And while the `Sequential` and functional APIs can be used, eager execution -especially benefits *model subclassing* and building *custom layers*—the APIs -that require you to write the forward pass as code (instead of the APIs that -create models by assembling existing layers). - -See the [eager execution guide](/programmers_guide/eager#build_a_model) for -examples of using Keras models with custom training loops and `tf.GradientTape`. - - -## Distribution - -### Estimators - -The [Estimators](/programmers_guide/estimators) API is used for training models -for distributed environments. This targets industry use cases such as -distributed training on large datasets that can export a model for production. - -A `tf.keras.Model` can be trained with the `tf.estimator` API by converting the -model to an `tf.estimator.Estimator` object with -`tf.keras.estimator.model_to_estimator`. See -[Creating Estimators from Keras models](/programmers_guide/estimators#creating_estimators_from_keras_models). - -```python -model = keras.Sequential([layers.Dense(10,activation='softmax'), - layers.Dense(10,activation='softmax')]) - -model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), - loss='categorical_crossentropy', - metrics=['accuracy']) - -estimator = keras.estimator.model_to_estimator(model) -``` - -Note: Enable [eager execution](/programmers_guide/eager) for debugging -[Estimator input functions](/programmers_guide/premade_estimators#create_input_functions) -and inspecting data. - -### Multiple GPUs - -`tf.keras` models can run on multiple GPUs using -`tf.contrib.distribute.DistributionStrategy`. This API provides distributed -training on multiple GPUs with almost no changes to existing code. - -Currently, `tf.contrib.distribute.MirroredStrategy` is the only supported -distribution strategy. `MirroredStrategy` does in-graph replication with -synchronous training using all-reduce on a single machine. To use -`DistributionStrategy` with Keras, convert the `tf.keras.Model` to a -`tf.estimator.Estimator` with `tf.keras.estimator.model_to_estimator`, then -train the estimator - -The following example distributes a `tf.keras.Model` across multiple GPUs on a -single machine. - -First, define a simple model: - -```python -model = keras.Sequential() -model.add(keras.layers.Dense(16, activation='relu', input_shape=(10,))) -model.add(keras.layers.Dense(1, activation='sigmoid')) - -optimizer = tf.train.GradientDescentOptimizer(0.2) - -model.compile(loss='binary_crossentropy', optimizer=optimizer) -model.summary() -``` - -Convert the Keras model to a `tf.estimator.Estimator` instance: - -```python -keras_estimator = keras.estimator.model_to_estimator( - keras_model=model, - config=config, - model_dir='/tmp/model_dir') -``` - -Define an *input pipeline*. The `input_fn` returns a `tf.data.Dataset` object -used to distribute the data across multiple devices—with each device processing -a slice of the input batch. - -```python -def input_fn(): - x = np.random.random((1024, 10)) - y = np.random.randint(2, size=(1024, 1)) - x = tf.cast(x, tf.float32) - dataset = tf.data.Dataset.from_tensor_slices((x, y)) - dataset = dataset.repeat(10) - dataset = dataset.batch(32) - return dataset -``` - -Next, create a `tf.estimator.RunConfig` and set the `train_distribute` argument -to the `tf.contrib.distribute.MirroredStrategy` instance. When creating -`MirroredStrategy`, you can specify a list of devices or set the `num_gpus` -argument. The default uses all available GPUs, like the following: - -```python -strategy = tf.contrib.distribute.MirroredStrategy() -config = tf.estimator.RunConfig(train_distribute=strategy) -``` - -Finally, train the `Estimator` instance by providing the `input_fn` and `steps` -arguments: - -```python -keras_estimator.train(input_fn=input_fn, steps=10) -``` diff --git a/tensorflow/docs_src/programmers_guide/leftnav_files b/tensorflow/docs_src/programmers_guide/leftnav_files deleted file mode 100644 index 357a2a1cb9..0000000000 --- a/tensorflow/docs_src/programmers_guide/leftnav_files +++ /dev/null @@ -1,40 +0,0 @@ -index.md - -### High Level APIs -keras.md -eager.md -datasets.md - -### Estimators -estimators.md: Introduction to Estimators -premade_estimators.md -custom_estimators.md -feature_columns.md -datasets_for_estimators.md -checkpoints.md - -### Accelerators -using_gpu.md -using_tpu.md - -### Low Level APIs -low_level_intro.md -tensors.md -variables.md -graphs.md -saved_model.md - -### ML Concepts -embedding.md - -### Debugging -debugger.md - -### TensorBoard -summaries_and_tensorboard.md: Visualizing Learning -graph_viz.md: Graphs -tensorboard_histograms.md: Histograms - -### Misc -version_compat.md -faq.md diff --git a/tensorflow/docs_src/programmers_guide/low_level_intro.md b/tensorflow/docs_src/programmers_guide/low_level_intro.md deleted file mode 100644 index 478e2bb70b..0000000000 --- a/tensorflow/docs_src/programmers_guide/low_level_intro.md +++ /dev/null @@ -1,604 +0,0 @@ -# Introduction - -This guide gets you started programming in the low-level TensorFlow APIs -(TensorFlow Core), showing you how to: - - * Manage your own TensorFlow program (a `tf.Graph`) and TensorFlow - runtime (a `tf.Session`), instead of relying on Estimators to manage them. - * Run TensorFlow operations, using a `tf.Session`. - * Use high level components ([datasets](#datasets), [layers](#layers), and - [feature_columns](#feature_columns)) in this low level environment. - * Build your own training loop, instead of using the one - @{$premade_estimators$provided by Estimators}. - -We recommend using the higher level APIs to build models when possible. -Knowing TensorFlow Core is valuable for the following reasons: - - * Experimentation and debugging are both more straight forward - when you can use low level TensorFlow operations directly. - * It gives you a mental model of how things work internally when - using the higher level APIs. - -## Setup - -Before using this guide, @{$install$install TensorFlow}. - -To get the most out of this guide, you should know the following: - -* How to program in Python. -* At least a little bit about arrays. -* Ideally, something about machine learning. - -Feel free to launch `python` and follow along with this walkthrough. -Run the following lines to set up your Python environment: - -```python -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import numpy as np -import tensorflow as tf -``` - -## Tensor Values - -The central unit of data in TensorFlow is the **tensor**. A tensor consists of a -set of primitive values shaped into an array of any number of dimensions. A -tensor's **rank** is its number of dimensions, while its **shape** is a tuple -of integers specifying the array's length along each dimension. Here are some -examples of tensor values: - -```python -3. # a rank 0 tensor; a scalar with shape [], -[1., 2., 3.] # a rank 1 tensor; a vector with shape [3] -[[1., 2., 3.], [4., 5., 6.]] # a rank 2 tensor; a matrix with shape [2, 3] -[[[1., 2., 3.]], [[7., 8., 9.]]] # a rank 3 tensor with shape [2, 1, 3] -``` - -TensorFlow uses numpy arrays to represent tensor **values**. - -## TensorFlow Core Walkthrough - -You might think of TensorFlow Core programs as consisting of two discrete -sections: - -1. Building the computational graph (a @{tf.Graph}). -2. Running the computational graph (using a @{tf.Session}). - -### Graph - -A **computational graph** is a series of TensorFlow operations arranged into a -graph. The graph is composed of two types of objects. - - * @{tf.Operation$Operations} (or "ops"): The nodes of the graph. - Operations describe calculations that consume and produce tensors. - * @{tf.Tensor$Tensors}: The edges in the graph. These represent the values - that will flow through the graph. Most TensorFlow functions return - `tf.Tensors`. - -Important: `tf.Tensors` do not have values, they are just handles to elements -in the computation graph. - -Let's build a simple computational graph. The most basic operation is a -constant. The Python function that builds the operation takes a tensor value as -input. The resulting operation takes no inputs. When run, it outputs the -value that was passed to the constructor. We can create two floating point -constants `a` and `b` as follows: - -```python -a = tf.constant(3.0, dtype=tf.float32) -b = tf.constant(4.0) # also tf.float32 implicitly -total = a + b -print(a) -print(b) -print(total) -``` - -The print statements produce: - -``` -Tensor("Const:0", shape=(), dtype=float32) -Tensor("Const_1:0", shape=(), dtype=float32) -Tensor("add:0", shape=(), dtype=float32) -``` - -Notice that printing the tensors does not output the values `3.0`, `4.0`, and -`7.0` as you might expect. The above statements only build the computation -graph. These `tf.Tensor` objects just represent the results of the operations -that will be run. - -Each operation in a graph is given a unique name. This name is independent of -the names the objects are assigned to in Python. Tensors are named after the -operation that produces them followed by an output index, as in -`"add:0"` above. - -### TensorBoard - -TensorFlow provides a utility called TensorBoard. One of TensorBoard's many -capabilities is visualizing a computation graph. You can easily do this with -a few simple commands. - -First you save the computation graph to a TensorBoard summary file as -follows: - -``` -writer = tf.summary.FileWriter('.') -writer.add_graph(tf.get_default_graph()) -``` - -This will produce an `event` file in the current directory with a name in the -following format: - -``` -events.out.tfevents.{timestamp}.{hostname} -``` - -Now, in a new terminal, launch TensorBoard with the following shell command: - -```bsh -tensorboard --logdir . -``` - -Then open TensorBoard's [graphs page](http://localhost:6006/#graphs) in your -browser, and you should see a graph similar to the following: - -![TensorBoard screenshot](https://www.tensorflow.org/images/getting_started_add.png) - -For more about TensorBoard's graph visualization tools see @{$graph_viz}. - -### Session - -To evaluate tensors, instantiate a @{tf.Session} object, informally known as a -**session**. A session encapsulates the state of the TensorFlow runtime, and -runs TensorFlow operations. If a `tf.Graph` is like a `.py` file, a `tf.Session` -is like the `python` executable. - -The following code creates a `tf.Session` object and then invokes its `run` -method to evaluate the `total` tensor we created above: - -```python -sess = tf.Session() -print(sess.run(total)) -``` - -When you request the output of a node with `Session.run` TensorFlow backtracks -through the graph and runs all the nodes that provide input to the requested -output node. So this prints the expected value of 7.0: - -``` -7.0 -``` - -You can pass multiple tensors to `tf.Session.run`. The `run` method -transparently handles any combination of tuples or dictionaries, as in the -following example: - -```python -print(sess.run({'ab':(a, b), 'total':total})) -``` - -which returns the results in a structure of the same layout: - -``` None -{'total': 7.0, 'ab': (3.0, 4.0)} -``` - -During a call to `tf.Session.run` any `tf.Tensor` only has a single value. -For example, the following code calls `tf.random_uniform` to produce a -`tf.Tensor` that generates a random 3-element vector (with values in `[0,1)`): - -```python -vec = tf.random_uniform(shape=(3,)) -out1 = vec + 1 -out2 = vec + 2 -print(sess.run(vec)) -print(sess.run(vec)) -print(sess.run((out1, out2))) -``` - -The result shows a different random value on each call to `run`, but -a consistent value during a single `run` (`out1` and `out2` receive the same -random input): - -``` -[ 0.52917576 0.64076328 0.68353939] -[ 0.66192627 0.89126778 0.06254101] -( - array([ 1.88408756, 1.87149239, 1.84057522], dtype=float32), - array([ 2.88408756, 2.87149239, 2.84057522], dtype=float32) -) -``` - -Some TensorFlow functions return `tf.Operations` instead of `tf.Tensors`. -The result of calling `run` on an Operation is `None`. You run an operation -to cause a side-effect, not to retrieve a value. Examples of this include the -[initialization](#Initializing Layers), and [training](#Training) ops -demonstrated later. - -### Feeding - -As it stands, this graph is not especially interesting because it always -produces a constant result. A graph can be parameterized to accept external -inputs, known as **placeholders**. A **placeholder** is a promise to provide a -value later, like a function argument. - -```python -x = tf.placeholder(tf.float32) -y = tf.placeholder(tf.float32) -z = x + y -``` - -The preceding three lines are a bit like a function in which we -define two input parameters (`x` and `y`) and then an operation on them. We can -evaluate this graph with multiple inputs by using the `feed_dict` argument of -the @{tf.Session.run$run method} to feed concrete values to the placeholders: - -```python -print(sess.run(z, feed_dict={x: 3, y: 4.5})) -print(sess.run(z, feed_dict={x: [1, 3], y: [2, 4]})) -``` -This results in the following output: - -``` -7.5 -[ 3. 7.] -``` - -Also note that the `feed_dict` argument can be used to overwrite any tensor in -the graph. The only difference between placeholders and other `tf.Tensors` is -that placeholders throw an error if no value is fed to them. - -## Datasets - -Placeholders work for simple experiments, but @{tf.data$Datasets} are the -preferred method of streaming data into a model. - -To get a runnable `tf.Tensor` from a Dataset you must first convert it to a -@{tf.data.Iterator}, and then call the Iterator's -@{tf.data.Iterator.get_next$`get_next`} method. - -The simplest way to create an Iterator is with the -@{tf.data.Dataset.make_one_shot_iterator$`make_one_shot_iterator`} method. -For example, in the following code the `next_item` tensor will return a row from -the `my_data` array on each `run` call: - -``` python -my_data = [ - [0, 1,], - [2, 3,], - [4, 5,], - [6, 7,], -] -slices = tf.data.Dataset.from_tensor_slices(my_data) -next_item = slices.make_one_shot_iterator().get_next() -``` - -Reaching the end of the data stream causes `Dataset` to throw an -@{tf.errors.OutOfRangeError$`OutOfRangeError`}. For example, the following code -reads the `next_item` until there is no more data to read: - -``` python -while True: - try: - print(sess.run(next_item)) - except tf.errors.OutOfRangeError: - break -``` - -If the `Dataset` depends on stateful operations you may need to -initialize the iterator before using it, as shown below: - -``` python -r = tf.random_normal([10,3]) -dataset = tf.data.Dataset.from_tensor_slices(r) -iterator = dataset.make_initializable_iterator() -next_row = iterator.get_next() - -sess.run(iterator.initializer) -while True: - try: - print(sess.run(next_row)) - except tf.errors.OutOfRangeError: - break -``` - -For more details on Datasets and Iterators see: @{$programmers_guide/datasets}. - -## Layers - -A trainable model must modify the values in the graph to get new outputs with -the same input. @{tf.layers$Layers} are the preferred way to add trainable -parameters to a graph. - -Layers package together both the variables and the operations that act -on them. For example a -[densely-connected layer](https://developers.google.com/machine-learning/glossary/#fully_connected_layer) -performs a weighted sum across all inputs -for each output and applies an optional -[activation function](https://developers.google.com/machine-learning/glossary/#activation_function). -The connection weights and biases are managed by the layer object. - -### Creating Layers - -The following code creates a @{tf.layers.Dense$`Dense`} layer that takes a -batch of input vectors, and produces a single output value for each. To apply a -layer to an input, call the layer as if it were a function. For example: - -```python -x = tf.placeholder(tf.float32, shape=[None, 3]) -linear_model = tf.layers.Dense(units=1) -y = linear_model(x) -``` - -The layer inspects its input to determine sizes for its internal variables. So -here we must set the shape of the `x` placeholder so that the layer can -build a weight matrix of the correct size. - -Now that we have defined the calculation of the output, `y`, there is one more -detail we need to take care of before we run the calculation. - -### Initializing Layers - -The layer contains variables that must be **initialized** before they can be -used. While it is possible to initialize variables individually, you can easily -initialize all the variables in a TensorFlow graph as follows: - -```python -init = tf.global_variables_initializer() -sess.run(init) -``` - -Important: Calling `tf.global_variables_initializer` only -creates and returns a handle to a TensorFlow operation. That op -will initialize all the global variables when we run it with `tf.Session.run`. - -Also note that this `global_variables_initializer` only initializes variables -that existed in the graph when the initializer was created. So the initializer -should be one of the last things added during graph construction. - -### Executing Layers - -Now that the layer is initialized, we can evaluate the `linear_model`'s output -tensor as we would any other tensor. For example, the following code: - -```python -print(sess.run(y, {x: [[1, 2, 3],[4, 5, 6]]})) -``` - -will generate a two-element output vector such as the following: - -``` -[[-3.41378999] - [-9.14999008]] -``` - -### Layer Function shortcuts - -For each layer class (like @{tf.layers.Dense}) TensorFlow also supplies a -shortcut function (like @{tf.layers.dense}). The only difference is that the -shortcut function versions create and run the layer in a single call. For -example, the following code is equivalent to the earlier version: - -```python -x = tf.placeholder(tf.float32, shape=[None, 3]) -y = tf.layers.dense(x, units=1) - -init = tf.global_variables_initializer() -sess.run(init) - -print(sess.run(y, {x: [[1, 2, 3], [4, 5, 6]]})) -``` - -While convenient, this approach allows no access to the @{tf.layers.Layer} -object. This makes introspection and debugging more difficult, -and layer reuse impossible. - -## Feature columns - -The easiest way to experiment with feature columns is using the -@{tf.feature_column.input_layer} function. This function only accepts -@{$feature_columns$dense columns} as inputs, so to view the result -of a categorical column you must wrap it in an -@{tf.feature_column.indicator_column}. For example: - -``` python -features = { - 'sales' : [[5], [10], [8], [9]], - 'department': ['sports', 'sports', 'gardening', 'gardening']} - -department_column = tf.feature_column.categorical_column_with_vocabulary_list( - 'department', ['sports', 'gardening']) -department_column = tf.feature_column.indicator_column(department_column) - -columns = [ - tf.feature_column.numeric_column('sales'), - department_column -] - -inputs = tf.feature_column.input_layer(features, columns) -``` - -Running the `inputs` tensor will parse the `features` into a batch of vectors. - -Feature columns can have internal state, like layers, so they often need to be -initialized. Categorical columns use @{tf.contrib.lookup$lookup tables} -internally and these require a separate initialization op, -@{tf.tables_initializer}. - -``` python -var_init = tf.global_variables_initializer() -table_init = tf.tables_initializer() -sess = tf.Session() -sess.run((var_init, table_init)) -``` - -Once the internal state has been initialized you can run `inputs` like any -other `tf.Tensor`: - -```python -print(sess.run(inputs)) -``` - -This shows how the feature columns have packed the input vectors, with the -one-hot "department" as the first two indices and "sales" as the third. - -```None -[[ 1. 0. 5.] - [ 1. 0. 10.] - [ 0. 1. 8.] - [ 0. 1. 9.]] -``` - -## Training - -Now that you're familiar with the basics of core TensorFlow, let's train a -small regression model manually. - -### Define the data - -First let's define some inputs, `x`, and the expected output for each input, -`y_true`: - -```python -x = tf.constant([[1], [2], [3], [4]], dtype=tf.float32) -y_true = tf.constant([[0], [-1], [-2], [-3]], dtype=tf.float32) -``` - -### Define the model - -Next, build a simple linear model, with 1 output: - -``` python -linear_model = tf.layers.Dense(units=1) - -y_pred = linear_model(x) -``` - -You can evaluate the predictions as follows: - -``` python -sess = tf.Session() -init = tf.global_variables_initializer() -sess.run(init) - -print(sess.run(y_pred)) -``` - -The model hasn't yet been trained, so the four "predicted" values aren't very -good. Here's what we got; your own output will almost certainly differ: - -``` None -[[ 0.02631879] - [ 0.05263758] - [ 0.07895637] - [ 0.10527515]] -``` - -### Loss - -To optimize a model, you first need to define the loss. We'll use the mean -square error, a standard loss for regression problems. - -While you could do this manually with lower level math operations, -the @{tf.losses} module provides a set of common loss functions. You can use it -to calculate the mean square error as follows: - -``` python -loss = tf.losses.mean_squared_error(labels=y_true, predictions=y_pred) - -print(sess.run(loss)) -``` -This will produce a loss value, something like: - -``` None -2.23962 -``` - -### Training - -TensorFlow provides -[**optimizers**](https://developers.google.com/machine-learning/glossary/#optimizer) -implementing standard optimization algorithms. These are implemented as -sub-classes of @{tf.train.Optimizer}. They incrementally change each -variable in order to minimize the loss. The simplest optimization algorithm is -[**gradient descent**](https://developers.google.com/machine-learning/glossary/#gradient_descent), -implemented by @{tf.train.GradientDescentOptimizer}. It modifies each -variable according to the magnitude of the derivative of loss with respect to -that variable. For example: - -```python -optimizer = tf.train.GradientDescentOptimizer(0.01) -train = optimizer.minimize(loss) -``` - -This code builds all the graph components necessary for the optimization, and -returns a training operation. When run, the training op will update variables -in the graph. You might run it as follows: - -```python -for i in range(100): - _, loss_value = sess.run((train, loss)) - print(loss_value) -``` - -Since `train` is an op, not a tensor, it doesn't return a value when run. -To see the progression of the loss during training, we run the loss tensor at -the same time, producing output like the following: - -``` None -1.35659 -1.00412 -0.759167 -0.588829 -0.470264 -0.387626 -0.329918 -0.289511 -0.261112 -0.241046 -... -``` - -### Complete program - -```python -x = tf.constant([[1], [2], [3], [4]], dtype=tf.float32) -y_true = tf.constant([[0], [-1], [-2], [-3]], dtype=tf.float32) - -linear_model = tf.layers.Dense(units=1) - -y_pred = linear_model(x) -loss = tf.losses.mean_squared_error(labels=y_true, predictions=y_pred) - -optimizer = tf.train.GradientDescentOptimizer(0.01) -train = optimizer.minimize(loss) - -init = tf.global_variables_initializer() - -sess = tf.Session() -sess.run(init) -for i in range(100): - _, loss_value = sess.run((train, loss)) - print(loss_value) - -print(sess.run(y_pred)) -``` - -## Next steps - -To learn more about building models with TensorFlow consider the following: - -* @{$custom_estimators$Custom Estimators}, to learn how to build - customized models with TensorFlow. Your knowledge of TensorFlow Core will - help you understand and debug your own models. - -If you want to learn more about the inner workings of TensorFlow consider the -following documents, which go into more depth on many of the topics discussed -here: - -* @{$graphs} -* @{$tensors} -* @{$variables} - - diff --git a/tensorflow/docs_src/programmers_guide/premade_estimators.md b/tensorflow/docs_src/programmers_guide/premade_estimators.md deleted file mode 100644 index 02e2caf64b..0000000000 --- a/tensorflow/docs_src/programmers_guide/premade_estimators.md +++ /dev/null @@ -1,430 +0,0 @@ -# Premade Estimators - -This document introduces the TensorFlow programming environment and shows you -how to solve the Iris classification problem in TensorFlow. - -## Prerequisites - -Prior to using the sample code in this document, you'll need to do the -following: - -* @{$install$Install TensorFlow}. -* If you installed TensorFlow with virtualenv or Anaconda, activate your - TensorFlow environment. -* Install or upgrade pandas by issuing the following command: - - pip install pandas - -## Getting the sample code - -Take the following steps to get the sample code we'll be going through: - -1. Clone the TensorFlow Models repository from GitHub by entering the following - command: - - git clone https://github.com/tensorflow/models - -1. Change directory within that branch to the location containing the examples - used in this document: - - cd models/samples/core/get_started/ - -The program described in this document is -[`premade_estimator.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/premade_estimator.py). -This program uses -[`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py) -to fetch its training data. - -### Running the program - -You run TensorFlow programs as you would run any Python program. For example: - -``` bsh -python premade_estimator.py -``` - -The program should output training logs followed by some predictions against -the test set. For example, the first line in the following output shows that -the model thinks there is a 99.6% chance that the first example in the test -set is a Setosa. Since the test set expected Setosa, this appears to be -a good prediction. - -``` None -... -Prediction is "Setosa" (99.6%), expected "Setosa" - -Prediction is "Versicolor" (99.8%), expected "Versicolor" - -Prediction is "Virginica" (97.9%), expected "Virginica" -``` - -If the program generates errors instead of answers, ask yourself the following -questions: - -* Did you install TensorFlow properly? -* Are you using the correct version of TensorFlow? -* Did you activate the environment you installed TensorFlow in? (This is - only relevant in certain installation mechanisms.) - -## The programming stack - -Before getting into the details of the program itself, let's investigate the -programming environment. As the following illustration shows, TensorFlow -provides a programming stack consisting of multiple API layers: - -
- -
- -We strongly recommend writing TensorFlow programs with the following APIs: - -* @{$programmers_guide/estimators$Estimators}, which represent a complete model. - The Estimator API provides methods to train the model, to judge the model's - accuracy, and to generate predictions. -* @{$programmers_guide/datasets_for_estimators}, which build a data input - pipeline. The Dataset API has methods to load and manipulate data, and feed - it into your model. The Dataset API meshes well with the Estimators API. - -## Classifying irises: an overview - -The sample program in this document builds and tests a model that -classifies Iris flowers into three different species based on the size of their -[sepals](https://en.wikipedia.org/wiki/Sepal) and -[petals](https://en.wikipedia.org/wiki/Petal). - -
-Petal geometry compared for three iris species: Iris setosa, Iris virginica, and Iris versicolor -
- -**From left to right, -[*Iris setosa*](https://commons.wikimedia.org/w/index.php?curid=170298) (by -[Radomil](https://commons.wikimedia.org/wiki/User:Radomil), CC BY-SA 3.0), -[*Iris versicolor*](https://commons.wikimedia.org/w/index.php?curid=248095) (by -[Dlanglois](https://commons.wikimedia.org/wiki/User:Dlanglois), CC BY-SA 3.0), -and [*Iris virginica*](https://www.flickr.com/photos/33397993@N05/3352169862) -(by [Frank Mayfield](https://www.flickr.com/photos/33397993@N05), CC BY-SA -2.0).** - -### The data set - -The Iris data set contains four features and one -[label](https://developers.google.com/machine-learning/glossary/#label). -The four features identify the following botanical characteristics of -individual Iris flowers: - -* sepal length -* sepal width -* petal length -* petal width - -Our model will represent these features as `float32` numerical data. - -The label identifies the Iris species, which must be one of the following: - -* Iris setosa (0) -* Iris versicolor (1) -* Iris virginica (2) - -Our model will represent the label as `int32` categorical data. - -The following table shows three examples in the data set: - -|sepal length | sepal width | petal length | petal width| species (label) | -|------------:|------------:|-------------:|-----------:|:---------------:| -| 5.1 | 3.3 | 1.7 | 0.5 | 0 (Setosa) | -| 5.0 | 2.3 | 3.3 | 1.0 | 1 (versicolor)| -| 6.4 | 2.8 | 5.6 | 2.2 | 2 (virginica) | - -### The algorithm - -The program trains a Deep Neural Network classifier model having the following -topology: - -* 2 hidden layers. -* Each hidden layer contains 10 nodes. - -The following figure illustrates the features, hidden layers, and predictions -(not all of the nodes in the hidden layers are shown): - -
-A diagram of the network architecture: Inputs, 2 hidden layers, and outputs -
- -### Inference - -Running the trained model on an unlabeled example yields three predictions, -namely, the likelihood that this flower is the given Iris species. The sum of -those output predictions will be 1.0. For example, the prediction on an -unlabeled example might be something like the following: - -* 0.03 for Iris Setosa -* 0.95 for Iris Versicolor -* 0.02 for Iris Virginica - -The preceding prediction indicates a 95% probability that the given unlabeled -example is an Iris Versicolor. - -## Overview of programming with Estimators - -An Estimator is TensorFlow's high-level representation of a complete model. It -handles the details of initialization, logging, saving and restoring, and many -other features so you can concentrate on your model. For more details see -@{$programmers_guide/estimators}. - -An Estimator is any class derived from @{tf.estimator.Estimator}. TensorFlow -provides a collection of -@{tf.estimator$pre-made Estimators} -(for example, `LinearRegressor`) to implement common ML algorithms. Beyond -those, you may write your own -@{$custom_estimators$custom Estimators}. -We recommend using pre-made Estimators when just getting started. - -To write a TensorFlow program based on pre-made Estimators, you must perform the -following tasks: - -* Create one or more input functions. -* Define the model's feature columns. -* Instantiate an Estimator, specifying the feature columns and various - hyperparameters. -* Call one or more methods on the Estimator object, passing the appropriate - input function as the source of the data. - -Let's see how those tasks are implemented for Iris classification. - -## Create input functions - -You must create input functions to supply data for training, -evaluating, and prediction. - -An **input function** is a function that returns a @{tf.data.Dataset} object -which outputs the following two-element tuple: - -* [`features`](https://developers.google.com/machine-learning/glossary/#feature) - A Python dictionary in which: - * Each key is the name of a feature. - * Each value is an array containing all of that feature's values. -* `label` - An array containing the values of the - [label](https://developers.google.com/machine-learning/glossary/#label) for - every example. - -Just to demonstrate the format of the input function, here's a simple -implementation: - -```python -def input_evaluation_set(): - features = {'SepalLength': np.array([6.4, 5.0]), - 'SepalWidth': np.array([2.8, 2.3]), - 'PetalLength': np.array([5.6, 3.3]), - 'PetalWidth': np.array([2.2, 1.0])} - labels = np.array([2, 1]) - return features, labels -``` - -Your input function may generate the `features` dictionary and `label` list any -way you like. However, we recommend using TensorFlow's Dataset API, which can -parse all sorts of data. At a high level, the Dataset API consists of the -following classes: - -
-A diagram showing subclasses of the Dataset class -
- -Where the individual members are: - -* `Dataset` - Base class containing methods to create and transform - datasets. Also allows you to initialize a dataset from data in memory, or from - a Python generator. -* `TextLineDataset` - Reads lines from text files. -* `TFRecordDataset` - Reads records from TFRecord files. -* `FixedLengthRecordDataset` - Reads fixed size records from binary files. -* `Iterator` - Provides a way to access one data set element at a time. - -The Dataset API can handle a lot of common cases for you. For example, -using the Dataset API, you can easily read in records from a large collection -of files in parallel and join them into a single stream. - -To keep things simple in this example we are going to load the data with -[pandas](https://pandas.pydata.org/), and build our input pipeline from this -in-memory data. - -Here is the input function used for training in this program, which is available -in [`iris_data.py`](https://github.com/tensorflow/models/blob/master/samples/core/get_started/iris_data.py): - -``` python -def train_input_fn(features, labels, batch_size): - """An input function for training""" - # Convert the inputs to a Dataset. - dataset = tf.data.Dataset.from_tensor_slices((dict(features), labels)) - - # Shuffle, repeat, and batch the examples. - return dataset.shuffle(1000).repeat().batch(batch_size) -``` - -## Define the feature columns - -A [**feature column**](https://developers.google.com/machine-learning/glossary/#feature_columns) -is an object describing how the model should use raw input data from the -features dictionary. When you build an Estimator model, you pass it a list of -feature columns that describes each of the features you want the model to use. -The @{tf.feature_column} module provides many options for representing data -to the model. - -For Iris, the 4 raw features are numeric values, so we'll build a list of -feature columns to tell the Estimator model to represent each of the four -features as 32-bit floating-point values. Therefore, the code to create the -feature column is: - -```python -# Feature columns describe how to use the input. -my_feature_columns = [] -for key in train_x.keys(): - my_feature_columns.append(tf.feature_column.numeric_column(key=key)) -``` - -Feature columns can be far more sophisticated than those we're showing here. We -detail feature columns @{$feature_columns$later on} in our Getting -Started guide. - -Now that we have the description of how we want the model to represent the raw -features, we can build the estimator. - - -## Instantiate an estimator - -The Iris problem is a classic classification problem. Fortunately, TensorFlow -provides several pre-made classifier Estimators, including: - -* @{tf.estimator.DNNClassifier} for deep models that perform multi-class - classification. -* @{tf.estimator.DNNLinearCombinedClassifier} for wide & deep models. -* @{tf.estimator.LinearClassifier} for classifiers based on linear models. - -For the Iris problem, `tf.estimator.DNNClassifier` seems like the best choice. -Here's how we instantiated this Estimator: - -```python -# Build a DNN with 2 hidden layers and 10 nodes in each hidden layer. -classifier = tf.estimator.DNNClassifier( - feature_columns=my_feature_columns, - # Two hidden layers of 10 nodes each. - hidden_units=[10, 10], - # The model must choose between 3 classes. - n_classes=3) -``` - -## Train, Evaluate, and Predict - -Now that we have an Estimator object, we can call methods to do the following: - -* Train the model. -* Evaluate the trained model. -* Use the trained model to make predictions. - -### Train the model - -Train the model by calling the Estimator's `train` method as follows: - -```python -# Train the Model. -classifier.train( - input_fn=lambda:iris_data.train_input_fn(train_x, train_y, args.batch_size), - steps=args.train_steps) -``` - -Here we wrap up our `input_fn` call in a -[`lambda`](https://docs.python.org/3/tutorial/controlflow.html) -to capture the arguments while providing an input function that takes no -arguments, as expected by the Estimator. The `steps` argument tells the method -to stop training after a number of training steps. - -### Evaluate the trained model - -Now that the model has been trained, we can get some statistics on its -performance. The following code block evaluates the accuracy of the trained -model on the test data: - -```python -# Evaluate the model. -eval_result = classifier.evaluate( - input_fn=lambda:iris_data.eval_input_fn(test_x, test_y, args.batch_size)) - -print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result)) -``` - -Unlike our call to the `train` method, we did not pass the `steps` -argument to evaluate. Our `eval_input_fn` only yields a single -[epoch](https://developers.google.com/machine-learning/glossary/#epoch) of data. - -Running this code yields the following output (or something similar): - -```none -Test set accuracy: 0.967 -``` - -### Making predictions (inferring) from the trained model - -We now have a trained model that produces good evaluation results. -We can now use the trained model to predict the species of an Iris flower -based on some unlabeled measurements. As with training and evaluation, we make -predictions using a single function call: - -```python -# Generate predictions from the model -expected = ['Setosa', 'Versicolor', 'Virginica'] -predict_x = { - 'SepalLength': [5.1, 5.9, 6.9], - 'SepalWidth': [3.3, 3.0, 3.1], - 'PetalLength': [1.7, 4.2, 5.4], - 'PetalWidth': [0.5, 1.5, 2.1], -} - -predictions = classifier.predict( - input_fn=lambda:iris_data.eval_input_fn(predict_x, - batch_size=args.batch_size)) -``` - -The `predict` method returns a Python iterable, yielding a dictionary of -prediction results for each example. The following code prints a few -predictions and their probabilities: - - -``` python -template = ('\nPrediction is "{}" ({:.1f}%), expected "{}"') - -for pred_dict, expec in zip(predictions, expected): - class_id = pred_dict['class_ids'][0] - probability = pred_dict['probabilities'][class_id] - - print(template.format(iris_data.SPECIES[class_id], - 100 * probability, expec)) -``` - -Running the preceding code yields the following output: - -``` None -... -Prediction is "Setosa" (99.6%), expected "Setosa" - -Prediction is "Versicolor" (99.8%), expected "Versicolor" - -Prediction is "Virginica" (97.9%), expected "Virginica" -``` - - -## Summary - -Pre-made Estimators are an effective way to quickly create standard models. - -Now that you've gotten started writing TensorFlow programs, consider the -following material: - -* @{$checkpoints$Checkpoints} to learn how to save and restore models. -* @{$programmers_guide/datasets_for_estimators} to learn more about importing - data into your model. -* @{$custom_estimators$Creating Custom Estimators} to learn how to - write your own Estimator, customized for a particular problem. diff --git a/tensorflow/docs_src/programmers_guide/saved_model.md b/tensorflow/docs_src/programmers_guide/saved_model.md deleted file mode 100644 index c6ef87c54a..0000000000 --- a/tensorflow/docs_src/programmers_guide/saved_model.md +++ /dev/null @@ -1,999 +0,0 @@ -# Save and Restore - -The @{tf.train.Saver} class provides methods to save and restore models. The -@{tf.saved_model.simple_save} function is an easy way to build a -@{tf.saved_model$saved model} suitable for serving. -[Estimators](@{$programmers_guide/estimators}) automatically save and restore -variables in the `model_dir`. - -## Save and restore variables - -TensorFlow @{$variables} are the best way to represent shared, persistent state -manipulated by your program. The `tf.train.Saver` constructor adds `save` and -`restore` ops to the graph for all, or a specified list, of the variables in the -graph. The `Saver` object provides methods to run these ops, specifying paths -for the checkpoint files to write to or read from. - -`Saver` restores all variables already defined in your model. If you're -loading a model without knowing how to build its graph (for example, if you're -writing a generic program to load models), then read the -[Overview of saving and restoring models](#models) section -later in this document. - -TensorFlow saves variables in binary *checkpoint files* that map variable -names to tensor values. - -Caution: TensorFlow model files are code. Be careful with untrusted code. -See [Using TensorFlow Securely](https://github.com/tensorflow/tensorflow/blob/master/SECURITY.md) -for details. - -### Save variables - -Create a `Saver` with `tf.train.Saver()` to manage all variables in the -model. For example, the following snippet demonstrates how to call the -`tf.train.Saver.save` method to save variables to checkpoint files: - -```python -# Create some variables. -v1 = tf.get_variable("v1", shape=[3], initializer = tf.zeros_initializer) -v2 = tf.get_variable("v2", shape=[5], initializer = tf.zeros_initializer) - -inc_v1 = v1.assign(v1+1) -dec_v2 = v2.assign(v2-1) - -# Add an op to initialize the variables. -init_op = tf.global_variables_initializer() - -# Add ops to save and restore all the variables. -saver = tf.train.Saver() - -# Later, launch the model, initialize the variables, do some work, and save the -# variables to disk. -with tf.Session() as sess: - sess.run(init_op) - # Do some work with the model. - inc_v1.op.run() - dec_v2.op.run() - # Save the variables to disk. - save_path = saver.save(sess, "/tmp/model.ckpt") - print("Model saved in path: %s" % save_path) -``` - -### Restore variables - -The `tf.train.Saver` object not only saves variables to checkpoint files, it -also restores variables. Note that when you restore variables you do not have -to initialize them beforehand. For example, the following snippet demonstrates -how to call the `tf.train.Saver.restore` method to restore variables from the -checkpoint files: - -```python -tf.reset_default_graph() - -# Create some variables. -v1 = tf.get_variable("v1", shape=[3]) -v2 = tf.get_variable("v2", shape=[5]) - -# Add ops to save and restore all the variables. -saver = tf.train.Saver() - -# Later, launch the model, use the saver to restore variables from disk, and -# do some work with the model. -with tf.Session() as sess: - # Restore variables from disk. - saver.restore(sess, "/tmp/model.ckpt") - print("Model restored.") - # Check the values of the variables - print("v1 : %s" % v1.eval()) - print("v2 : %s" % v2.eval()) -``` - -Note: There is not a physical file called `/tmp/model.ckpt`. It is the *prefix* of -filenames created for the checkpoint. Users only interact with the prefix -instead of physical checkpoint files. - -### Choose variables to save and restore - -If you do not pass any arguments to `tf.train.Saver()`, the saver handles all -variables in the graph. Each variable is saved under the name that was passed -when the variable was created. - -It is sometimes useful to explicitly specify names for variables in the -checkpoint files. For example, you may have trained a model with a variable -named `"weights"` whose value you want to restore into a variable named -`"params"`. - -It is also sometimes useful to only save or restore a subset of the variables -used by a model. For example, you may have trained a neural net with five -layers, and you now want to train a new model with six layers that reuses the -existing weights of the five trained layers. You can use the saver to restore -the weights of just the first five layers. - -You can easily specify the names and variables to save or load by passing to the -`tf.train.Saver()` constructor either of the following: - -* A list of variables (which will be stored under their own names). -* A Python dictionary in which keys are the names to use and the values are the -variables to manage. - -Continuing from the save/restore examples shown earlier: - -```python -tf.reset_default_graph() -# Create some variables. -v1 = tf.get_variable("v1", [3], initializer = tf.zeros_initializer) -v2 = tf.get_variable("v2", [5], initializer = tf.zeros_initializer) - -# Add ops to save and restore only `v2` using the name "v2" -saver = tf.train.Saver({"v2": v2}) - -# Use the saver object normally after that. -with tf.Session() as sess: - # Initialize v1 since the saver will not. - v1.initializer.run() - saver.restore(sess, "/tmp/model.ckpt") - - print("v1 : %s" % v1.eval()) - print("v2 : %s" % v2.eval()) -``` - -Notes: - -* You can create as many `Saver` objects as you want if you need to save and - restore different subsets of the model variables. The same variable can be - listed in multiple saver objects; its value is only changed when the - `Saver.restore()` method is run. - -* If you only restore a subset of the model variables at the start of a - session, you have to run an initialize op for the other variables. See - @{tf.variables_initializer} for more information. - -* To inspect the variables in a checkpoint, you can use the - [`inspect_checkpoint`](https://www.tensorflow.org/code/tensorflow/python/tools/inspect_checkpoint.py) - library, particularly the `print_tensors_in_checkpoint_file` function. - -* By default, `Saver` uses the value of the @{tf.Variable.name} property - for each variable. However, when you create a `Saver` object, you may - optionally choose names for the variables in the checkpoint files. - - -### Inspect variables in a checkpoint - -We can quickly inspect variables in a checkpoint with the -[`inspect_checkpoint`](https://www.tensorflow.org/code/tensorflow/python/tools/inspect_checkpoint.py) library. - -Continuing from the save/restore examples shown earlier: - -```python -# import the inspect_checkpoint library -from tensorflow.python.tools import inspect_checkpoint as chkp - -# print all tensors in checkpoint file -chkp.print_tensors_in_checkpoint_file("/tmp/model.ckpt", tensor_name='', all_tensors=True) - -# tensor_name: v1 -# [ 1. 1. 1.] -# tensor_name: v2 -# [-1. -1. -1. -1. -1.] - -# print only tensor v1 in checkpoint file -chkp.print_tensors_in_checkpoint_file("/tmp/model.ckpt", tensor_name='v1', all_tensors=False) - -# tensor_name: v1 -# [ 1. 1. 1.] - -# print only tensor v2 in checkpoint file -chkp.print_tensors_in_checkpoint_file("/tmp/model.ckpt", tensor_name='v2', all_tensors=False) - -# tensor_name: v2 -# [-1. -1. -1. -1. -1.] -``` - - - -## Save and restore models - -Use `SavedModel` to save and load your model—variables, the graph, and the -graph's metadata. This is a language-neutral, recoverable, hermetic -serialization format that enables higher-level systems and tools to produce, -consume, and transform TensorFlow models. TensorFlow provides several ways to -interact with `SavedModel`, including the @{tf.saved_model} APIs, -@{tf.estimator.Estimator}, and a command-line interface. - - -## Build and load a SavedModel - -### Simple save - -The easiest way to create a `SavedModel` is to use the @{tf.saved_model.simple_save} -function: - -```python -simple_save(session, - export_dir, - inputs={"x": x, "y": y}, - outputs={"z": z}) -``` - -This configures the `SavedModel` so it can be loaded by -[TensorFlow serving](/serving/serving_basic) and supports the -[Predict API](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/predict.proto). -To access the classify, regress, or multi-inference APIs, use the manual -`SavedModel` builder APIs or an @{tf.estimator.Estimator}. - -### Manually build a SavedModel - -If your use case isn't covered by @{tf.saved_model.simple_save}, use the manual -@{tf.saved_model.builder$builder APIs} to create a `SavedModel`. - -The @{tf.saved_model.builder.SavedModelBuilder} class provides functionality to -save multiple `MetaGraphDef`s. A **MetaGraph** is a dataflow graph, plus -its associated variables, assets, and signatures. A **`MetaGraphDef`** -is the protocol buffer representation of a MetaGraph. A **signature** is -the set of inputs to and outputs from a graph. - -If assets need to be saved and written or copied to disk, they can be provided -when the first `MetaGraphDef` is added. If multiple `MetaGraphDef`s are -associated with an asset of the same name, only the first version is retained. - -Each `MetaGraphDef` added to the SavedModel must be annotated with -user-specified tags. The tags provide a means to identify the specific -`MetaGraphDef` to load and restore, along with the shared set of variables -and assets. These tags -typically annotate a `MetaGraphDef` with its functionality (for example, -serving or training), and optionally with hardware-specific aspects (for -example, GPU). - -For example, the following code suggests a typical way to use -`SavedModelBuilder` to build a SavedModel: - -```python -export_dir = ... -... -builder = tf.saved_model.builder.SavedModelBuilder(export_dir) -with tf.Session(graph=tf.Graph()) as sess: - ... - builder.add_meta_graph_and_variables(sess, - [tag_constants.TRAINING], - signature_def_map=foo_signatures, - assets_collection=foo_assets, - strip_default_attrs=True) -... -# Add a second MetaGraphDef for inference. -with tf.Session(graph=tf.Graph()) as sess: - ... - builder.add_meta_graph([tag_constants.SERVING], strip_default_attrs=True) -... -builder.save() -``` - - -#### Forward compatibility via `strip_default_attrs=True` - -Following the guidance below gives you forward compatibility only if the set of -Ops has not changed. - -The @{tf.saved_model.builder.SavedModelBuilder$`SavedModelBuilder`} class allows -users to control whether default-valued attributes must be stripped from the -@{$extend/tool_developers#nodes$`NodeDefs`} -while adding a meta graph to the SavedModel bundle. Both -@{tf.saved_model.builder.SavedModelBuilder.add_meta_graph_and_variables$`SavedModelBuilder.add_meta_graph_and_variables`} -and @{tf.saved_model.builder.SavedModelBuilder.add_meta_graph$`SavedModelBuilder.add_meta_graph`} -methods accept a Boolean flag `strip_default_attrs` that controls this behavior. - -If `strip_default_attrs` is `False`, the exported @{tf.MetaGraphDef} will have -the default valued attributes in all its @{tf.NodeDef} instances. -This can break forward compatibility with a sequence of events such as the -following: - -* An existing Op (`Foo`) is updated to include a new attribute (`T`) with a - default (`bool`) at version 101. -* A model producer such as a "trainer binary" picks up this change (version 101) - to the `OpDef` and re-exports an existing model that uses Op `Foo`. -* A model consumer (such as [Tensorflow Serving](/serving)) running an older - binary (version 100) doesn't have attribute `T` for Op `Foo`, but tries to - import this model. The model consumer doesn't recognize attribute `T` in a - `NodeDef` that uses Op `Foo` and therefore fails to load the model. -* By setting `strip_default_attrs` to True, the model producers can strip away - any default valued attributes in the `NodeDefs`. This helps ensure that newly - added attributes with defaults don't cause older model consumers to fail - loading models regenerated with newer training binaries. - -See [compatibility guidance](https://www.tensorflow.org/programmers_guide/version_compat) -for more information. - -### Loading a SavedModel in Python - -The Python version of the SavedModel -@{tf.saved_model.loader$loader} -provides load and restore capability for a SavedModel. The `load` operation -requires the following information: - -* The session in which to restore the graph definition and variables. -* The tags used to identify the MetaGraphDef to load. -* The location (directory) of the SavedModel. - -Upon a load, the subset of variables, assets, and signatures supplied as part of -the specific MetaGraphDef will be restored into the supplied session. - - -```python -export_dir = ... -... -with tf.Session(graph=tf.Graph()) as sess: - tf.saved_model.loader.load(sess, [tag_constants.TRAINING], export_dir) - ... -``` - - -### Load a SavedModel in C++ - -The C++ version of the SavedModel -[loader](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/loader.h) -provides an API to load a SavedModel from a path, while allowing -`SessionOptions` and `RunOptions`. -You have to specify the tags associated with the graph to be loaded. -The loaded version of SavedModel is referred to as `SavedModelBundle` -and contains the MetaGraphDef and the session within which it is loaded. - -```c++ -const string export_dir = ... -SavedModelBundle bundle; -... -LoadSavedModel(session_options, run_options, export_dir, {kSavedModelTagTrain}, - &bundle); -``` - -### Load and serve a SavedModel in TensorFlow serving - -You can easily load and serve a SavedModel with the TensorFlow Serving Model -Server binary. See [instructions](https://www.tensorflow.org/serving/setup#installing_using_apt-get) -on how to install the server, or build it if you wish. - -Once you have the Model Server, run it with: -``` -tensorflow_model_server --port=port-numbers --model_name=your-model-name --model_base_path=your_model_base_path -``` -Set the port and model_name flags to values of your choosing. The -model_base_path flag expects to be to a base directory, with each version of -your model residing in a numerically named subdirectory. If you only have a -single version of your model, simply place it in a subdirectory like so: -* Place the model in /tmp/model/0001 -* Set model_base_path to /tmp/model - -Store different versions of your model in numerically named subdirectories of a -common base directory. For example, suppose the base directory is `/tmp/model`. -If you have only one version of your model, store it in `/tmp/model/0001`. If -you have two versions of your model, store the second version in -`/tmp/model/0002`, and so on. Set the `--model-base_path` flag to the base -directory (`/tmp/model`, in this example). TensorFlow Model Server will serve -the model in the highest numbered subdirectory of that base directory. - -### Standard constants - -SavedModel offers the flexibility to build and load TensorFlow graphs for a -variety of use-cases. For the most common use-cases, SavedModel's APIs -provide a set of constants in Python and C++ that are easy to -reuse and share across tools consistently. - -#### Standard MetaGraphDef tags - -You may use sets of tags to uniquely identify a `MetaGraphDef` saved in a -SavedModel. A subset of commonly used tags is specified in: - -* [Python](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/saved_model/tag_constants.py) -* [C++](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/tag_constants.h) - - -#### Standard SignatureDef constants - -A [**SignatureDef**](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/protobuf/meta_graph.proto) -is a protocol buffer that defines the signature of a computation -supported by a graph. -Commonly used input keys, output keys, and method names are -defined in: - -* [Python](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/saved_model/signature_constants.py) -* [C++](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/cc/saved_model/signature_constants.h) - -## Using SavedModel with Estimators - -After training an `Estimator` model, you may want to create a service -from that model that takes requests and returns a result. You can run such a -service locally on your machine or deploy it in the cloud. - -To prepare a trained Estimator for serving, you must export it in the standard -SavedModel format. This section explains how to: - -* Specify the output nodes and the corresponding - [APIs](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto) - that can be served (Classify, Regress, or Predict). -* Export your model to the SavedModel format. -* Serve the model from a local server and request predictions. - - -### Prepare serving inputs - -During training, an @{$premade_estimators#input_fn$`input_fn()`} ingests data -and prepares it for use by the model. At serving time, similarly, a -`serving_input_receiver_fn()` accepts inference requests and prepares them for -the model. This function has the following purposes: - -* To add placeholders to the graph that the serving system will feed - with inference requests. -* To add any additional ops needed to convert data from the input format - into the feature `Tensor`s expected by the model. - -The function returns a @{tf.estimator.export.ServingInputReceiver} object, -which packages the placeholders and the resulting feature `Tensor`s together. - -A typical pattern is that inference requests arrive in the form of serialized -`tf.Example`s, so the `serving_input_receiver_fn()` creates a single string -placeholder to receive them. The `serving_input_receiver_fn()` is then also -responsible for parsing the `tf.Example`s by adding a @{tf.parse_example} op to -the graph. - -When writing such a `serving_input_receiver_fn()`, you must pass a parsing -specification to @{tf.parse_example} to tell the parser what feature names to -expect and how to map them to `Tensor`s. A parsing specification takes the -form of a dict from feature names to @{tf.FixedLenFeature}, @{tf.VarLenFeature}, -and @{tf.SparseFeature}. Note this parsing specification should not include -any label or weight columns, since those will not be available at serving -time—in contrast to a parsing specification used in the `input_fn()` at -training time. - -In combination, then: - -```py -feature_spec = {'foo': tf.FixedLenFeature(...), - 'bar': tf.VarLenFeature(...)} - -def serving_input_receiver_fn(): - """An input receiver that expects a serialized tf.Example.""" - serialized_tf_example = tf.placeholder(dtype=tf.string, - shape=[default_batch_size], - name='input_example_tensor') - receiver_tensors = {'examples': serialized_tf_example} - features = tf.parse_example(serialized_tf_example, feature_spec) - return tf.estimator.export.ServingInputReceiver(features, receiver_tensors) -``` - -The @{tf.estimator.export.build_parsing_serving_input_receiver_fn} utility -function provides that input receiver for the common case. - -> Note: when training a model to be served using the Predict API with a local -> server, the parsing step is not needed because the model will receive raw -> feature data. - -Even if you require no parsing or other input processing—that is, if the -serving system will feed feature `Tensor`s directly—you must still provide -a `serving_input_receiver_fn()` that creates placeholders for the feature -`Tensor`s and passes them through. The -@{tf.estimator.export.build_raw_serving_input_receiver_fn} utility provides for -this. - -If these utilities do not meet your needs, you are free to write your own -`serving_input_receiver_fn()`. One case where this may be needed is if your -training `input_fn()` incorporates some preprocessing logic that must be -recapitulated at serving time. To reduce the risk of training-serving skew, we -recommend encapsulating such processing in a function which is then called -from both `input_fn()` and `serving_input_receiver_fn()`. - -Note that the `serving_input_receiver_fn()` also determines the *input* -portion of the signature. That is, when writing a -`serving_input_receiver_fn()`, you must tell the parser what signatures -to expect and how to map them to your model's expected inputs. -By contrast, the *output* portion of the signature is determined by the model. - - -### Specify the outputs of a custom model - -When writing a custom `model_fn`, you must populate the `export_outputs` element -of the @{tf.estimator.EstimatorSpec} return value. This is a dict of -`{name: output}` describing the output signatures to be exported and used during -serving. - -In the usual case of making a single prediction, this dict contains -one element, and the `name` is immaterial. In a multi-headed model, each head -is represented by an entry in this dict. In this case the `name` is a string -of your choice that can be used to request a specific head at serving time. - -Each `output` value must be an `ExportOutput` object such as -@{tf.estimator.export.ClassificationOutput}, -@{tf.estimator.export.RegressionOutput}, or -@{tf.estimator.export.PredictOutput}. - -These output types map straightforwardly to the -[TensorFlow Serving APIs](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto), -and so determine which request types will be honored. - -Note: In the multi-headed case, a `SignatureDef` will be generated for each -element of the `export_outputs` dict returned from the model_fn, named using -the same keys. These `SignatureDef`s differ only in their outputs, as -provided by the corresponding `ExportOutput` entry. The inputs are always -those provided by the `serving_input_receiver_fn`. -An inference request may specify the head by name. One head must be named -using [`signature_constants.DEFAULT_SERVING_SIGNATURE_DEF_KEY`](https://www.tensorflow.org/code/tensorflow/python/saved_model/signature_constants.py) -indicating which `SignatureDef` will be served when an inference request -does not specify one. - - -### Perform the export - -To export your trained Estimator, call -@{tf.estimator.Estimator.export_savedmodel} with the export base path and -the `serving_input_receiver_fn`. - -```py -estimator.export_savedmodel(export_dir_base, serving_input_receiver_fn, - strip_default_attrs=True) -``` - -This method builds a new graph by first calling the -`serving_input_receiver_fn()` to obtain feature `Tensor`s, and then calling -this `Estimator`'s `model_fn()` to generate the model graph based on those -features. It starts a fresh `Session`, and, by default, restores the most recent -checkpoint into it. (A different checkpoint may be passed, if needed.) -Finally it creates a time-stamped export directory below the given -`export_dir_base` (i.e., `export_dir_base/`), and writes a -SavedModel into it containing a single `MetaGraphDef` saved from this -Session. - -> Note: It is your responsibility to garbage-collect old exports. -> Otherwise, successive exports will accumulate under `export_dir_base`. - -### Serve the exported model locally - -For local deployment, you can serve your model using -[TensorFlow Serving](https://github.com/tensorflow/serving), an open-source project that loads a -SavedModel and exposes it as a [gRPC](https://www.grpc.io/) service. - -First, [install TensorFlow Serving](https://github.com/tensorflow/serving). - -Then build and run the local model server, substituting `$export_dir_base` with -the path to the SavedModel you exported above: - -```sh -bazel build //tensorflow_serving/model_servers:tensorflow_model_server -bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server --port=9000 --model_base_path=$export_dir_base -``` - -Now you have a server listening for inference requests via gRPC on port 9000! - - -### Request predictions from a local server - -The server responds to gRPC requests according to the -[PredictionService](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis/prediction_service.proto#L15) -gRPC API service definition. (The nested protocol buffers are defined in -various [neighboring files](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/apis)). - -From the API service definition, the gRPC framework generates client libraries -in various languages providing remote access to the API. In a project using the -Bazel build tool, these libraries are built automatically and provided via -dependencies like these (using Python for example): - -```build - deps = [ - "//tensorflow_serving/apis:classification_proto_py_pb2", - "//tensorflow_serving/apis:regression_proto_py_pb2", - "//tensorflow_serving/apis:predict_proto_py_pb2", - "//tensorflow_serving/apis:prediction_service_proto_py_pb2" - ] -``` - -Python client code can then import the libraries thus: - -```py -from tensorflow_serving.apis import classification_pb2 -from tensorflow_serving.apis import regression_pb2 -from tensorflow_serving.apis import predict_pb2 -from tensorflow_serving.apis import prediction_service_pb2 -``` - -> Note: `prediction_service_pb2` defines the service as a whole and so -> is always required. However a typical client will need only one of -> `classification_pb2`, `regression_pb2`, and `predict_pb2`, depending on the -> type of requests being made. - -Sending a gRPC request is then accomplished by assembling a protocol buffer -containing the request data and passing it to the service stub. Note how the -request protocol buffer is created empty and then populated via the -[generated protocol buffer API](https://developers.google.com/protocol-buffers/docs/reference/python-generated). - -```py -from grpc.beta import implementations - -channel = implementations.insecure_channel(host, int(port)) -stub = prediction_service_pb2.beta_create_PredictionService_stub(channel) - -request = classification_pb2.ClassificationRequest() -example = request.input.example_list.examples.add() -example.features.feature['x'].float_list.value.extend(image[0].astype(float)) - -result = stub.Classify(request, 10.0) # 10 secs timeout -``` - -The returned result in this example is a `ClassificationResponse` protocol -buffer. - -This is a skeletal example; please see the @{$deploy$Tensorflow Serving} -documentation and [examples](https://github.com/tensorflow/serving/tree/master/tensorflow_serving/example) -for more details. - -> Note: `ClassificationRequest` and `RegressionRequest` contain a -> `tensorflow.serving.Input` protocol buffer, which in turn contains a list of -> `tensorflow.Example` protocol buffers. `PredictRequest`, by contrast, -> contains a mapping from feature names to values encoded via `TensorProto`. -> Correspondingly: When using the `Classify` and `Regress` APIs, TensorFlow -> Serving feeds serialized `tf.Example`s to the graph, so your -> `serving_input_receiver_fn()` should include a `tf.parse_example()` Op. -> When using the generic `Predict` API, however, TensorFlow Serving feeds raw -> feature data to the graph, so a pass through `serving_input_receiver_fn()` -> should be used. - - - - - - - - - -## CLI to inspect and execute SavedModel - -You can use the SavedModel Command Line Interface (CLI) to inspect and -execute a SavedModel. -For example, you can use the CLI to inspect the model's `SignatureDef`s. -The CLI enables you to quickly confirm that the input -@{$tensors$Tensor dtype and shape} match the model. Moreover, if you -want to test your model, you can use the CLI to do a sanity check by -passing in sample inputs in various formats (for example, Python -expressions) and then fetching the output. - - -### Install the SavedModel CLI - -Broadly speaking, you can install TensorFlow in either of the following -two ways: - -* By installing a pre-built TensorFlow binary. -* By building TensorFlow from source code. - -If you installed TensorFlow through a pre-built TensorFlow binary, -then the SavedModel CLI is already installed on your system -at pathname `bin\saved_model_cli`. - -If you built TensorFlow from source code, you must run the following -additional command to build `saved_model_cli`: - -``` -$ bazel build tensorflow/python/tools:saved_model_cli -``` - -### Overview of commands - -The SavedModel CLI supports the following two commands on a -`MetaGraphDef` in a SavedModel: - -* `show`, which shows a computation on a `MetaGraphDef` in a SavedModel. -* `run`, which runs a computation on a `MetaGraphDef`. - - -### `show` command - -A SavedModel contains one or more `MetaGraphDef`s, identified by their tag-sets. -To serve a model, you -might wonder what kind of `SignatureDef`s are in each model, and what are their -inputs and outputs. The `show` command let you examine the contents of the -SavedModel in hierarchical order. Here's the syntax: - -``` -usage: saved_model_cli show [-h] --dir DIR [--all] -[--tag_set TAG_SET] [--signature_def SIGNATURE_DEF_KEY] -``` - -For example, the following command shows all available -MetaGraphDef tag-sets in the SavedModel: - -``` -$ saved_model_cli show --dir /tmp/saved_model_dir -The given SavedModel contains the following tag-sets: -serve -serve, gpu -``` - -The following command shows all available `SignatureDef` keys in -a `MetaGraphDef`: - -``` -$ saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve -The given SavedModel `MetaGraphDef` contains `SignatureDefs` with the -following keys: -SignatureDef key: "classify_x2_to_y3" -SignatureDef key: "classify_x_to_y" -SignatureDef key: "regress_x2_to_y3" -SignatureDef key: "regress_x_to_y" -SignatureDef key: "regress_x_to_y2" -SignatureDef key: "serving_default" -``` - -If a `MetaGraphDef` has *multiple* tags in the tag-set, you must specify -all tags, each tag separated by a comma. For example: - -```none -$ saved_model_cli show --dir /tmp/saved_model_dir --tag_set serve,gpu -``` - -To show all inputs and outputs TensorInfo for a specific `SignatureDef`, pass in -the `SignatureDef` key to `signature_def` option. This is very useful when you -want to know the tensor key value, dtype and shape of the input tensors for -executing the computation graph later. For example: - -``` -$ saved_model_cli show --dir \ -/tmp/saved_model_dir --tag_set serve --signature_def serving_default -The given SavedModel SignatureDef contains the following input(s): - inputs['x'] tensor_info: - dtype: DT_FLOAT - shape: (-1, 1) - name: x:0 -The given SavedModel SignatureDef contains the following output(s): - outputs['y'] tensor_info: - dtype: DT_FLOAT - shape: (-1, 1) - name: y:0 -Method name is: tensorflow/serving/predict -``` - -To show all available information in the SavedModel, use the `--all` option. -For example: - -```none -$ saved_model_cli show --dir /tmp/saved_model_dir --all -MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs: - -signature_def['classify_x2_to_y3']: - The given SavedModel SignatureDef contains the following input(s): - inputs['inputs'] tensor_info: - dtype: DT_FLOAT - shape: (-1, 1) - name: x2:0 - The given SavedModel SignatureDef contains the following output(s): - outputs['scores'] tensor_info: - dtype: DT_FLOAT - shape: (-1, 1) - name: y3:0 - Method name is: tensorflow/serving/classify - -... - -signature_def['serving_default']: - The given SavedModel SignatureDef contains the following input(s): - inputs['x'] tensor_info: - dtype: DT_FLOAT - shape: (-1, 1) - name: x:0 - The given SavedModel SignatureDef contains the following output(s): - outputs['y'] tensor_info: - dtype: DT_FLOAT - shape: (-1, 1) - name: y:0 - Method name is: tensorflow/serving/predict -``` - - -### `run` command - -Invoke the `run` command to run a graph computation, passing -inputs and then displaying (and optionally saving) the outputs. -Here's the syntax: - -``` -usage: saved_model_cli run [-h] --dir DIR --tag_set TAG_SET --signature_def - SIGNATURE_DEF_KEY [--inputs INPUTS] - [--input_exprs INPUT_EXPRS] [--outdir OUTDIR] - [--overwrite] [--tf_debug] -``` - -The `run` command provides the following two ways to pass inputs to the model: - -* `--inputs` option enables you to pass numpy ndarray in files. -* `--input_exprs` option enables you to pass Python expressions. -* `--input_examples` option enables you to pass `tf.train.Example`. - - -#### `--inputs` - -To pass input data in files, specify the `--inputs` option, which takes the -following general format: - -```bsh ---inputs -``` - -where *INPUTS* is either of the following formats: - -* `=` -* `=[]` - -You may pass multiple *INPUTS*. If you do pass multiple inputs, use a semicolon -to separate each of the *INPUTS*. - -`saved_model_cli` uses `numpy.load` to load the *filename*. -The *filename* may be in any of the following formats: - -* `.npy` -* `.npz` -* pickle format - -A `.npy` file always contains a numpy ndarray. Therefore, when loading from -a `.npy` file, the content will be directly assigned to the specified input -tensor. If you specify a *variable_name* with that `.npy` file, the -*variable_name* will be ignored and a warning will be issued. - -When loading from a `.npz` (zip) file, you may optionally specify a -*variable_name* to identify the variable within the zip file to load for -the input tensor key. If you don't specify a *variable_name*, the SavedModel -CLI will check that only one file is included in the zip file and load it -for the specified input tensor key. - -When loading from a pickle file, if no `variable_name` is specified in the -square brackets, whatever that is inside the pickle file will be passed to the -specified input tensor key. Otherwise, the SavedModel CLI will assume a -dictionary is stored in the pickle file and the value corresponding to -the *variable_name* will be used. - - -#### `--inputs_exprs` - -To pass inputs through Python expressions, specify the `--input_exprs` option. -This can be useful for when you don't have data -files lying around, but still want to sanity check the model with some simple -inputs that match the dtype and shape of the model's `SignatureDef`s. -For example: - -```bsh -`=[[1],[2],[3]]` -``` - -In addition to Python expressions, you may also pass numpy functions. For -example: - -```bsh -`=np.ones((32,32,3))` -``` - -(Note that the `numpy` module is already available to you as `np`.) - - -#### `--inputs_examples` - -To pass `tf.train.Example` as inputs, specify the `--input_examples` option. -For each input key, it takes a list of dictionary, where each dictionary is an -instance of `tf.train.Example`. The dictionary keys are the features and the -values are the value lists for each feature. -For example: - -```bsh -`=[{"age":[22,24],"education":["BS","MS"]}]` -``` - -#### Save output - -By default, the SavedModel CLI writes output to stdout. If a directory is -passed to `--outdir` option, the outputs will be saved as npy files named after -output tensor keys under the given directory. - -Use `--overwrite` to overwrite existing output files. - - -#### TensorFlow debugger (tfdbg) integration - -If `--tf_debug` option is set, the SavedModel CLI will use the -TensorFlow Debugger (tfdbg) to watch the intermediate Tensors and runtime -graphs or subgraphs while running the SavedModel. - - -#### Full examples of `run` - -Given: - -* Your model simply adds `x1` and `x2` to get output `y`. -* All tensors in the model have shape `(-1, 1)`. -* You have two `npy` files: - * `/tmp/my_data1.npy`, which contains a numpy ndarray `[[1], [2], [3]]`. - * `/tmp/my_data2.npy`, which contains another numpy - ndarray `[[0.5], [0.5], [0.5]]`. - -To run these two `npy` files through the model to get output `y`, issue -the following command: - -``` -$ saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \ ---signature_def x1_x2_to_y --inputs x1=/tmp/my_data1.npy;x2=/tmp/my_data2.npy \ ---outdir /tmp/out -Result for output key y: -[[ 1.5] - [ 2.5] - [ 3.5]] -``` - -Let's change the preceding example slightly. This time, instead of two -`.npy` files, you now have an `.npz` file and a pickle file. Furthermore, -you want to overwrite any existing output file. Here's the command: - -``` -$ saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \ ---signature_def x1_x2_to_y \ ---inputs x1=/tmp/my_data1.npz[x];x2=/tmp/my_data2.pkl --outdir /tmp/out \ ---overwrite -Result for output key y: -[[ 1.5] - [ 2.5] - [ 3.5]] -``` - -You may specify python expression instead of an input file. For example, -the following command replaces input `x2` with a Python expression: - -``` -$ saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \ ---signature_def x1_x2_to_y --inputs x1=/tmp/my_data1.npz[x] \ ---input_exprs 'x2=np.ones((3,1))' -Result for output key y: -[[ 2] - [ 3] - [ 4]] -``` - -To run the model with the TensorFlow Debugger on, issue the -following command: - -``` -$ saved_model_cli run --dir /tmp/saved_model_dir --tag_set serve \ ---signature_def serving_default --inputs x=/tmp/data.npz[x] --tf_debug -``` - - - -## Structure of a SavedModel directory - -When you save a model in SavedModel format, TensorFlow creates -a SavedModel directory consisting of the following subdirectories -and files: - -```bsh -assets/ -assets.extra/ -variables/ - variables.data-?????-of-????? - variables.index -saved_model.pb|saved_model.pbtxt -``` - -where: - -* `assets` is a subfolder containing auxiliary (external) files, - such as vocabularies. Assets are copied to the SavedModel location - and can be read when loading a specific `MetaGraphDef`. -* `assets.extra` is a subfolder where higher-level libraries and users can - add their own assets that co-exist with the model, but are not loaded by - the graph. This subfolder is not managed by the SavedModel libraries. -* `variables` is a subfolder that includes output from - `tf.train.Saver`. -* `saved_model.pb` or `saved_model.pbtxt` is the SavedModel protocol buffer. - It includes the graph definitions as `MetaGraphDef` protocol buffers. - -A single SavedModel can represent multiple graphs. In this case, all the -graphs in the SavedModel share a *single* set of checkpoints (variables) -and assets. For example, the following diagram shows one SavedModel -containing three `MetaGraphDef`s, all three of which share the same set -of checkpoints and assets: - -![SavedModel represents checkpoints, assets, and one or more MetaGraphDefs](../images/SavedModel.svg) - -Each graph is associated with a specific set of tags, which enables -identification during a load or restore operation. diff --git a/tensorflow/docs_src/programmers_guide/summaries_and_tensorboard.md b/tensorflow/docs_src/programmers_guide/summaries_and_tensorboard.md deleted file mode 100644 index fadfa03e78..0000000000 --- a/tensorflow/docs_src/programmers_guide/summaries_and_tensorboard.md +++ /dev/null @@ -1,225 +0,0 @@ -# TensorBoard: Visualizing Learning - -The computations you'll use TensorFlow for - like training a massive -deep neural network - can be complex and confusing. To make it easier to -understand, debug, and optimize TensorFlow programs, we've included a suite of -visualization tools called TensorBoard. You can use TensorBoard to visualize -your TensorFlow graph, plot quantitative metrics about the execution of your -graph, and show additional data like images that pass through it. When -TensorBoard is fully configured, it looks like this: - -![MNIST TensorBoard](https://www.tensorflow.org/images/mnist_tensorboard.png "MNIST TensorBoard") - -
- -
- -This 30-minute tutorial is intended to get you started with simple TensorBoard -usage. It assumes a basic understanding of TensorFlow. - -There are other resources available as well! The [TensorBoard GitHub](https://github.com/tensorflow/tensorboard) -has a lot more information on using individual dashboards within TensorBoard -including tips & tricks and debugging information. - -## Setup - -[Install TensorFlow](https://www.tensorflow.org/install/). Installing TensorFlow -via pip should also automatically install TensorBoard. - -## Serializing the data - -TensorBoard operates by reading TensorFlow events files, which contain summary -data that you can generate when running TensorFlow. Here's the general -lifecycle for summary data within TensorBoard. - -First, create the TensorFlow graph that you'd like to collect summary -data from, and decide which nodes you would like to annotate with -@{$python/summary$summary operations}. - -For example, suppose you are training a convolutional neural network for -recognizing MNIST digits. You'd like to record how the learning rate -varies over time, and how the objective function is changing. Collect these by -attaching @{tf.summary.scalar} ops -to the nodes that output the learning rate and loss respectively. Then, give -each `scalar_summary` a meaningful `tag`, like `'learning rate'` or `'loss -function'`. - -Perhaps you'd also like to visualize the distributions of activations coming -off a particular layer, or the distribution of gradients or weights. Collect -this data by attaching -@{tf.summary.histogram} ops to -the gradient outputs and to the variable that holds your weights, respectively. - -For details on all of the summary operations available, check out the docs on -@{$python/summary$summary operations}. - -Operations in TensorFlow don't do anything until you run them, or an op that -depends on their output. And the summary nodes that we've just created are -peripheral to your graph: none of the ops you are currently running depend on -them. So, to generate summaries, we need to run all of these summary nodes. -Managing them by hand would be tedious, so use -@{tf.summary.merge_all} -to combine them into a single op that generates all the summary data. - -Then, you can just run the merged summary op, which will generate a serialized -`Summary` protobuf object with all of your summary data at a given step. -Finally, to write this summary data to disk, pass the summary protobuf to a -@{tf.summary.FileWriter}. - -The `FileWriter` takes a logdir in its constructor - this logdir is quite -important, it's the directory where all of the events will be written out. -Also, the `FileWriter` can optionally take a `Graph` in its constructor. -If it receives a `Graph` object, then TensorBoard will visualize your graph -along with tensor shape information. This will give you a much better sense of -what flows through the graph: see -@{$graph_viz#tensor-shape-information$Tensor shape information}. - -Now that you've modified your graph and have a `FileWriter`, you're ready to -start running your network! If you want, you could run the merged summary op -every single step, and record a ton of training data. That's likely to be more -data than you need, though. Instead, consider running the merged summary op -every `n` steps. - -The code example below is a modification of the -[simple MNIST tutorial](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/mnist/mnist.py), -in which we have added some summary ops, and run them every ten steps. If you -run this and then launch `tensorboard --logdir=/tmp/tensorflow/mnist`, you'll be able -to visualize statistics, such as how the weights or accuracy varied during -training. The code below is an excerpt; full source is -[here](https://www.tensorflow.org/code/tensorflow/examples/tutorials/mnist/mnist_with_summaries.py). - -```python -def variable_summaries(var): - """Attach a lot of summaries to a Tensor (for TensorBoard visualization).""" - with tf.name_scope('summaries'): - mean = tf.reduce_mean(var) - tf.summary.scalar('mean', mean) - with tf.name_scope('stddev'): - stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean))) - tf.summary.scalar('stddev', stddev) - tf.summary.scalar('max', tf.reduce_max(var)) - tf.summary.scalar('min', tf.reduce_min(var)) - tf.summary.histogram('histogram', var) - -def nn_layer(input_tensor, input_dim, output_dim, layer_name, act=tf.nn.relu): - """Reusable code for making a simple neural net layer. - - It does a matrix multiply, bias add, and then uses relu to nonlinearize. - It also sets up name scoping so that the resultant graph is easy to read, - and adds a number of summary ops. - """ - # Adding a name scope ensures logical grouping of the layers in the graph. - with tf.name_scope(layer_name): - # This Variable will hold the state of the weights for the layer - with tf.name_scope('weights'): - weights = weight_variable([input_dim, output_dim]) - variable_summaries(weights) - with tf.name_scope('biases'): - biases = bias_variable([output_dim]) - variable_summaries(biases) - with tf.name_scope('Wx_plus_b'): - preactivate = tf.matmul(input_tensor, weights) + biases - tf.summary.histogram('pre_activations', preactivate) - activations = act(preactivate, name='activation') - tf.summary.histogram('activations', activations) - return activations - -hidden1 = nn_layer(x, 784, 500, 'layer1') - -with tf.name_scope('dropout'): - keep_prob = tf.placeholder(tf.float32) - tf.summary.scalar('dropout_keep_probability', keep_prob) - dropped = tf.nn.dropout(hidden1, keep_prob) - -# Do not apply softmax activation yet, see below. -y = nn_layer(dropped, 500, 10, 'layer2', act=tf.identity) - -with tf.name_scope('cross_entropy'): - # The raw formulation of cross-entropy, - # - # tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.softmax(y)), - # reduction_indices=[1])) - # - # can be numerically unstable. - # - # So here we use tf.losses.sparse_softmax_cross_entropy on the - # raw logit outputs of the nn_layer above. - with tf.name_scope('total'): - cross_entropy = tf.losses.sparse_softmax_cross_entropy(labels=y_, logits=y) -tf.summary.scalar('cross_entropy', cross_entropy) - -with tf.name_scope('train'): - train_step = tf.train.AdamOptimizer(FLAGS.learning_rate).minimize( - cross_entropy) - -with tf.name_scope('accuracy'): - with tf.name_scope('correct_prediction'): - correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) - with tf.name_scope('accuracy'): - accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) -tf.summary.scalar('accuracy', accuracy) - -# Merge all the summaries and write them out to /tmp/mnist_logs (by default) -merged = tf.summary.merge_all() -train_writer = tf.summary.FileWriter(FLAGS.summaries_dir + '/train', - sess.graph) -test_writer = tf.summary.FileWriter(FLAGS.summaries_dir + '/test') -tf.global_variables_initializer().run() -``` - -After we've initialized the `FileWriters`, we have to add summaries to the -`FileWriters` as we train and test the model. - -```python -# Train the model, and also write summaries. -# Every 10th step, measure test-set accuracy, and write test summaries -# All other steps, run train_step on training data, & add training summaries - -def feed_dict(train): - """Make a TensorFlow feed_dict: maps data onto Tensor placeholders.""" - if train or FLAGS.fake_data: - xs, ys = mnist.train.next_batch(100, fake_data=FLAGS.fake_data) - k = FLAGS.dropout - else: - xs, ys = mnist.test.images, mnist.test.labels - k = 1.0 - return {x: xs, y_: ys, keep_prob: k} - -for i in range(FLAGS.max_steps): - if i % 10 == 0: # Record summaries and test-set accuracy - summary, acc = sess.run([merged, accuracy], feed_dict=feed_dict(False)) - test_writer.add_summary(summary, i) - print('Accuracy at step %s: %s' % (i, acc)) - else: # Record train set summaries, and train - summary, _ = sess.run([merged, train_step], feed_dict=feed_dict(True)) - train_writer.add_summary(summary, i) -``` - -You're now all set to visualize this data using TensorBoard. - - -## Launching TensorBoard - -To run TensorBoard, use the following command (alternatively `python -m -tensorboard.main`) - -```bash -tensorboard --logdir=path/to/log-directory -``` - -where `logdir` points to the directory where the `FileWriter` serialized its -data. If this `logdir` directory contains subdirectories which contain -serialized data from separate runs, then TensorBoard will visualize the data -from all of those runs. Once TensorBoard is running, navigate your web browser -to `localhost:6006` to view the TensorBoard. - -When looking at TensorBoard, you will see the navigation tabs in the top right -corner. Each tab represents a set of serialized data that can be visualized. - -For in depth information on how to use the *graph* tab to visualize your graph, -see @{$graph_viz$TensorBoard: Graph Visualization}. - -For more usage information on TensorBoard in general, see the -[TensorBoard GitHub](https://github.com/tensorflow/tensorboard). diff --git a/tensorflow/docs_src/programmers_guide/tensorboard_histograms.md b/tensorflow/docs_src/programmers_guide/tensorboard_histograms.md deleted file mode 100644 index 918deda190..0000000000 --- a/tensorflow/docs_src/programmers_guide/tensorboard_histograms.md +++ /dev/null @@ -1,245 +0,0 @@ -# TensorBoard Histogram Dashboard - -The TensorBoard Histogram Dashboard displays how the distribution of some -`Tensor` in your TensorFlow graph has changed over time. It does this by showing -many histograms visualizations of your tensor at different points in time. - -## A Basic Example - -Let's start with a simple case: a normally-distributed variable, where the mean -shifts over time. -TensorFlow has an op -[`tf.random_normal`](https://www.tensorflow.org/api_docs/python/tf/random_normal) -which is perfect for this purpose. As is usually the case with TensorBoard, we -will ingest data using a summary op; in this case, -['tf.summary.histogram'](https://www.tensorflow.org/api_docs/python/tf/summary/histogram). -For a primer on how summaries work, please see the general -[TensorBoard tutorial](https://www.tensorflow.org/get_started/summaries_and_tensorboard). - -Here is a code snippet that will generate some histogram summaries containing -normally distributed data, where the mean of the distribution increases over -time. - -```python -import tensorflow as tf - -k = tf.placeholder(tf.float32) - -# Make a normal distribution, with a shifting mean -mean_moving_normal = tf.random_normal(shape=[1000], mean=(5*k), stddev=1) -# Record that distribution into a histogram summary -tf.summary.histogram("normal/moving_mean", mean_moving_normal) - -# Setup a session and summary writer -sess = tf.Session() -writer = tf.summary.FileWriter("/tmp/histogram_example") - -summaries = tf.summary.merge_all() - -# Setup a loop and write the summaries to disk -N = 400 -for step in range(N): - k_val = step/float(N) - summ = sess.run(summaries, feed_dict={k: k_val}) - writer.add_summary(summ, global_step=step) -``` - -Once that code runs, we can load the data into TensorBoard via the command line: - - -```sh -tensorboard --logdir=/tmp/histogram_example -``` - -Once TensorBoard is running, load it in Chrome or Firefox and navigate to the -Histogram Dashboard. Then we can see a histogram visualization for our normally -distributed data. - -![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/1_moving_mean.png) - -`tf.summary.histogram` takes an arbitrarily sized and shaped Tensor, and -compresses it into a histogram data structure consisting of many bins with -widths and counts. For example, let's say we want to organize the numbers -`[0.5, 1.1, 1.3, 2.2, 2.9, 2.99]` into bins. We could make three bins: -* a bin -containing everything from 0 to 1 (it would contain one element, 0.5), -* a bin -containing everything from 1-2 (it would contain two elements, 1.1 and 1.3), -* a bin containing everything from 2-3 (it would contain three elements: 2.2, -2.9 and 2.99). - -TensorFlow uses a similar approach to create bins, but unlike in our example, it -doesn't create integer bins. For large, sparse datasets, that might result in -many thousands of bins. -Instead, [the bins are exponentially distributed, with many bins close to 0 and -comparatively few bins for very large numbers.](https://github.com/tensorflow/tensorflow/blob/c8b59c046895fa5b6d79f73e0b5817330fcfbfc1/tensorflow/core/lib/histogram/histogram.cc#L28) -However, visualizing exponentially-distributed bins is tricky; if height is used -to encode count, then wider bins take more space, even if they have the same -number of elements. Conversely, encoding count in the area makes height -comparisons impossible. Instead, the histograms [resample the data](https://github.com/tensorflow/tensorflow/blob/17c47804b86e340203d451125a721310033710f1/tensorflow/tensorboard/components/tf_backend/backend.ts#L400) -into uniform bins. This can lead to unfortunate artifacts in some cases. - -Each slice in the histogram visualizer displays a single histogram. -The slices are organized by step; -older slices (e.g. step 0) are further "back" and darker, while newer slices -(e.g. step 400) are close to the foreground, and lighter in color. -The y-axis on the right shows the step number. - -You can mouse over the histogram to see tooltips with some more detailed -information. For example, in the following image we can see that the histogram -at timestep 176 has a bin centered at 2.25 with 177 elements in that bin. - -![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/2_moving_mean_tooltip.png) - -Also, you may note that the histogram slices are not always evenly spaced in -step count or time. This is because TensorBoard uses -[reservoir sampling](https://en.wikipedia.org/wiki/Reservoir_sampling) to keep a -subset of all the histograms, to save on memory. Reservoir sampling guarantees -that every sample has an equal likelihood of being included, but because it is -a randomized algorithm, the samples chosen don't occur at even steps. - -## Overlay Mode - -There is a control on the left of the dashboard that allows you to toggle the -histogram mode from "offset" to "overlay": - -![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/3_overlay_offset.png) - -In "offset" mode, the visualization rotates 45 degrees, so that the individual -histogram slices are no longer spread out in time, but instead are all plotted -on the same y-axis. - -![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/4_overlay.png) -Now, each slice is a separate line on the chart, and the y-axis shows the item -count within each bucket. Darker lines are older, earlier steps, and lighter -lines are more recent, later steps. Once again, you can mouse over the chart to -see some additional information. - -![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/5_overlay_tooltips.png) - -In general, the overlay visualization is useful if you want to directly compare -the counts of different histograms. - -## Multimodal Distributions - -The Histogram Dashboard is great for visualizing multimodal -distributions. Let's construct a simple bimodal distribution by concatenating -the outputs from two different normal distributions. The code will look like -this: - -```python -import tensorflow as tf - -k = tf.placeholder(tf.float32) - -# Make a normal distribution, with a shifting mean -mean_moving_normal = tf.random_normal(shape=[1000], mean=(5*k), stddev=1) -# Record that distribution into a histogram summary -tf.summary.histogram("normal/moving_mean", mean_moving_normal) - -# Make a normal distribution with shrinking variance -variance_shrinking_normal = tf.random_normal(shape=[1000], mean=0, stddev=1-(k)) -# Record that distribution too -tf.summary.histogram("normal/shrinking_variance", variance_shrinking_normal) - -# Let's combine both of those distributions into one dataset -normal_combined = tf.concat([mean_moving_normal, variance_shrinking_normal], 0) -# We add another histogram summary to record the combined distribution -tf.summary.histogram("normal/bimodal", normal_combined) - -summaries = tf.summary.merge_all() - -# Setup a session and summary writer -sess = tf.Session() -writer = tf.summary.FileWriter("/tmp/histogram_example") - -# Setup a loop and write the summaries to disk -N = 400 -for step in range(N): - k_val = step/float(N) - summ = sess.run(summaries, feed_dict={k: k_val}) - writer.add_summary(summ, global_step=step) -``` - -You already remember our "moving mean" normal distribution from the example -above. Now we also have a "shrinking variance" distribution. Side-by-side, they -look like this: -![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/6_two_distributions.png) - -When we concatenate them, we get a chart that clearly reveals the divergent, -bimodal structure: -![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/7_bimodal.png) - -## Some more distributions - -Just for fun, let's generate and visualize a few more distributions, and then -combine them all into one chart. Here's the code we'll use: - -```python -import tensorflow as tf - -k = tf.placeholder(tf.float32) - -# Make a normal distribution, with a shifting mean -mean_moving_normal = tf.random_normal(shape=[1000], mean=(5*k), stddev=1) -# Record that distribution into a histogram summary -tf.summary.histogram("normal/moving_mean", mean_moving_normal) - -# Make a normal distribution with shrinking variance -variance_shrinking_normal = tf.random_normal(shape=[1000], mean=0, stddev=1-(k)) -# Record that distribution too -tf.summary.histogram("normal/shrinking_variance", variance_shrinking_normal) - -# Let's combine both of those distributions into one dataset -normal_combined = tf.concat([mean_moving_normal, variance_shrinking_normal], 0) -# We add another histogram summary to record the combined distribution -tf.summary.histogram("normal/bimodal", normal_combined) - -# Add a gamma distribution -gamma = tf.random_gamma(shape=[1000], alpha=k) -tf.summary.histogram("gamma", gamma) - -# And a poisson distribution -poisson = tf.random_poisson(shape=[1000], lam=k) -tf.summary.histogram("poisson", poisson) - -# And a uniform distribution -uniform = tf.random_uniform(shape=[1000], maxval=k*10) -tf.summary.histogram("uniform", uniform) - -# Finally, combine everything together! -all_distributions = [mean_moving_normal, variance_shrinking_normal, - gamma, poisson, uniform] -all_combined = tf.concat(all_distributions, 0) -tf.summary.histogram("all_combined", all_combined) - -summaries = tf.summary.merge_all() - -# Setup a session and summary writer -sess = tf.Session() -writer = tf.summary.FileWriter("/tmp/histogram_example") - -# Setup a loop and write the summaries to disk -N = 400 -for step in range(N): - k_val = step/float(N) - summ = sess.run(summaries, feed_dict={k: k_val}) - writer.add_summary(summ, global_step=step) -``` -### Gamma Distribution -![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/8_gamma.png) - -### Uniform Distribution -![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/9_uniform.png) - -### Poisson Distribution -![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/10_poisson.png) -The poisson distribution is defined over the integers. So, all of the values -being generated are perfect integers. The histogram compression moves the data -into floating-point bins, causing the visualization to show little -bumps over the integer values rather than perfect spikes. - -### All Together Now -Finally, we can concatenate all of the data into one funny-looking curve. -![](https://www.tensorflow.org/images/tensorboard/histogram_dashboard/11_all_combined.png) - diff --git a/tensorflow/docs_src/programmers_guide/tensors.md b/tensorflow/docs_src/programmers_guide/tensors.md deleted file mode 100644 index 1248c3cabe..0000000000 --- a/tensorflow/docs_src/programmers_guide/tensors.md +++ /dev/null @@ -1,330 +0,0 @@ -# Tensors - -TensorFlow, as the name indicates, is a framework to define and run computations -involving tensors. A **tensor** is a generalization of vectors and matrices to -potentially higher dimensions. Internally, TensorFlow represents tensors as -n-dimensional arrays of base datatypes. - -When writing a TensorFlow program, the main object you manipulate and pass -around is the `tf.Tensor`. A `tf.Tensor` object represents a partially defined -computation that will eventually produce a value. TensorFlow programs work by -first building a graph of `tf.Tensor` objects, detailing how each tensor is -computed based on the other available tensors and then by running parts of this -graph to achieve the desired results. - -A `tf.Tensor` has the following properties: - - * a data type (`float32`, `int32`, or `string`, for example) - * a shape - - -Each element in the Tensor has the same data type, and the data type is always -known. The shape (that is, the number of dimensions it has and the size of each -dimension) might be only partially known. Most operations produce tensors of -fully-known shapes if the shapes of their inputs are also fully known, but in -some cases it's only possible to find the shape of a tensor at graph execution -time. - -Some types of tensors are special, and these will be covered in other -units of the Programmer's guide. The main ones are: - - * `tf.Variable` - * `tf.constant` - * `tf.placeholder` - * `tf.SparseTensor` - -With the exception of `tf.Variable`, the value of a tensor is immutable, which -means that in the context of a single execution tensors only have a single -value. However, evaluating the same tensor twice can return different values; -for example that tensor can be the result of reading data from disk, or -generating a random number. - -## Rank - -The **rank** of a `tf.Tensor` object is its number of dimensions. Synonyms for -rank include **order** or **degree** or **n-dimension**. -Note that rank in TensorFlow is not the same as matrix rank in mathematics. -As the following table shows, each rank in TensorFlow corresponds to a -different mathematical entity: - -Rank | Math entity ---- | --- -0 | Scalar (magnitude only) -1 | Vector (magnitude and direction) -2 | Matrix (table of numbers) -3 | 3-Tensor (cube of numbers) -n | n-Tensor (you get the idea) - - -### Rank 0 - -The following snippet demonstrates creating a few rank 0 variables: - -```python -mammal = tf.Variable("Elephant", tf.string) -ignition = tf.Variable(451, tf.int16) -floating = tf.Variable(3.14159265359, tf.float64) -its_complicated = tf.Variable(12.3 - 4.85j, tf.complex64) -``` - -Note: A string is treated as a single item in TensorFlow, not as a sequence of -characters. It is possible to have scalar strings, vectors of strings, etc. - -### Rank 1 - -To create a rank 1 `tf.Tensor` object, you can pass a list of items as the -initial value. For example: - -```python -mystr = tf.Variable(["Hello"], tf.string) -cool_numbers = tf.Variable([3.14159, 2.71828], tf.float32) -first_primes = tf.Variable([2, 3, 5, 7, 11], tf.int32) -its_very_complicated = tf.Variable([12.3 - 4.85j, 7.5 - 6.23j], tf.complex64) -``` - - -### Higher ranks - -A rank 2 `tf.Tensor` object consists of at least one row and at least -one column: - -```python -mymat = tf.Variable([[7],[11]], tf.int16) -myxor = tf.Variable([[False, True],[True, False]], tf.bool) -linear_squares = tf.Variable([[4], [9], [16], [25]], tf.int32) -squarish_squares = tf.Variable([ [4, 9], [16, 25] ], tf.int32) -rank_of_squares = tf.rank(squarish_squares) -mymatC = tf.Variable([[7],[11]], tf.int32) -``` - -Higher-rank Tensors, similarly, consist of an n-dimensional array. For example, -during image processing, many tensors of rank 4 are used, with dimensions -corresponding to example-in-batch, image width, image height, and color channel. - -``` python -my_image = tf.zeros([10, 299, 299, 3]) # batch x height x width x color -``` - -### Getting a `tf.Tensor` object's rank - -To determine the rank of a `tf.Tensor` object, call the `tf.rank` method. -For example, the following method programmatically determines the rank -of the `tf.Tensor` defined in the previous section: - -```python -r = tf.rank(my_image) -# After the graph runs, r will hold the value 4. -``` - -### Referring to `tf.Tensor` slices - -Since a `tf.Tensor` is an n-dimensional array of cells, to access a single cell -in a `tf.Tensor` you need to specify n indices. - -For a rank 0 tensor (a scalar), no indices are necessary, since it is already a -single number. - -For a rank 1 tensor (a vector), passing a single index allows you to access a -number: - -```python -my_scalar = my_vector[2] -``` - -Note that the index passed inside the `[]` can itself be a scalar `tf.Tensor`, if -you want to dynamically choose an element from the vector. - -For tensors of rank 2 or higher, the situation is more interesting. For a -`tf.Tensor` of rank 2, passing two numbers returns a scalar, as expected: - - -```python -my_scalar = my_matrix[1, 2] -``` - - -Passing a single number, however, returns a subvector of a matrix, as follows: - - -```python -my_row_vector = my_matrix[2] -my_column_vector = my_matrix[:, 3] -``` - -The `:` notation is python slicing syntax for "leave this dimension alone". This -is useful in higher-rank Tensors, as it allows you to access its subvectors, -submatrices, and even other subtensors. - - -## Shape - -The **shape** of a tensor is the number of elements in each dimension. -TensorFlow automatically infers shapes during graph construction. These inferred -shapes might have known or unknown rank. If the rank is known, the sizes of each -dimension might be known or unknown. - -The TensorFlow documentation uses three notational conventions to describe -tensor dimensionality: rank, shape, and dimension number. The following table -shows how these relate to one another: - -Rank | Shape | Dimension number | Example ---- | --- | --- | --- -0 | [] | 0-D | A 0-D tensor. A scalar. -1 | [D0] | 1-D | A 1-D tensor with shape [5]. -2 | [D0, D1] | 2-D | A 2-D tensor with shape [3, 4]. -3 | [D0, D1, D2] | 3-D | A 3-D tensor with shape [1, 4, 3]. -n | [D0, D1, ... Dn-1] | n-D | A tensor with shape [D0, D1, ... Dn-1]. - -Shapes can be represented via Python lists / tuples of ints, or with the -@{tf.TensorShape}. - -### Getting a `tf.Tensor` object's shape - -There are two ways of accessing the shape of a `tf.Tensor`. While building the -graph, it is often useful to ask what is already known about a tensor's -shape. This can be done by reading the `shape` property of a `tf.Tensor` object. -This method returns a `TensorShape` object, which is a convenient way of -representing partially-specified shapes (since, when building the graph, not all -shapes will be fully known). - -It is also possible to get a `tf.Tensor` that will represent the fully-defined -shape of another `tf.Tensor` at runtime. This is done by calling the `tf.shape` -operation. This way, you can build a graph that manipulates the shapes of -tensors by building other tensors that depend on the dynamic shape of the input -`tf.Tensor`. - -For example, here is how to make a vector of zeros with the same size as the -number of columns in a given matrix: - -``` python -zeros = tf.zeros(my_matrix.shape[1]) -``` - -### Changing the shape of a `tf.Tensor` - -The **number of elements** of a tensor is the product of the sizes of all its -shapes. The number of elements of a scalar is always `1`. Since there are often -many different shapes that have the same number of elements, it's often -convenient to be able to change the shape of a `tf.Tensor`, keeping its elements -fixed. This can be done with `tf.reshape`. - -The following examples demonstrate how to reshape tensors: - -```python -rank_three_tensor = tf.ones([3, 4, 5]) -matrix = tf.reshape(rank_three_tensor, [6, 10]) # Reshape existing content into - # a 6x10 matrix -matrixB = tf.reshape(matrix, [3, -1]) # Reshape existing content into a 3x20 - # matrix. -1 tells reshape to calculate - # the size of this dimension. -matrixAlt = tf.reshape(matrixB, [4, 3, -1]) # Reshape existing content into a - #4x3x5 tensor - -# Note that the number of elements of the reshaped Tensors has to match the -# original number of elements. Therefore, the following example generates an -# error because no possible value for the last dimension will match the number -# of elements. -yet_another = tf.reshape(matrixAlt, [13, 2, -1]) # ERROR! -``` - -## Data types - -In addition to dimensionality, Tensors have a data type. Refer to the -`tf.DataType` page in the programmer's guide for a full list of the data types. - -It is not possible to have a `tf.Tensor` with more than one data type. It is -possible, however, to serialize arbitrary data structures as `string`s and store -those in `tf.Tensor`s. - -It is possible to cast `tf.Tensor`s from one datatype to another using -`tf.cast`: - -``` python -# Cast a constant integer tensor into floating point. -float_tensor = tf.cast(tf.constant([1, 2, 3]), dtype=tf.float32) -``` - -To inspect a `tf.Tensor`'s data type use the `Tensor.dtype` property. - -When creating a `tf.Tensor` from a python object you may optionally specify the -datatype. If you don't, TensorFlow chooses a datatype that can represent your -data. TensorFlow converts Python integers to `tf.int32` and python floating -point numbers to `tf.float32`. Otherwise TensorFlow uses the same rules numpy -uses when converting to arrays. - -## Evaluating Tensors - -Once the computation graph has been built, you can run the computation that -produces a particular `tf.Tensor` and fetch the value assigned to it. This is -often useful for debugging as well as being required for much of TensorFlow to -work. - -The simplest way to evaluate a Tensor is using the `Tensor.eval` method. For -example: - -```python -constant = tf.constant([1, 2, 3]) -tensor = constant * constant -print(tensor.eval()) -``` - -The `eval` method only works when a default `tf.Session` is active (see -Graphs and Sessions for more information). - -`Tensor.eval` returns a numpy array with the same contents as the tensor. - -Sometimes it is not possible to evaluate a `tf.Tensor` with no context because -its value might depend on dynamic information that is not available. For -example, tensors that depend on `placeholder`s can't be evaluated without -providing a value for the `placeholder`. - -``` python -p = tf.placeholder(tf.float32) -t = p + 1.0 -t.eval() # This will fail, since the placeholder did not get a value. -t.eval(feed_dict={p:2.0}) # This will succeed because we're feeding a value - # to the placeholder. -``` - -Note that it is possible to feed any `tf.Tensor`, not just placeholders. - -Other model constructs might make evaluating a `tf.Tensor` -complicated. TensorFlow can't directly evaluate `tf.Tensor`s defined inside -functions or inside control flow constructs. If a `tf.Tensor` depends on a value -from a queue, evaluating the `tf.Tensor` will only work once something has been -enqueued; otherwise, evaluating it will hang. When working with queues, remember -to call `tf.train.start_queue_runners` before evaluating any `tf.Tensor`s. - -## Printing Tensors - -For debugging purposes you might want to print the value of a `tf.Tensor`. While - @{$debugger$tfdbg} provides advanced debugging support, TensorFlow also has an - operation to directly print the value of a `tf.Tensor`. - -Note that you rarely want to use the following pattern when printing a -`tf.Tensor`: - -``` python -t = <> -print(t) # This will print the symbolic tensor when the graph is being built. - # This tensor does not have a value in this context. -``` - -This code prints the `tf.Tensor` object (which represents deferred computation) -and not its value. Instead, TensorFlow provides the `tf.Print` operation, which -returns its first tensor argument unchanged while printing the set of -`tf.Tensor`s it is passed as the second argument. - -To correctly use `tf.Print` its return value must be used. See the example below - -``` python -t = <> -tf.Print(t, [t]) # This does nothing -t = tf.Print(t, [t]) # Here we are using the value returned by tf.Print -result = t + 1 # Now when result is evaluated the value of `t` will be printed. -``` - -When you evaluate `result` you will evaluate everything `result` depends -upon. Since `result` depends upon `t`, and evaluating `t` has the side effect of -printing its input (the old value of `t`), `t` gets printed. - diff --git a/tensorflow/docs_src/programmers_guide/using_gpu.md b/tensorflow/docs_src/programmers_guide/using_gpu.md deleted file mode 100644 index c429ca4750..0000000000 --- a/tensorflow/docs_src/programmers_guide/using_gpu.md +++ /dev/null @@ -1,215 +0,0 @@ -# Using GPUs - -## Supported devices - -On a typical system, there are multiple computing devices. In TensorFlow, the -supported device types are `CPU` and `GPU`. They are represented as `strings`. -For example: - -* `"/cpu:0"`: The CPU of your machine. -* `"/device:GPU:0"`: The GPU of your machine, if you have one. -* `"/device:GPU:1"`: The second GPU of your machine, etc. - -If a TensorFlow operation has both CPU and GPU implementations, the GPU devices -will be given priority when the operation is assigned to a device. For example, -`matmul` has both CPU and GPU kernels. On a system with devices `cpu:0` and -`gpu:0`, `gpu:0` will be selected to run `matmul`. - -## Logging Device placement - -To find out which devices your operations and tensors are assigned to, create -the session with `log_device_placement` configuration option set to `True`. - -```python -# Creates a graph. -a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') -b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') -c = tf.matmul(a, b) -# Creates a session with log_device_placement set to True. -sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) -# Runs the op. -print(sess.run(c)) -``` - -You should see the following output: - -``` -Device mapping: -/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K40c, pci bus -id: 0000:05:00.0 -b: /job:localhost/replica:0/task:0/device:GPU:0 -a: /job:localhost/replica:0/task:0/device:GPU:0 -MatMul: /job:localhost/replica:0/task:0/device:GPU:0 -[[ 22. 28.] - [ 49. 64.]] - -``` - -## Manual device placement - -If you would like a particular operation to run on a device of your choice -instead of what's automatically selected for you, you can use `with tf.device` -to create a device context such that all the operations within that context will -have the same device assignment. - -```python -# Creates a graph. -with tf.device('/cpu:0'): - a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') - b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') -c = tf.matmul(a, b) -# Creates a session with log_device_placement set to True. -sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) -# Runs the op. -print(sess.run(c)) -``` - -You will see that now `a` and `b` are assigned to `cpu:0`. Since a device was -not explicitly specified for the `MatMul` operation, the TensorFlow runtime will -choose one based on the operation and available devices (`gpu:0` in this -example) and automatically copy tensors between devices if required. - -``` -Device mapping: -/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K40c, pci bus -id: 0000:05:00.0 -b: /job:localhost/replica:0/task:0/cpu:0 -a: /job:localhost/replica:0/task:0/cpu:0 -MatMul: /job:localhost/replica:0/task:0/device:GPU:0 -[[ 22. 28.] - [ 49. 64.]] -``` - -## Allowing GPU memory growth - -By default, TensorFlow maps nearly all of the GPU memory of all GPUs (subject to -[`CUDA_VISIBLE_DEVICES`](https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#env-vars)) -visible to the process. This is done to more efficiently use the relatively -precious GPU memory resources on the devices by reducing [memory -fragmentation](https://en.wikipedia.org/wiki/Fragmentation_\(computing\)). - -In some cases it is desirable for the process to only allocate a subset of the -available memory, or to only grow the memory usage as is needed by the process. -TensorFlow provides two Config options on the Session to control this. - -The first is the `allow_growth` option, which attempts to allocate only as much -GPU memory based on runtime allocations: it starts out allocating very little -memory, and as Sessions get run and more GPU memory is needed, we extend the GPU -memory region needed by the TensorFlow process. Note that we do not release -memory, since that can lead to even worse memory fragmentation. To turn this -option on, set the option in the ConfigProto by: - -```python -config = tf.ConfigProto() -config.gpu_options.allow_growth = True -session = tf.Session(config=config, ...) -``` - -The second method is the `per_process_gpu_memory_fraction` option, which -determines the fraction of the overall amount of memory that each visible GPU -should be allocated. For example, you can tell TensorFlow to only allocate 40% -of the total memory of each GPU by: - -```python -config = tf.ConfigProto() -config.gpu_options.per_process_gpu_memory_fraction = 0.4 -session = tf.Session(config=config, ...) -``` - -This is useful if you want to truly bound the amount of GPU memory available to -the TensorFlow process. - -## Using a single GPU on a multi-GPU system - -If you have more than one GPU in your system, the GPU with the lowest ID will be -selected by default. If you would like to run on a different GPU, you will need -to specify the preference explicitly: - -```python -# Creates a graph. -with tf.device('/device:GPU:2'): - a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') - b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') - c = tf.matmul(a, b) -# Creates a session with log_device_placement set to True. -sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) -# Runs the op. -print(sess.run(c)) -``` - -If the device you have specified does not exist, you will get -`InvalidArgumentError`: - -``` -InvalidArgumentError: Invalid argument: Cannot assign a device to node 'b': -Could not satisfy explicit device specification '/device:GPU:2' - [[Node: b = Const[dtype=DT_FLOAT, value=Tensor, _device="/device:GPU:2"]()]] -``` - -If you would like TensorFlow to automatically choose an existing and supported -device to run the operations in case the specified one doesn't exist, you can -set `allow_soft_placement` to `True` in the configuration option when creating -the session. - -```python -# Creates a graph. -with tf.device('/device:GPU:2'): - a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') - b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') - c = tf.matmul(a, b) -# Creates a session with allow_soft_placement and log_device_placement set -# to True. -sess = tf.Session(config=tf.ConfigProto( - allow_soft_placement=True, log_device_placement=True)) -# Runs the op. -print(sess.run(c)) -``` - -## Using multiple GPUs - -If you would like to run TensorFlow on multiple GPUs, you can construct your -model in a multi-tower fashion where each tower is assigned to a different GPU. -For example: - -``` python -# Creates a graph. -c = [] -for d in ['/device:GPU:2', '/device:GPU:3']: - with tf.device(d): - a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3]) - b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2]) - c.append(tf.matmul(a, b)) -with tf.device('/cpu:0'): - sum = tf.add_n(c) -# Creates a session with log_device_placement set to True. -sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) -# Runs the op. -print(sess.run(sum)) -``` - -You will see the following output. - -``` -Device mapping: -/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: Tesla K20m, pci bus -id: 0000:02:00.0 -/job:localhost/replica:0/task:0/device:GPU:1 -> device: 1, name: Tesla K20m, pci bus -id: 0000:03:00.0 -/job:localhost/replica:0/task:0/device:GPU:2 -> device: 2, name: Tesla K20m, pci bus -id: 0000:83:00.0 -/job:localhost/replica:0/task:0/device:GPU:3 -> device: 3, name: Tesla K20m, pci bus -id: 0000:84:00.0 -Const_3: /job:localhost/replica:0/task:0/device:GPU:3 -Const_2: /job:localhost/replica:0/task:0/device:GPU:3 -MatMul_1: /job:localhost/replica:0/task:0/device:GPU:3 -Const_1: /job:localhost/replica:0/task:0/device:GPU:2 -Const: /job:localhost/replica:0/task:0/device:GPU:2 -MatMul: /job:localhost/replica:0/task:0/device:GPU:2 -AddN: /job:localhost/replica:0/task:0/cpu:0 -[[ 44. 56.] - [ 98. 128.]] -``` - -The @{$deep_cnn$cifar10 tutorial} is a good example -demonstrating how to do training with multiple GPUs. diff --git a/tensorflow/docs_src/programmers_guide/using_tpu.md b/tensorflow/docs_src/programmers_guide/using_tpu.md deleted file mode 100644 index 44aabf0557..0000000000 --- a/tensorflow/docs_src/programmers_guide/using_tpu.md +++ /dev/null @@ -1,395 +0,0 @@ -# Using TPUs - -This document walks through the principal TensorFlow APIs necessary to make -effective use of a [Cloud TPU](https://cloud.google.com/tpu/), and highlights -the differences between regular TensorFlow usage, and usage on a TPU. - -This doc is aimed at users who: - -* Are familiar with TensorFlow's `Estimator` and `Dataset` APIs -* Have maybe [tried out a Cloud TPU](https://cloud.google.com/tpu/docs/quickstart) - using an existing model. -* Have, perhaps, skimmed the code of an example TPU model - [[1]](https://github.com/tensorflow/models/blob/master/official/mnist/mnist_tpu.py) - [[2]](https://github.com/tensorflow/tpu/tree/master/models). -* Are interested in porting an existing `Estimator` model to - run on Cloud TPUs - -## TPUEstimator - -@{tf.estimator.Estimator$Estimators} are TensorFlow's model-level abstraction. -Standard `Estimators` can drive models on CPU and GPUs. You must use -@{tf.contrib.tpu.TPUEstimator} to drive a model on TPUs. - -Refer to TensorFlow's Getting Started section for an introduction to the basics -of using a @{$premade_estimators$pre-made `Estimator`}, and -@{$custom_estimators$custom `Estimator`s}. - -The `TPUEstimator` class differs somewhat from the `Estimator` class. - -The simplest way to maintain a model that can be run both on CPU/GPU or on a -Cloud TPU is to define the model's inference phase (from inputs to predictions) -outside of the `model_fn`. Then maintain separate implementations of the -`Estimator` setup and `model_fn`, both wrapping this inference step. For an -example of this pattern compare the `mnist.py` and `mnist_tpu.py` implementation in -[tensorflow/models](https://github.com/tensorflow/models/tree/master/official/mnist). - -### Running a `TPUEstimator` locally - -To create a standard `Estimator` you call the constructor, and pass it a -`model_fn`, for example: - -``` -my_estimator = tf.estimator.Estimator( - model_fn=my_model_fn) -``` - -The changes required to use a @{tf.contrib.tpu.TPUEstimator} on your local -machine are relatively minor. The constructor requires two additional arguments. -You should set the `use_tpu` argument to `False`, and pass a -@{tf.contrib.tpu.RunConfig} as the `config` argument, as shown below: - -``` python -my_tpu_estimator = tf.contrib.tpu.TPUEstimator( - model_fn=my_model_fn, - config=tf.contrib.tpu.RunConfig() - use_tpu=False) -``` - -Just this simple change will allow you to run a `TPUEstimator` locally. -The majority of example TPU models can be run in this local mode, -by setting the command line flags as follows: - - -``` -$> python mnist_tpu.py --use_tpu=false --master='' -``` - -Note: This `use_tpu=False` argument is useful for trying out the `TPUEstimator` -API. It is not meant to be a complete TPU compatibility test. Successfully -running a model locally in a `TPUEstimator` does not guarantee that it will -work on a TPU. - - -### Building a `tpu.RunConfig` - -While the default `RunConfig` is sufficient for local training, these settings -cannot be ignored in real usage. - -A more typical setup for a `RunConfig`, that can be switched to use a Cloud -TPU, might be as follows: - -``` python -import tempfile -import subprocess - -class FLAGS(object): - use_tpu=False - tpu_name=None - # Use a local temporary path for the `model_dir` - model_dir = tempfile.mkdtemp() - # Number of training steps to run on the Cloud TPU before returning control. - iterations = 50 - # A single Cloud TPU has 8 shards. - num_shards = 8 - -if FLAGS.use_tpu: - my_project_name = subprocess.check_output([ - 'gcloud','config','get-value','project']) - my_zone = subprocess.check_output([ - 'gcloud','config','get-value','compute/zone']) - cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver( - tpu_names=[FLAGS.tpu_name], - zone=my_zone, - project=my_project) - master = tpu_cluster_resolver.get_master() -else: - master = '' - -my_tpu_run_config = tf.contrib.tpu.RunConfig( - master=master, - evaluation_master=master, - model_dir=FLAGS.model_dir, - session_config=tf.ConfigProto( - allow_soft_placement=True, log_device_placement=True), - tpu_config=tf.contrib.tpu.TPUConfig(FLAGS.iterations, - FLAGS.num_shards), -) -``` - -Then you must pass the @{tf.contrib.tpu.RunConfig} to the constructor: - -``` python -my_tpu_estimator = tf.contrib.tpu.TPUEstimator( - model_fn=my_model_fn, - config = my_tpu_run_config, - use_tpu=FLAGS.use_tpu) -``` - -Typically the `FLAGS` would be set by command line arguments. To switch from -training locally to training on a cloud TPU you would need to: - -* Set `FLAGS.use_tpu` to `True` -* Set `FLAGS.tpu_name` so the `tf.contrib.cluster_resolver.TPUClusterResolver` can find it -* Set `FLAGS.model_dir` to a Google Cloud Storage bucket url (`gs://`). - - -## Optimizer - -When training on a cloud TPU you **must** wrap the optimizer in a -@{tf.contrib.tpu.CrossShardOptimizer}, which uses an `allreduce` to aggregate -gradients and broadcast the result to each shard (each TPU core). - -The `CrossShardOptimizer` is not compatible with local training. So, to have -the same code run both locally and on a Cloud TPU, add lines like the following: - -``` python -optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate) -if FLAGS.use_tpu: - optimizer = tf.contrib.tpu.CrossShardOptimizer(optimizer) -``` - -If you prefer to avoid a global `FLAGS` variable in your model code, one -approach is to set the optimizer as one of the `Estimator`'s params, -as follows: - -``` python -my_tpu_estimator = tf.contrib.tpu.TPUEstimator( - model_fn=my_model_fn, - config = my_tpu_run_config, - use_tpu=FLAGS.use_tpu, - params={'optimizer':optimizer}) -``` - -## Model Function - -This section details the changes you must make to the model function -(`model_fn()`) to make it `TPUEstimator` compatible. - -### Static shapes - -During regular usage TensorFlow attempts to determine the shapes of each -`tf.Tensor` during graph construction. During execution any unknown shape -dimensions are determined dynamically, -see @{$programmers_guide/tensors#shape$Tensor Shapes} for more details. - -To run on Cloud TPUs TensorFlow models are compiled using @{$xla$XLA}. -XLA uses a similar system for determining shapes at compile time. XLA requires -that all tensor dimensions be statically defined at compile time. All shapes -must evaluate to a constant, and not depend on external data, or stateful -operations like variables or a random number generator. - - -### Summaries - -Remove any use of `tf.summary` from your model. - -@{$summaries_and_tensorboard$TensorBoard summaries} are a great way see inside -your model. A minimal set of basic summaries are automatically recorded by the -`TPUEstimator`, to `event` files in the `model_dir`. Custom summaries, however, -are currently unsupported when training on a Cloud TPU. So while the -`TPUEstimator` will still run locally with summaries, it will fail if used on a -TPU. - -### Metrics - -Build your evaluation metrics dictionary in a stand-alone `metric_fn`. - - - -Evaluation metrics are an essential part of training a model. These are fully -supported on Cloud TPUs, but with a slightly different syntax. - -A standard @{tf.metrics} returns two tensors. The first returns the running -average of the metric value, while the second updates the running average and -returns the value for this batch: - -``` -running_average, current_batch = tf.metrics.accuracy(labels, predictions) -``` - -In a standard `Estimator` you create a dictionary of these pairs, and return it -as part of the `EstimatorSpec`. - -```python -my_metrics = {'accuracy': tf.metrics.accuracy(labels, predictions)} - -return tf.estimator.EstimatorSpec( - ... - eval_metric_ops=my_metrics -) -``` - -In a `TPUEstimator` you instead pass a function (which returns a metrics -dictionary) and a list of argument tensors, as shown below: - -```python -def my_metric_fn(labels, predictions): - return {'accuracy': tf.metrics.accuracy(labels, predictions)} - -return tf.contrib.tpu.TPUEstimatorSpec( - ... - eval_metrics=(my_metric_fn, [labels, predictions]) -) -``` - -### Use `TPUEstimatorSpec` - -`TPUEstimatorSpec` do not support hooks, and require function wrappers for -some fields. - -An `Estimator`'s `model_fn` must return an `EstimatorSpec`. An `EstimatorSpec` -is a simple structure of named fields containing all the `tf.Tensors` of the -model that the `Estimator` may need to interact with. - -`TPUEstimators` use a @{tf.contrib.tpu.TPUEstimatorSpec}. There are a few -differences between it and a standard @{tf.estimator.EstimatorSpec}: - - -* The `eval_metric_ops` must be wrapped into a `metrics_fn`, this field is - renamed `eval_metrics` ([see above](#metrics)). -* The @{tf.train.SessionRunHook$hooks} are unsupported, so these fields are - omitted. -* The @{tf.train.Scaffold$`scaffold`}, if used, must also be wrapped in a - function. This field is renamed to `scaffold_fn`. - -`Scaffold` and `Hooks` are for advanced usage, and can typically be omitted. - -## Input functions - -Input functions work mainly unchanged as they run on the host computer, not the -Cloud TPU itself. This section explains the two necessary adjustments. - -### Params argument - - - -The `input_fn` for a standard `Estimator` _can_ include a -`params` argument; the `input_fn` for a `TPUEstimator` *must* include a -`params` argument. This is necessary to allow the estimator to set the batch -size for each replica of the input stream. So the minimum signature for an -`input_fn` for a `TPUEstimator` is: - -``` -def my_input_fn(params): - pass -``` - -Where `params['batch-size']` will contain the batch size. - -### Static shapes and batch size - -The input pipeline generated by your `input_fn` is run on CPU. So it is mostly -free from the strict static shape requirements imposed by the XLA/TPU environment. -The one requirement is that the batches of data fed from your input pipeline to -the TPU have a static shape, as determined by the standard TensorFlow shape -inference algorithm. Intermediate tensors are free to have a dynamic shapes. -If shape inference has failed, but the shape is known it is possible to -impose the correct shape using `tf.set_shape()`. - -In the example below the shape -inference algorithm fails, but it is correctly using `set_shape`: - -``` ->>> x = tf.zeros(tf.constant([1,2,3])+1) ->>> x.shape - -TensorShape([Dimension(None), Dimension(None), Dimension(None)]) - ->>> x.set_shape([2,3,4]) -``` - -In many cases the batch size is the only unknown dimension. - -A typical input pipeline, using `tf.data`, will usually produce batches of a -fixed size. The last batch of a finite `Dataset`, however, is typically smaller, -containing just the remaining elements. Since a `Dataset` does not know its own -length or finiteness, the standard @{tf.data.Dataset.batch$`batch`} method -cannot determine if all batches will have a fixed size batch on its own: - -``` ->>> params = {'batch_size':32} ->>> ds = tf.data.Dataset.from_tensors([0, 1, 2]) ->>> ds = ds.repeat().batch(params['batch-size']) ->>> ds - - -``` - -The most straightforward fix is to -@{tf.data.Dataset.apply$apply} @{tf.contrib.data.batch_and_drop_remainder} -as follows: - -``` ->>> params = {'batch_size':32} ->>> ds = tf.data.Dataset.from_tensors([0, 1, 2]) ->>> ds = ds.repeat().apply( -... tf.contrib.data.batch_and_drop_remainder(params['batch-size'])) ->>> ds - - <_RestructuredDataset shapes: (32, 3), types: tf.int32> -``` - -The one downside to this approach is that, as the name implies, this batching -method throws out any fractional batch at the end of the dataset. This is fine -for an infinitely repeating dataset being used for training, but could be a -problem if you want to train for an exact number of epochs. - -To do an exact 1-epoch of _evaluation_ you can work around this by manually -padding the length of the batches, and setting the padding entries to have zero -weight when creating your `tf.metrics`. - -## Datasets - -Efficient use of the `tf.data.Dataset` API is critical when using a Cloud -TPU, as it is impossible to use the Cloud TPU's unless you can feed it data -quickly enough. See @{$datasets_performance} for details on dataset performance. - -For all but the simplest experimentation (using -@{tf.data.Dataset.from_tensor_slices} or other in-graph data) you will need to -store all data files read by the `TPUEstimator`'s `Dataset` in Google Cloud -Storage Buckets. - - - -For most use-cases, we recommend converting your data into `TFRecord` -format and using a @{tf.data.TFRecordDataset} to read it. This, however, is not -a hard requirement and you can use other dataset readers -(`FixedLengthRecordDataset` or `TextLineDataset`) if you prefer. - -Small datasets can be loaded entirely into memory using -@{tf.data.Dataset.cache}. - -Regardless of the data format used, it is strongly recommended that you -@{$performance_guide#use_large_files$use large files}, on the order of -100MB. This is especially important in this networked setting as the overhead -of opening a file is significantly higher. - -It is also important, regardless of the type of reader used, to enable buffering -using the `buffer_size` argument to the constructor. This argument is specified -in bytes. A minimum of a few MB (`buffer_size=8*1024*1024`) is recommended so -that data is available when needed. - -The TPU-demos repo includes -[a script](https://github.com/tensorflow/tpu/blob/master/tools/datasets/imagenet_to_gcs.py) -for downloading the imagenet dataset and converting it to an appropriate format. -This together with the imagenet -[models](https://github.com/tensorflow/tpu/tree/master/models) -included in the repo demonstrate all of these best-practices. - - -## What Next - -For details on how to actually set up and run a Cloud TPU see: - - * [Google Cloud TPU Documentation](https://cloud.google.com/tpu/docs/) - -This document is by no means exhaustive. The best source of more detail on how -to make a Cloud TPU compatible model are the example models published in: - - * The [TPU Demos Repository.](https://github.com/tensorflow/tpu) - -For more information about tuning TensorFlow code for performance see: - - * The @{$performance$Performance Section.} - diff --git a/tensorflow/docs_src/programmers_guide/variables.md b/tensorflow/docs_src/programmers_guide/variables.md deleted file mode 100644 index cd8c4b5b9a..0000000000 --- a/tensorflow/docs_src/programmers_guide/variables.md +++ /dev/null @@ -1,319 +0,0 @@ -# Variables - -A TensorFlow **variable** is the best way to represent shared, persistent state -manipulated by your program. - -Variables are manipulated via the `tf.Variable` class. A `tf.Variable` -represents a tensor whose value can be changed by running ops on it. Unlike -`tf.Tensor` objects, a `tf.Variable` exists outside the context of a single -`session.run` call. - -Internally, a `tf.Variable` stores a persistent tensor. Specific ops allow you -to read and modify the values of this tensor. These modifications are visible -across multiple `tf.Session`s, so multiple workers can see the same values for a -`tf.Variable`. - -## Creating a Variable - -The best way to create a variable is to call the `tf.get_variable` -function. This function requires you to specify the Variable's name. This name -will be used by other replicas to access the same variable, as well as to name -this variable's value when checkpointing and exporting models. `tf.get_variable` -also allows you to reuse a previously created variable of the same name, making it -easy to define models which reuse layers. - -To create a variable with `tf.get_variable`, simply provide the name and shape - -``` python -my_variable = tf.get_variable("my_variable", [1, 2, 3]) -``` - -This creates a variable named "my_variable" which is a three-dimensional tensor -with shape `[1, 2, 3]`. This variable will, by default, have the `dtype` -`tf.float32` and its initial value will be randomized via -`tf.glorot_uniform_initializer`. - -You may optionally specify the `dtype` and initializer to `tf.get_variable`. For -example: - -``` python -my_int_variable = tf.get_variable("my_int_variable", [1, 2, 3], dtype=tf.int32, - initializer=tf.zeros_initializer) -``` - -TensorFlow provides many convenient initializers. Alternatively, you may -initialize a `tf.Variable` to have the value of a `tf.Tensor`. For example: - -``` python -other_variable = tf.get_variable("other_variable", dtype=tf.int32, - initializer=tf.constant([23, 42])) -``` - -Note that when the initializer is a `tf.Tensor` you should not specify the -variable's shape, as the shape of the initializer tensor will be used. - - - -### Variable collections - -Because disconnected parts of a TensorFlow program might want to create -variables, it is sometimes useful to have a single way to access all of -them. For this reason TensorFlow provides **collections**, which are named lists -of tensors or other objects, such as `tf.Variable` instances. - -By default every `tf.Variable` gets placed in the following two collections: - - * `tf.GraphKeys.GLOBAL_VARIABLES` --- variables that can be shared across - multiple devices, - * `tf.GraphKeys.TRAINABLE_VARIABLES` --- variables for which TensorFlow will - calculate gradients. - -If you don't want a variable to be trainable, add it to the -`tf.GraphKeys.LOCAL_VARIABLES` collection instead. For example, the following -snippet demonstrates how to add a variable named `my_local` to this collection: - -``` python -my_local = tf.get_variable("my_local", shape=(), -collections=[tf.GraphKeys.LOCAL_VARIABLES]) -``` - -Alternatively, you can specify `trainable=False` as an argument to -`tf.get_variable`: - -``` python -my_non_trainable = tf.get_variable("my_non_trainable", - shape=(), - trainable=False) -``` - - -You can also use your own collections. Any string is a valid collection name, -and there is no need to explicitly create a collection. To add a variable (or -any other object) to a collection after creating the variable, call -`tf.add_to_collection`. For example, the following code adds an existing -variable named `my_local` to a collection named `my_collection_name`: - -``` python -tf.add_to_collection("my_collection_name", my_local) -``` - -And to retrieve a list of all the variables (or other objects) you've placed in -a collection you can use: - -``` python -tf.get_collection("my_collection_name") -``` - -### Device placement - -Just like any other TensorFlow operation, you can place variables on particular -devices. For example, the following snippet creates a variable named `v` and -places it on the second GPU device: - -``` python -with tf.device("/device:GPU:1"): - v = tf.get_variable("v", [1]) -``` - -It is particularly important for variables to be in the correct device in -distributed settings. Accidentally putting variables on workers instead of -parameter servers, for example, can severely slow down training or, in the worst -case, let each worker blithely forge ahead with its own independent copy of each -variable. For this reason we provide @{tf.train.replica_device_setter}, which -can automatically place variables in parameter servers. For example: - -``` python -cluster_spec = { - "ps": ["ps0:2222", "ps1:2222"], - "worker": ["worker0:2222", "worker1:2222", "worker2:2222"]} -with tf.device(tf.train.replica_device_setter(cluster=cluster_spec)): - v = tf.get_variable("v", shape=[20, 20]) # this variable is placed - # in the parameter server - # by the replica_device_setter -``` - -## Initializing variables - -Before you can use a variable, it must be initialized. If you are programming in -the low-level TensorFlow API (that is, you are explicitly creating your own -graphs and sessions), you must explicitly initialize the variables. Most -high-level frameworks such as `tf.contrib.slim`, `tf.estimator.Estimator` and -`Keras` automatically initialize variables for you before training a model. - -Explicit initialization is otherwise useful because it allows you not to rerun -potentially expensive initializers when reloading a model from a checkpoint as -well as allowing determinism when randomly-initialized variables are shared in a -distributed setting. - -To initialize all trainable variables in one go, before training starts, call -`tf.global_variables_initializer()`. This function returns a single operation -responsible for initializing all variables in the -`tf.GraphKeys.GLOBAL_VARIABLES` collection. Running this operation initializes -all variables. For example: - -``` python -session.run(tf.global_variables_initializer()) -# Now all variables are initialized. -``` - -If you do need to initialize variables yourself, you can run the variable's -initializer operation. For example: - -``` python -session.run(my_variable.initializer) -``` - - -You can also ask which variables have still not been initialized. For example, -the following code prints the names of all variables which have not yet been -initialized: - -``` python -print(session.run(tf.report_uninitialized_variables())) -``` - - -Note that by default `tf.global_variables_initializer` does not specify the -order in which variables are initialized. Therefore, if the initial value of a -variable depends on another variable's value, it's likely that you'll get an -error. Any time you use the value of a variable in a context in which not all -variables are initialized (say, if you use a variable's value while initializing -another variable), it is best to use `variable.initialized_value()` instead of -`variable`: - -``` python -v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer()) -w = tf.get_variable("w", initializer=v.initialized_value() + 1) -``` - -## Using variables - -To use the value of a `tf.Variable` in a TensorFlow graph, simply treat it like -a normal `tf.Tensor`: - -``` python -v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer()) -w = v + 1 # w is a tf.Tensor which is computed based on the value of v. - # Any time a variable is used in an expression it gets automatically - # converted to a tf.Tensor representing its value. -``` - -To assign a value to a variable, use the methods `assign`, `assign_add`, and -friends in the `tf.Variable` class. For example, here is how you can call these -methods: - -``` python -v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer()) -assignment = v.assign_add(1) -tf.global_variables_initializer().run() -sess.run(assignment) # or assignment.op.run(), or assignment.eval() -``` - -Most TensorFlow optimizers have specialized ops that efficiently update the -values of variables according to some gradient descent-like algorithm. See -@{tf.train.Optimizer} for an explanation of how to use optimizers. - -Because variables are mutable it's sometimes useful to know what version of a -variable's value is being used at any point in time. To force a re-read of the -value of a variable after something has happened, you can use -`tf.Variable.read_value`. For example: - -``` python -v = tf.get_variable("v", shape=(), initializer=tf.zeros_initializer()) -assignment = v.assign_add(1) -with tf.control_dependencies([assignment]): - w = v.read_value() # w is guaranteed to reflect v's value after the - # assign_add operation. -``` - - -## Sharing variables - -TensorFlow supports two ways of sharing variables: - - * Explicitly passing `tf.Variable` objects around. - * Implicitly wrapping `tf.Variable` objects within `tf.variable_scope` objects. - -While code which explicitly passes variables around is very clear, it is -sometimes convenient to write TensorFlow functions that implicitly use -variables in their implementations. Most of the functional layers from -`tf.layers` use this approach, as well as all `tf.metrics`, and a few other -library utilities. - -Variable scopes allow you to control variable reuse when calling functions which -implicitly create and use variables. They also allow you to name your variables -in a hierarchical and understandable way. - -For example, let's say we write a function to create a convolutional / relu -layer: - -```python -def conv_relu(input, kernel_shape, bias_shape): - # Create variable named "weights". - weights = tf.get_variable("weights", kernel_shape, - initializer=tf.random_normal_initializer()) - # Create variable named "biases". - biases = tf.get_variable("biases", bias_shape, - initializer=tf.constant_initializer(0.0)) - conv = tf.nn.conv2d(input, weights, - strides=[1, 1, 1, 1], padding='SAME') - return tf.nn.relu(conv + biases) -``` - -This function uses short names `weights` and `biases`, which is good for -clarity. In a real model, however, we want many such convolutional layers, and -calling this function repeatedly would not work: - -``` python -input1 = tf.random_normal([1,10,10,32]) -input2 = tf.random_normal([1,20,20,32]) -x = conv_relu(input1, kernel_shape=[5, 5, 32, 32], bias_shape=[32]) -x = conv_relu(x, kernel_shape=[5, 5, 32, 32], bias_shape = [32]) # This fails. -``` - -Since the desired behavior is unclear (create new variables or reuse the -existing ones?) TensorFlow will fail. Calling `conv_relu` in different scopes, -however, clarifies that we want to create new variables: - -```python -def my_image_filter(input_images): - with tf.variable_scope("conv1"): - # Variables created here will be named "conv1/weights", "conv1/biases". - relu1 = conv_relu(input_images, [5, 5, 32, 32], [32]) - with tf.variable_scope("conv2"): - # Variables created here will be named "conv2/weights", "conv2/biases". - return conv_relu(relu1, [5, 5, 32, 32], [32]) -``` - -If you do want the variables to be shared, you have two options. First, you can -create a scope with the same name using `reuse=True`: - -``` python -with tf.variable_scope("model"): - output1 = my_image_filter(input1) -with tf.variable_scope("model", reuse=True): - output2 = my_image_filter(input2) - -``` - -You can also call `scope.reuse_variables()` to trigger a reuse: - -``` python -with tf.variable_scope("model") as scope: - output1 = my_image_filter(input1) - scope.reuse_variables() - output2 = my_image_filter(input2) - -``` - -Since depending on exact string names of scopes can feel dangerous, it's also -possible to initialize a variable scope based on another one: - -``` python -with tf.variable_scope("model") as scope: - output1 = my_image_filter(input1) -with tf.variable_scope(scope, reuse=True): - output2 = my_image_filter(input2) - -``` - diff --git a/tensorflow/docs_src/programmers_guide/version_compat.md b/tensorflow/docs_src/programmers_guide/version_compat.md deleted file mode 100644 index 72e427c5f8..0000000000 --- a/tensorflow/docs_src/programmers_guide/version_compat.md +++ /dev/null @@ -1,319 +0,0 @@ -# TensorFlow Version Compatibility - -This document is for users who need backwards compatibility across different -versions of TensorFlow (either for code or data), and for developers who want -to modify TensorFlow while preserving compatibility. - -## Semantic Versioning 2.0 - -TensorFlow follows Semantic Versioning 2.0 ([semver](http://semver.org)) for its -public API. Each release version of TensorFlow has the form `MAJOR.MINOR.PATCH`. -For example, TensorFlow version 1.2.3 has `MAJOR` version 1, `MINOR` version 2, -and `PATCH` version 3. Changes to each number have the following meaning: - -* **MAJOR**: Potentially backwards incompatible changes. Code and data that - worked with a previous major release will not necessarily work with the new - release. However, in some cases existing TensorFlow graphs and checkpoints - may be migratable to the newer release; see - [Compatibility of graphs and checkpoints](#compatibility_of_graphs_and_checkpoints) - for details on data compatibility. - -* **MINOR**: Backwards compatible features, speed improvements, etc. Code and - data that worked with a previous minor release *and* which depends only on the - public API will continue to work unchanged. For details on what is and is - not the public API, see [What is covered](#what_is_covered). - -* **PATCH**: Backwards compatible bug fixes. - -For example, release 1.0.0 introduced backwards *incompatible* changes from -release 0.12.1. However, release 1.1.1 was backwards *compatible* with release -1.0.0. - -## What is covered - -Only the public APIs of TensorFlow are backwards compatible across minor and -patch versions. The public APIs consist of - -* All the documented [Python](../api_docs/python) functions and classes in the - `tensorflow` module and its submodules, except for - * functions and classes in `tf.contrib` - * functions and classes whose names start with `_` (as these are private) - Note that the code in the `examples/` and `tools/` directories is not - reachable through the `tensorflow` Python module and is thus not covered by - the compatibility guarantee. - - If a symbol is available through the `tensorflow` Python module or its - submodules, but is not documented, then it is **not** considered part of the - public API. - -* The [C API](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/c/c_api.h). - -* The following protocol buffer files: - * [`attr_value`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/attr_value.proto) - * [`config`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/protobuf/config.proto) - * [`event`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/util/event.proto) - * [`graph`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/graph.proto) - * [`op_def`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/op_def.proto) - * [`reader_base`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/reader_base.proto) - * [`summary`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/summary.proto) - * [`tensor`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/tensor.proto) - * [`tensor_shape`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/tensor_shape.proto) - * [`types`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/types.proto) - - -## What is *not* covered - -Some API functions are explicitly marked as "experimental" and can change in -backward incompatible ways between minor releases. These include: - -* **Experimental APIs**: The @{tf.contrib} module and its submodules in Python - and any functions in the C API or fields in protocol buffers that are - explicitly commented as being experimental. In particular, any field in a - protocol buffer which is called "experimental" and all its fields and - submessages can change at any time. - -* **Other languages**: TensorFlow APIs in languages other than Python and C, - such as: - - - @{$cc/guide$C++} (exposed through header files in - [`tensorflow/cc`](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/cc)). - - [Java](../api_docs/java/reference/org/tensorflow/package-summary), - - [Go](https://godoc.org/github.com/tensorflow/tensorflow/tensorflow/go) - -* **Details of composite ops:** Many public functions in Python expand to - several primitive ops in the graph, and these details will be part of any - graphs saved to disk as `GraphDef`s. These details may change for - minor releases. In particular, regressions tests that check for exact - matching between graphs are likely to break across minor releases, even - though the behavior of the graph should be unchanged and existing - checkpoints will still work. - -* **Floating point numerical details:** The specific floating point values - computed by ops may change at any time. Users should rely only on - approximate accuracy and numerical stability, not on the specific bits - computed. Changes to numerical formulas in minor and patch releases should - result in comparable or improved accuracy, with the caveat that in machine - learning improved accuracy of specific formulas may result in decreased - accuracy for the overall system. - -* **Random numbers:** The specific random numbers computed by the - @{$python/constant_op#Random_Tensors$random ops} may change at any time. - Users should rely only on approximately correct distributions and - statistical strength, not the specific bits computed. However, we will make - changes to random bits rarely (or perhaps never) for patch releases. We - will, of course, document all such changes. - -* **Version skew in distributed Tensorflow:** Running two different versions - of TensorFlow in a single cluster is unsupported. There are no guarantees - about backwards compatibility of the wire protocol. - -* **Bugs:** We reserve the right to make backwards incompatible behavior - (though not API) changes if the current implementation is clearly broken, - that is, if it contradicts the documentation or if a well-known and - well-defined intended behavior is not properly implemented due to a bug. - For example, if an optimizer claims to implement a well-known optimization - algorithm but does not match that algorithm due to a bug, then we will fix - the optimizer. Our fix may break code relying on the wrong behavior for - convergence. We will note such changes in the release notes. - -* **Error messages:** We reserve the right to change the text of error - messages. In addition, the type of an error may change unless the type is - specified in the documentation. For example, a function documented to - raise an `InvalidArgument` exception will continue to - raise `InvalidArgument`, but the human-readable message contents can change. - -## Compatibility of graphs and checkpoints - -You'll sometimes need to preserve graphs and checkpoints. -Graphs describe the data flow of ops to be run during training and -inference, and checkpoints contain the saved tensor values of variables in a -graph. - -Many TensorFlow users save graphs and trained models to disk for -later evaluation or additional training, but end up running their saved graphs -or models on a later release. In compliance with semver, any graph or checkpoint -written out with one version of TensorFlow can be loaded and evaluated with a -later version of TensorFlow with the same major release. However, we will -endeavor to preserve backwards compatibility even across major releases when -possible, so that the serialized files are usable over long periods of time. - - -Graphs are serialized via the `GraphDef` protocol buffer. To facilitate (rare) -backwards incompatible changes to graphs, each `GraphDef` has a version number -separate from the TensorFlow version. For example, `GraphDef` version 17 -deprecated the `inv` op in favor of `reciprocal`. The semantics are: - -* Each version of TensorFlow supports an interval of `GraphDef` versions. This - interval will be constant across patch releases, and will only grow across - minor releases. Dropping support for a `GraphDef` version will only occur - for a major release of TensorFlow. - -* Newly created graphs are assigned the latest `GraphDef` version number. - -* If a given version of TensorFlow supports the `GraphDef` version of a graph, - it will load and evaluate with the same behavior as the TensorFlow version - used to generate it (except for floating point numerical details and random - numbers), regardless of the major version of TensorFlow. In particular, all - checkpoint files will be compatible. - -* If the `GraphDef` *upper* bound is increased to X in a (minor) release, there - will be at least six months before the *lower* bound is increased to X. For - example (we're using hypothetical version numbers here): - * TensorFlow 1.2 might support `GraphDef` versions 4 to 7. - * TensorFlow 1.3 could add `GraphDef` version 8 and support versions 4 to 8. - * At least six months later, TensorFlow 2.0.0 could drop support for - versions 4 to 7, leaving version 8 only. - -Finally, when support for a `GraphDef` version is dropped, we will attempt to -provide tools for automatically converting graphs to a newer supported -`GraphDef` version. - -## Graph and checkpoint compatibility when extending TensorFlow - -This section is relevant only when making incompatible changes to the `GraphDef` -format, such as when adding ops, removing ops, or changing the functionality -of existing ops. The previous section should suffice for most users. - -### Backward and partial forward compatibility - -Our versioning scheme has three requirements: - -* **Backward compatibility** to support loading graphs and checkpoints - created with older versions of TensorFlow. -* **Forward compatibility** to support scenarios where the producer of a - graph or checkpoint is upgraded to a newer version of TensorFlow before - the consumer. -* Enable evolving TensorFlow in incompatible ways. For example, removing ops, - adding attributes, and removing attributes. - -Note that while the `GraphDef` version mechanism is separate from the TensorFlow -version, backwards incompatible changes to the `GraphDef` format are still -restricted by Semantic Versioning. This means functionality can only be removed -or changed between `MAJOR` versions of TensorFlow (such as `1.7` to `2.0`). -Additionally, forward compatibility is enforced within Patch releases (`1.x.1` -to `1.x.2` for example). - -To achieve backward and forward compatibility and to know when to enforce changes -in formats, graphs and checkpoints have metadata that describes when they -were produced. The sections below detail the TensorFlow implementation and -guidelines for evolving `GraphDef` versions. - -### Independent data version schemes - -There are different data versions for graphs and checkpoints. The two data -formats evolve at different rates from each other and also at different rates -from TensorFlow. Both versioning systems are defined in -[`core/public/version.h`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/public/version.h). -Whenever a new version is added, a note is added to the header detailing what -changed and the date. - -### Data, producers, and consumers - -We distinguish between the following kinds of data version information: -* **producers**: binaries that produce data. Producers have a version - (`producer`) and a minimum consumer version that they are compatible with - (`min_consumer`). -* **consumers**: binaries that consume data. Consumers have a version - (`consumer`) and a minimum producer version that they are compatible with - (`min_producer`). - -Each piece of versioned data has a [`VersionDef -versions`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/versions.proto) -field which records the `producer` that made the data, the `min_consumer` -that it is compatible with, and a list of `bad_consumers` versions that are -disallowed. - -By default, when a producer makes some data, the data inherits the producer's -`producer` and `min_consumer` versions. `bad_consumers` can be set if specific -consumer versions are known to contain bugs and must be avoided. A consumer can -accept a piece of data if the following are all true: - -* `consumer` >= data's `min_consumer` -* data's `producer` >= consumer's `min_producer` -* `consumer` not in data's `bad_consumers` - -Since both producers and consumers come from the same TensorFlow code base, -[`core/public/version.h`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/public/version.h) -contains a main data version which is treated as either `producer` or -`consumer` depending on context and both `min_consumer` and `min_producer` -(needed by producers and consumers, respectively). Specifically, - -* For `GraphDef` versions, we have `TF_GRAPH_DEF_VERSION`, - `TF_GRAPH_DEF_VERSION_MIN_CONSUMER`, and - `TF_GRAPH_DEF_VERSION_MIN_PRODUCER`. -* For checkpoint versions, we have `TF_CHECKPOINT_VERSION`, - `TF_CHECKPOINT_VERSION_MIN_CONSUMER`, and - `TF_CHECKPOINT_VERSION_MIN_PRODUCER`. - -### Add a new attribute with default to an existing op - -Following the guidance below gives you forward compatibility only if the set of -ops has not changed: - -1. If forward compatibility is desired, set `strip_default_attrs` to `True` - while exporting the model using either the - @{tf.saved_model.builder.SavedModelBuilder.add_meta_graph_and_variables$`add_meta_graph_and_variables`} - and @{tf.saved_model.builder.SavedModelBuilder.add_meta_graph$`add_meta_graph`} - methods of the `SavedModelBuilder` class, or - @{tf.estimator.Estimator.export_savedmodel$`Estimator.export_savedmodel`} -2. This strips off the default valued attributes at the time of - producing/exporting the models. This makes sure that the exported - @{tf.MetaGraphDef} does not contain the new op-attribute when the default - value is used. -3. Having this control could allow out-of-date consumers (for example, serving - binaries that lag behind training binaries) to continue loading the models - and prevent interruptions in model serving. - -### Evolving GraphDef versions - -This section explains how to use this versioning mechanism to make different -types of changes to the `GraphDef` format. - -#### Add an op - -Add the new op to both consumers and producers at the same time, and do not -change any `GraphDef` versions. This type of change is automatically -backward compatible, and does not impact forward compatibility plan since -existing producer scripts will not suddenly use the new functionality. - -#### Add an op and switch existing Python wrappers to use it - -1. Implement new consumer functionality and increment the `GraphDef` version. -2. If it is possible to make the wrappers use the new functionality only in - cases that did not work before, the wrappers can be updated now. -3. Change Python wrappers to use the new functionality. Do not increment - `min_consumer`, since models that do not use this op should not break. - -#### Remove or restrict an op's functionality - -1. Fix all producer scripts (not TensorFlow itself) to not use the banned op or - functionality. -2. Increment the `GraphDef` version and implement new consumer functionality - that bans the removed op or functionality for GraphDefs at the new version - and above. If possible, make TensorFlow stop producing `GraphDefs` with the - banned functionality. To do so, add the - [`REGISTER_OP(...).Deprecated(deprecated_at_version, - message)`](https://github.com/tensorflow/tensorflow/blob/b289bc7a50fc0254970c60aaeba01c33de61a728/tensorflow/core/ops/array_ops.cc#L1009). -3. Wait for a major release for backward compatibility purposes. -4. Increase `min_producer` to the GraphDef version from (2) and remove the - functionality entirely. - -#### Change an op's functionality - -1. Add a new similar op named `SomethingV2` or similar and go through the - process of adding it and switching existing Python wrappers to use it, which - may take three weeks if forward compatibility is desired. -2. Remove the old op (Can only take place with a major version change due to - backward compatibility). -3. Increase `min_consumer` to rule out consumers with the old op, add back the - old op as an alias for `SomethingV2`, and go through the process to switch - existing Python wrappers to use it. -4. Go through the process to remove `SomethingV2`. - -#### Ban a single unsafe consumer version - -1. Bump the `GraphDef` version and add the bad version to `bad_consumers` for - all new GraphDefs. If possible, add to `bad_consumers` only for GraphDefs - which contain a certain op or similar. -2. If existing consumers have the bad version, push them out as soon as - possible. diff --git a/tensorflow/docs_src/tutorials/deep_cnn.md b/tensorflow/docs_src/tutorials/deep_cnn.md index 6a4c9a9b07..44a32d9d1d 100644 --- a/tensorflow/docs_src/tutorials/deep_cnn.md +++ b/tensorflow/docs_src/tutorials/deep_cnn.md @@ -268,7 +268,7 @@ in `cifar10_input.py`. `cifar10_train.py` periodically @{tf.train.Saver$saves} all model parameters in -@{$programmers_guide/saved_model$checkpoint files} +@{$guide/saved_model$checkpoint files} but it does *not* evaluate the model. The checkpoint file will be used by `cifar10_eval.py` to measure the predictive performance (see [Evaluating a Model](#evaluating-a-model) below). diff --git a/tensorflow/docs_src/tutorials/layers.md b/tensorflow/docs_src/tutorials/layers.md index 0f17899dae..212e337637 100644 --- a/tensorflow/docs_src/tutorials/layers.md +++ b/tensorflow/docs_src/tutorials/layers.md @@ -627,7 +627,7 @@ operation earlier when we generated the probabilities in `cnn_model_fn`. > argument, TensorFlow will assign a default name. A couple easy ways to > discover the names applied to operations are to visualize your graph on > @{$graph_viz$TensorBoard}) or to enable the -> @{$programmers_guide/debugger$TensorFlow Debugger (tfdbg)}. +> @{$guide/debugger$TensorFlow Debugger (tfdbg)}. Next, we create the `LoggingTensorHook`, passing `tensors_to_log` to the `tensors` argument. We set `every_n_iter=50`, which specifies that probabilities diff --git a/tensorflow/examples/how_tos/reading_data/fully_connected_reader.py b/tensorflow/examples/how_tos/reading_data/fully_connected_reader.py index 307eede5c0..7402247448 100644 --- a/tensorflow/examples/how_tos/reading_data/fully_connected_reader.py +++ b/tensorflow/examples/how_tos/reading_data/fully_connected_reader.py @@ -17,7 +17,7 @@ This version is like fully_connected_feed.py but uses data converted to a TFRecords file containing tf.train.Example protocol buffers. See: -https://www.tensorflow.org/programmers_guide/reading_data#reading_from_files +https://www.tensorflow.org/guide/reading_data#reading_from_files for context. YOU MUST run convert_to_records before running this (but you only need to diff --git a/tensorflow/java/README.md b/tensorflow/java/README.md index 2f1ce253b2..c7382ff231 100644 --- a/tensorflow/java/README.md +++ b/tensorflow/java/README.md @@ -1,7 +1,7 @@ # TensorFlow for Java > *WARNING*: The TensorFlow Java API is not currently covered by the TensorFlow -> [API stability guarantees](https://www.tensorflow.org/programmers_guide/version_semantics). +> [API stability guarantees](https://www.tensorflow.org/guide/version_semantics). > > For using TensorFlow on Android refer instead to > [contrib/android](https://www.tensorflow.org/code/tensorflow/contrib/android), @@ -23,8 +23,7 @@ native libraries will need to be built from source. 2. Setup the environment to build TensorFlow from source code ([Linux](https://www.tensorflow.org/install/install_sources#PrepareLinux) - or [Mac OS - X](https://www.tensorflow.org/install/install_sources#PrepareMac)). + or [macOS](https://www.tensorflow.org/install/install_sources#PrepareMac)). If you'd like to skip reading those details and do not care about GPU support, try the following: diff --git a/tensorflow/java/src/main/java/org/tensorflow/package-info.java b/tensorflow/java/src/main/java/org/tensorflow/package-info.java index 521c5c610c..f353ee3145 100644 --- a/tensorflow/java/src/main/java/org/tensorflow/package-info.java +++ b/tensorflow/java/src/main/java/org/tensorflow/package-info.java @@ -17,7 +17,7 @@ limitations under the License. * Defines classes to build, save, load and execute TensorFlow models. * *

WARNING: The API is currently experimental and is not covered by TensorFlow API stability + * href="https://www.tensorflow.org/guide/version_semantics">API stability * guarantees. See README.md for installation * instructions. diff --git a/tensorflow/python/data/__init__.py b/tensorflow/python/data/__init__.py index 7efe0948e7..3b9bf2469e 100644 --- a/tensorflow/python/data/__init__.py +++ b/tensorflow/python/data/__init__.py @@ -14,7 +14,7 @@ # ============================================================================== """`tf.data.Dataset` API for input pipelines. -See the @{$datasets$Importing Data} Programmer's Guide for an overview. +See @{$guide/datasets$Importing Data} for an overview. """ from __future__ import absolute_import diff --git a/tensorflow/python/data/ops/dataset_ops.py b/tensorflow/python/data/ops/dataset_ops.py index 6f9b12b123..0e020d86d0 100644 --- a/tensorflow/python/data/ops/dataset_ops.py +++ b/tensorflow/python/data/ops/dataset_ops.py @@ -212,6 +212,13 @@ class Dataset(object): def from_tensors(tensors): """Creates a `Dataset` with a single element, comprising the given tensors. + Note that if `tensors` contains a NumPy array, and eager execution is not + enabled, the values will be embedded in the graph as one or more + @{tf.constant} operations. For large datasets (> 1 GB), this can waste + memory and run into byte limits of graph serialization. If tensors contains + one or more large NumPy arrays, consider the alternative described in + @{$guide/datasets#consuming_numpy_arrays$this guide}. + Args: tensors: A nested structure of tensors. @@ -224,6 +231,13 @@ class Dataset(object): def from_tensor_slices(tensors): """Creates a `Dataset` whose elements are slices of the given tensors. + Note that if `tensors` contains a NumPy array, and eager execution is not + enabled, the values will be embedded in the graph as one or more + @{tf.constant} operations. For large datasets (> 1 GB), this can waste + memory and run into byte limits of graph serialization. If tensors contains + one or more large NumPy arrays, consider the alternative described in + @{$guide/datasets#consuming_numpy_arrays$this guide}. + Args: tensors: A nested structure of tensors, each having the same size in the 0th dimension. diff --git a/tensorflow/python/debug/BUILD b/tensorflow/python/debug/BUILD index 09062abd74..2d261f9be7 100644 --- a/tensorflow/python/debug/BUILD +++ b/tensorflow/python/debug/BUILD @@ -5,7 +5,7 @@ # # ":debug_py": Public Python methods and classes of tfdbg. # For API documentation, see https://www.tensorflow.org/api_docs/python/tfdbg -# For a user interface walkthrough, see https://www.tensorflow.org/programmers_guide/debugger +# For a user interface walkthrough, see https://www.tensorflow.org/guide/debugger # ":grpc_debug_server": Server interface for grpc:// debug URLs. package( diff --git a/tensorflow/python/debug/README.md b/tensorflow/python/debug/README.md index 269bbb19bd..9c16af4d79 100644 --- a/tensorflow/python/debug/README.md +++ b/tensorflow/python/debug/README.md @@ -28,7 +28,7 @@ models: * Easy access through session wrappers * Easy integration with common high-level APIs, such as - [TensorFlow Estimators](https://www.tensorflow.org/programmers_guide/estimators) and + [TensorFlow Estimators](https://www.tensorflow.org/guide/estimators) and [Keras](https://keras.io/) * Inspection of runtime tensor values and node connections * Conditional breaking after runs that generate tensors satisfying given @@ -43,7 +43,7 @@ models: ## How to use TFDBG? -* For a walkthrough of TFDBG command-line interface, see https://www.tensorflow.org/programmers_guide/debugger. +* For a walkthrough of TFDBG command-line interface, see https://www.tensorflow.org/guide/debugger. * For information on the web GUI of TFDBG (TensorBoard Debugger Plugin), see [this README](https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/debugger/README.md). * For programmatic use of the API of TFDBG, see https://www.tensorflow.org/api_docs/python/tfdbg. diff --git a/tensorflow/python/debug/examples/README.md b/tensorflow/python/debug/examples/README.md index cb4d484092..3b431e04dc 100644 --- a/tensorflow/python/debug/examples/README.md +++ b/tensorflow/python/debug/examples/README.md @@ -3,7 +3,7 @@ Hi, there! The documentation of **TensorFlow Debugger (tfdbg)** has moved. See the source version at -[this new location](../../../docs_src/programmers_guide/debugger.md). +[this new location](../../../docs_src/guide/debugger.md). See the public website version at -[https://www.tensorflow.org/programmers_guide/debugger](https://www.tensorflow.org/programmers_guide/debugger). +[https://www.tensorflow.org/guide/debugger](https://www.tensorflow.org/guide/debugger). diff --git a/tensorflow/python/estimator/keras.py b/tensorflow/python/estimator/keras.py index 2f439f765e..6856b8b5a9 100644 --- a/tensorflow/python/estimator/keras.py +++ b/tensorflow/python/estimator/keras.py @@ -455,7 +455,7 @@ def model_to_estimator(keras_model=None, """Constructs an `Estimator` instance from given keras model. For usage example, please see - @{$programmers_guide/estimators$creating_estimators_from_keras_models}. + @{$guide/estimators$creating_estimators_from_keras_models}. Args: keras_model: A compiled Keras model object. This argument is mutually diff --git a/tensorflow/python/ops/script_ops.py b/tensorflow/python/ops/script_ops.py index 16c73213d5..f8df9b2c78 100644 --- a/tensorflow/python/ops/script_ops.py +++ b/tensorflow/python/ops/script_ops.py @@ -267,7 +267,7 @@ def eager_py_func(func, inp, Tout, name=None): or print statements as desired, and wrap those functions in `tf.contrib.eager.py_func`. - For more information on eager execution, see @{$programmers_guide/eager}. + For more information on eager execution, see @{$guide/eager}. `tf.contrib.eager.py_func` is similar in spirit to @{tf.py_func}, but unlike the latter, the former lets you use TensorFlow operations in the wrapped diff --git a/tensorflow/python/tools/saved_model_cli.py b/tensorflow/python/tools/saved_model_cli.py index 5b9d25d449..38fed5335e 100644 --- a/tensorflow/python/tools/saved_model_cli.py +++ b/tensorflow/python/tools/saved_model_cli.py @@ -15,7 +15,7 @@ """Command-line interface to inspect and execute a graph in a SavedModel. For detailed usages and examples, please refer to: -https://www.tensorflow.org/programmers_guide/saved_model_cli +https://www.tensorflow.org/guide/saved_model_cli """ @@ -720,7 +720,7 @@ def create_parser(): '\'input4_key=[{"id":[26],"weights":[0.5, 0.5]}]\' \\\n' ' --outdir=/out\n\n' 'For more information about input file format, please see:\n' - 'https://www.tensorflow.org/programmers_guide/saved_model_cli\n') + 'https://www.tensorflow.org/guide/saved_model_cli\n') parser_run = subparsers.add_parser( 'run', description=run_msg, formatter_class=argparse.RawTextHelpFormatter) parser_run.add_argument( diff --git a/third_party/examples/eager/spinn/README.md b/third_party/examples/eager/spinn/README.md index fbb1fde837..e2fd8009a0 100644 --- a/third_party/examples/eager/spinn/README.md +++ b/third_party/examples/eager/spinn/README.md @@ -22,7 +22,7 @@ Other eager execution examples can be found under [tensorflow/contrib/eager/pyth - [`data.py`](../../../../tensorflow/contrib/eager/python/examples/spinn/data.py): Pipeline for loading and preprocessing the [SNLI](https://nlp.stanford.edu/projects/snli/) data and [GloVe](https://nlp.stanford.edu/projects/glove/) word embedding, written - using the [`tf.data`](https://www.tensorflow.org/programmers_guide/datasets) + using the [`tf.data`](https://www.tensorflow.org/guide/datasets) API. - [`spinn.py`](./spinn.py): Model definition and training routines. This example illustrates how one might perform the following actions with -- cgit v1.2.3 From ab60fbc1fcfc600b800ad12c9f76cfccc4fb7087 Mon Sep 17 00:00:00 2001 From: Pete Warden Date: Tue, 26 Jun 2018 09:32:12 -0700 Subject: Fix for RPi OpenBLAS compile issues, by pinning to known good version --- tensorflow/tools/ci_build/install/install_pi_python3_toolchain.sh | 8 ++++++++ tensorflow/tools/ci_build/pi/build_raspberry_pi.sh | 4 ++++ 2 files changed, 12 insertions(+) diff --git a/tensorflow/tools/ci_build/install/install_pi_python3_toolchain.sh b/tensorflow/tools/ci_build/install/install_pi_python3_toolchain.sh index 9d8e3df3b5..4afb2f1534 100755 --- a/tensorflow/tools/ci_build/install/install_pi_python3_toolchain.sh +++ b/tensorflow/tools/ci_build/install/install_pi_python3_toolchain.sh @@ -27,3 +27,11 @@ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add - apt-get update rm -rf /usr/local/bin/bazel apt-get install -y bazel python3 python3-numpy python3-dev python3-pip + +# We're using Ubuntu 14.04 as our base image because that's needed by the Pi +# cross-compilation chain, but that doesn't have built-in Python 3.5 support, so +# install from a separate repository. +apt-get install -y software-properties-common +add-apt-repository ppa:fkrull/deadsnakes +apt-get update +apt-get install -y python3.5 python3.5-dev diff --git a/tensorflow/tools/ci_build/pi/build_raspberry_pi.sh b/tensorflow/tools/ci_build/pi/build_raspberry_pi.sh index 4d1a30601e..5eff3e415d 100755 --- a/tensorflow/tools/ci_build/pi/build_raspberry_pi.sh +++ b/tensorflow/tools/ci_build/pi/build_raspberry_pi.sh @@ -65,6 +65,10 @@ OPENBLAS_SRC_PATH=/tmp/openblas_src/ sudo rm -rf ${OPENBLAS_SRC_PATH} git clone https://github.com/xianyi/OpenBLAS ${OPENBLAS_SRC_PATH} cd ${OPENBLAS_SRC_PATH} +# The commit after this introduced Fortran compile issues. In theory they should +# be solvable using NOFORTRAN=1 on the make command, but my initial tries didn't +# work, so pinning to the last know good version. +git checkout 5a6a2bed9aff0ba8a18651d5514d029c8cae336a # If this path is changed, you'll also need to update # cxx_builtin_include_directory in third_party/toolchains/cpus/arm/CROSSTOOL.tpl OPENBLAS_INSTALL_PATH=/tmp/openblas_install/ -- cgit v1.2.3 From fe374d31f38ba7fa84284b58d28c55dc0087f2b3 Mon Sep 17 00:00:00 2001 From: Pete Warden Date: Tue, 26 Jun 2018 09:36:22 -0700 Subject: Removed Python 3.5 updates for RPi --- tensorflow/tools/ci_build/install/install_pi_python3_toolchain.sh | 8 -------- 1 file changed, 8 deletions(-) diff --git a/tensorflow/tools/ci_build/install/install_pi_python3_toolchain.sh b/tensorflow/tools/ci_build/install/install_pi_python3_toolchain.sh index 4afb2f1534..9d8e3df3b5 100755 --- a/tensorflow/tools/ci_build/install/install_pi_python3_toolchain.sh +++ b/tensorflow/tools/ci_build/install/install_pi_python3_toolchain.sh @@ -27,11 +27,3 @@ curl https://bazel.build/bazel-release.pub.gpg | sudo apt-key add - apt-get update rm -rf /usr/local/bin/bazel apt-get install -y bazel python3 python3-numpy python3-dev python3-pip - -# We're using Ubuntu 14.04 as our base image because that's needed by the Pi -# cross-compilation chain, but that doesn't have built-in Python 3.5 support, so -# install from a separate repository. -apt-get install -y software-properties-common -add-apt-repository ppa:fkrull/deadsnakes -apt-get update -apt-get install -y python3.5 python3.5-dev -- cgit v1.2.3 From 8025ac34099ed1b38c3cf0c0f84244496b42fedb Mon Sep 17 00:00:00 2001 From: Michael Case Date: Tue, 26 Jun 2018 13:05:25 -0700 Subject: Moving StatusOr from XLA to stream_executor. PiperOrigin-RevId: 202179928 --- tensorflow/compiler/xla/BUILD | 17 +- tensorflow/compiler/xla/service/gpu/BUILD | 1 + .../xla/service/gpu/stream_executor_util.h | 1 + tensorflow/compiler/xla/statusor.cc | 38 -- tensorflow/compiler/xla/statusor.h | 286 +-------- tensorflow/compiler/xla/statusor_internals.h | 245 -------- tensorflow/compiler/xla/statusor_test.cc | 675 -------------------- tensorflow/stream_executor/BUILD | 2 - tensorflow/stream_executor/lib/statusor.cc | 40 ++ tensorflow/stream_executor/lib/statusor.h | 290 ++++++++- .../stream_executor/lib/statusor_internals.h | 248 ++++++++ tensorflow/stream_executor/lib/statusor_test.cc | 676 +++++++++++++++++++++ 12 files changed, 1254 insertions(+), 1265 deletions(-) delete mode 100644 tensorflow/compiler/xla/statusor.cc delete mode 100644 tensorflow/compiler/xla/statusor_internals.h delete mode 100644 tensorflow/compiler/xla/statusor_test.cc create mode 100644 tensorflow/stream_executor/lib/statusor.cc create mode 100644 tensorflow/stream_executor/lib/statusor_internals.h create mode 100644 tensorflow/stream_executor/lib/statusor_test.cc diff --git a/tensorflow/compiler/xla/BUILD b/tensorflow/compiler/xla/BUILD index c6deb959a5..afa8ce730b 100644 --- a/tensorflow/compiler/xla/BUILD +++ b/tensorflow/compiler/xla/BUILD @@ -143,30 +143,15 @@ cc_library( cc_library( name = "statusor", - srcs = ["statusor.cc"], hdrs = [ "statusor.h", - "statusor_internals.h", ], visibility = ["//visibility:public"], deps = [ ":status", "//tensorflow/core:lib", "//tensorflow/core:lib_internal", - ], -) - -tf_cc_test( - name = "statusor_test", - size = "small", - srcs = ["statusor_test.cc"], - deps = [ - ":statusor", - ":test", - ":types", - "//tensorflow/core:lib", - "//tensorflow/core:test", - "//tensorflow/core:test_main", + "//tensorflow/stream_executor", ], ) diff --git a/tensorflow/compiler/xla/service/gpu/BUILD b/tensorflow/compiler/xla/service/gpu/BUILD index 68297ad4ae..fe597bfb45 100644 --- a/tensorflow/compiler/xla/service/gpu/BUILD +++ b/tensorflow/compiler/xla/service/gpu/BUILD @@ -727,6 +727,7 @@ cc_library( hdrs = ["stream_executor_util.h"], deps = [ "//tensorflow/compiler/xla:shape_util", + "//tensorflow/compiler/xla:statusor", "//tensorflow/compiler/xla:xla_data_proto", "//tensorflow/core:stream_executor_no_cuda", ], diff --git a/tensorflow/compiler/xla/service/gpu/stream_executor_util.h b/tensorflow/compiler/xla/service/gpu/stream_executor_util.h index 8218f4fd11..39a6a38d00 100644 --- a/tensorflow/compiler/xla/service/gpu/stream_executor_util.h +++ b/tensorflow/compiler/xla/service/gpu/stream_executor_util.h @@ -16,6 +16,7 @@ limitations under the License. #ifndef TENSORFLOW_COMPILER_XLA_SERVICE_GPU_STREAM_EXECUTOR_UTIL_H_ #define TENSORFLOW_COMPILER_XLA_SERVICE_GPU_STREAM_EXECUTOR_UTIL_H_ +#include "tensorflow/compiler/xla/statusor.h" #include "tensorflow/compiler/xla/xla_data.pb.h" #include "tensorflow/core/platform/stream_executor_no_cuda.h" diff --git a/tensorflow/compiler/xla/statusor.cc b/tensorflow/compiler/xla/statusor.cc deleted file mode 100644 index 72ab67ff81..0000000000 --- a/tensorflow/compiler/xla/statusor.cc +++ /dev/null @@ -1,38 +0,0 @@ -/* Copyright 2017 The TensorFlow Authors. All Rights Reserved. - -Licensed under the Apache License, Version 2.0 (the "License"); -you may not use this file except in compliance with the License. -You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - -Unless required by applicable law or agreed to in writing, software -distributed under the License is distributed on an "AS IS" BASIS, -WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -See the License for the specific language governing permissions and -limitations under the License. -==============================================================================*/ - -#include "tensorflow/compiler/xla/statusor.h" - -#include "tensorflow/core/lib/core/errors.h" -#include "tensorflow/core/platform/logging.h" - -namespace xla { -namespace internal_statusor { - -void Helper::HandleInvalidStatusCtorArg(Status* status) { - const char* kMessage = - "An OK status is not a valid constructor argument to StatusOr"; - LOG(ERROR) << kMessage; - // Fall back to tensorflow::error::INTERNAL. - *status = ::tensorflow::errors::Internal(kMessage); -} - -void Helper::Crash(const Status& status) { - LOG(FATAL) << "Attempting to fetch value instead of handling error " - << status; -} - -} // namespace internal_statusor -} // namespace xla diff --git a/tensorflow/compiler/xla/statusor.h b/tensorflow/compiler/xla/statusor.h index 0e1387c939..a32e2ad985 100644 --- a/tensorflow/compiler/xla/statusor.h +++ b/tensorflow/compiler/xla/statusor.h @@ -12,297 +12,17 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ==============================================================================*/ - -// StatusOr is the union of a Status object and a T object. StatusOr models -// the concept of an object that is either a value, or an error Status -// explaining why such a value is not present. To this end, StatusOr does not -// allow its Status value to be Status::OK. -// -// The primary use-case for StatusOr is as the return value of a -// function which may fail. -// -// Example client usage for a StatusOr, where T is not a pointer: -// -// StatusOr result = DoBigCalculationThatCouldFail(); -// if (result.ok()) { -// float answer = result.ValueOrDie(); -// printf("Big calculation yielded: %f", answer); -// } else { -// LOG(ERROR) << result.status(); -// } -// -// Example client usage for a StatusOr: -// -// StatusOr result = FooFactory::MakeNewFoo(arg); -// if (result.ok()) { -// std::unique_ptr foo(result.ValueOrDie()); -// foo->DoSomethingCool(); -// } else { -// LOG(ERROR) << result.status(); -// } -// -// Example client usage for a StatusOr>: -// -// StatusOr> result = FooFactory::MakeNewFoo(arg); -// if (result.ok()) { -// std::unique_ptr foo = std::move(result.ValueOrDie()); -// foo->DoSomethingCool(); -// } else { -// LOG(ERROR) << result.status(); -// } -// -// Example factory implementation returning StatusOr: -// -// StatusOr FooFactory::MakeNewFoo(int arg) { -// if (arg <= 0) { -// return tensorflow::InvalidArgument("Arg must be positive"); -// } else { -// return new Foo(arg); -// } -// } -// -// Note that the assignment operators require that destroying the currently -// stored value cannot invalidate the argument; in other words, the argument -// cannot be an alias for the current value, or anything owned by the current -// value. #ifndef TENSORFLOW_COMPILER_XLA_STATUSOR_H_ #define TENSORFLOW_COMPILER_XLA_STATUSOR_H_ #include "tensorflow/compiler/xla/status.h" -#include "tensorflow/compiler/xla/statusor_internals.h" -#include "tensorflow/core/platform/macros.h" +#include "tensorflow/stream_executor/lib/statusor.h" namespace xla { -#if defined(__clang__) -// Only clang supports warn_unused_result as a type annotation. -template -class TF_MUST_USE_RESULT StatusOr; -#endif - -template -class StatusOr : private internal_statusor::StatusOrData, - private internal_statusor::TraitsBase< - std::is_copy_constructible::value, - std::is_move_constructible::value> { - template - friend class StatusOr; - - typedef internal_statusor::StatusOrData Base; - - public: - typedef T element_type; - - // Constructs a new StatusOr with Status::UNKNOWN status. This is marked - // 'explicit' to try to catch cases like 'return {};', where people think - // StatusOr> will be initialized with an empty vector, - // instead of a Status::UNKNOWN status. - explicit StatusOr(); - - // StatusOr will be copy constructible/assignable if T is copy - // constructible. - StatusOr(const StatusOr&) = default; - StatusOr& operator=(const StatusOr&) = default; - - // StatusOr will be move constructible/assignable if T is move - // constructible. - StatusOr(StatusOr&&) = default; - StatusOr& operator=(StatusOr&&) = default; - - // Conversion copy/move constructor, T must be convertible from U. - template ::value>::type* = nullptr> - StatusOr(const StatusOr& other); - template ::value>::type* = nullptr> - StatusOr(StatusOr&& other); - - // Conversion copy/move assignment operator, T must be convertible from U. - template ::value>::type* = nullptr> - StatusOr& operator=(const StatusOr& other); - template ::value>::type* = nullptr> - StatusOr& operator=(StatusOr&& other); - - // Constructs a new StatusOr with the given value. After calling this - // constructor, calls to ValueOrDie() will succeed, and calls to status() will - // return OK. - // - // NOTE: Not explicit - we want to use StatusOr as a return type - // so it is convenient and sensible to be able to do 'return T()' - // when the return type is StatusOr. - // - // REQUIRES: T is copy constructible. - StatusOr(const T& value); - - // Constructs a new StatusOr with the given non-ok status. After calling - // this constructor, calls to ValueOrDie() will CHECK-fail. - // - // NOTE: Not explicit - we want to use StatusOr as a return - // value, so it is convenient and sensible to be able to do 'return - // Status()' when the return type is StatusOr. - // - // REQUIRES: !status.ok(). This requirement is DCHECKed. - // In optimized builds, passing Status::OK() here will have the effect - // of passing tensorflow::error::INTERNAL as a fallback. - StatusOr(const Status& status); - StatusOr& operator=(const Status& status); - - // TODO(b/62186997): Add operator=(T) overloads. - - // Similar to the `const T&` overload. - // - // REQUIRES: T is move constructible. - StatusOr(T&& value); - - // RValue versions of the operations declared above. - StatusOr(Status&& status); - StatusOr& operator=(Status&& status); - - // Returns this->status().ok() - bool ok() const { return this->status_.ok(); } - - // Returns a reference to our status. If this contains a T, then - // returns Status::OK(). - const Status& status() const &; - Status status() &&; - - // Returns a reference to our current value, or CHECK-fails if !this->ok(). - // - // Note: for value types that are cheap to copy, prefer simple code: - // - // T value = statusor.ValueOrDie(); - // - // Otherwise, if the value type is expensive to copy, but can be left - // in the StatusOr, simply assign to a reference: - // - // T& value = statusor.ValueOrDie(); // or `const T&` - // - // Otherwise, if the value type supports an efficient move, it can be - // used as follows: - // - // T value = std::move(statusor).ValueOrDie(); - // - // The std::move on statusor instead of on the whole expression enables - // warnings about possible uses of the statusor object after the move. - // C++ style guide waiver for ref-qualified overloads granted in cl/143176389 - // See go/ref-qualifiers for more details on such overloads. - const T& ValueOrDie() const &; - T& ValueOrDie() &; - const T&& ValueOrDie() const &&; - T&& ValueOrDie() &&; - - T ConsumeValueOrDie() { return std::move(ValueOrDie()); } - - // Ignores any errors. This method does nothing except potentially suppress - // complaints from any tools that are checking that errors are not dropped on - // the floor. - void IgnoreError() const; -}; - -//////////////////////////////////////////////////////////////////////////////// -// Implementation details for StatusOr - -template -StatusOr::StatusOr() : Base(Status(tensorflow::error::UNKNOWN, "")) {} - -template -StatusOr::StatusOr(const T& value) : Base(value) {} - -template -StatusOr::StatusOr(const Status& status) : Base(status) {} - -template -StatusOr& StatusOr::operator=(const Status& status) { - this->Assign(status); - return *this; -} - -template -StatusOr::StatusOr(T&& value) : Base(std::move(value)) {} - -template -StatusOr::StatusOr(Status&& status) : Base(std::move(status)) {} - -template -StatusOr& StatusOr::operator=(Status&& status) { - this->Assign(std::move(status)); - return *this; -} - -template -template ::value>::type*> -inline StatusOr::StatusOr(const StatusOr& other) - : Base(static_cast::Base&>(other)) {} - -template -template ::value>::type*> -inline StatusOr& StatusOr::operator=(const StatusOr& other) { - if (other.ok()) - this->Assign(other.ValueOrDie()); - else - this->Assign(other.status()); - return *this; -} - -template -template ::value>::type*> -inline StatusOr::StatusOr(StatusOr&& other) - : Base(static_cast::Base&&>(other)) {} - -template -template ::value>::type*> -inline StatusOr& StatusOr::operator=(StatusOr&& other) { - if (other.ok()) { - this->Assign(std::move(other).ValueOrDie()); - } else { - this->Assign(std::move(other).status()); - } - return *this; -} - -template -const Status& StatusOr::status() const & { - return this->status_; -} -template -Status StatusOr::status() && { - return ok() ? Status::OK() : std::move(this->status_); -} - -template -const T& StatusOr::ValueOrDie() const & { - this->EnsureOk(); - return this->data_; -} - -template -T& StatusOr::ValueOrDie() & { - this->EnsureOk(); - return this->data_; -} - -template -const T&& StatusOr::ValueOrDie() const && { - this->EnsureOk(); - return std::move(this->data_); -} - -template -T&& StatusOr::ValueOrDie() && { - this->EnsureOk(); - return std::move(this->data_); -} - +// Use steam_executor's StatusOr so we don't duplicate code. template -void StatusOr::IgnoreError() const { - // no-op -} +using StatusOr = ::stream_executor::port::StatusOr; } // namespace xla diff --git a/tensorflow/compiler/xla/statusor_internals.h b/tensorflow/compiler/xla/statusor_internals.h deleted file mode 100644 index 14636bd144..0000000000 --- a/tensorflow/compiler/xla/statusor_internals.h +++ /dev/null @@ -1,245 +0,0 @@ -/* Copyright 2017 The TensorFlow Authors. All Rights Reserved. - -Licensed under the Apache License, Version 2.0 (the "License"); -you may not use this file except in compliance with the License. -You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - -Unless required by applicable law or agreed to in writing, software -distributed under the License is distributed on an "AS IS" BASIS, -WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -See the License for the specific language governing permissions and -limitations under the License. -==============================================================================*/ - -#ifndef TENSORFLOW_COMPILER_XLA_STATUSOR_INTERNALS_H_ -#define TENSORFLOW_COMPILER_XLA_STATUSOR_INTERNALS_H_ - -#include "tensorflow/compiler/xla/status.h" -#include "tensorflow/core/platform/macros.h" - -namespace xla { -namespace internal_statusor { - -class Helper { - public: - // Move type-agnostic error handling to the .cc. - static void HandleInvalidStatusCtorArg(Status*); - TF_ATTRIBUTE_NORETURN static void Crash(const Status& status); -}; - -// Construct an instance of T in `p` through placement new, passing Args... to -// the constructor. -// This abstraction is here mostly for the gcc performance fix. -template -void PlacementNew(void* p, Args&&... args) { -#if defined(__GNUC__) && !defined(__clang__) - // Teach gcc that 'p' cannot be null, fixing code size issues. - if (p == nullptr) __builtin_unreachable(); -#endif - new (p) T(std::forward(args)...); -} - -// Helper base class to hold the data and all operations. -// We move all this to a base class to allow mixing with the appropriate -// TraitsBase specialization. -template -class StatusOrData { - template - friend class StatusOrData; - - public: - StatusOrData() = delete; - - StatusOrData(const StatusOrData& other) { - if (other.ok()) { - MakeValue(other.data_); - MakeStatus(); - } else { - MakeStatus(other.status_); - } - } - - StatusOrData(StatusOrData&& other) noexcept { - if (other.ok()) { - MakeValue(std::move(other.data_)); - MakeStatus(); - } else { - MakeStatus(std::move(other.status_)); - } - } - - template - StatusOrData(const StatusOrData& other) { - if (other.ok()) { - MakeValue(other.data_); - MakeStatus(); - } else { - MakeStatus(other.status_); - } - } - - template - StatusOrData(StatusOrData&& other) { - if (other.ok()) { - MakeValue(std::move(other.data_)); - MakeStatus(); - } else { - MakeStatus(std::move(other.status_)); - } - } - - explicit StatusOrData(const T& value) : data_(value) { MakeStatus(); } - explicit StatusOrData(T&& value) : data_(std::move(value)) { MakeStatus(); } - - explicit StatusOrData(const Status& status) : status_(status) { - EnsureNotOk(); - } - explicit StatusOrData(Status&& status) : status_(std::move(status)) { - EnsureNotOk(); - } - - StatusOrData& operator=(const StatusOrData& other) { - if (this == &other) return *this; - if (other.ok()) - Assign(other.data_); - else - Assign(other.status_); - return *this; - } - - StatusOrData& operator=(StatusOrData&& other) { - if (this == &other) return *this; - if (other.ok()) - Assign(std::move(other.data_)); - else - Assign(std::move(other.status_)); - return *this; - } - - ~StatusOrData() { - if (ok()) { - status_.~Status(); - data_.~T(); - } else { - status_.~Status(); - } - } - - void Assign(const T& value) { - if (ok()) { - data_.~T(); - MakeValue(value); - } else { - MakeValue(value); - status_ = Status::OK(); - } - } - - void Assign(T&& value) { - if (ok()) { - data_.~T(); - MakeValue(std::move(value)); - } else { - MakeValue(std::move(value)); - status_ = Status::OK(); - } - } - - void Assign(const Status& status) { - Clear(); - status_ = status; - EnsureNotOk(); - } - - void Assign(Status&& status) { - Clear(); - status_ = std::move(status); - EnsureNotOk(); - } - - bool ok() const { return status_.ok(); } - - protected: - // status_ will always be active after the constructor. - // We make it a union to be able to initialize exactly how we need without - // waste. - // Eg. in the copy constructor we use the default constructor of Status in - // the ok() path to avoid an extra Ref call. - union { - Status status_; - }; - - // data_ is active iff status_.ok()==true - struct Dummy {}; - union { - // When T is const, we need some non-const object we can cast to void* for - // the placement new. dummy_ is that object. - Dummy dummy_; - T data_; - }; - - void Clear() { - if (ok()) data_.~T(); - } - - void EnsureOk() const { - if (!ok()) Helper::Crash(status_); - } - - void EnsureNotOk() { - if (ok()) Helper::HandleInvalidStatusCtorArg(&status_); - } - - // Construct the value (ie. data_) through placement new with the passed - // argument. - template - void MakeValue(Arg&& arg) { - internal_statusor::PlacementNew(&dummy_, std::forward(arg)); - } - - // Construct the status (ie. status_) through placement new with the passed - // argument. - template - void MakeStatus(Args&&... args) { - internal_statusor::PlacementNew(&status_, - std::forward(args)...); - } -}; - -// Helper base class to allow implicitly deleted constructors and assignment -// operations in StatusOr. -// TraitsBase will explicitly delete what it can't support and StatusOr will -// inherit that behavior implicitly. -template -struct TraitsBase { - TraitsBase() = default; - TraitsBase(const TraitsBase&) = default; - TraitsBase(TraitsBase&&) = default; - TraitsBase& operator=(const TraitsBase&) = default; - TraitsBase& operator=(TraitsBase&&) = default; -}; - -template <> -struct TraitsBase { - TraitsBase() = default; - TraitsBase(const TraitsBase&) = delete; - TraitsBase(TraitsBase&&) = default; - TraitsBase& operator=(const TraitsBase&) = delete; - TraitsBase& operator=(TraitsBase&&) = default; -}; - -template <> -struct TraitsBase { - TraitsBase() = default; - TraitsBase(const TraitsBase&) = delete; - TraitsBase(TraitsBase&&) = delete; - TraitsBase& operator=(const TraitsBase&) = delete; - TraitsBase& operator=(TraitsBase&&) = delete; -}; - -} // namespace internal_statusor -} // namespace xla - -#endif // TENSORFLOW_COMPILER_XLA_STATUSOR_INTERNALS_H_ diff --git a/tensorflow/compiler/xla/statusor_test.cc b/tensorflow/compiler/xla/statusor_test.cc deleted file mode 100644 index 377a618ffb..0000000000 --- a/tensorflow/compiler/xla/statusor_test.cc +++ /dev/null @@ -1,675 +0,0 @@ -/* Copyright 2017 The TensorFlow Authors. All Rights Reserved. - -Licensed under the Apache License, Version 2.0 (the "License"); -you may not use this file except in compliance with the License. -You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - -Unless required by applicable law or agreed to in writing, software -distributed under the License is distributed on an "AS IS" BASIS, -WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -See the License for the specific language governing permissions and -limitations under the License. -==============================================================================*/ - -// Unit tests for StatusOr - -#include "tensorflow/compiler/xla/statusor.h" - -#include -#include - -#include "tensorflow/compiler/xla/test.h" -#include "tensorflow/compiler/xla/types.h" -#include "tensorflow/core/lib/core/errors.h" -#include "tensorflow/core/platform/macros.h" -#include "tensorflow/core/platform/test_benchmark.h" - -namespace xla { -namespace { - -class Base1 { - public: - virtual ~Base1() {} - int pad_; -}; - -class Base2 { - public: - virtual ~Base2() {} - int yetotherpad_; -}; - -class Derived : public Base1, public Base2 { - public: - ~Derived() override {} - int evenmorepad_; -}; - -class CopyNoAssign { - public: - explicit CopyNoAssign(int value) : foo_(value) {} - CopyNoAssign(const CopyNoAssign& other) : foo_(other.foo_) {} - int foo_; - - private: - const CopyNoAssign& operator=(const CopyNoAssign&); -}; - -class NoDefaultConstructor { - public: - explicit NoDefaultConstructor(int foo); -}; - -static_assert(!std::is_default_constructible(), - "Should not be default-constructible."); - -StatusOr> ReturnUniquePtr() { - // Uses implicit constructor from T&& - return std::unique_ptr(new int(0)); -} - -TEST(StatusOr, ElementType) { - static_assert(std::is_same::element_type, int>(), ""); - static_assert(std::is_same::element_type, char>(), ""); -} - -TEST(StatusOr, NullPointerStatusOr) { - // As a very special case, null-plain-pointer StatusOr used to be an - // error. Test that it no longer is. - StatusOr null_status(nullptr); - EXPECT_TRUE(null_status.ok()); - EXPECT_EQ(null_status.ValueOrDie(), nullptr); -} - -TEST(StatusOr, TestNoDefaultConstructorInitialization) { - // Explicitly initialize it with an error code. - StatusOr statusor(tensorflow::errors::Cancelled("")); - EXPECT_FALSE(statusor.ok()); - EXPECT_EQ(statusor.status().code(), tensorflow::error::CANCELLED); - - // Default construction of StatusOr initializes it with an UNKNOWN error code. - StatusOr statusor2; - EXPECT_FALSE(statusor2.ok()); - EXPECT_EQ(statusor2.status().code(), tensorflow::error::UNKNOWN); -} - -TEST(StatusOr, TestMoveOnlyInitialization) { - StatusOr> thing(ReturnUniquePtr()); - ASSERT_TRUE(thing.ok()); - EXPECT_EQ(0, *thing.ValueOrDie()); - int* previous = thing.ValueOrDie().get(); - - thing = ReturnUniquePtr(); - EXPECT_TRUE(thing.ok()); - EXPECT_EQ(0, *thing.ValueOrDie()); - EXPECT_NE(previous, thing.ValueOrDie().get()); -} - -TEST(StatusOr, TestMoveOnlyStatusCtr) { - StatusOr> thing(tensorflow::errors::Cancelled("")); - ASSERT_FALSE(thing.ok()); -} - -TEST(StatusOr, TestMoveOnlyValueExtraction) { - StatusOr> thing(ReturnUniquePtr()); - ASSERT_TRUE(thing.ok()); - std::unique_ptr ptr = thing.ConsumeValueOrDie(); - EXPECT_EQ(0, *ptr); - - thing = std::move(ptr); - ptr = std::move(thing.ValueOrDie()); - EXPECT_EQ(0, *ptr); -} - -TEST(StatusOr, TestMoveOnlyConversion) { - StatusOr> const_thing(ReturnUniquePtr()); - EXPECT_TRUE(const_thing.ok()); - EXPECT_EQ(0, *const_thing.ValueOrDie()); - - // Test rvalue converting assignment - const int* const_previous = const_thing.ValueOrDie().get(); - const_thing = ReturnUniquePtr(); - EXPECT_TRUE(const_thing.ok()); - EXPECT_EQ(0, *const_thing.ValueOrDie()); - EXPECT_NE(const_previous, const_thing.ValueOrDie().get()); -} - -TEST(StatusOr, TestMoveOnlyVector) { - // Sanity check that StatusOr works in vector. - std::vector>> vec; - vec.push_back(ReturnUniquePtr()); - vec.resize(2); - auto another_vec = std::move(vec); - EXPECT_EQ(0, *another_vec[0].ValueOrDie()); - EXPECT_EQ(tensorflow::error::UNKNOWN, another_vec[1].status().code()); -} - -TEST(StatusOr, TestMoveWithValuesAndErrors) { - StatusOr status_or(string(1000, '0')); - StatusOr value1(string(1000, '1')); - StatusOr value2(string(1000, '2')); - StatusOr error1(Status(tensorflow::error::UNKNOWN, "error1")); - StatusOr error2(Status(tensorflow::error::UNKNOWN, "error2")); - - ASSERT_TRUE(status_or.ok()); - EXPECT_EQ(string(1000, '0'), status_or.ValueOrDie()); - - // Overwrite the value in status_or with another value. - status_or = std::move(value1); - ASSERT_TRUE(status_or.ok()); - EXPECT_EQ(string(1000, '1'), status_or.ValueOrDie()); - - // Overwrite the value in status_or with an error. - status_or = std::move(error1); - ASSERT_FALSE(status_or.ok()); - EXPECT_EQ("error1", status_or.status().error_message()); - - // Overwrite the error in status_or with another error. - status_or = std::move(error2); - ASSERT_FALSE(status_or.ok()); - EXPECT_EQ("error2", status_or.status().error_message()); - - // Overwrite the error with a value. - status_or = std::move(value2); - ASSERT_TRUE(status_or.ok()); - EXPECT_EQ(string(1000, '2'), status_or.ValueOrDie()); -} - -TEST(StatusOr, TestCopyWithValuesAndErrors) { - StatusOr status_or(string(1000, '0')); - StatusOr value1(string(1000, '1')); - StatusOr value2(string(1000, '2')); - StatusOr error1(Status(tensorflow::error::UNKNOWN, "error1")); - StatusOr error2(Status(tensorflow::error::UNKNOWN, "error2")); - - ASSERT_TRUE(status_or.ok()); - EXPECT_EQ(string(1000, '0'), status_or.ValueOrDie()); - - // Overwrite the value in status_or with another value. - status_or = value1; - ASSERT_TRUE(status_or.ok()); - EXPECT_EQ(string(1000, '1'), status_or.ValueOrDie()); - - // Overwrite the value in status_or with an error. - status_or = error1; - ASSERT_FALSE(status_or.ok()); - EXPECT_EQ("error1", status_or.status().error_message()); - - // Overwrite the error in status_or with another error. - status_or = error2; - ASSERT_FALSE(status_or.ok()); - EXPECT_EQ("error2", status_or.status().error_message()); - - // Overwrite the error with a value. - status_or = value2; - ASSERT_TRUE(status_or.ok()); - EXPECT_EQ(string(1000, '2'), status_or.ValueOrDie()); - - // Verify original values unchanged. - EXPECT_EQ(string(1000, '1'), value1.ValueOrDie()); - EXPECT_EQ("error1", error1.status().error_message()); - EXPECT_EQ("error2", error2.status().error_message()); - EXPECT_EQ(string(1000, '2'), value2.ValueOrDie()); -} - -TEST(StatusOr, TestDefaultCtor) { - StatusOr thing; - EXPECT_FALSE(thing.ok()); - EXPECT_EQ(thing.status().code(), tensorflow::error::UNKNOWN); -} - -TEST(StatusOrDeathTest, TestDefaultCtorValue) { - StatusOr thing; - EXPECT_DEATH(thing.ValueOrDie(), ""); - - const StatusOr thing2; - EXPECT_DEATH(thing.ValueOrDie(), ""); -} - -TEST(StatusOr, TestStatusCtor) { - StatusOr thing(Status(tensorflow::error::CANCELLED, "")); - EXPECT_FALSE(thing.ok()); - EXPECT_EQ(thing.status().code(), tensorflow::error::CANCELLED); -} - -TEST(StatusOr, TestValueCtor) { - const int kI = 4; - const StatusOr thing(kI); - EXPECT_TRUE(thing.ok()); - EXPECT_EQ(kI, thing.ValueOrDie()); -} - -TEST(StatusOr, TestCopyCtorStatusOk) { - const int kI = 4; - const StatusOr original(kI); - const StatusOr copy(original); - EXPECT_EQ(copy.status(), original.status()); - EXPECT_EQ(original.ValueOrDie(), copy.ValueOrDie()); -} - -TEST(StatusOr, TestCopyCtorStatusNotOk) { - StatusOr original(Status(tensorflow::error::CANCELLED, "")); - StatusOr copy(original); - EXPECT_EQ(copy.status(), original.status()); -} - -TEST(StatusOr, TestCopyCtorNonAssignable) { - const int kI = 4; - CopyNoAssign value(kI); - StatusOr original(value); - StatusOr copy(original); - EXPECT_EQ(copy.status(), original.status()); - EXPECT_EQ(original.ValueOrDie().foo_, copy.ValueOrDie().foo_); -} - -TEST(StatusOr, TestCopyCtorStatusOKConverting) { - const int kI = 4; - StatusOr original(kI); - StatusOr copy(original); - EXPECT_EQ(copy.status(), original.status()); - EXPECT_DOUBLE_EQ(original.ValueOrDie(), copy.ValueOrDie()); -} - -TEST(StatusOr, TestCopyCtorStatusNotOkConverting) { - StatusOr original(Status(tensorflow::error::CANCELLED, "")); - StatusOr copy(original); - EXPECT_EQ(copy.status(), original.status()); -} - -TEST(StatusOr, TestAssignmentStatusOk) { - const int kI = 4; - StatusOr source(kI); - StatusOr target; - target = source; - EXPECT_EQ(target.status(), source.status()); - EXPECT_EQ(source.ValueOrDie(), target.ValueOrDie()); -} - -TEST(StatusOr, TestAssignmentStatusNotOk) { - StatusOr source(Status(tensorflow::error::CANCELLED, "")); - StatusOr target; - target = source; - EXPECT_EQ(target.status(), source.status()); -} - -TEST(StatusOr, TestStatus) { - StatusOr good(4); - EXPECT_TRUE(good.ok()); - StatusOr bad(Status(tensorflow::error::CANCELLED, "")); - EXPECT_FALSE(bad.ok()); - EXPECT_EQ(bad.status(), Status(tensorflow::error::CANCELLED, "")); -} - -TEST(StatusOr, TestValue) { - const int kI = 4; - StatusOr thing(kI); - EXPECT_EQ(kI, thing.ValueOrDie()); -} - -TEST(StatusOr, TestValueConst) { - const int kI = 4; - const StatusOr thing(kI); - EXPECT_EQ(kI, thing.ValueOrDie()); -} - -TEST(StatusOrDeathTest, TestValueNotOk) { - StatusOr thing(Status(tensorflow::error::CANCELLED, "cancelled")); - EXPECT_DEATH(thing.ValueOrDie(), "cancelled"); -} - -TEST(StatusOrDeathTest, TestValueNotOkConst) { - const StatusOr thing(Status(tensorflow::error::UNKNOWN, "")); - EXPECT_DEATH(thing.ValueOrDie(), ""); -} - -TEST(StatusOr, TestPointerDefaultCtor) { - StatusOr thing; - EXPECT_FALSE(thing.ok()); - EXPECT_EQ(thing.status().code(), tensorflow::error::UNKNOWN); -} - -TEST(StatusOrDeathTest, TestPointerDefaultCtorValue) { - StatusOr thing; - EXPECT_DEATH(thing.ValueOrDie(), ""); -} - -TEST(StatusOr, TestPointerStatusCtor) { - StatusOr thing(Status(tensorflow::error::CANCELLED, "")); - EXPECT_FALSE(thing.ok()); - EXPECT_EQ(thing.status(), Status(tensorflow::error::CANCELLED, "")); -} - -TEST(StatusOr, TestPointerValueCtor) { - const int kI = 4; - StatusOr thing(&kI); - EXPECT_TRUE(thing.ok()); - EXPECT_EQ(&kI, thing.ValueOrDie()); -} - -TEST(StatusOr, TestPointerCopyCtorStatusOk) { - const int kI = 0; - StatusOr original(&kI); - StatusOr copy(original); - EXPECT_EQ(copy.status(), original.status()); - EXPECT_EQ(original.ValueOrDie(), copy.ValueOrDie()); -} - -TEST(StatusOr, TestPointerCopyCtorStatusNotOk) { - StatusOr original(Status(tensorflow::error::CANCELLED, "")); - StatusOr copy(original); - EXPECT_EQ(copy.status(), original.status()); -} - -TEST(StatusOr, TestPointerCopyCtorStatusOKConverting) { - Derived derived; - StatusOr original(&derived); - StatusOr copy(original); - EXPECT_EQ(copy.status(), original.status()); - EXPECT_EQ(static_cast(original.ValueOrDie()), - copy.ValueOrDie()); -} - -TEST(StatusOr, TestPointerCopyCtorStatusNotOkConverting) { - StatusOr original(Status(tensorflow::error::CANCELLED, "")); - StatusOr copy(original); - EXPECT_EQ(copy.status(), original.status()); -} - -TEST(StatusOr, TestPointerAssignmentStatusOk) { - const int kI = 0; - StatusOr source(&kI); - StatusOr target; - target = source; - EXPECT_EQ(target.status(), source.status()); - EXPECT_EQ(source.ValueOrDie(), target.ValueOrDie()); -} - -TEST(StatusOr, TestPointerAssignmentStatusNotOk) { - StatusOr source(Status(tensorflow::error::CANCELLED, "")); - StatusOr target; - target = source; - EXPECT_EQ(target.status(), source.status()); -} - -TEST(StatusOr, TestPointerStatus) { - const int kI = 0; - StatusOr good(&kI); - EXPECT_TRUE(good.ok()); - StatusOr bad(Status(tensorflow::error::CANCELLED, "")); - EXPECT_EQ(bad.status(), Status(tensorflow::error::CANCELLED, "")); -} - -TEST(StatusOr, TestPointerValue) { - const int kI = 0; - StatusOr thing(&kI); - EXPECT_EQ(&kI, thing.ValueOrDie()); -} - -TEST(StatusOr, TestPointerValueConst) { - const int kI = 0; - const StatusOr thing(&kI); - EXPECT_EQ(&kI, thing.ValueOrDie()); -} - -// NOTE(tucker): StatusOr does not support this kind -// of resize op. -// TEST(StatusOr, StatusOrVectorOfUniquePointerCanResize) { -// using EvilType = std::vector>; -// static_assert(std::is_copy_constructible::value, ""); -// std::vector> v(5); -// v.reserve(v.capacity() + 10); -// } - -TEST(StatusOrDeathTest, TestPointerValueNotOk) { - StatusOr thing(Status(tensorflow::error::CANCELLED, "cancelled")); - EXPECT_DEATH(thing.ValueOrDie(), "cancelled"); -} - -TEST(StatusOrDeathTest, TestPointerValueNotOkConst) { - const StatusOr thing(Status(tensorflow::error::CANCELLED, "cancelled")); - EXPECT_DEATH(thing.ValueOrDie(), "cancelled"); -} - -static StatusOr MakeStatus() { return 100; } -// A factory to help us benchmark the various factory styles. All of -// the factory methods are marked as non-inlineable so as to more -// accurately simulate calling a factory for which you do not have -// visibility of implementation. Similarly, the value_ variable is -// marked volatile to prevent the compiler from getting too clever -// about detecting that the same value is used in all loop iterations. -template -class BenchmarkFactory { - public: - // Construct a new factory. Allocate an object which will always - // be the result of the factory methods. - BenchmarkFactory() : value_(new T) {} - - // Destroy this factory, including the result value. - ~BenchmarkFactory() { delete value_; } - - // A trivial factory that just returns the value. There is no status - // object that could be returned to encapsulate an error - T* TrivialFactory() TF_ATTRIBUTE_NOINLINE { return value_; } - - // A more sophisticated factory, which returns a status to indicate - // the result of the operation. The factory result is populated into - // the user provided pointer result. - Status ArgumentFactory(T** result) TF_ATTRIBUTE_NOINLINE { - *result = value_; - return Status::OK(); - } - - Status ArgumentFactoryFail(T** result) TF_ATTRIBUTE_NOINLINE { - *result = nullptr; - return Status(tensorflow::error::CANCELLED, ""); - } - - Status ArgumentFactoryFailShortMsg(T** result) TF_ATTRIBUTE_NOINLINE { - *result = nullptr; - return Status(::tensorflow::error::INTERNAL, ""); - } - - Status ArgumentFactoryFailLongMsg(T** result) TF_ATTRIBUTE_NOINLINE { - *result = nullptr; - return Status(::tensorflow::error::INTERNAL, - "a big string of message junk that will never be read"); - } - - // A factory that returns a StatusOr. If the factory operation - // is OK, then the StatusOr will hold a T*. Otherwise, it will - // hold a status explaining the error. - StatusOr StatusOrFactory() TF_ATTRIBUTE_NOINLINE { - return static_cast(value_); - } - - StatusOr StatusOrFactoryFail() TF_ATTRIBUTE_NOINLINE { - return Status(tensorflow::error::CANCELLED, ""); - } - - StatusOr StatusOrFactoryFailShortMsg() TF_ATTRIBUTE_NOINLINE { - return Status(::tensorflow::error::INTERNAL, ""); - } - - StatusOr StatusOrFactoryFailLongMsg() TF_ATTRIBUTE_NOINLINE { - return Status(::tensorflow::error::INTERNAL, - "a big string of message junk that will never be read"); - } - - private: - T* volatile value_; - TF_DISALLOW_COPY_AND_ASSIGN(BenchmarkFactory); -}; - -// A simple type we use with the factory. -class BenchmarkType { - public: - BenchmarkType() {} - virtual ~BenchmarkType() {} - virtual void DoWork() TF_ATTRIBUTE_NOINLINE {} - - private: - TF_DISALLOW_COPY_AND_ASSIGN(BenchmarkType); -}; - -// Calibrate the amount of time spent just calling DoWork, since each of our -// tests will do this, we can subtract this out of benchmark results. -void BM_CalibrateWorkLoop(int iters) { - tensorflow::testing::StopTiming(); - BenchmarkFactory factory; - BenchmarkType* result = factory.TrivialFactory(); - tensorflow::testing::StartTiming(); - for (int i = 0; i != iters; ++i) { - if (result != nullptr) { - result->DoWork(); - } - } -} -BENCHMARK(BM_CalibrateWorkLoop); - -// Measure the time taken to call into the factory, return the value, -// determine that it is OK, and invoke a trivial function. -void BM_TrivialFactory(int iters) { - tensorflow::testing::StopTiming(); - BenchmarkFactory factory; - tensorflow::testing::StartTiming(); - for (int i = 0; i != iters; ++i) { - BenchmarkType* result = factory.TrivialFactory(); - if (result != nullptr) { - result->DoWork(); - } - } -} -BENCHMARK(BM_TrivialFactory); - -// Measure the time taken to call into the factory, providing an -// out-param for the result, evaluating the status result and the -// result pointer, and invoking the trivial function. -void BM_ArgumentFactory(int iters) { - tensorflow::testing::StopTiming(); - BenchmarkFactory factory; - tensorflow::testing::StartTiming(); - for (int i = 0; i != iters; ++i) { - BenchmarkType* result = nullptr; - Status status = factory.ArgumentFactory(&result); - if (status.ok() && result != nullptr) { - result->DoWork(); - } - } -} -BENCHMARK(BM_ArgumentFactory); - -// Measure the time to use the StatusOr factory, evaluate the result, -// and invoke the trivial function. -void BM_StatusOrFactory(int iters) { - tensorflow::testing::StopTiming(); - BenchmarkFactory factory; - tensorflow::testing::StartTiming(); - for (int i = 0; i != iters; ++i) { - StatusOr result = factory.StatusOrFactory(); - if (result.ok()) { - result.ValueOrDie()->DoWork(); - } - } -} -BENCHMARK(BM_StatusOrFactory); - -// Measure the time taken to call into the factory, providing an -// out-param for the result, evaluating the status result and the -// result pointer, and invoking the trivial function. -void BM_ArgumentFactoryFail(int iters) { - tensorflow::testing::StopTiming(); - BenchmarkFactory factory; - tensorflow::testing::StartTiming(); - for (int i = 0; i != iters; ++i) { - BenchmarkType* result = nullptr; - Status status = factory.ArgumentFactoryFail(&result); - if (status.ok() && result != nullptr) { - result->DoWork(); - } - } -} -BENCHMARK(BM_ArgumentFactoryFail); - -// Measure the time to use the StatusOr factory, evaluate the result, -// and invoke the trivial function. -void BM_StatusOrFactoryFail(int iters) { - tensorflow::testing::StopTiming(); - BenchmarkFactory factory; - tensorflow::testing::StartTiming(); - for (int i = 0; i != iters; ++i) { - StatusOr result = factory.StatusOrFactoryFail(); - if (result.ok()) { - result.ValueOrDie()->DoWork(); - } - } -} -BENCHMARK(BM_StatusOrFactoryFail); - -// Measure the time taken to call into the factory, providing an -// out-param for the result, evaluating the status result and the -// result pointer, and invoking the trivial function. -void BM_ArgumentFactoryFailShortMsg(int iters) { - tensorflow::testing::StopTiming(); - BenchmarkFactory factory; - tensorflow::testing::StartTiming(); - for (int i = 0; i != iters; ++i) { - BenchmarkType* result = nullptr; - Status status = factory.ArgumentFactoryFailShortMsg(&result); - if (status.ok() && result != nullptr) { - result->DoWork(); - } - } -} -BENCHMARK(BM_ArgumentFactoryFailShortMsg); - -// Measure the time to use the StatusOr factory, evaluate the result, -// and invoke the trivial function. -void BM_StatusOrFactoryFailShortMsg(int iters) { - tensorflow::testing::StopTiming(); - BenchmarkFactory factory; - tensorflow::testing::StartTiming(); - for (int i = 0; i != iters; ++i) { - StatusOr result = factory.StatusOrFactoryFailShortMsg(); - if (result.ok()) { - result.ValueOrDie()->DoWork(); - } - } -} -BENCHMARK(BM_StatusOrFactoryFailShortMsg); - -// Measure the time taken to call into the factory, providing an -// out-param for the result, evaluating the status result and the -// result pointer, and invoking the trivial function. -void BM_ArgumentFactoryFailLongMsg(int iters) { - tensorflow::testing::StopTiming(); - BenchmarkFactory factory; - tensorflow::testing::StartTiming(); - for (int i = 0; i != iters; ++i) { - BenchmarkType* result = nullptr; - Status status = factory.ArgumentFactoryFailLongMsg(&result); - if (status.ok() && result != nullptr) { - result->DoWork(); - } - } -} -BENCHMARK(BM_ArgumentFactoryFailLongMsg); - -// Measure the time to use the StatusOr factory, evaluate the result, -// and invoke the trivial function. -void BM_StatusOrFactoryFailLongMsg(int iters) { - tensorflow::testing::StopTiming(); - BenchmarkFactory factory; - tensorflow::testing::StartTiming(); - for (int i = 0; i != iters; ++i) { - StatusOr result = factory.StatusOrFactoryFailLongMsg(); - if (result.ok()) { - result.ValueOrDie()->DoWork(); - } - } -} -BENCHMARK(BM_StatusOrFactoryFailLongMsg); - -} // namespace -} // namespace xla diff --git a/tensorflow/stream_executor/BUILD b/tensorflow/stream_executor/BUILD index c68cda0100..21295abed1 100644 --- a/tensorflow/stream_executor/BUILD +++ b/tensorflow/stream_executor/BUILD @@ -33,7 +33,6 @@ cc_library( }), visibility = ["//visibility:public"], deps = [ - "//tensorflow/compiler/xla:statusor", "//tensorflow/core:lib", "//tensorflow/core:ptr_util", "@local_config_cuda//cuda:cuda_headers", @@ -48,7 +47,6 @@ cc_library( deps = [ "//tensorflow/core:lib", "//tensorflow/core:ptr_util", - "//tensorflow/compiler/xla:statusor", "@local_config_cuda//cuda:cuda_headers", ] + if_static([":stream_executor_impl"]), ) diff --git a/tensorflow/stream_executor/lib/statusor.cc b/tensorflow/stream_executor/lib/statusor.cc new file mode 100644 index 0000000000..e0e851f96e --- /dev/null +++ b/tensorflow/stream_executor/lib/statusor.cc @@ -0,0 +1,40 @@ +/* Copyright 2017 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +#include "tensorflow/stream_executor/lib/statusor.h" + +#include "tensorflow/core/lib/core/errors.h" +#include "tensorflow/core/platform/logging.h" + +namespace stream_executor { +namespace port { +namespace internal_statusor { + +void Helper::HandleInvalidStatusCtorArg(Status* status) { + const char* kMessage = + "An OK status is not a valid constructor argument to StatusOr"; + LOG(ERROR) << kMessage; + // Fall back to tensorflow::error::INTERNAL. + *status = ::tensorflow::errors::Internal(kMessage); +} + +void Helper::Crash(const Status& status) { + LOG(FATAL) << "Attempting to fetch value instead of handling error " + << status; +} + +} // namespace internal_statusor +} // namespace port +} // namespace stream_executor diff --git a/tensorflow/stream_executor/lib/statusor.h b/tensorflow/stream_executor/lib/statusor.h index dab5909674..3c716acb46 100644 --- a/tensorflow/stream_executor/lib/statusor.h +++ b/tensorflow/stream_executor/lib/statusor.h @@ -1,4 +1,4 @@ -/* Copyright 2015 The TensorFlow Authors. All Rights Reserved. +/* Copyright 2017 The TensorFlow Authors. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. @@ -13,19 +13,297 @@ See the License for the specific language governing permissions and limitations under the License. ==============================================================================*/ -// IWYU pragma: private, include "third_party/tensorflow/stream_executor/stream_executor.h" - +// StatusOr is the union of a Status object and a T object. StatusOr models +// the concept of an object that is either a value, or an error Status +// explaining why such a value is not present. To this end, StatusOr does not +// allow its Status value to be Status::OK. +// +// The primary use-case for StatusOr is as the return value of a +// function which may fail. +// +// Example client usage for a StatusOr, where T is not a pointer: +// +// StatusOr result = DoBigCalculationThatCouldFail(); +// if (result.ok()) { +// float answer = result.ValueOrDie(); +// printf("Big calculation yielded: %f", answer); +// } else { +// LOG(ERROR) << result.status(); +// } +// +// Example client usage for a StatusOr: +// +// StatusOr result = FooFactory::MakeNewFoo(arg); +// if (result.ok()) { +// std::unique_ptr foo(result.ValueOrDie()); +// foo->DoSomethingCool(); +// } else { +// LOG(ERROR) << result.status(); +// } +// +// Example client usage for a StatusOr>: +// +// StatusOr> result = FooFactory::MakeNewFoo(arg); +// if (result.ok()) { +// std::unique_ptr foo = std::move(result.ValueOrDie()); +// foo->DoSomethingCool(); +// } else { +// LOG(ERROR) << result.status(); +// } +// +// Example factory implementation returning StatusOr: +// +// StatusOr FooFactory::MakeNewFoo(int arg) { +// if (arg <= 0) { +// return tensorflow::InvalidArgument("Arg must be positive"); +// } else { +// return new Foo(arg); +// } +// } +// +// Note that the assignment operators require that destroying the currently +// stored value cannot invalidate the argument; in other words, the argument +// cannot be an alias for the current value, or anything owned by the current +// value. #ifndef TENSORFLOW_STREAM_EXECUTOR_LIB_STATUSOR_H_ #define TENSORFLOW_STREAM_EXECUTOR_LIB_STATUSOR_H_ -#include "tensorflow/compiler/xla/statusor.h" +#include "tensorflow/core/platform/macros.h" +#include "tensorflow/stream_executor/lib/status.h" +#include "tensorflow/stream_executor/lib/statusor_internals.h" namespace stream_executor { namespace port { -// Use XLA's StatusOr so we don't duplicate code. +#if defined(__clang__) +// Only clang supports warn_unused_result as a type annotation. +template +class TF_MUST_USE_RESULT StatusOr; +#endif + +template +class StatusOr : private internal_statusor::StatusOrData, + private internal_statusor::TraitsBase< + std::is_copy_constructible::value, + std::is_move_constructible::value> { + template + friend class StatusOr; + + typedef internal_statusor::StatusOrData Base; + + public: + typedef T element_type; + + // Constructs a new StatusOr with Status::UNKNOWN status. This is marked + // 'explicit' to try to catch cases like 'return {};', where people think + // StatusOr> will be initialized with an empty vector, + // instead of a Status::UNKNOWN status. + explicit StatusOr(); + + // StatusOr will be copy constructible/assignable if T is copy + // constructible. + StatusOr(const StatusOr&) = default; + StatusOr& operator=(const StatusOr&) = default; + + // StatusOr will be move constructible/assignable if T is move + // constructible. + StatusOr(StatusOr&&) = default; + StatusOr& operator=(StatusOr&&) = default; + + // Conversion copy/move constructor, T must be convertible from U. + template ::value>::type* = nullptr> + StatusOr(const StatusOr& other); + template ::value>::type* = nullptr> + StatusOr(StatusOr&& other); + + // Conversion copy/move assignment operator, T must be convertible from U. + template ::value>::type* = nullptr> + StatusOr& operator=(const StatusOr& other); + template ::value>::type* = nullptr> + StatusOr& operator=(StatusOr&& other); + + // Constructs a new StatusOr with the given value. After calling this + // constructor, calls to ValueOrDie() will succeed, and calls to status() will + // return OK. + // + // NOTE: Not explicit - we want to use StatusOr as a return type + // so it is convenient and sensible to be able to do 'return T()' + // when the return type is StatusOr. + // + // REQUIRES: T is copy constructible. + StatusOr(const T& value); + + // Constructs a new StatusOr with the given non-ok status. After calling + // this constructor, calls to ValueOrDie() will CHECK-fail. + // + // NOTE: Not explicit - we want to use StatusOr as a return + // value, so it is convenient and sensible to be able to do 'return + // Status()' when the return type is StatusOr. + // + // REQUIRES: !status.ok(). This requirement is DCHECKed. + // In optimized builds, passing Status::OK() here will have the effect + // of passing tensorflow::error::INTERNAL as a fallback. + StatusOr(const Status& status); + StatusOr& operator=(const Status& status); + + // TODO(b/62186997): Add operator=(T) overloads. + + // Similar to the `const T&` overload. + // + // REQUIRES: T is move constructible. + StatusOr(T&& value); + + // RValue versions of the operations declared above. + StatusOr(Status&& status); + StatusOr& operator=(Status&& status); + + // Returns this->status().ok() + bool ok() const { return this->status_.ok(); } + + // Returns a reference to our status. If this contains a T, then + // returns Status::OK(). + const Status& status() const &; + Status status() &&; + + // Returns a reference to our current value, or CHECK-fails if !this->ok(). + // + // Note: for value types that are cheap to copy, prefer simple code: + // + // T value = statusor.ValueOrDie(); + // + // Otherwise, if the value type is expensive to copy, but can be left + // in the StatusOr, simply assign to a reference: + // + // T& value = statusor.ValueOrDie(); // or `const T&` + // + // Otherwise, if the value type supports an efficient move, it can be + // used as follows: + // + // T value = std::move(statusor).ValueOrDie(); + // + // The std::move on statusor instead of on the whole expression enables + // warnings about possible uses of the statusor object after the move. + // C++ style guide waiver for ref-qualified overloads granted in cl/143176389 + // See go/ref-qualifiers for more details on such overloads. + const T& ValueOrDie() const &; + T& ValueOrDie() &; + const T&& ValueOrDie() const &&; + T&& ValueOrDie() &&; + + T ConsumeValueOrDie() { return std::move(ValueOrDie()); } + + // Ignores any errors. This method does nothing except potentially suppress + // complaints from any tools that are checking that errors are not dropped on + // the floor. + void IgnoreError() const; +}; + +//////////////////////////////////////////////////////////////////////////////// +// Implementation details for StatusOr + +template +StatusOr::StatusOr() : Base(Status(tensorflow::error::UNKNOWN, "")) {} + +template +StatusOr::StatusOr(const T& value) : Base(value) {} + +template +StatusOr::StatusOr(const Status& status) : Base(status) {} + +template +StatusOr& StatusOr::operator=(const Status& status) { + this->Assign(status); + return *this; +} + +template +StatusOr::StatusOr(T&& value) : Base(std::move(value)) {} + +template +StatusOr::StatusOr(Status&& status) : Base(std::move(status)) {} + +template +StatusOr& StatusOr::operator=(Status&& status) { + this->Assign(std::move(status)); + return *this; +} + +template +template ::value>::type*> +inline StatusOr::StatusOr(const StatusOr& other) + : Base(static_cast::Base&>(other)) {} + +template +template ::value>::type*> +inline StatusOr& StatusOr::operator=(const StatusOr& other) { + if (other.ok()) + this->Assign(other.ValueOrDie()); + else + this->Assign(other.status()); + return *this; +} + +template +template ::value>::type*> +inline StatusOr::StatusOr(StatusOr&& other) + : Base(static_cast::Base&&>(other)) {} + +template +template ::value>::type*> +inline StatusOr& StatusOr::operator=(StatusOr&& other) { + if (other.ok()) { + this->Assign(std::move(other).ValueOrDie()); + } else { + this->Assign(std::move(other).status()); + } + return *this; +} + +template +const Status& StatusOr::status() const & { + return this->status_; +} +template +Status StatusOr::status() && { + return ok() ? Status::OK() : std::move(this->status_); +} + +template +const T& StatusOr::ValueOrDie() const & { + this->EnsureOk(); + return this->data_; +} + +template +T& StatusOr::ValueOrDie() & { + this->EnsureOk(); + return this->data_; +} + +template +const T&& StatusOr::ValueOrDie() const && { + this->EnsureOk(); + return std::move(this->data_); +} + +template +T&& StatusOr::ValueOrDie() && { + this->EnsureOk(); + return std::move(this->data_); +} + template -using StatusOr = ::xla::StatusOr; +void StatusOr::IgnoreError() const { + // no-op +} } // namespace port } // namespace stream_executor diff --git a/tensorflow/stream_executor/lib/statusor_internals.h b/tensorflow/stream_executor/lib/statusor_internals.h new file mode 100644 index 0000000000..09f88f5825 --- /dev/null +++ b/tensorflow/stream_executor/lib/statusor_internals.h @@ -0,0 +1,248 @@ +/* Copyright 2017 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +#ifndef TENSORFLOW_STREAM_EXECUTOR_LIB_STATUSOR_INTERNALS_H_ +#define TENSORFLOW_STREAM_EXECUTOR_LIB_STATUSOR_INTERNALS_H_ + + +#include "tensorflow/core/platform/macros.h" +#include "tensorflow/stream_executor/lib/status.h" + +namespace stream_executor { +namespace port { +namespace internal_statusor { + +class Helper { + public: + // Move type-agnostic error handling to the .cc. + static void HandleInvalidStatusCtorArg(Status*); + TF_ATTRIBUTE_NORETURN static void Crash(const Status& status); +}; + +// Construct an instance of T in `p` through placement new, passing Args... to +// the constructor. +// This abstraction is here mostly for the gcc performance fix. +template +void PlacementNew(void* p, Args&&... args) { +#if defined(__GNUC__) && !defined(__clang__) + // Teach gcc that 'p' cannot be null, fixing code size issues. + if (p == nullptr) __builtin_unreachable(); +#endif + new (p) T(std::forward(args)...); +} + +// Helper base class to hold the data and all operations. +// We move all this to a base class to allow mixing with the appropriate +// TraitsBase specialization. +template +class StatusOrData { + template + friend class StatusOrData; + + public: + StatusOrData() = delete; + + StatusOrData(const StatusOrData& other) { + if (other.ok()) { + MakeValue(other.data_); + MakeStatus(); + } else { + MakeStatus(other.status_); + } + } + + StatusOrData(StatusOrData&& other) noexcept { + if (other.ok()) { + MakeValue(std::move(other.data_)); + MakeStatus(); + } else { + MakeStatus(std::move(other.status_)); + } + } + + template + StatusOrData(const StatusOrData& other) { + if (other.ok()) { + MakeValue(other.data_); + MakeStatus(); + } else { + MakeStatus(other.status_); + } + } + + template + StatusOrData(StatusOrData&& other) { + if (other.ok()) { + MakeValue(std::move(other.data_)); + MakeStatus(); + } else { + MakeStatus(std::move(other.status_)); + } + } + + explicit StatusOrData(const T& value) : data_(value) { MakeStatus(); } + explicit StatusOrData(T&& value) : data_(std::move(value)) { MakeStatus(); } + + explicit StatusOrData(const Status& status) : status_(status) { + EnsureNotOk(); + } + explicit StatusOrData(Status&& status) : status_(std::move(status)) { + EnsureNotOk(); + } + + StatusOrData& operator=(const StatusOrData& other) { + if (this == &other) return *this; + if (other.ok()) + Assign(other.data_); + else + Assign(other.status_); + return *this; + } + + StatusOrData& operator=(StatusOrData&& other) { + if (this == &other) return *this; + if (other.ok()) + Assign(std::move(other.data_)); + else + Assign(std::move(other.status_)); + return *this; + } + + ~StatusOrData() { + if (ok()) { + status_.~Status(); + data_.~T(); + } else { + status_.~Status(); + } + } + + void Assign(const T& value) { + if (ok()) { + data_.~T(); + MakeValue(value); + } else { + MakeValue(value); + status_ = Status::OK(); + } + } + + void Assign(T&& value) { + if (ok()) { + data_.~T(); + MakeValue(std::move(value)); + } else { + MakeValue(std::move(value)); + status_ = Status::OK(); + } + } + + void Assign(const Status& status) { + Clear(); + status_ = status; + EnsureNotOk(); + } + + void Assign(Status&& status) { + Clear(); + status_ = std::move(status); + EnsureNotOk(); + } + + bool ok() const { return status_.ok(); } + + protected: + // status_ will always be active after the constructor. + // We make it a union to be able to initialize exactly how we need without + // waste. + // Eg. in the copy constructor we use the default constructor of Status in + // the ok() path to avoid an extra Ref call. + union { + Status status_; + }; + + // data_ is active iff status_.ok()==true + struct Dummy {}; + union { + // When T is const, we need some non-const object we can cast to void* for + // the placement new. dummy_ is that object. + Dummy dummy_; + T data_; + }; + + void Clear() { + if (ok()) data_.~T(); + } + + void EnsureOk() const { + if (!ok()) Helper::Crash(status_); + } + + void EnsureNotOk() { + if (ok()) Helper::HandleInvalidStatusCtorArg(&status_); + } + + // Construct the value (ie. data_) through placement new with the passed + // argument. + template + void MakeValue(Arg&& arg) { + internal_statusor::PlacementNew(&dummy_, std::forward(arg)); + } + + // Construct the status (ie. status_) through placement new with the passed + // argument. + template + void MakeStatus(Args&&... args) { + internal_statusor::PlacementNew(&status_, + std::forward(args)...); + } +}; + +// Helper base class to allow implicitly deleted constructors and assignment +// operations in StatusOr. +// TraitsBase will explicitly delete what it can't support and StatusOr will +// inherit that behavior implicitly. +template +struct TraitsBase { + TraitsBase() = default; + TraitsBase(const TraitsBase&) = default; + TraitsBase(TraitsBase&&) = default; + TraitsBase& operator=(const TraitsBase&) = default; + TraitsBase& operator=(TraitsBase&&) = default; +}; + +template <> +struct TraitsBase { + TraitsBase() = default; + TraitsBase(const TraitsBase&) = delete; + TraitsBase(TraitsBase&&) = default; + TraitsBase& operator=(const TraitsBase&) = delete; + TraitsBase& operator=(TraitsBase&&) = default; +}; + +template <> +struct TraitsBase { + TraitsBase() = default; + TraitsBase(const TraitsBase&) = delete; + TraitsBase(TraitsBase&&) = delete; + TraitsBase& operator=(const TraitsBase&) = delete; + TraitsBase& operator=(TraitsBase&&) = delete; +}; + +} // namespace internal_statusor +} // namespace port +} // namespace stream_executor + +#endif // TENSORFLOW_STREAM_EXECUTOR_LIB_STATUSOR_INTERNALS_H_ diff --git a/tensorflow/stream_executor/lib/statusor_test.cc b/tensorflow/stream_executor/lib/statusor_test.cc new file mode 100644 index 0000000000..56584e1892 --- /dev/null +++ b/tensorflow/stream_executor/lib/statusor_test.cc @@ -0,0 +1,676 @@ +/* Copyright 2017 The TensorFlow Authors. All Rights Reserved. + +Licensed under the Apache License, Version 2.0 (the "License"); +you may not use this file except in compliance with the License. +You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +==============================================================================*/ + +// Unit tests for StatusOr + +#include "tensorflow/stream_executor/lib/statusor.h" + +#include +#include + +#include "tensorflow/core/platform/test.h" +#include "tensorflow/core/lib/core/errors.h" +#include "tensorflow/core/platform/macros.h" +#include "tensorflow/core/platform/test_benchmark.h" + +namespace stream_executor { +namespace port { +namespace { + +class Base1 { + public: + virtual ~Base1() {} + int pad_; +}; + +class Base2 { + public: + virtual ~Base2() {} + int yetotherpad_; +}; + +class Derived : public Base1, public Base2 { + public: + ~Derived() override {} + int evenmorepad_; +}; + +class CopyNoAssign { + public: + explicit CopyNoAssign(int value) : foo_(value) {} + CopyNoAssign(const CopyNoAssign& other) : foo_(other.foo_) {} + int foo_; + + private: + const CopyNoAssign& operator=(const CopyNoAssign&); +}; + +class NoDefaultConstructor { + public: + explicit NoDefaultConstructor(int foo); +}; + +static_assert(!std::is_default_constructible(), + "Should not be default-constructible."); + +StatusOr> ReturnUniquePtr() { + // Uses implicit constructor from T&& + return std::unique_ptr(new int(0)); +} + +TEST(StatusOr, ElementType) { + static_assert(std::is_same::element_type, int>(), ""); + static_assert(std::is_same::element_type, char>(), ""); +} + +TEST(StatusOr, NullPointerStatusOr) { + // As a very special case, null-plain-pointer StatusOr used to be an + // error. Test that it no longer is. + StatusOr null_status(nullptr); + EXPECT_TRUE(null_status.ok()); + EXPECT_EQ(null_status.ValueOrDie(), nullptr); +} + +TEST(StatusOr, TestNoDefaultConstructorInitialization) { + // Explicitly initialize it with an error code. + StatusOr statusor(tensorflow::errors::Cancelled("")); + EXPECT_FALSE(statusor.ok()); + EXPECT_EQ(statusor.status().code(), tensorflow::error::CANCELLED); + + // Default construction of StatusOr initializes it with an UNKNOWN error code. + StatusOr statusor2; + EXPECT_FALSE(statusor2.ok()); + EXPECT_EQ(statusor2.status().code(), tensorflow::error::UNKNOWN); +} + +TEST(StatusOr, TestMoveOnlyInitialization) { + StatusOr> thing(ReturnUniquePtr()); + ASSERT_TRUE(thing.ok()); + EXPECT_EQ(0, *thing.ValueOrDie()); + int* previous = thing.ValueOrDie().get(); + + thing = ReturnUniquePtr(); + EXPECT_TRUE(thing.ok()); + EXPECT_EQ(0, *thing.ValueOrDie()); + EXPECT_NE(previous, thing.ValueOrDie().get()); +} + +TEST(StatusOr, TestMoveOnlyStatusCtr) { + StatusOr> thing(tensorflow::errors::Cancelled("")); + ASSERT_FALSE(thing.ok()); +} + +TEST(StatusOr, TestMoveOnlyValueExtraction) { + StatusOr> thing(ReturnUniquePtr()); + ASSERT_TRUE(thing.ok()); + std::unique_ptr ptr = thing.ConsumeValueOrDie(); + EXPECT_EQ(0, *ptr); + + thing = std::move(ptr); + ptr = std::move(thing.ValueOrDie()); + EXPECT_EQ(0, *ptr); +} + +TEST(StatusOr, TestMoveOnlyConversion) { + StatusOr> const_thing(ReturnUniquePtr()); + EXPECT_TRUE(const_thing.ok()); + EXPECT_EQ(0, *const_thing.ValueOrDie()); + + // Test rvalue converting assignment + const int* const_previous = const_thing.ValueOrDie().get(); + const_thing = ReturnUniquePtr(); + EXPECT_TRUE(const_thing.ok()); + EXPECT_EQ(0, *const_thing.ValueOrDie()); + EXPECT_NE(const_previous, const_thing.ValueOrDie().get()); +} + +TEST(StatusOr, TestMoveOnlyVector) { + // Sanity check that StatusOr works in vector. + std::vector>> vec; + vec.push_back(ReturnUniquePtr()); + vec.resize(2); + auto another_vec = std::move(vec); + EXPECT_EQ(0, *another_vec[0].ValueOrDie()); + EXPECT_EQ(tensorflow::error::UNKNOWN, another_vec[1].status().code()); +} + +TEST(StatusOr, TestMoveWithValuesAndErrors) { + StatusOr status_or(string(1000, '0')); + StatusOr value1(string(1000, '1')); + StatusOr value2(string(1000, '2')); + StatusOr error1(Status(tensorflow::error::UNKNOWN, "error1")); + StatusOr error2(Status(tensorflow::error::UNKNOWN, "error2")); + + ASSERT_TRUE(status_or.ok()); + EXPECT_EQ(string(1000, '0'), status_or.ValueOrDie()); + + // Overwrite the value in status_or with another value. + status_or = std::move(value1); + ASSERT_TRUE(status_or.ok()); + EXPECT_EQ(string(1000, '1'), status_or.ValueOrDie()); + + // Overwrite the value in status_or with an error. + status_or = std::move(error1); + ASSERT_FALSE(status_or.ok()); + EXPECT_EQ("error1", status_or.status().error_message()); + + // Overwrite the error in status_or with another error. + status_or = std::move(error2); + ASSERT_FALSE(status_or.ok()); + EXPECT_EQ("error2", status_or.status().error_message()); + + // Overwrite the error with a value. + status_or = std::move(value2); + ASSERT_TRUE(status_or.ok()); + EXPECT_EQ(string(1000, '2'), status_or.ValueOrDie()); +} + +TEST(StatusOr, TestCopyWithValuesAndErrors) { + StatusOr status_or(string(1000, '0')); + StatusOr value1(string(1000, '1')); + StatusOr value2(string(1000, '2')); + StatusOr error1(Status(tensorflow::error::UNKNOWN, "error1")); + StatusOr error2(Status(tensorflow::error::UNKNOWN, "error2")); + + ASSERT_TRUE(status_or.ok()); + EXPECT_EQ(string(1000, '0'), status_or.ValueOrDie()); + + // Overwrite the value in status_or with another value. + status_or = value1; + ASSERT_TRUE(status_or.ok()); + EXPECT_EQ(string(1000, '1'), status_or.ValueOrDie()); + + // Overwrite the value in status_or with an error. + status_or = error1; + ASSERT_FALSE(status_or.ok()); + EXPECT_EQ("error1", status_or.status().error_message()); + + // Overwrite the error in status_or with another error. + status_or = error2; + ASSERT_FALSE(status_or.ok()); + EXPECT_EQ("error2", status_or.status().error_message()); + + // Overwrite the error with a value. + status_or = value2; + ASSERT_TRUE(status_or.ok()); + EXPECT_EQ(string(1000, '2'), status_or.ValueOrDie()); + + // Verify original values unchanged. + EXPECT_EQ(string(1000, '1'), value1.ValueOrDie()); + EXPECT_EQ("error1", error1.status().error_message()); + EXPECT_EQ("error2", error2.status().error_message()); + EXPECT_EQ(string(1000, '2'), value2.ValueOrDie()); +} + +TEST(StatusOr, TestDefaultCtor) { + StatusOr thing; + EXPECT_FALSE(thing.ok()); + EXPECT_EQ(thing.status().code(), tensorflow::error::UNKNOWN); +} + +TEST(StatusOrDeathTest, TestDefaultCtorValue) { + StatusOr thing; + EXPECT_DEATH(thing.ValueOrDie(), ""); + + const StatusOr thing2; + EXPECT_DEATH(thing.ValueOrDie(), ""); +} + +TEST(StatusOr, TestStatusCtor) { + StatusOr thing(Status(tensorflow::error::CANCELLED, "")); + EXPECT_FALSE(thing.ok()); + EXPECT_EQ(thing.status().code(), tensorflow::error::CANCELLED); +} + +TEST(StatusOr, TestValueCtor) { + const int kI = 4; + const StatusOr thing(kI); + EXPECT_TRUE(thing.ok()); + EXPECT_EQ(kI, thing.ValueOrDie()); +} + +TEST(StatusOr, TestCopyCtorStatusOk) { + const int kI = 4; + const StatusOr original(kI); + const StatusOr copy(original); + EXPECT_EQ(copy.status(), original.status()); + EXPECT_EQ(original.ValueOrDie(), copy.ValueOrDie()); +} + +TEST(StatusOr, TestCopyCtorStatusNotOk) { + StatusOr original(Status(tensorflow::error::CANCELLED, "")); + StatusOr copy(original); + EXPECT_EQ(copy.status(), original.status()); +} + +TEST(StatusOr, TestCopyCtorNonAssignable) { + const int kI = 4; + CopyNoAssign value(kI); + StatusOr original(value); + StatusOr copy(original); + EXPECT_EQ(copy.status(), original.status()); + EXPECT_EQ(original.ValueOrDie().foo_, copy.ValueOrDie().foo_); +} + +TEST(StatusOr, TestCopyCtorStatusOKConverting) { + const int kI = 4; + StatusOr original(kI); + StatusOr copy(original); + EXPECT_EQ(copy.status(), original.status()); + EXPECT_DOUBLE_EQ(original.ValueOrDie(), copy.ValueOrDie()); +} + +TEST(StatusOr, TestCopyCtorStatusNotOkConverting) { + StatusOr original(Status(tensorflow::error::CANCELLED, "")); + StatusOr copy(original); + EXPECT_EQ(copy.status(), original.status()); +} + +TEST(StatusOr, TestAssignmentStatusOk) { + const int kI = 4; + StatusOr source(kI); + StatusOr target; + target = source; + EXPECT_EQ(target.status(), source.status()); + EXPECT_EQ(source.ValueOrDie(), target.ValueOrDie()); +} + +TEST(StatusOr, TestAssignmentStatusNotOk) { + StatusOr source(Status(tensorflow::error::CANCELLED, "")); + StatusOr target; + target = source; + EXPECT_EQ(target.status(), source.status()); +} + +TEST(StatusOr, TestStatus) { + StatusOr good(4); + EXPECT_TRUE(good.ok()); + StatusOr bad(Status(tensorflow::error::CANCELLED, "")); + EXPECT_FALSE(bad.ok()); + EXPECT_EQ(bad.status(), Status(tensorflow::error::CANCELLED, "")); +} + +TEST(StatusOr, TestValue) { + const int kI = 4; + StatusOr thing(kI); + EXPECT_EQ(kI, thing.ValueOrDie()); +} + +TEST(StatusOr, TestValueConst) { + const int kI = 4; + const StatusOr thing(kI); + EXPECT_EQ(kI, thing.ValueOrDie()); +} + +TEST(StatusOrDeathTest, TestValueNotOk) { + StatusOr thing(Status(tensorflow::error::CANCELLED, "cancelled")); + EXPECT_DEATH(thing.ValueOrDie(), "cancelled"); +} + +TEST(StatusOrDeathTest, TestValueNotOkConst) { + const StatusOr thing(Status(tensorflow::error::UNKNOWN, "")); + EXPECT_DEATH(thing.ValueOrDie(), ""); +} + +TEST(StatusOr, TestPointerDefaultCtor) { + StatusOr thing; + EXPECT_FALSE(thing.ok()); + EXPECT_EQ(thing.status().code(), tensorflow::error::UNKNOWN); +} + +TEST(StatusOrDeathTest, TestPointerDefaultCtorValue) { + StatusOr thing; + EXPECT_DEATH(thing.ValueOrDie(), ""); +} + +TEST(StatusOr, TestPointerStatusCtor) { + StatusOr thing(Status(tensorflow::error::CANCELLED, "")); + EXPECT_FALSE(thing.ok()); + EXPECT_EQ(thing.status(), Status(tensorflow::error::CANCELLED, "")); +} + +TEST(StatusOr, TestPointerValueCtor) { + const int kI = 4; + StatusOr thing(&kI); + EXPECT_TRUE(thing.ok()); + EXPECT_EQ(&kI, thing.ValueOrDie()); +} + +TEST(StatusOr, TestPointerCopyCtorStatusOk) { + const int kI = 0; + StatusOr original(&kI); + StatusOr copy(original); + EXPECT_EQ(copy.status(), original.status()); + EXPECT_EQ(original.ValueOrDie(), copy.ValueOrDie()); +} + +TEST(StatusOr, TestPointerCopyCtorStatusNotOk) { + StatusOr original(Status(tensorflow::error::CANCELLED, "")); + StatusOr copy(original); + EXPECT_EQ(copy.status(), original.status()); +} + +TEST(StatusOr, TestPointerCopyCtorStatusOKConverting) { + Derived derived; + StatusOr original(&derived); + StatusOr copy(original); + EXPECT_EQ(copy.status(), original.status()); + EXPECT_EQ(static_cast(original.ValueOrDie()), + copy.ValueOrDie()); +} + +TEST(StatusOr, TestPointerCopyCtorStatusNotOkConverting) { + StatusOr original(Status(tensorflow::error::CANCELLED, "")); + StatusOr copy(original); + EXPECT_EQ(copy.status(), original.status()); +} + +TEST(StatusOr, TestPointerAssignmentStatusOk) { + const int kI = 0; + StatusOr source(&kI); + StatusOr target; + target = source; + EXPECT_EQ(target.status(), source.status()); + EXPECT_EQ(source.ValueOrDie(), target.ValueOrDie()); +} + +TEST(StatusOr, TestPointerAssignmentStatusNotOk) { + StatusOr source(Status(tensorflow::error::CANCELLED, "")); + StatusOr target; + target = source; + EXPECT_EQ(target.status(), source.status()); +} + +TEST(StatusOr, TestPointerStatus) { + const int kI = 0; + StatusOr good(&kI); + EXPECT_TRUE(good.ok()); + StatusOr bad(Status(tensorflow::error::CANCELLED, "")); + EXPECT_EQ(bad.status(), Status(tensorflow::error::CANCELLED, "")); +} + +TEST(StatusOr, TestPointerValue) { + const int kI = 0; + StatusOr thing(&kI); + EXPECT_EQ(&kI, thing.ValueOrDie()); +} + +TEST(StatusOr, TestPointerValueConst) { + const int kI = 0; + const StatusOr thing(&kI); + EXPECT_EQ(&kI, thing.ValueOrDie()); +} + +// NOTE(tucker): StatusOr does not support this kind +// of resize op. +// TEST(StatusOr, StatusOrVectorOfUniquePointerCanResize) { +// using EvilType = std::vector>; +// static_assert(std::is_copy_constructible::value, ""); +// std::vector> v(5); +// v.reserve(v.capacity() + 10); +// } + +TEST(StatusOrDeathTest, TestPointerValueNotOk) { + StatusOr thing(Status(tensorflow::error::CANCELLED, "cancelled")); + EXPECT_DEATH(thing.ValueOrDie(), "cancelled"); +} + +TEST(StatusOrDeathTest, TestPointerValueNotOkConst) { + const StatusOr thing(Status(tensorflow::error::CANCELLED, "cancelled")); + EXPECT_DEATH(thing.ValueOrDie(), "cancelled"); +} + +static StatusOr MakeStatus() { return 100; } +// A factory to help us benchmark the various factory styles. All of +// the factory methods are marked as non-inlineable so as to more +// accurately simulate calling a factory for which you do not have +// visibility of implementation. Similarly, the value_ variable is +// marked volatile to prevent the compiler from getting too clever +// about detecting that the same value is used in all loop iterations. +template +class BenchmarkFactory { + public: + // Construct a new factory. Allocate an object which will always + // be the result of the factory methods. + BenchmarkFactory() : value_(new T) {} + + // Destroy this factory, including the result value. + ~BenchmarkFactory() { delete value_; } + + // A trivial factory that just returns the value. There is no status + // object that could be returned to encapsulate an error + T* TrivialFactory() TF_ATTRIBUTE_NOINLINE { return value_; } + + // A more sophisticated factory, which returns a status to indicate + // the result of the operation. The factory result is populated into + // the user provided pointer result. + Status ArgumentFactory(T** result) TF_ATTRIBUTE_NOINLINE { + *result = value_; + return Status::OK(); + } + + Status ArgumentFactoryFail(T** result) TF_ATTRIBUTE_NOINLINE { + *result = nullptr; + return Status(tensorflow::error::CANCELLED, ""); + } + + Status ArgumentFactoryFailShortMsg(T** result) TF_ATTRIBUTE_NOINLINE { + *result = nullptr; + return Status(::tensorflow::error::INTERNAL, ""); + } + + Status ArgumentFactoryFailLongMsg(T** result) TF_ATTRIBUTE_NOINLINE { + *result = nullptr; + return Status(::tensorflow::error::INTERNAL, + "a big string of message junk that will never be read"); + } + + // A factory that returns a StatusOr. If the factory operation + // is OK, then the StatusOr will hold a T*. Otherwise, it will + // hold a status explaining the error. + StatusOr StatusOrFactory() TF_ATTRIBUTE_NOINLINE { + return static_cast(value_); + } + + StatusOr StatusOrFactoryFail() TF_ATTRIBUTE_NOINLINE { + return Status(tensorflow::error::CANCELLED, ""); + } + + StatusOr StatusOrFactoryFailShortMsg() TF_ATTRIBUTE_NOINLINE { + return Status(::tensorflow::error::INTERNAL, ""); + } + + StatusOr StatusOrFactoryFailLongMsg() TF_ATTRIBUTE_NOINLINE { + return Status(::tensorflow::error::INTERNAL, + "a big string of message junk that will never be read"); + } + + private: + T* volatile value_; + TF_DISALLOW_COPY_AND_ASSIGN(BenchmarkFactory); +}; + +// A simple type we use with the factory. +class BenchmarkType { + public: + BenchmarkType() {} + virtual ~BenchmarkType() {} + virtual void DoWork() TF_ATTRIBUTE_NOINLINE {} + + private: + TF_DISALLOW_COPY_AND_ASSIGN(BenchmarkType); +}; + +// Calibrate the amount of time spent just calling DoWork, since each of our +// tests will do this, we can subtract this out of benchmark results. +void BM_CalibrateWorkLoop(int iters) { + tensorflow::testing::StopTiming(); + BenchmarkFactory factory; + BenchmarkType* result = factory.TrivialFactory(); + tensorflow::testing::StartTiming(); + for (int i = 0; i != iters; ++i) { + if (result != nullptr) { + result->DoWork(); + } + } +} +BENCHMARK(BM_CalibrateWorkLoop); + +// Measure the time taken to call into the factory, return the value, +// determine that it is OK, and invoke a trivial function. +void BM_TrivialFactory(int iters) { + tensorflow::testing::StopTiming(); + BenchmarkFactory factory; + tensorflow::testing::StartTiming(); + for (int i = 0; i != iters; ++i) { + BenchmarkType* result = factory.TrivialFactory(); + if (result != nullptr) { + result->DoWork(); + } + } +} +BENCHMARK(BM_TrivialFactory); + +// Measure the time taken to call into the factory, providing an +// out-param for the result, evaluating the status result and the +// result pointer, and invoking the trivial function. +void BM_ArgumentFactory(int iters) { + tensorflow::testing::StopTiming(); + BenchmarkFactory factory; + tensorflow::testing::StartTiming(); + for (int i = 0; i != iters; ++i) { + BenchmarkType* result = nullptr; + Status status = factory.ArgumentFactory(&result); + if (status.ok() && result != nullptr) { + result->DoWork(); + } + } +} +BENCHMARK(BM_ArgumentFactory); + +// Measure the time to use the StatusOr factory, evaluate the result, +// and invoke the trivial function. +void BM_StatusOrFactory(int iters) { + tensorflow::testing::StopTiming(); + BenchmarkFactory factory; + tensorflow::testing::StartTiming(); + for (int i = 0; i != iters; ++i) { + StatusOr result = factory.StatusOrFactory(); + if (result.ok()) { + result.ValueOrDie()->DoWork(); + } + } +} +BENCHMARK(BM_StatusOrFactory); + +// Measure the time taken to call into the factory, providing an +// out-param for the result, evaluating the status result and the +// result pointer, and invoking the trivial function. +void BM_ArgumentFactoryFail(int iters) { + tensorflow::testing::StopTiming(); + BenchmarkFactory factory; + tensorflow::testing::StartTiming(); + for (int i = 0; i != iters; ++i) { + BenchmarkType* result = nullptr; + Status status = factory.ArgumentFactoryFail(&result); + if (status.ok() && result != nullptr) { + result->DoWork(); + } + } +} +BENCHMARK(BM_ArgumentFactoryFail); + +// Measure the time to use the StatusOr factory, evaluate the result, +// and invoke the trivial function. +void BM_StatusOrFactoryFail(int iters) { + tensorflow::testing::StopTiming(); + BenchmarkFactory factory; + tensorflow::testing::StartTiming(); + for (int i = 0; i != iters; ++i) { + StatusOr result = factory.StatusOrFactoryFail(); + if (result.ok()) { + result.ValueOrDie()->DoWork(); + } + } +} +BENCHMARK(BM_StatusOrFactoryFail); + +// Measure the time taken to call into the factory, providing an +// out-param for the result, evaluating the status result and the +// result pointer, and invoking the trivial function. +void BM_ArgumentFactoryFailShortMsg(int iters) { + tensorflow::testing::StopTiming(); + BenchmarkFactory factory; + tensorflow::testing::StartTiming(); + for (int i = 0; i != iters; ++i) { + BenchmarkType* result = nullptr; + Status status = factory.ArgumentFactoryFailShortMsg(&result); + if (status.ok() && result != nullptr) { + result->DoWork(); + } + } +} +BENCHMARK(BM_ArgumentFactoryFailShortMsg); + +// Measure the time to use the StatusOr factory, evaluate the result, +// and invoke the trivial function. +void BM_StatusOrFactoryFailShortMsg(int iters) { + tensorflow::testing::StopTiming(); + BenchmarkFactory factory; + tensorflow::testing::StartTiming(); + for (int i = 0; i != iters; ++i) { + StatusOr result = factory.StatusOrFactoryFailShortMsg(); + if (result.ok()) { + result.ValueOrDie()->DoWork(); + } + } +} +BENCHMARK(BM_StatusOrFactoryFailShortMsg); + +// Measure the time taken to call into the factory, providing an +// out-param for the result, evaluating the status result and the +// result pointer, and invoking the trivial function. +void BM_ArgumentFactoryFailLongMsg(int iters) { + tensorflow::testing::StopTiming(); + BenchmarkFactory factory; + tensorflow::testing::StartTiming(); + for (int i = 0; i != iters; ++i) { + BenchmarkType* result = nullptr; + Status status = factory.ArgumentFactoryFailLongMsg(&result); + if (status.ok() && result != nullptr) { + result->DoWork(); + } + } +} +BENCHMARK(BM_ArgumentFactoryFailLongMsg); + +// Measure the time to use the StatusOr factory, evaluate the result, +// and invoke the trivial function. +void BM_StatusOrFactoryFailLongMsg(int iters) { + tensorflow::testing::StopTiming(); + BenchmarkFactory factory; + tensorflow::testing::StartTiming(); + for (int i = 0; i != iters; ++i) { + StatusOr result = factory.StatusOrFactoryFailLongMsg(); + if (result.ok()) { + result.ValueOrDie()->DoWork(); + } + } +} +BENCHMARK(BM_StatusOrFactoryFailLongMsg); + +} // namespace +} // namespace port +} // namespace stream_executor -- cgit v1.2.3 From b8c1732664f41d5af2587e2f093880a3a7d83f43 Mon Sep 17 00:00:00 2001 From: Michael Case Date: Tue, 26 Jun 2018 15:07:23 -0700 Subject: Fix small typo in RELEASE.md --- RELEASE.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/RELEASE.md b/RELEASE.md index 879b995a5a..52cd9ef72b 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -21,7 +21,7 @@ * The [distributions.Bijector](https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/distributions/bijectors/Bijector) API supports broadcasting for Bijectors with new API changes. -## Breaking Chances +## Breaking Changes * If you're opening empty variable scopes; replace `variable_scope('', ...)` by `variable_scope(tf.get_variable_scope(), ...)`. -- cgit v1.2.3 From 74ca837950536aaef358abf3e05b31b4d62248f7 Mon Sep 17 00:00:00 2001 From: Gunhan Gulsoy Date: Tue, 26 Jun 2018 15:29:48 -0700 Subject: Update eigen version to a fixed version for ppc64. --- tensorflow/workspace.bzl | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/tensorflow/workspace.bzl b/tensorflow/workspace.bzl index 3c657c4a5b..79274d66ad 100644 --- a/tensorflow/workspace.bzl +++ b/tensorflow/workspace.bzl @@ -107,11 +107,11 @@ def tf_workspace(path_prefix="", tf_repo_name=""): tf_http_archive( name = "eigen_archive", urls = [ - "https://mirror.bazel.build/bitbucket.org/eigen/eigen/get/e5e305a158a0.tar.gz", - "https://bitbucket.org/eigen/eigen/get/e5e305a158a0.tar.gz", + "https://mirror.bazel.build/bitbucket.org/eigen/eigen/get/fd6845384b86.tar.gz", + "https://bitbucket.org/eigen/eigen/get/fd6845384b86.tar.gz", ], - sha256 = "8bbe676d69e7f59070c83a949454b8b6344034e0ebbf686b337528e5dc04c7de", - strip_prefix = "eigen-eigen-e5e305a158a0", + sha256 = "d956415d784fa4e42b6a2a45c32556d6aec9d0a3d8ef48baee2522ab762556a9", + strip_prefix = "eigen-eigen-fd6845384b86", build_file = clean_dep("//third_party:eigen.BUILD"), ) -- cgit v1.2.3 From 388a267b1191adf2df4006bf205a19b8a24813db Mon Sep 17 00:00:00 2001 From: Billy Lamberta Date: Tue, 26 Jun 2018 11:24:37 -0700 Subject: Remove section links that don't go anywhere. --- tensorflow/docs_src/get_started/_index.yaml | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/tensorflow/docs_src/get_started/_index.yaml b/tensorflow/docs_src/get_started/_index.yaml index 277fc852fb..4060804892 100644 --- a/tensorflow/docs_src/get_started/_index.yaml +++ b/tensorflow/docs_src/get_started/_index.yaml @@ -66,9 +66,7 @@ landing_page: }

- -

Learn and use ML

-
+

Learn and use ML

The high-level Keras API provides building blocks to create and @@ -117,9 +115,7 @@ landing_page: - items: - custom_html: >

- -

Research and experimentation

-
+

Research and experimentation

Eager execution provides an imperative, define-by-run interface for advanced operations. Write custom layers, forward passes, and training loops with auto‑differentiation. Start with @@ -170,9 +166,7 @@ landing_page:

- custom_html: >
- -

ML at production scale

-
+

ML at production scale

Estimators can train large models on multiple machines in a -- cgit v1.2.3 From 9f4fbdb05e35b512dd4a3da5ae80558021a291e5 Mon Sep 17 00:00:00 2001 From: Billy Lamberta Date: Tue, 26 Jun 2018 14:56:57 -0700 Subject: Fix leftnav for get_started --- tensorflow/docs_src/get_started/leftnav_files | 6 +++--- tensorflow/docs_src/get_started/next_steps.md | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/tensorflow/docs_src/get_started/leftnav_files b/tensorflow/docs_src/get_started/leftnav_files index 5c400a67f0..99d2b2c3e1 100644 --- a/tensorflow/docs_src/get_started/leftnav_files +++ b/tensorflow/docs_src/get_started/leftnav_files @@ -1,7 +1,7 @@ ### Learn and use ML -basic_classification.md -basic_text_classification.md -basic_regression.md +basic_classification.md: Basic classification +basic_text_classification.md: Text classification +basic_regression.md: Regression overfit_and_underfit.md save_and_restore_models.md next_steps.md diff --git a/tensorflow/docs_src/get_started/next_steps.md b/tensorflow/docs_src/get_started/next_steps.md index 6318a39c6c..01c9f7204a 100644 --- a/tensorflow/docs_src/get_started/next_steps.md +++ b/tensorflow/docs_src/get_started/next_steps.md @@ -1,4 +1,4 @@ -# Next Steps +# Next steps ## Learn more about TensorFlow -- cgit v1.2.3 From 9809978ea09845d5429925b64c20cda461c20a66 Mon Sep 17 00:00:00 2001 From: Billy Lamberta Date: Tue, 26 Jun 2018 21:34:07 -0700 Subject: Fix checkpoints link in keras guide --- tensorflow/docs_src/guide/keras.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tensorflow/docs_src/guide/keras.md b/tensorflow/docs_src/guide/keras.md index 83172dab7f..c799e9b12c 100644 --- a/tensorflow/docs_src/guide/keras.md +++ b/tensorflow/docs_src/guide/keras.md @@ -35,7 +35,7 @@ from tensorflow import keras * The `tf.keras` version in the latest TensorFlow release might not be the same as the latest `keras` version from PyPI. Check `tf.keras.__version__`. * When [saving a model's weights](#weights_only), `tf.keras` defaults to the - [checkpoint format](../get_started/checkpoints.md). Pass `save_format='h5'` to + [checkpoint format](./checkpoints.md). Pass `save_format='h5'` to use HDF5. ## Build a simple model @@ -442,7 +442,7 @@ model.load_weights('my_model') ``` By default, this saves the model's weights in the -[TensorFlow checkpoint](../get_started/checkpoints.md) file format. Weights can +[TensorFlow checkpoint](./checkpoints.md) file format. Weights can also be saved to the Keras HDF5 format (the default for the multi-backend implementation of Keras): -- cgit v1.2.3 From 32d4e6fd74fbeb91c8b2fd06c5ab0d4247d1784d Mon Sep 17 00:00:00 2001 From: Michael Case Date: Wed, 27 Jun 2018 09:46:45 -0700 Subject: Update version strings for TF 1.9.0-rc2. --- tensorflow/core/public/version.h | 2 +- tensorflow/docs_src/install/install_c.md | 2 +- tensorflow/docs_src/install/install_go.md | 2 +- tensorflow/docs_src/install/install_java.md | 22 +++++++++++----------- tensorflow/docs_src/install/install_linux.md | 18 +++++++++--------- tensorflow/docs_src/install/install_mac.md | 10 +++++----- tensorflow/docs_src/install/install_sources.md | 4 ++-- tensorflow/tools/pip_package/setup.py | 2 +- 8 files changed, 31 insertions(+), 31 deletions(-) diff --git a/tensorflow/core/public/version.h b/tensorflow/core/public/version.h index 9e5e747557..0e4a61ac1f 100644 --- a/tensorflow/core/public/version.h +++ b/tensorflow/core/public/version.h @@ -24,7 +24,7 @@ limitations under the License. // TF_VERSION_SUFFIX is non-empty for pre-releases (e.g. "-alpha", "-alpha.1", // "-beta", "-rc", "-rc.1") -#define TF_VERSION_SUFFIX "-rc1" +#define TF_VERSION_SUFFIX "-rc2" #define TF_STR_HELPER(x) #x #define TF_STR(x) TF_STR_HELPER(x) diff --git a/tensorflow/docs_src/install/install_c.md b/tensorflow/docs_src/install/install_c.md index 2f81ae0c40..9aebf2bfa4 100644 --- a/tensorflow/docs_src/install/install_c.md +++ b/tensorflow/docs_src/install/install_c.md @@ -38,7 +38,7 @@ enable TensorFlow for C: OS="linux" # Change to "darwin" for macOS TARGET_DIRECTORY="/usr/local" curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-${OS}-x86_64-1.9.0-rc1.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-${OS}-x86_64-1.9.0-rc2.tar.gz" | sudo tar -C $TARGET_DIRECTORY -xz The `tar` command extracts the TensorFlow C library into the `lib` diff --git a/tensorflow/docs_src/install/install_go.md b/tensorflow/docs_src/install/install_go.md index 5451e1b319..1907355341 100644 --- a/tensorflow/docs_src/install/install_go.md +++ b/tensorflow/docs_src/install/install_go.md @@ -38,7 +38,7 @@ steps to install this library and enable TensorFlow for Go: TF_TYPE="cpu" # Change to "gpu" for GPU support TARGET_DIRECTORY='/usr/local' curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-$(go env GOOS)-x86_64-1.9.0-rc1.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-$(go env GOOS)-x86_64-1.9.0-rc2.tar.gz" | sudo tar -C $TARGET_DIRECTORY -xz The `tar` command extracts the TensorFlow C library into the `lib` diff --git a/tensorflow/docs_src/install/install_java.md b/tensorflow/docs_src/install/install_java.md index ad3544b595..b9c9912816 100644 --- a/tensorflow/docs_src/install/install_java.md +++ b/tensorflow/docs_src/install/install_java.md @@ -36,7 +36,7 @@ following to the project's `pom.xml` to use the TensorFlow Java APIs: org.tensorflow tensorflow - 1.9.0-rc1 + 1.9.0-rc2 ``` @@ -65,7 +65,7 @@ As an example, these steps will create a Maven project that uses TensorFlow: org.tensorflow tensorflow - 1.9.0-rc1 + 1.9.0-rc2 @@ -124,12 +124,12 @@ instead: org.tensorflow libtensorflow - 1.9.0-rc1 + 1.9.0-rc2 org.tensorflow libtensorflow_jni_gpu - 1.9.0-rc1 + 1.9.0-rc2 ``` @@ -148,7 +148,7 @@ refer to the simpler instructions above instead. Take the following steps to install TensorFlow for Java on Linux or macOS: 1. Download - [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc1.jar), + [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc2.jar), which is the TensorFlow Java Archive (JAR). 2. Decide whether you will run TensorFlow for Java on CPU(s) only or with @@ -167,7 +167,7 @@ Take the following steps to install TensorFlow for Java on Linux or macOS: OS=$(uname -s | tr '[:upper:]' '[:lower:]') mkdir -p ./jni curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-${TF_TYPE}-${OS}-x86_64-1.9.0-rc1.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-${TF_TYPE}-${OS}-x86_64-1.9.0-rc2.tar.gz" | tar -xz -C ./jni ### Install on Windows @@ -175,10 +175,10 @@ Take the following steps to install TensorFlow for Java on Linux or macOS: Take the following steps to install TensorFlow for Java on Windows: 1. Download - [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc1.jar), + [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc2.jar), which is the TensorFlow Java Archive (JAR). 2. Download the following Java Native Interface (JNI) file appropriate for - [TensorFlow for Java on Windows](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-cpu-windows-x86_64-1.9.0-rc1.zip). + [TensorFlow for Java on Windows](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-cpu-windows-x86_64-1.9.0-rc2.zip). 3. Extract this .zip file. @@ -227,7 +227,7 @@ must be part of your `classpath`. For example, you can include the downloaded `.jar` in your `classpath` by using the `-cp` compilation flag as follows: -

javac -cp libtensorflow-1.9.0-rc1.jar HelloTF.java
+
javac -cp libtensorflow-1.9.0-rc2.jar HelloTF.java
### Running @@ -241,11 +241,11 @@ two files are available to the JVM: For example, the following command line executes the `HelloTF` program on Linux and macOS X: -
java -cp libtensorflow-1.9.0-rc1.jar:. -Djava.library.path=./jni HelloTF
+
java -cp libtensorflow-1.9.0-rc2.jar:. -Djava.library.path=./jni HelloTF
And the following command line executes the `HelloTF` program on Windows: -
java -cp libtensorflow-1.9.0-rc1.jar;. -Djava.library.path=jni HelloTF
+
java -cp libtensorflow-1.9.0-rc2.jar;. -Djava.library.path=jni HelloTF
If the program prints Hello from version, you've successfully installed TensorFlow for Java and are ready to use the API. If the program diff --git a/tensorflow/docs_src/install/install_linux.md b/tensorflow/docs_src/install/install_linux.md index 41619ca230..ae3d50ff39 100644 --- a/tensorflow/docs_src/install/install_linux.md +++ b/tensorflow/docs_src/install/install_linux.md @@ -438,7 +438,7 @@ Take the following steps to install TensorFlow in an Anaconda environment:
      (tensorflow)$ pip install --ignore-installed --upgrade \
-     https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc1-cp34-cp34m-linux_x86_64.whl
+ https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp34-cp34m-linux_x86_64.whl ## Validate your installation @@ -678,14 +678,14 @@ This section documents the relevant values for Linux installations. CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc1-cp27-none-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp27-none-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc1-cp27-none-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp27-none-linux_x86_64.whl
 
Note that GPU support requires the NVIDIA hardware and software described in @@ -697,14 +697,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc1-cp34-cp34m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp34-cp34m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc1-cp34-cp34m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp34-cp34m-linux_x86_64.whl
 
Note that GPU support requires the NVIDIA hardware and software described in @@ -716,14 +716,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc1-cp35-cp35m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp35-cp35m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc1-cp35-cp35m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp35-cp35m-linux_x86_64.whl
 
@@ -735,14 +735,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc1-cp36-cp36m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp36-cp36m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc1-cp36-cp36m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp36-cp36m-linux_x86_64.whl
 
diff --git a/tensorflow/docs_src/install/install_mac.md b/tensorflow/docs_src/install/install_mac.md index eeca389617..3de6da1342 100644 --- a/tensorflow/docs_src/install/install_mac.md +++ b/tensorflow/docs_src/install/install_mac.md @@ -119,7 +119,7 @@ Take the following steps to install TensorFlow with Virtualenv: TensorFlow in the active Virtualenv is as follows:
 $ pip3 install --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc1-py3-none-any.whl
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py3-none-any.whl If you encounter installation problems, see [Common Installation Problems](#common-installation-problems). @@ -242,7 +242,7 @@ take the following steps: issue the following command:
 $ sudo pip3 install --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc1-py3-none-any.whl 
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py3-none-any.whl If the preceding command fails, see [installation problems](#common-installation-problems). @@ -350,7 +350,7 @@ Take the following steps to install TensorFlow in an Anaconda environment: TensorFlow for Python 2.7:
 (targetDirectory)$ pip install --ignore-installed --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc1-py2-none-any.whl
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py2-none-any.whl @@ -518,7 +518,7 @@ The value you specify depends on your Python version.
-https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc1-py2-none-any.whl
+https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py2-none-any.whl
 
@@ -526,5 +526,5 @@ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc1-py2-none-a
-https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc1-py3-none-any.whl
+https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py3-none-any.whl
 
diff --git a/tensorflow/docs_src/install/install_sources.md b/tensorflow/docs_src/install/install_sources.md index 7afcd340aa..3520f97c9a 100644 --- a/tensorflow/docs_src/install/install_sources.md +++ b/tensorflow/docs_src/install/install_sources.md @@ -328,10 +328,10 @@ Invoke `pip install` to install that pip package. The filename of the `.whl` file depends on your platform. For example, the following command will install the pip package -for TensorFlow 1.9.0rc1 on Linux: +for TensorFlow 1.9.0rc2 on Linux:
-$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.9.0rc1-py2-none-any.whl
+$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.9.0rc2-py2-none-any.whl
 
## Validate your installation diff --git a/tensorflow/tools/pip_package/setup.py b/tensorflow/tools/pip_package/setup.py index eb2e359ee5..ed7ce01b6b 100644 --- a/tensorflow/tools/pip_package/setup.py +++ b/tensorflow/tools/pip_package/setup.py @@ -45,7 +45,7 @@ DOCLINES = __doc__.split('\n') # This version string is semver compatible, but incompatible with pip. # For pip, we will remove all '-' characters from this string, and use the # result for pip. -_VERSION = '1.9.0-rc1' +_VERSION = '1.9.0-rc2' REQUIRED_PACKAGES = [ 'absl-py >= 0.1.6', -- cgit v1.2.3 From f93e1b07282216d77e9d7d704f6722a893e9ef73 Mon Sep 17 00:00:00 2001 From: Michael Case Date: Thu, 28 Jun 2018 11:24:37 -0700 Subject: Potential fix for how pip installs headers used for custom ops. These headers were recently moved from site-packages/external into site-packages/tensorflow/include/external. Need to update setup.py to reflect that. --- RELEASE.md | 1 + tensorflow/tools/pip_package/setup.py | 10 +++++----- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/RELEASE.md b/RELEASE.md index 52cd9ef72b..21207a7efa 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -24,6 +24,7 @@ ## Breaking Changes * If you're opening empty variable scopes; replace `variable_scope('', ...)` by `variable_scope(tf.get_variable_scope(), ...)`. + * Headers used for building custom ops have been moved from site-packages/external into site-packages/tensorflow/include/external. ## Bug Fixes and Other Changes diff --git a/tensorflow/tools/pip_package/setup.py b/tensorflow/tools/pip_package/setup.py index ed7ce01b6b..8c077580aa 100644 --- a/tensorflow/tools/pip_package/setup.py +++ b/tensorflow/tools/pip_package/setup.py @@ -170,8 +170,9 @@ class InstallHeaders(Command): # symlink within the directory hierarchy. # NOTE(keveman): Figure out how to customize bdist_wheel package so # we can do the symlink. - if 'external/eigen_archive/' in install_dir: - extra_dir = install_dir.replace('external/eigen_archive', '') + if 'tensorflow/include/external/eigen_archive/' in install_dir: + extra_dir = install_dir.replace( + 'tensorflow/include/external/eigen_archive', '') if not os.path.exists(extra_dir): self.mkpath(extra_dir) self.copy_file(header, extra_dir) @@ -204,13 +205,12 @@ def find_files(pattern, root): yield os.path.join(dirpath, filename) -matches = ['../' + x for x in find_files('*', 'external') if '.py' not in x] - so_lib_paths = [ i for i in os.listdir('.') if os.path.isdir(i) and fnmatch.fnmatch(i, '_solib_*') ] +matches = [] for path in so_lib_paths: matches.extend( ['../' + x for x in find_files('*', path) if '.py' not in x] @@ -225,7 +225,7 @@ headers = (list(find_files('*.h', 'tensorflow/core')) + list(find_files('*.h', 'tensorflow/stream_executor')) + list(find_files('*.h', 'google/protobuf_archive/src')) + list(find_files('*', 'third_party/eigen3')) + - list(find_files('*', 'external/eigen_archive'))) + list(find_files('*', 'tensorflow/include/external/eigen_archive'))) setup( name=project_name, -- cgit v1.2.3 From f09aaf0dd33869253020b095d7c44840d1b430fe Mon Sep 17 00:00:00 2001 From: Michael Case Date: Fri, 29 Jun 2018 10:19:06 -0700 Subject: Exclude test sources from stream executor builds. (#20423) PiperOrigin-RevId: 202423156 --- tensorflow/contrib/cmake/tf_stream_executor.cmake | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/tensorflow/contrib/cmake/tf_stream_executor.cmake b/tensorflow/contrib/cmake/tf_stream_executor.cmake index 9a37b68119..2f70e59d54 100644 --- a/tensorflow/contrib/cmake/tf_stream_executor.cmake +++ b/tensorflow/contrib/cmake/tf_stream_executor.cmake @@ -76,11 +76,11 @@ if (tensorflow_ENABLE_GPU) list(APPEND tf_stream_executor_srcs ${tf_stream_executor_gpu_srcs}) endif() -#file(GLOB_RECURSE tf_stream_executor_test_srcs -# "${tensorflow_source_dir}/tensorflow/stream_executor/*_test.cc" -# "${tensorflow_source_dir}/tensorflow/stream_executor/*_test.h" -#) -#list(REMOVE_ITEM tf_stream_executor_srcs ${tf_stream_executor_test_srcs}) +file(GLOB_RECURSE tf_stream_executor_test_srcs + "${tensorflow_source_dir}/tensorflow/stream_executor/*test.cc" + "${tensorflow_source_dir}/tensorflow/stream_executor/lib/*test.h" +) +list(REMOVE_ITEM tf_stream_executor_srcs ${tf_stream_executor_test_srcs}) if (NOT WIN32) set (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -lgomp") -- cgit v1.2.3 From 648ef712f2c4fc996551373765aff30a0e48bc4c Mon Sep 17 00:00:00 2001 From: bhack Date: Sat, 30 Jun 2018 14:54:43 +0200 Subject: Advise batch_normalization with model_to_estimator --- tensorflow/docs_src/guide/keras.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tensorflow/docs_src/guide/keras.md b/tensorflow/docs_src/guide/keras.md index c799e9b12c..d584ebe945 100644 --- a/tensorflow/docs_src/guide/keras.md +++ b/tensorflow/docs_src/guide/keras.md @@ -548,9 +548,11 @@ model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), estimator = keras.estimator.model_to_estimator(model) ``` -Note: Enable [eager execution](./eager.md) for debugging +Note: +* Enable [eager execution](./eager.md) for debugging [Estimator input functions](./premade_estimators.md#create_input_functions) and inspecting data. +* Don't use batch normalization or try to finetune batch normalization models with estimators created from `tf.keras.estimator.model_to_estimator`. More details at [#17950](https://github.com/tensorflow/tensorflow/issues/17950) ### Multiple GPUs -- cgit v1.2.3 From 1d7fcde539fcff854e261c375c8ec2fbff258c34 Mon Sep 17 00:00:00 2001 From: 张天启 Date: Sun, 1 Jul 2018 22:05:51 +0800 Subject: fix bug in maxout function The line "shape[axis] = -1" will make the shape wrong when dealing with batches with arbitrary sizes. --- tensorflow/contrib/layers/python/layers/layers.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tensorflow/contrib/layers/python/layers/layers.py b/tensorflow/contrib/layers/python/layers/layers.py index b7194ae333..a55d42c151 100644 --- a/tensorflow/contrib/layers/python/layers/layers.py +++ b/tensorflow/contrib/layers/python/layers/layers.py @@ -3117,7 +3117,7 @@ def maxout(inputs, num_units, axis=-1, scope=None): raise ValueError('number of features({}) is not ' 'a multiple of num_units({})'.format( num_channels, num_units)) - shape[axis] = -1 + shape[axis] = num_units shape += [num_channels // num_units] # Dealing with batches with arbitrary sizes -- cgit v1.2.3 From 4664191b73959387c190f969d7f1fe3480a585f4 Mon Sep 17 00:00:00 2001 From: Yifei Feng <1192265+yifeif@users.noreply.github.com> Date: Mon, 2 Jul 2018 17:10:20 -0700 Subject: Match for path instead of name --- tensorflow/tools/pip_package/build_pip_package.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tensorflow/tools/pip_package/build_pip_package.sh b/tensorflow/tools/pip_package/build_pip_package.sh index 9e41514cfa..b0089d3360 100755 --- a/tensorflow/tools/pip_package/build_pip_package.sh +++ b/tensorflow/tools/pip_package/build_pip_package.sh @@ -27,7 +27,7 @@ function cp_external() { pushd . cd "$src_dir" - for f in `find . ! -type d ! -name '*.py' ! -name '*local_config_cuda*' ! -name '*local_config_tensorrt*' ! -name '*org_tensorflow*'`; do + for f in `find . ! -type d ! -name '*.py' ! -path '*local_config_cuda*' ! -path '*local_config_tensorrt*' ! -path '*org_tensorflow*'`; do mkdir -p "${dest_dir}/$(dirname ${f})" cp "${f}" "${dest_dir}/$(dirname ${f})/" done -- cgit v1.2.3 From 486b96a51d6b0b394edf77d182f7283a8ec03e0d Mon Sep 17 00:00:00 2001 From: Billy Lamberta Date: Tue, 3 Jul 2018 16:17:58 -0700 Subject: Update eager notebooks in 1.9 to match master --- .../nmt_with_attention/nmt_with_attention.ipynb | 909 +++++++++++++++++++++ .../eager/python/examples/notebooks/1_basics.ipynb | 429 ---------- .../python/examples/notebooks/2_gradients.ipynb | 323 -------- .../python/examples/notebooks/3_datasets.ipynb | 209 ----- .../examples/notebooks/3_training_models.ipynb | 485 ----------- .../python/examples/notebooks/4_high_level.ipynb | 551 ------------- .../eager/python/examples/notebooks/README.md | 11 + .../notebooks/automatic_differentiation.ipynb | 364 +++++++++ .../python/examples/notebooks/custom_layers.ipynb | 399 +++++++++ .../examples/notebooks/custom_training.ipynb | 478 +++++++++++ .../python/examples/notebooks/eager_basics.ipynb | 491 +++++++++++ 11 files changed, 2652 insertions(+), 1997 deletions(-) create mode 100644 tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb delete mode 100644 tensorflow/contrib/eager/python/examples/notebooks/1_basics.ipynb delete mode 100644 tensorflow/contrib/eager/python/examples/notebooks/2_gradients.ipynb delete mode 100644 tensorflow/contrib/eager/python/examples/notebooks/3_datasets.ipynb delete mode 100644 tensorflow/contrib/eager/python/examples/notebooks/3_training_models.ipynb delete mode 100644 tensorflow/contrib/eager/python/examples/notebooks/4_high_level.ipynb create mode 100644 tensorflow/contrib/eager/python/examples/notebooks/README.md create mode 100644 tensorflow/contrib/eager/python/examples/notebooks/automatic_differentiation.ipynb create mode 100644 tensorflow/contrib/eager/python/examples/notebooks/custom_layers.ipynb create mode 100644 tensorflow/contrib/eager/python/examples/notebooks/custom_training.ipynb create mode 100644 tensorflow/contrib/eager/python/examples/notebooks/eager_basics.ipynb diff --git a/tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb b/tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb new file mode 100644 index 0000000000..34ce5e0cc3 --- /dev/null +++ b/tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb @@ -0,0 +1,909 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "nmt_with_attention.ipynb", + "version": "0.3.2", + "views": {}, + "default_view": {}, + "provenance": [ + { + "file_id": "1C4fpM7_7IL8ZzF7Gc5abywqQjeQNS2-U", + "timestamp": 1527858391290 + }, + { + "file_id": "1pExo6aUuw0S6MISFWoinfJv0Ftm9V4qv", + "timestamp": 1527776041613 + } + ], + "private_outputs": true, + "collapsed_sections": [], + "toc_visible": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "accelerator": "GPU" + }, + "cells": [ + { + "metadata": { + "id": "AOpGoE2T-YXS", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "##### Copyright 2018 The TensorFlow Authors.\n", + "\n", + "Licensed under the Apache License, Version 2.0 (the \"License\").\n", + "\n", + "# Neural Machine Translation with Attention\n", + "\n", + "
\n", + "\n", + " Run in Google Colab \n", + "\n", + "View source on GitHub
" + ] + }, + { + "metadata": { + "id": "CiwtNgENbx2g", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "This notebook trains a sequence to sequence (seq2seq) model for Spanish to English translation using [tf.keras](https://www.tensorflow.org/programmers_guide/keras) and [eager execution](https://www.tensorflow.org/programmers_guide/eager). This is an advanced example that assumes some knowledge of sequence to sequence models.\n", + "\n", + "After training the model in this notebook, you will be able to input a Spanish sentence, such as *\"¿todavia estan en casa?\"*, and return the English translation: *\"are you still at home?\"*\n", + "\n", + "The translation quality is reasonable for a toy example, but the generated attention plot is perhaps more interesting. This shows which parts of the input sentence has the model's attention while translating:\n", + "\n", + "\"spanish-english\n", + "\n", + "Note: This example takes approximately 10 mintues to run on a single P100 GPU." + ] + }, + { + "metadata": { + "id": "tnxXKDjq3jEL", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "from __future__ import absolute_import, division, print_function\n", + "\n", + "# Import TensorFlow >= 1.9 and enable eager execution\n", + "import tensorflow as tf\n", + "\n", + "tf.enable_eager_execution()\n", + "\n", + "import matplotlib.pyplot as plt\n", + "from sklearn.model_selection import train_test_split\n", + "\n", + "import unicodedata\n", + "import re\n", + "import numpy as np\n", + "import os\n", + "import time\n", + "\n", + "print(tf.__version__)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "wfodePkj3jEa", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Download and prepare the dataset\n", + "\n", + "We'll use a language dataset provided by http://www.manythings.org/anki/. This dataset contains language translation pairs in the format:\n", + "\n", + "```\n", + "May I borrow this book?\t¿Puedo tomar prestado este libro?\n", + "```\n", + "\n", + "There are a variety of languages available, but we'll use the English-Spanish dataset. For convenience, we've hosted a copy of this dataset on Google Cloud, but you can also download your own copy. After downloading the dataset, here are the steps we'll take to prepare the data:\n", + "\n", + "1. Add a *start* and *end* token to each sentence.\n", + "2. Clean the sentences by removing special characters.\n", + "3. Create a word index and reverse word index (dictionaries mapping from word → id and id → word).\n", + "4. Pad each sentence to a maximum length." + ] + }, + { + "metadata": { + "id": "kRVATYOgJs1b", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# Download the file\n", + "path_to_zip = tf.keras.utils.get_file(\n", + " 'spa-eng.zip', origin='http://download.tensorflow.org/data/spa-eng.zip', \n", + " extract=True)\n", + "\n", + "path_to_file = os.path.dirname(path_to_zip)+\"/spa-eng/spa.txt\"" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "rd0jw-eC3jEh", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# Converts the unicode file to ascii\n", + "def unicode_to_ascii(s):\n", + " return ''.join(c for c in unicodedata.normalize('NFD', s)\n", + " if unicodedata.category(c) != 'Mn')\n", + "\n", + "\n", + "def preprocess_sentence(w):\n", + " w = unicode_to_ascii(w.lower().strip())\n", + " \n", + " # creating a space between a word and the punctuation following it\n", + " # eg: \"he is a boy.\" => \"he is a boy .\" \n", + " # Reference:- https://stackoverflow.com/questions/3645931/python-padding-punctuation-with-white-spaces-keeping-punctuation\n", + " w = re.sub(r\"([?.!,¿])\", r\" \\1 \", w)\n", + " w = re.sub(r'[\" \"]+', \" \", w)\n", + " \n", + " # replacing everything with space except (a-z, A-Z, \".\", \"?\", \"!\", \",\")\n", + " w = re.sub(r\"[^a-zA-Z?.!,¿]+\", \" \", w)\n", + " \n", + " w = w.rstrip().strip()\n", + " \n", + " # adding a start and an end token to the sentence\n", + " # so that the model know when to start and stop predicting.\n", + " w = ' ' + w + ' '\n", + " return w" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "OHn4Dct23jEm", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# 1. Remove the accents\n", + "# 2. Clean the sentences\n", + "# 3. Return word pairs in the format: [ENGLISH, SPANISH]\n", + "def create_dataset(path, num_examples):\n", + " lines = open(path, encoding='UTF-8').read().strip().split('\\n')\n", + " \n", + " word_pairs = [[preprocess_sentence(w) for w in l.split('\\t')] for l in lines[:num_examples]]\n", + " \n", + " return word_pairs" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "9xbqO7Iie9bb", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# This class creates a word -> index mapping (e.g,. \"dad\" -> 5) and vice-versa \n", + "# (e.g., 5 -> \"dad\") for each language,\n", + "class LanguageIndex():\n", + " def __init__(self, lang):\n", + " self.lang = lang\n", + " self.word2idx = {}\n", + " self.idx2word = {}\n", + " self.vocab = set()\n", + " \n", + " self.create_index()\n", + " \n", + " def create_index(self):\n", + " for phrase in self.lang:\n", + " self.vocab.update(phrase.split(' '))\n", + " \n", + " self.vocab = sorted(self.vocab)\n", + " \n", + " self.word2idx[''] = 0\n", + " for index, word in enumerate(self.vocab):\n", + " self.word2idx[word] = index + 1\n", + " \n", + " for word, index in self.word2idx.items():\n", + " self.idx2word[index] = word" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "eAY9k49G3jE_", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "def max_length(tensor):\n", + " return max(len(t) for t in tensor)\n", + "\n", + "\n", + "def load_dataset(path, num_examples):\n", + " # creating cleaned input, output pairs\n", + " pairs = create_dataset(path, num_examples)\n", + "\n", + " # index language using the class defined above \n", + " inp_lang = LanguageIndex(sp for en, sp in pairs)\n", + " targ_lang = LanguageIndex(en for en, sp in pairs)\n", + " \n", + " # Vectorize the input and target languages\n", + " \n", + " # Spanish sentences\n", + " input_tensor = [[inp_lang.word2idx[s] for s in sp.split(' ')] for en, sp in pairs]\n", + " \n", + " # English sentences\n", + " target_tensor = [[targ_lang.word2idx[s] for s in en.split(' ')] for en, sp in pairs]\n", + " \n", + " # Calculate max_length of input and output tensor\n", + " # Here, we'll set those to the longest sentence in the dataset\n", + " max_length_inp, max_length_tar = max_length(input_tensor), max_length(target_tensor)\n", + " \n", + " # Padding the input and output tensor to the maximum length\n", + " input_tensor = tf.keras.preprocessing.sequence.pad_sequences(input_tensor, \n", + " maxlen=max_length_inp,\n", + " padding='post')\n", + " \n", + " target_tensor = tf.keras.preprocessing.sequence.pad_sequences(target_tensor, \n", + " maxlen=max_length_tar, \n", + " padding='post')\n", + " \n", + " return input_tensor, target_tensor, inp_lang, targ_lang, max_length_inp, max_length_tar" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "GOi42V79Ydlr", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "### Limit the size of the dataset to experiment faster (optional)\n", + "\n", + "Training on the complete dataset of >100,000 sentences will take a long time. To train faster, we can limit the size of the dataset to 30,000 sentences (of course, translation quality degrades with less data):" + ] + }, + { + "metadata": { + "id": "cnxC7q-j3jFD", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# Try experimenting with the size of that dataset\n", + "num_examples = 30000\n", + "input_tensor, target_tensor, inp_lang, targ_lang, max_length_inp, max_length_targ = load_dataset(path_to_file, num_examples)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "4QILQkOs3jFG", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# Creating training and validation sets using an 80-20 split\n", + "input_tensor_train, input_tensor_val, target_tensor_train, target_tensor_val = train_test_split(input_tensor, target_tensor, test_size=0.2)\n", + "\n", + "# Show length\n", + "len(input_tensor_train), len(target_tensor_train), len(input_tensor_val), len(target_tensor_val)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "rgCLkfv5uO3d", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "### Create a tf.data dataset" + ] + }, + { + "metadata": { + "id": "TqHsArVZ3jFS", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "BUFFER_SIZE = len(input_tensor_train)\n", + "BATCH_SIZE = 64\n", + "embedding_dim = 256\n", + "units = 1024\n", + "vocab_inp_size = len(inp_lang.word2idx)\n", + "vocab_tar_size = len(targ_lang.word2idx)\n", + "\n", + "dataset = tf.data.Dataset.from_tensor_slices((input_tensor_train, target_tensor_train)).shuffle(BUFFER_SIZE)\n", + "dataset = dataset.apply(tf.contrib.data.batch_and_drop_remainder(BATCH_SIZE))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "TNfHIF71ulLu", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Write the encoder and decoder model\n", + "\n", + "Here, we'll implement an encoder-decoder model with attention which you can read about in the TensorFlow [Neural Machine Translation (seq2seq) tutorial](https://www.tensorflow.org/tutorials/seq2seq). This example uses a more recent set of APIs. This notebook implements the [attention equations](https://www.tensorflow.org/tutorials/seq2seq#background_on_the_attention_mechanism) from the seq2seq tutorial. The following diagram shows that each input words is assigned a weight by the attention mechanism which is then used by the decoder to predict the next word in the sentence.\n", + "\n", + "\"attention\n", + "\n", + "The input is put through an encoder model which gives us the encoder output of shape *(batch_size, max_length, hidden_size)* and the encoder hidden state of shape *(batch_size, hidden_size)*. \n", + "\n", + "Here are the equations that are implemented:\n", + "\n", + "\"attention\n", + "\"attention\n", + "\n", + "We're using *Bahdanau attention*. Lets decide on notation before writing the simplified form:\n", + "\n", + "* FC = Fully connected (dense) layer\n", + "* EO = Encoder output\n", + "* H = hidden state\n", + "* X = input to the decoder\n", + "\n", + "And the pseudo-code:\n", + "\n", + "* `score = FC(tanh(FC(EO) + FC(H)))`\n", + "* `attention weights = softmax(score, axis = 1)`. Softmax by default is applied on the last axis but here we want to apply it on the *1st axis*, since the shape of score is *(batch_size, max_length, hidden_size)*. `Max_length` is the length of our input. Since we are trying to assign a weight to each input, softmax should be applied on that axis.\n", + "* `context vector = sum(attention weights * EO, axis = 1)`. Same reason as above for choosing axis as 1.\n", + "* `embedding output` = The input to the decoder X is passed through an embedding layer.\n", + "* `merged vector = concat(embedding output, context vector)`\n", + "* This merged vector is then given to the GRU\n", + " \n", + "The shapes of all the vectors at each step have been specified in the comments in the code:" + ] + }, + { + "metadata": { + "id": "avyJ_4VIUoHb", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "def gru(units):\n", + " # If you have a GPU, we recommend using CuDNNGRU(provides a 3x speedup than GRU)\n", + " # the code automatically does that.\n", + " if tf.test.is_gpu_available():\n", + " return tf.keras.layers.CuDNNGRU(units, \n", + " return_sequences=True, \n", + " return_state=True, \n", + " recurrent_initializer='glorot_uniform')\n", + " else:\n", + " return tf.keras.layers.GRU(units, \n", + " return_sequences=True, \n", + " return_state=True, \n", + " recurrent_activation='sigmoid', \n", + " recurrent_initializer='glorot_uniform')" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "nZ2rI24i3jFg", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "class Encoder(tf.keras.Model):\n", + " def __init__(self, vocab_size, embedding_dim, enc_units, batch_sz):\n", + " super(Encoder, self).__init__()\n", + " self.batch_sz = batch_sz\n", + " self.enc_units = enc_units\n", + " self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)\n", + " self.gru = gru(self.enc_units)\n", + " \n", + " def call(self, x, hidden):\n", + " x = self.embedding(x)\n", + " output, state = self.gru(x, initial_state = hidden) \n", + " return output, state\n", + " \n", + " def initialize_hidden_state(self):\n", + " return tf.zeros((self.batch_sz, self.enc_units))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "yJ_B3mhW3jFk", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "class Decoder(tf.keras.Model):\n", + " def __init__(self, vocab_size, embedding_dim, dec_units, batch_sz):\n", + " super(Decoder, self).__init__()\n", + " self.batch_sz = batch_sz\n", + " self.dec_units = dec_units\n", + " self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)\n", + " self.gru = gru(self.dec_units)\n", + " self.fc = tf.keras.layers.Dense(vocab_size)\n", + " \n", + " # used for attention\n", + " self.W1 = tf.keras.layers.Dense(self.dec_units)\n", + " self.W2 = tf.keras.layers.Dense(self.dec_units)\n", + " self.V = tf.keras.layers.Dense(1)\n", + " \n", + " def call(self, x, hidden, enc_output):\n", + " # enc_output shape == (batch_size, max_length, hidden_size)\n", + " \n", + " # hidden shape == (batch_size, hidden size)\n", + " # hidden_with_time_axis shape == (batch_size, 1, hidden size)\n", + " # we are doing this to perform addition to calculate the score\n", + " hidden_with_time_axis = tf.expand_dims(hidden, 1)\n", + " \n", + " # score shape == (batch_size, max_length, hidden_size)\n", + " score = tf.nn.tanh(self.W1(enc_output) + self.W2(hidden_with_time_axis))\n", + " \n", + " # attention_weights shape == (batch_size, max_length, 1)\n", + " # we get 1 at the last axis because we are applying score to self.V\n", + " attention_weights = tf.nn.softmax(self.V(score), axis=1)\n", + " \n", + " # context_vector shape after sum == (batch_size, hidden_size)\n", + " context_vector = attention_weights * enc_output\n", + " context_vector = tf.reduce_sum(context_vector, axis=1)\n", + " \n", + " # x shape after passing through embedding == (batch_size, 1, embedding_dim)\n", + " x = self.embedding(x)\n", + " \n", + " # x shape after concatenation == (batch_size, 1, embedding_dim + hidden_size)\n", + " x = tf.concat([tf.expand_dims(context_vector, 1), x], axis=-1)\n", + " \n", + " # passing the concatenated vector to the GRU\n", + " output, state = self.gru(x)\n", + " \n", + " # output shape == (batch_size * max_length, hidden_size)\n", + " output = tf.reshape(output, (-1, output.shape[2]))\n", + " \n", + " # output shape == (batch_size * max_length, vocab)\n", + " x = self.fc(output)\n", + " \n", + " return x, state, attention_weights\n", + " \n", + " def initialize_hidden_state(self):\n", + " return tf.zeros((self.batch_sz, self.dec_units))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "P5UY8wko3jFp", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "encoder = Encoder(vocab_inp_size, embedding_dim, units, BATCH_SIZE)\n", + "decoder = Decoder(vocab_tar_size, embedding_dim, units, BATCH_SIZE)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "_ch_71VbIRfK", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Define the optimizer and the loss function" + ] + }, + { + "metadata": { + "id": "WmTHr5iV3jFr", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "optimizer = tf.train.AdamOptimizer()\n", + "\n", + "\n", + "def loss_function(real, pred):\n", + " mask = 1 - np.equal(real, 0)\n", + " loss_ = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=real, logits=pred) * mask\n", + " return tf.reduce_mean(loss_)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "hpObfY22IddU", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Training\n", + "\n", + "1. Pass the *input* through the *encoder* which return *encoder output* and the *encoder hidden state*.\n", + "2. The encoder output, encoder hidden state and the decoder input (which is the *start token*) is passed to the decoder.\n", + "3. The decoder returns the *predictions* and the *decoder hidden state*.\n", + "4. The decoder hidden state is then passed back into the model and the predictions are used to calculate the loss.\n", + "5. Use *teacher forcing* to decide the next input to the decoder.\n", + "6. *Teacher forcing* is the technique where the *target word* is passed as the *next input* to the decoder.\n", + "7. The final step is to calculate the gradients and apply it to the optimizer and backpropagate." + ] + }, + { + "metadata": { + "id": "ddefjBMa3jF0", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "EPOCHS = 10\n", + "\n", + "for epoch in range(EPOCHS):\n", + " start = time.time()\n", + " \n", + " hidden = encoder.initialize_hidden_state()\n", + " total_loss = 0\n", + " \n", + " for (batch, (inp, targ)) in enumerate(dataset):\n", + " loss = 0\n", + " \n", + " with tf.GradientTape() as tape:\n", + " enc_output, enc_hidden = encoder(inp, hidden)\n", + " \n", + " dec_hidden = enc_hidden\n", + " \n", + " dec_input = tf.expand_dims([targ_lang.word2idx['']] * BATCH_SIZE, 1) \n", + " \n", + " # Teacher forcing - feeding the target as the next input\n", + " for t in range(1, targ.shape[1]):\n", + " # passing enc_output to the decoder\n", + " predictions, dec_hidden, _ = decoder(dec_input, dec_hidden, enc_output)\n", + " \n", + " loss += loss_function(targ[:, t], predictions)\n", + " \n", + " # using teacher forcing\n", + " dec_input = tf.expand_dims(targ[:, t], 1)\n", + " \n", + " total_loss += (loss / int(targ.shape[1]))\n", + " \n", + " variables = encoder.variables + decoder.variables\n", + " \n", + " gradients = tape.gradient(loss, variables)\n", + " \n", + " optimizer.apply_gradients(zip(gradients, variables), tf.train.get_or_create_global_step())\n", + "\n", + " if batch % 100 == 0:\n", + " print('Epoch {} Batch {} Loss {:.4f}'.format(epoch + 1,\n", + " batch,\n", + " loss.numpy() / int(targ.shape[1])))\n", + " \n", + " print('Epoch {} Loss {:.4f}'.format(epoch + 1,\n", + " total_loss/len(input_tensor)))\n", + " print('Time taken for 1 epoch {} sec\\n'.format(time.time() - start))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "mU3Ce8M6I3rz", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Translate\n", + "\n", + "* The evaluate function is similar to the training loop, except we don't use *teacher forcing* here. The input to the decoder at each time step is its previous predictions along with the hidden state and the encoder output.\n", + "* Stop predicting when the model predicts the *end token*.\n", + "* And store the *attention weights for every time step*.\n", + "\n", + "Note: The encoder output is calculated only once for one input." + ] + }, + { + "metadata": { + "id": "EbQpyYs13jF_", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "def evaluate(sentence, encoder, decoder, inp_lang, targ_lang, max_length_inp, max_length_targ):\n", + " attention_plot = np.zeros((max_length_targ, max_length_inp))\n", + " \n", + " sentence = preprocess_sentence(sentence)\n", + "\n", + " inputs = [inp_lang.word2idx[i] for i in sentence.split(' ')]\n", + " inputs = tf.keras.preprocessing.sequence.pad_sequences([inputs], maxlen=max_length_inp, padding='post')\n", + " inputs = tf.convert_to_tensor(inputs)\n", + " \n", + " result = ''\n", + "\n", + " hidden = [tf.zeros((1, units))]\n", + " enc_out, enc_hidden = encoder(inputs, hidden)\n", + "\n", + " dec_hidden = enc_hidden\n", + " dec_input = tf.expand_dims([targ_lang.word2idx['']], 0)\n", + "\n", + " for t in range(max_length_targ):\n", + " predictions, dec_hidden, attention_weights = decoder(dec_input, dec_hidden, enc_out)\n", + " \n", + " # storing the attention weigths to plot later on\n", + " attention_weights = tf.reshape(attention_weights, (-1, ))\n", + " attention_plot[t] = attention_weights.numpy()\n", + "\n", + " predicted_id = tf.multinomial(tf.exp(predictions), num_samples=1)[0][0].numpy()\n", + "\n", + " result += targ_lang.idx2word[predicted_id] + ' '\n", + "\n", + " if targ_lang.idx2word[predicted_id] == '':\n", + " return result, sentence, attention_plot\n", + " \n", + " # the predicted ID is fed back into the model\n", + " dec_input = tf.expand_dims([predicted_id], 0)\n", + "\n", + " return result, sentence, attention_plot" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "s5hQWlbN3jGF", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# function for plotting the attention weights\n", + "def plot_attention(attention, sentence, predicted_sentence):\n", + " fig = plt.figure(figsize=(10,10))\n", + " ax = fig.add_subplot(1, 1, 1)\n", + " ax.matshow(attention, cmap='viridis')\n", + " \n", + " fontdict = {'fontsize': 14}\n", + " \n", + " ax.set_xticklabels([''] + sentence, fontdict=fontdict, rotation=90)\n", + " ax.set_yticklabels([''] + predicted_sentence, fontdict=fontdict)\n", + "\n", + " plt.show()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "sl9zUHzg3jGI", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "def translate(sentence, encoder, decoder, inp_lang, targ_lang, max_length_inp, max_length_targ):\n", + " result, sentence, attention_plot = evaluate(sentence, encoder, decoder, inp_lang, targ_lang, max_length_inp, max_length_targ)\n", + " \n", + " print('Input: {}'.format(sentence))\n", + " print('Predicted translation: {}'.format(result))\n", + " \n", + " attention_plot = attention_plot[:len(result.split(' ')), :len(sentence.split(' '))]\n", + " plot_attention(attention_plot, sentence.split(' '), result.split(' '))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "WrAM0FDomq3E", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "translate('hace mucho frio aqui.', encoder, decoder, inp_lang, targ_lang, max_length_inp, max_length_targ)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "zSx2iM36EZQZ", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "translate('esta es mi vida.', encoder, decoder, inp_lang, targ_lang, max_length_inp, max_length_targ)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "A3LLCx3ZE0Ls", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "translate('¿todavia estan en casa?', encoder, decoder, inp_lang, targ_lang, max_length_inp, max_length_targ)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "DUQVLVqUE1YW", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# wrong translation\n", + "translate('trata de averiguarlo.', encoder, decoder, inp_lang, targ_lang, max_length_inp, max_length_targ)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "RTe5P5ioMJwN", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Next steps\n", + "\n", + "* [Download a different dataset](http://www.manythings.org/anki/) to experiment with translations, for example, English to German, or English to French.\n", + "* Experiment with training on a larger dataset, or using more epochs\n" + ] + } + ] +} \ No newline at end of file diff --git a/tensorflow/contrib/eager/python/examples/notebooks/1_basics.ipynb b/tensorflow/contrib/eager/python/examples/notebooks/1_basics.ipynb deleted file mode 100644 index 51d10a7784..0000000000 --- a/tensorflow/contrib/eager/python/examples/notebooks/1_basics.ipynb +++ /dev/null @@ -1,429 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "U9i2Dsh-ziXr" - }, - "source": [ - "# An introduction to TensorFlow\n", - "\n", - "This is an introductory tutorial for using TensorFlow. It will cover:\n", - "\n", - "* Importing required packages\n", - "* Creating and using Tensors\n", - "* Using GPU acceleration\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "z1JcS5iBXMRO" - }, - "source": [ - "## Import TensorFlow\n", - "\n", - "To get started, import the `tensorflow` module and enable eager execution.\n", - "Eager execution enables a more interactive frontend to TensorFlow, the details of which we will discuss much later." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "cellView": "code", - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "RlIWhyeLoYnG" - }, - "outputs": [], - "source": [ - "import tensorflow as tf\n", - "\n", - "tf.enable_eager_execution()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "H9UySOPLXdaw" - }, - "source": [ - "## Tensors\n", - "\n", - "A Tensor is a multi-dimensional array. Similar to NumPy `ndarray` objects, `Tensor` objects have a data type and a shape. Additionally, Tensors can reside in accelerator (like GPU) memory. TensorFlow offers a rich library of operations ([tf.add](https://www.tensorflow.org/api_docs/python/tf/add), [tf.matmul](https://www.tensorflow.org/api_docs/python/tf/matmul), [tf.linalg.inv](https://www.tensorflow.org/api_docs/python/tf/linalg/inv) etc.) that consume and produce Tensors. These operations automatically convert native Python types. For example:\n" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "cellView": "code", - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "height": 125 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 320, - "status": "ok", - "timestamp": 1526420535530, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "ngUe237Wt48W", - "outputId": "b1a1cd60-4eb3-443d-cd6b-68406390784e" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "tf.Tensor(3, shape=(), dtype=int32)\n", - "tf.Tensor([4 6], shape=(2,), dtype=int32)\n", - "tf.Tensor(25, shape=(), dtype=int32)\n", - "tf.Tensor(6, shape=(), dtype=int32)\n", - "tf.Tensor(aGVsbG8gd29ybGQ, shape=(), dtype=string)\n", - "tf.Tensor(13, shape=(), dtype=int32)\n" - ] - } - ], - "source": [ - "print(tf.add(1, 2))\n", - "print(tf.add([1, 2], [3, 4]))\n", - "print(tf.square(5))\n", - "print(tf.reduce_sum([1, 2, 3]))\n", - "print(tf.encode_base64(\"hello world\"))\n", - "\n", - "# Operator overloading is also supported\n", - "print(tf.square(2) + tf.square(3))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "IDY4WsYRhP81" - }, - "source": [ - "Each Tensor has a shape and a datatype" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "height": 53 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 215, - "status": "ok", - "timestamp": 1526420538162, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "srYWH1MdJNG7", - "outputId": "5e4ac41c-5115-4e50-eba0-42e249c16561" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "(1, 2)\n", - "\u003cdtype: 'int32'\u003e\n" - ] - } - ], - "source": [ - "x = tf.matmul([[1]], [[2, 3]])\n", - "print(x.shape)\n", - "print(x.dtype)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "eBPw8e8vrsom" - }, - "source": [ - "The most obvious differences between NumPy arrays and TensorFlow Tensors are:\n", - "\n", - "1. Tensors can be backed by accelerator memory (like GPU, TPU).\n", - "2. Tensors are immutable." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "Dwi1tdW3JBw6" - }, - "source": [ - "### NumPy Compatibility\n", - "\n", - "Conversion between TensorFlow Tensors and NumPy ndarrays is quite simple as:\n", - "* TensorFlow operations automatically convert NumPy ndarrays to Tensors.\n", - "* NumPy operations automatically convert Tensors to NumPy ndarrays.\n", - "\n", - "Tensors can be explicitly converted to NumPy ndarrays by invoking the `.numpy()` method on them.\n", - "These conversions are typically cheap as the array and Tensor share the underlying memory representation if possible. However, sharing the underlying representation isn't always possible since the Tensor may be hosted in GPU memory while NumPy arrays are always backed by host memory, and the conversion will thus involve a copy from GPU to host memory." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "height": 251 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 238, - "status": "ok", - "timestamp": 1526420540562, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "lCUWzso6mbqR", - "outputId": "fd0a22bc-8249-49dd-fcbd-63161cc47e46" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "TensorFlow operations convert numpy arrays to Tensors automatically\n", - "tf.Tensor(\n", - "[[ 42. 42. 42.]\n", - " [ 42. 42. 42.]\n", - " [ 42. 42. 42.]], shape=(3, 3), dtype=float64)\n", - "And NumPy operations convert Tensors to numpy arrays automatically\n", - "[[ 43. 43. 43.]\n", - " [ 43. 43. 43.]\n", - " [ 43. 43. 43.]]\n", - "The .numpy() method explicitly converts a Tensor to a numpy array\n", - "[[ 42. 42. 42.]\n", - " [ 42. 42. 42.]\n", - " [ 42. 42. 42.]]\n" - ] - } - ], - "source": [ - "import numpy as np\n", - "\n", - "ndarray = np.ones([3, 3])\n", - "\n", - "print(\"TensorFlow operations convert numpy arrays to Tensors automatically\")\n", - "tensor = tf.multiply(ndarray, 42)\n", - "print(tensor)\n", - "\n", - "\n", - "print(\"And NumPy operations convert Tensors to numpy arrays automatically\")\n", - "print(np.add(tensor, 1))\n", - "\n", - "print(\"The .numpy() method explicitly converts a Tensor to a numpy array\")\n", - "print(tensor.numpy())" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "PBNP8yTRfu_X" - }, - "source": [ - "## GPU acceleration\n", - "\n", - "Many TensorFlow operations can be accelerated by using the GPU for computation. Without any annotations, TensorFlow automatically decides whether to use the GPU or CPU for an operation (and copies the tensor between CPU and GPU memory if necessary). Tensors produced by an operation are typically backed by the memory of the device on which the operation executed. For example:" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "cellView": "code", - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "height": 53 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 340, - "status": "ok", - "timestamp": 1526420543562, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "3Twf_Rw-gQFM", - "outputId": "2239ae2b-adf3-4895-b1f3-464cf5361d1b" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Is there a GPU available: False\n", - "Is the Tensor on GPU #0: False\n" - ] - } - ], - "source": [ - "x = tf.random_uniform([3, 3])\n", - "\n", - "print(\"Is there a GPU available: \"),\n", - "print(tf.test.is_gpu_available())\n", - "\n", - "print(\"Is the Tensor on GPU #0: \"),\n", - "print(x.device.endswith('GPU:0'))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "vpgYzgVXW2Ud" - }, - "source": [ - "### Device Names\n", - "\n", - "The `Tensor.device` property provides a fully qualified string name of the device hosting the contents of the Tensor. This name encodes a bunch of details, such as an identifier of the network address of the host on which this program is executing and the device within that host. This is required for distributed execution of TensorFlow programs, but we'll skip that for now. The string will end with `GPU:\u003cN\u003e` if the tensor is placed on the `N`-th tensor on the host." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "ZWZQCimzuqyP" - }, - "source": [ - "\n", - "\n", - "### Explicit Device Placement\n", - "\n", - "The term \"placement\" in TensorFlow refers to how individual operations are assigned (placed on) a device for execution. As mentioned above, when there is no explicit guidance provided, TensorFlow automatically decides which device to execute an operation, and copies Tensors to that device if needed. However, TensorFlow operations can be explicitly placed on specific devices using the `tf.device` context manager. For example:" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "height": 53 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 1762, - "status": "ok", - "timestamp": 1526420547562, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "RjkNZTuauy-Q", - "outputId": "2e613293-ccac-4db2-b793-8ceb5b5adcfd" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "On CPU:\n", - "10 loops, best of 3: 35.8 ms per loop\n" - ] - } - ], - "source": [ - "def time_matmul(x):\n", - " %timeit tf.matmul(x, x)\n", - "\n", - "# Force execution on CPU\n", - "print(\"On CPU:\")\n", - "with tf.device(\"CPU:0\"):\n", - " x = tf.random_uniform([1000, 1000])\n", - " assert x.device.endswith(\"CPU:0\")\n", - " time_matmul(x)\n", - "\n", - "# Force execution on GPU #0 if available\n", - "if tf.test.is_gpu_available():\n", - " with tf.device(\"GPU:0\"): # Or GPU:1 for the 2nd GPU, GPU:2 for the 3rd etc.\n", - " x = tf.random_uniform([1000, 1000])\n", - " assert x.device.endswith(\"GPU:0\")\n", - " time_matmul(x)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "YEOJTNiOvnpQ" - }, - "source": [ - "## Next Steps\n", - "\n", - "In this tutorial we covered the most fundamental concepts in TensorFlow - `Tensor`s, operations, and devices.\n", - "In [the next tutorial](https://github.com/tensorflow/models/tree/master/official/contrib/eager/python/examples/notebooks/2_gradients.ipynb) we will cover automatic differentiation - a building block required for training many machine learning models like neural networks." - ] - } - ], - "metadata": { - "colab": { - "collapsed_sections": [], - "default_view": {}, - "name": "TensorFlow: An introduction", - "provenance": [], - "version": "0.3.2", - "views": {} - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} diff --git a/tensorflow/contrib/eager/python/examples/notebooks/2_gradients.ipynb b/tensorflow/contrib/eager/python/examples/notebooks/2_gradients.ipynb deleted file mode 100644 index 9c1af9c208..0000000000 --- a/tensorflow/contrib/eager/python/examples/notebooks/2_gradients.ipynb +++ /dev/null @@ -1,323 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "vDJ4XzMqodTy" - }, - "source": [ - "# Automatic Differentiation\n", - "\n", - "In the previous tutorial we introduced `Tensor`s and operations on them. In this tutorial we will cover [automatic differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation), a key technique for optimizing machine learning models." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "GQJysDM__Qb0" - }, - "source": [ - "## Setup\n" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "OiMPZStlibBv" - }, - "outputs": [], - "source": [ - "import tensorflow as tf\n", - "tf.enable_eager_execution()\n", - "\n", - "tfe = tf.contrib.eager # Shorthand for some symbols" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "1CLWJl0QliB0" - }, - "source": [ - "## Derivatives of a function\n", - "\n", - "TensorFlow provides APIs for automatic differentiation - computing the derivative of a function. The way that more closely mimics the math is to encapsulate the computation in a Python function, say `f`, and use `tfe.gradients_function` to create a function that computes the derivatives of `f` with respect to its arguments. If you're familiar with [autograd](https://github.com/HIPS/autograd) for differentiating numpy functions, this will be familiar. For example: " - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "9FViq92UX7P8" - }, - "outputs": [], - "source": [ - "from math import pi\n", - "\n", - "def f(x):\n", - " return tf.square(tf.sin(x))\n", - "\n", - "assert f(pi/2).numpy() == 1.0\n", - "\n", - "\n", - "# grad_f will return a list of derivatives of f\n", - "# with respect to its arguments. Since f() has a single argument,\n", - "# grad_f will return a list with a single element.\n", - "grad_f = tfe.gradients_function(f)\n", - "assert tf.abs(grad_f(pi/2)[0]).numpy() \u003c 1e-7" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "v9fPs8RyopCf" - }, - "source": [ - "### Higher-order gradients\n", - "\n", - "The same API can be used to differentiate as many times as you like:\n" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "height": 276 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 730, - "status": "ok", - "timestamp": 1527005655565, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "3D0ZvnGYo0rW", - "outputId": "e23f8cc6-6813-4944-f20f-825b8a03c2ff" - }, - "outputs": [ - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXYAAAEDCAYAAAAhsS8XAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsnXd0HNX5sJ/ZXrTq3ZLV3IvcDdgGGwOm2WCbHhJa6C2B\nUBISQioBfoQPkjhACA4QCIQSDITQbGMbsHHvVbZ6s7q0vc18f4xmJVltJa0q+5zDOXhn9s7dqzvv\nfe/briBJkkSYMGHChBkxqAa7A2HChAkTJrSEBXuYMGHCjDDCgj1MmDBhRhhhwR4mTJgwI4ywYA8T\nJkyYEUZYsIcJEybMCCNkgl0URVasWMHtt98eqibDhAkTJkwvCJlgf+2118jJyQlVc2HChAkTppeE\nRLBXVlayceNGrrjiilA0FyZMmDBh+kBIBPvjjz/OQw89hCAIoWguTJgwYcL0gT4L9g0bNhAfH8/E\niRMJVycIEyZMmMFH6GutmGeeeYYPP/wQtVqN2+3Gbrdz3nnn8dRTT3X6HUmSwtp9CKittvH8UxsQ\nxZY/4aXXTGfa7PRB7NXAU1dj5y9PrIfmYUgeFcnya2aQmBI5uB0bYE5WNPHS/9uE6JcHYukVucw8\nPWOQezXw7NhcyCfvH0Bqfi+uumkO4ycnD3KvBpY+C/bWbNu2jdWrV/PCCy90e291tTVUj+03EhIs\nQ7qfWzfls2tzMTNPH01UrJEv/3eU5LRIVnx/5mB3rUP6azw3fnaMQ7vLOX1RNrVVNvIOVZGeFcPS\nq6YNmT6GmlP7KYoi/3ltF9WVNhacO4btXxfi9fi5+Mpc0jJjhkw/+5t9O0r5Zu1xDEYtpy/KZuOn\nR4mOM3HlTbNRqTo3UAynv3swhOPYhymSJJF3sAqtTs35l05mQm4K6VkxVJY2UVdtH+zuDRgOu4ej\n+yqIjDYwbW4a514yiYTkCMqKGnC7vIPdvQFjz9YSqittjJuSxNTZaVywcgoAX3xwCL9PHOTeDRyH\ndpej0ai47PqZTJyWwoTcFOprHBzdf3KwuzaghFSwz507NyhtPUzfOVnehLXRRdbYeLQ6DQATp6UC\ncGhv+WB2bUA5sLMMv19i2pz0gEaWNS4BUZQoOlE3yL0bGDxuHzu+LsRk1jH/nDEApI6OZtL0VFxO\nLyfLmwa5hwNDU4OT+loHozJiiIw2AjB7QSYajYrtXxXg9foHuYcDR1hjH6bkHawCYOzkxMBnmWPj\nMJq1HDtwEt93YBJ7PT4O7CrDYNQwPrfFhpo1Lh6AgmPVg9W1AaWyrBG/X2JCbjIGozbweXqWbIIp\nLawfrK4NKMX58kI+Oic28FmERc/UOWnYbR7yDn53tPawYB+GiKLI8SNVGEzaNvZTtVrFhKkpuF0+\n8o+OfKGWd7gKt8vHlJmj0GrVgc9j4kxExRopzq/7Tixw5cWNAKSkR7f5PHV0NIIApUXfEcHevEMb\nnR3b5vPxU5IAqChpHPA+DRZhwT4MKS2sx+XwMmZCYjuH0MRpKQAc3lsxGF0bUJQXNWdiYpvPBUEg\ne1w8Pq9IyXdAW60oaUAQ5Gig1uj0GhJTIqkqb8Lj9g1S7wYGn89PWXE90XGmgBlGITrWhN6gobIs\nLNjDDGE6MsMoRMUYSUiOoLKsacQ7zaoqrGh1amLiTO2uZY1LAKDgWM1Ad2tA8Xr9VFVYSUi2oNNr\n2l1Py4xBkqC8pGEQejdwVJQ04vOKZJyirYO80CeNiqSpwYXD5h6E3g08YcE+zJAkiZKCOswWHUmp\nHcdpJyRbEEWJupqRGx3jdvloqHWQmGLpMCciMcWCOUJH0fEaRHHkLnBV5U2IokRKelSH10dlyOaZ\nkW5nD5hhcuI6vJ48Sh6fyrLvhiM5LNiHGQ67B6fDS2JyZKdJXgnJcqxrdeXQj8vtLcpv6ywJSRAE\nMsfF43L6RvTLXF4sa+Kn2tcVkkdFodGoKCvqu8b+zjtv8f3vX8Fvf/ton9sKNUX5tWi0KlLSOl7g\nFDPVSJ4LrWm/dwszpKk5aQMgLimi03u+C4JdCeFLTOk8YSMlLYqDu8qpOWkjtRPBN9wpb/YzpHai\nsas1KlLSoygpqMdhc2OK0Pf6WWvWvMsf//hnkpNTet1Gf9BY76Sxzknm2DjUmo51VXlnBye/I3b2\nsMY+zKitkgV7fGLngj02wYxKLVBdaRuobg04VRWyYO/MHAUQlyCPkTJmIw2/X+RkeRNxCWb0Bm2n\n943KaA577IPW/vTTf6C8vIyHH76ft99+s9ft9AeKUzQto/MMW61OQ1xiBFWV1hHve4Kwxj7sUDT2\n+C40drVaRVyCmdpqG36/iFo9stZvSZKoKrditugwWzrXQKNijahUwojLxH17/XF25VXj8fhx+Hzo\nGh1s/+vmTu8X/SJ2RA59egTDxhMd3jNnQiJXLh7TaRsPPPAztm79lj//+UUiI4dWDZ76Zl9SXBfK\nDshmqZqTNqpPWgM295HKyHrjvwPUVNnQGzRERHa9pU5ItiD6pREn1ADsVjcOu6fbIl9qtYqYeBN1\nNfYRWXlU0Ty7W7hVzddFf181VYlApbUhhDLHYxPMXd6XnCbPl5PfATt7WGMfRng9PhrrnM2JJ11X\nx5Tt7BVUn7QGbO4jhZPliuO0+98VlxBBbZWdpgYnUTHtwyKHI1cuHsNdV83g1b9upuhELdf/cG63\ntvN/v7ydpgYnN99xxoirrFpX48Bk1rXJuu2IlsiYRqYxsiughjX2YURts2bSlRlGocWBOvLsy4p9\nPZiyvLGJshZXWzXydi71tXYMJm1QDtHYeBM+r4itaWTFcXs9PqyNLmLiu1+0IyL1mCN0VJY2jcgd\nXGvCgn0YEbCvd2NLBIiNN6NSCdSMwMiYqoqeaeww8hyoPq9fFmixwe1CouPkBa6+ti8L3NDT9Otq\nHED3ZhiQQ2ATUyNx2D3YbZ7+7tqgEhbsw4hgHKcKao2K2AQztVWyA3WkIIoS1ZVWYuJNHWZankpc\n8wtfO8J8DbLfAKI7yLrtiNhmjba+WRD2hnfe+YDIyKHldAzY1+O7F+xAIEu5sa734zAcCAv2YURt\nlQ2VWgj6ZU5ItuD3S4GogZFAU4MTr8dPQlJwfgNThA6DUTPinMi11fIi31E5hY5Q5kx97cgSaMrc\nDkZjB7luDEBDnbPf+jQUCAv2YYIoitRW24mNNwcdvjgS7eyN9fILGR1r7OZOGUEQiE2IoLFeXhBG\nCjXNpqXoYE0xMSYEoa+mmKGHUjYjJi44wR7VPG/CGnuYIUFDnRO/TwzKDKOg3DuS7MuNzZpWVJAC\nDVrMMSOpdk5AsAepsas1KiJjjNTXOEaU47Cuxo7ZokdvCC7AL6yxhxlS9MRxqqBotU0NI2cSN9bL\nmlZUTHAaO7Qkrijmi5GAYpazRBmC/k5snBm3y4fTMTKODHS7vNitnqDNMAAGoxaDUUNDWGMPMxRQ\nbMTdZde1Rm/QojdoaGxw9Ve3BhzFFNMTwa68+HUjJORRkiRqquxExciZtcESHXCgjoxxCETEBBHq\n2JqoWBNNDc4RFVRwKn0W7B6PhyuuuILly5ezbNky/vKXv4SiX2FOQdG6eyLQlPubGpyI4sjYfjfW\nOzGatEFFxCgoERMjJTLGYfPgcfuCdpwqxI4wB2rAcRpkRIxCdIwRSQJr48hReE6lz4Jdp9Px2muv\nsWbNGtasWcOmTZvYt29fKPoWphVNDU7UGhWmCF2PvhcZbUT0S9itwz8xxe8XsTa6Ag6wYNHq1ETF\nGKkbIaYYRTAHa19XiGkWgL0NeWxdtvebb77ijTdeDfq7lZUVfPHFp0Hd+/jjv2bjxvXd3te6lMCa\nNe/x2Wf/C6r9qICdXR6HTz75L9XVLUdJPvnk7ykqKgyqraFKSEoKGI3yi+bxePD5RvYRXINFU4OL\nyChDj9PBFQ2/sd7ZI3vsUMTa6EKS6FVpgMgYIyX5TjxuX4+0/aGIIpCCTU5SUByHvY2MObVs7/z5\nZ7a7x+/3o1ar231eXl7GF198xnnnXdCrZ3eE4gyPjDawfPllQX9PGQfFEf+//33EzJlTSUrKAODh\nh38esj4OFiGZ4aIosnLlSoqLi7n22mvJzc0NRbNhmnG7vLhdvnZnWgZDZLQszGVTTudlTYcDgYiY\nHpqjACKjlHFw9SiyaCjS0EuNXatTY4nU98oU07ps78UXX4LFYuHIkUPcd99DPP74r7FYIsnLO8r4\n8ROZP/9MnnvuaQRBQKvV8OyzL/Dii6soKirkppuu5YILlnLllde0af+ZZ55k9+6dpKSktonaOXr0\nCH/+8zO4XC6ioqL5+c8fIzY2jnvuuQ3RFUt1XSGxH5Zht9sxmUycccYCfve7x3jpJXk3UVlZwcMP\n38+rr77JK6/8nW+++QqHw4lWSmTS9HvYsGEdR44c5sEHH0Sj0fL886t54IF7ufvu+zh8+ADl5eXc\neee9gKzZHz16hB//+AE+//wT3nnnLfx+H5MmTeEnP/npkKrBExLBrlKpWLNmDTabjTvvvJPjx48z\nZkznJUDD9IymZufnqYf0BoMiBEdCZExDLyJiFJQFztroHPaC/Vv315ROK+RP+ZsRCnomTJxjPfh8\nInnfbGgjiGYkTmXlmKWdfu/Usr2ffPLfNt8vLS3mT396AYCHH76Pn/zkp0yZkktEhIamJg+33343\nb731Ok8++f/atb1x45eUlpbwz3++TU1NDd///hUsXXopPp+PZ599iieeeIaoqGjWrfuCF19cxc9+\n9kskScLpsHPdlT9l6VXTWL36bwBkZGTi9/uoqCgnJSWVdes+55xzzgPgssuu4oYbbsbn9XPj9+9k\n566t3P/Idbz33tv88pe/ICGhbWGwRYvO5fbbbwwI9nXrPuf6639IUVEh69Z9zgsvrEatVvPHPz7J\n559/wvnnX9Sjv0V/EtI9aUREBHPnzuWrr77qVrAnJAyPioNDoZ/VzdUMU9OiO+1PZ58b9HLFO5fD\nNyR+S1/64HHKCUaZ2fE9bidttLxb8fukbr87FMapK9xuH6oIAU0v6uyrNSp8PhEkUKtbBLPJqOv2\nd6tUEBdnJjragsViwNj8HYNBy8KFSwPfP/30uTz//HMsW7aMJUuWkJSURHS0CZ1O0+Ezjh07wIoV\nl5KQYCEhwcK8eWcQGWnEZquhoCCfBx+8F0mSEEWRxMREEhIsCAhkpE4nMTmShAQLZrMes9lAQoKF\npUsvZuvWTdxyyy1s2rSeZ599loQEC7t2bebll1/G6XRSVV9FWXkaCQkWtFo1ktQyL7RaNTExJsaO\nTSczM4OKigJGjx5NeXkpixcv4I033uD48WPccceNSJKE2+0mLS15SM2bPgv2uro6tFotFosFl8vF\nli1buPXWW7v9XnX10C9OlZBgGRL9LC2WDyJWaYQO+9NVPyVJQqNVUV1pHfTf0tfxPFkhn5QjIva4\nHalZhlWUNnb53aHyN+8Mr9dPbN5YZo45gwvPn9Lj7x/aU87GT49x9sUTmDA1uc217n63KErU1trw\netVYrS6cTg/V1VZcLi8+X8vcXLHiGqZNm8uWLV9z5ZVX8swzq2hocODx+Dp8htPpwWZzB6653V6a\nmpzU1dnIysrm+edXt+uny+VFY9Gh0amorrZit7uRJDXV1VZOO+0sHn30p8yaNQ+/X8JojKGsrJZf\n/erXrF79OvHxCTx8/2+wNjgpL6vH6/W3+f1er5/6egfV1VYWLDibd99dQ0ZGJvPnL6S62orV6mTJ\nkou47ba7ejR+oSDYxaPPUTHV1dVcd911XHrppVxxxRUsWLCAhQsX9rXZMK1QzCi9McUIgkBktJHG\nBuewzzhsqHNiMut65fxsbYoZziip8PGJPQvxU1Ac6LZ+DPUrKyslOzuHa6+9nilTplBcXIjJZMZu\n79hpO23aTNau/RxRFKmpqWHXrp0AjB6dSX19AwcO7AfA5/NRUJAPtBwy0tE7MWpUGmq1ilde+TuL\nF8tmGI/HgyBAZGQUDoeD44W7geY5ZTJhs3UcMbVw4WK++mpDG5POrFlz2bBhHfX1ssLV1NREZWVl\nr8aqv+izxj5+/Hjef//9UPQlTCcoSTmW6N5FtcihfnacDi8mc8/CJYcKfr+IrcnV6yPN9AY59r1p\nmMcuK6nwSjninqIIdmtTb8YhOHv+O++8ya5dO1Cr1YwfP47TT58PgFqt4cYbv8eFFy5r4zxduPBs\ndu3azvXXX016egYzZswCQKPR8LvfPcmzz/4fNpsNUfRz5ZXXkJWVjd8vtfk9p7J48RKef/5P3HLL\nnYBsJl62bAXXXXcVKSmpZGeNw14vv1sXXbSMxx57DK1Wx/PPr27jO7BYLGRmZlNcXMiECZMAyMzM\n4pZb7uT+++9CFCW0Wi333/8QycnJHfZlMBCkQVLjhvJ2V2GobMtff/5b/H6R6++e1+H17vq5ef0J\n9m4rYcX3Z5CcNnhlV/synvW1dt56aTsTpiZz9sUTetXGO//YQUOtg5t/cmanEQxD5W/eGbu/Lebb\nDflcddOcwCEiPcHn8/PS018xKiOaS66Z3g89bEt/jeen/zlAwbEarr9nXq+UleL8Wj5+ez9zFmQy\ne0HmkP+7KwyYKSZM/6JoqpG91NahVSz7MI6MCZQS6GFyUmssUQZ8PhGnffgesqBkS0b38pg/jUaN\nyawb9lmX1kYXGo0Ko6nr4/A6I1AMrH5kZOGeSliwD3FsTW4kCSKjei/QomLkRUERjsORvsSwKyj2\n2OFsjrE1m1D6Mg4RUfrmeTV8fS7WRheWXiTsKUREGlCpBJrqh+9c6IqwYB/iBBynoRBoI0Fj78OB\n1C3JWsP3ZbY2udHp1d0e3NwVkVEGRFEatsfDuV0+3C5fnzKpVSoBc4QOm3X4zoWuCAv2IU5LclLv\nJ7GinQxrjT0g2Hs/DgHH4TBd4CRJwtroIiKyb6UhlO/3Z2RMf6LsWvpaIiMi0oDd6hmRVR7Dgn2I\n05dQRwWVSsASbRjW205rkwuDSYtW1/tAroDGPkwFmsftw+vxY4nU96kdRSAO13FQ+t3bKDGFiCh5\nHEdCgbxTCQv2IU6LYO/bJI6KNuJyyjVnhhuSJGFvchNhCZFAG6amGGujLIAi+qipBmLZexXyOPhY\nlV1sCDR2kP1YI42wYB/iNDXI3v++xp8PZzu72+XD5xOJ6KOmqtGoMUcM34gQJfbc0kdTjPL94TYO\nu3fv5KGH7gv0uzNTzD333MbRo0e6bU/Z+diaXPzpT39i587tverX22+/idvdsjg89NCPsdsHt0R0\nWLAPYSRJoqnBiSW6995/BWXbaRuG207lRY6w9L3ssCXagK3JNSztqoqG3dcFztI8F4abYAcQBLoV\n7MGiaOyNDU7uvfdeZs2a06t23nnnTdzulrF86qlnMZsHt9Dc8C5MPcJxu3x43H5S0ntvX1dQzBjD\n0Z6oLEZ9FWggh41WljZht7r75LcYDBRTTF8FmlanwWDU9Eiwu1wufvnLn1JdXYUoilx//c0sXnxu\np2V1y8pK+b//exybrQlJEvjtb58gNXUUq1Y9x9atmxEEFddddxPnnHMeu3fvZPXqvxEVFU1BwQkm\nTJjIo4/+FoBvv93Mn//8DNHRMYwdOx6ApkYnGq0qEBnkdrt5/PFfU1RUSEZGBh5PS7TP9u3f8vLL\nf8Pr9TJqVBqPPPIYBoOBK664hLMXXcAXm7/Er1vGV9veZNas09HrDfzvfx/xm9/8AZB3Cf/+9xs8\n8cQzPP30Exw9egi3282iRedw00238u67b1FTU80999xOdHQ0zz33PFdccQkvv/xP3njjNZKTU1ix\n4nIAVq/+G2azmauuupZ//euffPnlF3i9Ps46axE33dR9fa2eEBbsQxjlxeurLRFaBPtwtCfam0In\n2C2tQh6Hm2BXNHb/hv+yY9WePu065tg8iH6J/IffBsAyew4JV1zd6f1bt24mPj6Bp556FgCHw95l\nWd1f//oXXHfdjaxYsZTy8jpEUWTjxvWcOJHHa6/9m/r6Om6++TpmzJgJQF7eMV5//R3i4uK4444f\nsn//XsaPn8hTT/2eP//5RUaNSuOXv/wZ0D6Gfc2adzEajbzyyr84ceI4N910LQCNjQ28+upqnnvu\nr+j1Bt5441Xeeut1brjhZvk3R5pZMu8uRqfHcrSkVB6XOafx9NN/wO12odcbWLfuCxYvXgLAbbfd\nhcViQRRFfvSjO8jPP87ll1/Nv//9ZqCcsYzcr3PPXcJzz/0xINjXr1/LM8/8me3bv6W0tJiXXnoN\nSZJ4+OH72bt3D9OmhS4TOCzYhzC2EAo087DW2BUTRN8XuMCBG43D7+ARa5MLlUpAq1XTVxe4oBKQ\n/CKSJJs3uiM7ewyrVj3HCy/8hTPOWMC0adPJzz9Bfv4J7rvvruayuhLx8Qk4HA5qaqpZsEAuBqjV\nypr1vn17OPfc8wGIiYllxoxZHD58CJPJxKRJk4mPjwdgzJhxVFRUYDAYSU0dxahRaQAsWXIhH3zw\nH3kXm9ayKO/Zs5srmhelnJwxjBkzDoCDBw9QWJjPHXf8EEmS8Pl8TJkyLfC9JUvO57//ymuj7KjV\nak477Qy+/vorFi1azJYtX3PXXT8CYN26z/jwwzX4/X7q6mopKCggO3sMIDX/pyD//9ix42loaKC2\ntob6+noiIyNJTEzinXfeYvv2bdx007VyXXmni9LS4rBg/66gCGFzH6NBWrcxHCMhrMoCF4JxaHEi\nD79xsDW6MVv0JF55NQl33dKn2ibfrDvOvu2lrLxuJkmp3Z/MlZ4+mpdffp0tW77hxRf/wty5p3PW\nWYvIzs5pV1bX4ei4iuOpma6t/60IfwC1WoXf3/HS5fPKu5RTzVGtfVBKu5IkMWfO6Tz22O86bMto\nNBIRaWj3TixefB7/+c/bREZamDhxMkajkYqKct566w1efvmfmM0RPP74r/F4uleSzj77HL78ci21\ntbWcc86SQL9+8IMbuOSSFd1+v7eEnadDmIBtOQQCTa2WD8Iejs5TW5MbQQCzpe+VKZXdj32YmaT8\nPhGH3ROyc2t7GvJYU1ODXq9nyZILuOaa73Ps2NFOy+qaTGYSE5P46qsNAHi9XtxuF9OmzWTdui8Q\nRZH6+nr27dvDpEmTO31mRkYmlZUVlJeXAbB27Wf4mmuntx6H6dNn8PnnnwCQn3+cEyfyAJg8eSr7\n9++lrEw2s7jdLkpKits8IyJSj8ftlw8faWbGjFkcO3aUDz9cEyjVa7fbMRqNmExm6upq+fbbzYH7\nuypJvHjxeaxb9zkbN67n7LPPAeC0007n448/xOl0No9tdaAEcKgIa+xDmFBq7CAvEDVVNiRJGlLn\nM3aHvcmFKUKHStV3PSSwcxlmC5xijuprcpKCEvIYbJJSfv5xVq16DpVKQKPR8sADP+uyrO4vfvFr\n/u//HueVV15CENT89rdPsHDh2Rw8uI8bbrgGQVBx5533EhMTS2FhQZtnKXNTp9Px4IOP8OCDPyI6\nOobc3OmcrKiT+99KsC9ffjmPP/5rbrjhe4wdO45Jk+QDSKKjo3nkkcf41a8ewePxIggCt9xyB+np\no1Hs4Ip5T1kwQD7qc968BXzyycf84he/BmDMmLGMHTueH/zgKlJTR5Gb22LSueSS5TzwwL3Exyfw\n3HPP07q8cVZWNg6Hg4SEJGJj4wCYM+d0iooKuf32GwEwmUw8+uhviYkJnWkwXLa3Cwa7lOcH/9pD\neXEDtz54FuoujkELtp99LXXaV3oznqIo8dLTm0hIsbDyBzND0o9X/vQNOr2G7912Wkj6OBCUFtbz\n0Vt7mTUvg7lnZfW5nzUnrbzzj51MmZnKmUvGhbCnbQn1eH79RR77d5Zx+Q2zSEju+1F0u7YUsXVj\nAVf/cC4xCb2vQzRQhMv2jgDsVjdGs7ZLod4ThmPIo8PuQRSlkJijFMwWPXbr8KpuGKr6KAqBujnD\nLJY9lKGvcjtKlNTwS9zrirBgH6JIkoTN2vc0+tZERA6/kMdQJeW0JsKix+cTh1V5hUCSVojGQT5R\nSh1wTA8X7FY3KrXQp+qWrVHGczifVdARYcE+RHG7fPh9Ysjs69CqNsYwKlVqDziQQ6OpwvAM/VQW\n41Bp7CDb2a2NrmG1c7Fb3Zgj9CHzEQV8DcO48mlH9FmwV1ZWct1113HRRRexbNkyXnvttVD06zuP\nLYQhfgrDWaCFUmMfjg5UpU5MSOdDpB6vx4/X4+/+5iGAKMqRQaGIjlIwRegQhJGnsfc5KkatVvOz\nn/2MiRMnYrfbWblyJfPnzycnJycU/fvOEuqIGBie2afWfjDFBBY42zAah0YXRpMWjVYdsjbNES0L\nvU4/9APkHHYvktTS71AghwHrh/VZBR3RZ409ISGBiRMnAmA2m8nJyaGqqqrPHfuuE8oYdgVThKzp\nDCfB3qKxh84EEXAiD5NxCPhbQjgGMPwWuP5QdkAOIW1qdCGKw8ck1R0htbGXlpZy5MgRcnNzQ9ls\nv2I/sA9nfv5gd6Md/TGJ1WpV83Fg7V9kT2UFjsOHQvasUKE4y3p7aHFHKFv51uMgSRKi14vo9SL5\nhpZT1enwIvqlkO5aAMzNC73d2lI0S3S7se3Zjd/WtuyszWbj/fffDfxbKaHbEU8++XuKigq7fX5X\nbbRGKcMbeCeC0NhffvnFoMvwRkQakEQJR/MC9/bbb+JyOrHu2IansmJIlOHtKSHbf9ntdu69914e\neeQRzGZzt/cHG4/Z3xT+8xU8dfWkLL2IjB9ci1rfdtIMVj+V1OnRmXHExoduPKNiTVSWNRIfFwGS\nSPlHH1P15QYchUUApF9zFaOvvrL3HQ9RPxUcNg9R0UYSE7tPew+WSItcVsDr8ZOQYEH0ejn8uz/Q\nsGcvxwFUKjK+/z3SLuu/lO+eUOlpBCA+IaLN+PV1bqamRcv/I8lt+Z1ODj3zJE2HDiOo1URNyyV1\n6UXEzJqJ293IRx/9h1tvlZNqoqNN6PWaDvvw9NNPtPm3co8oim2SzLpqozVarZqYGBP2etlhmjIq\nqsvviKLIT3/6QPcD0ExisoXjh6vQqNUkJFh49+03mJF/HOn4CZIvWMI//vFy0G0NFUIi2H0+H/fe\ney+XXnoGQVm+AAAgAElEQVQp5557blDfGSpJIEm33U3l6r9R8dHH1Gzbwagf/wRdQiIwuMkqNVXy\nc90eb7d96Ek/DUYNol+iuKgW19frqHn3bVCrMU+bjqesjJI3/43D4SFu2aV9/g196SfIafQ2q5vU\n9KiQ/x10ejX1tQ6qq61UvfUGDXv2ohuVhikhDmt+AUX/fANfXDLmyVNC+tzeUFoip5urNEJgHEIx\nN33N1SGrKps4WVpD2XPP4Dx2FOOEiYgOBw27dtOwZy8Zj/2Gx//2V4qLi1m27BJmzz6NM86YT0ND\nE7fddme7Urv33HMbd999H+PHT2DJkrO46qpr2bbtW+6++8fY7fY2ZXg9Hl+733FqGV673Ul9vYP6\nCh8V1cf42aOrEVRSuzK8F198Cdu3b2XlyivZunUz8+efGVQZ3sYGG3GWCZQUTeTdF/8fVSdP8ugX\nnxEdHcOq85exaNHZg16GVyHYxTwkgv2RRx5hzJgxXH/99aFobkAxZmeT8cvfUPOfd2hY+wU1775N\n6h13D3a3sFvdGIyhdZZBS9hgQ1k19o8+QB1hIePXv0MTFYW3tpbS/3uC2g/eR9DpiD3/wpA+u6co\ntt9Q25ahJUnJumsnDWu/QJeSyuhHHiUpLZ6SbXspfuL3VP79RTIe+y2a6OiQP78nKONgajZBbF5/\ngsK8GsQ+HhaimJSP7KvEsXsHWXlHiZg9h5RbbkdQq7Ht3kX5qj9R9cY/uf32uykszGf16jcAWUB2\nVGp36tRpbZ7hdDrJyRnDD394Gx6Ph6uvXtGuDO+pdFaGt7qqhgN5a3nxpb+RkBTdrgyvTqdn1aqX\nALnMMARXhvfEkZM89PC9HDt8mNOLi3lPq+WPj/6G1IVnN4dVDn4Z3p7SZxv7zp07+eijj/j2229Z\nvnw5K1asYNOmTaHo24Ch0ulIuOp76DOzsO3cgfuUQkEDTX8kJykoNvvyz9Yjud3EX34lmqgoALRx\ncaQ9+DDqqGhqP3i/nZ11oOmPUEeFCIset8tH+auvIOh0pNx+F6pmM5whK5uEK67Cb7VS8dILSOLg\nnrak2MAVm3ioUELBRb+Ir64Oc+40Um6+DUEtKxMRM2ZinjET57Gj2Hbvavd9pdSuIAiBUrunotFo\nWLhwMQBFRYXtyvB2xJ49uwPXWpfhPX7iCI22kzz007u48cbv8emnH3Py5MnA95SCXa1pXYbX7/ez\nZcvXnHmmXE543brPuOmm7/PL395Do/Ukx3ftQLTZEIwmLDNntYqVb1+G9/jxvEAZ3m3btgbK8N50\n07UUFxdRWjq4MqTPGvusWbM4fPhwKPoyqAiCQPzyFZQ9+wy1H35A6l33DFpfPG4fPm9ok5MUApl2\nReUk5owhct78Nte1cfHELDmfmnf+TeNXm4i98KKQ9yFY+iPrVEFxwDk9kPW9a9GPGtXmevQ55+E4\nchj7nt3Y9+4hYkZo6tT0BsWpp8yHeYtzuPSq6SExT73+/Ld4GxsZW7uDhB8/jqBpKxISr7qGwoMH\nqPv4w3YLXDCldnU6Xa+SiToqw+tyekhLnsA//vFih98xGjs+OKW7MrySX8Odt/4Ya1UtKrMZVSft\nwOCV4e0p4czTVpgmT8WQnYNt905cxUWD1g8lWsPcLwJN1vpcmggSr/0BQgcVE6POPAtBr6dh/dpB\njRCx2xRNNfTjYDLJWqkvbhSR889sd10QBOIvXQlA49eDuwPtL40dwKgRcUtajJNz0aWktruujU8g\n9qKlaB0ObLW1PW6/dVZrR2V4O6KzMrwW4yiq6gq6LMPbEd2V4XW6rZRXH8EraIg9/0LM5oghV4a3\np4QFeysEQSDuUnnVrf1wzaD1w94PMewK2qZqAPyJ6RhGZ3R4j9pkJmr+Anz1dR1uwQeK/opbBlDX\nyMJFGJ/b4eIGoE9PR5+ZhX3/PnwNDSHvQ7DYbW40GlW/JBFprDVIggrDWZ0HPcScfwGRlkhydDqu\nu+5q/vrXP7W7p7WG3dn/63Q6Hnro5zz44I+4665bSOlgIQG5DK/D4eCGG77Hm2++zqRJU/B6fKgF\nI8vOv5lf/eoRrr/+Gm677SaKAwpY57sCpQzv1q1bmDdPXsRbl+F96onfkBSdgU+tJ3rxOYEyvD/6\n0R3t2u6sDO95553P7bffyPXXX82jjz6M0+notD8DQbhs7ylIkkTJE7/HdeI4s/72PFbVwJ+LeWhv\nORs/OcbZF09gwtTkbu/vSYTEybfe5D8FSSTGaLns9vaaqoLnZCWFP/8phpwxjP7ZL4Lue6j6CfDZ\n+wfJP1rNdXefEXKtffsfVrFDmMycuUnMXjyx0z42bFhP1euvEb/ycmIvWhrSPgTLK3/+Bp2ubZnh\nkETFNNTz6ZNvURI1kcuun0liSuchpZWvrKbp602kPfAwpgkTO73vVEIVWVZfY+etv29n4rQUFl04\nvs/ttWl7/Vo+3tSI0xTLzQ8uGtJnFYTL9vYSQRCInLcAgLqt2walD/Z+qBMDIIki9p3b0IsunGLX\n2p8uKRlz7jRcJ47jzD8R0n4Ei8Mun5xkNIXWBOEuL0dVIv8mp6/rqCPL3NMRtFoav/lqUIpl+f0i\nTrs3kDUcShq+XI/eKzvIFbNXZ0SedjoA1m1bQ96PYLDb+m/3Ztu1E53fgU8Uhk3dnO4IC/YOiJg+\nAwSB2i3fDsrzbf1kgnDmHcNXX49Rr8Jh93QrqKIXy9tza6tjwAYSu9WDyaxDpQqtBtX09Sb0Pnvg\nGV2hNpmImDUb78mTOPOOhbQfweC094+fQZIkmrZuwaCSfSjdFYYzjp+AOioK687tg+J3USKkQlkA\nDMBvteI8dpSICNkRPFzKK3RHWLB3gCYqCuOYsTQdPoKvqWnAn99iYw/tJLZukxeqiPhI/H4Jj7vr\nF9Q0YSIqgwH7/n0Drq1KkoTD7gm5pir5fDRt+Qa9SYtaLQRV4TFqwVkANH61MaR9CYYWB3Jox8FT\nUY6vpoao0SmAnOHbFYJKhWX2XES7HfuhgyHtSzD0lyPdtncPiCIxaXJSYncL/XAhLNg7IWLGTJAk\n7Ht2D/izbVY3Or0arS50zjLJ58O6cwfqqCgik2SnT3eTWNBoME2egre6Gm9l+xjl/sTjluvRm0L8\nIjvzjuG3Womae7qcpBSEhmYcPwFNfDz23bsGXFvtLweyfd9eAGInjmnznK6wzJVt/IqCMJD0V0CB\nbfdOABInZAItoaXDnbBg74SIGbOAlj/8QOKweUKumdgPHUS02bDMnoup+eVw2LufxObmTEJbsyAY\nKPorxM9+8IDcbm4uZoseh82Dv5sMTkEQME/JRXS5cBUMbME4RZMO9c7Fvm8vCALxM+SSCcEscIbs\nHLTxCdh270Z0D6wA7I8FTnS5cBw8gG5UGjHpSfJzutm5DBfCgr0TtAkJmLMycRw+hN85cLWa/c1H\ntoX6Rbbtkhcoy9zTWqr6BTGJzVOnyvfu3xfS/nSHsuiEWmN3HDyAoNFgHDs+ICQUO3ZXmCdPBhhw\nM0TAaRjCcfA77DiP52HIysIQG41OrwlqLgiCgGXuaUhu14BXArXb3KjVAnpD6Hax9gP7kXw+ImbM\nDJykFLaxfweIPf00JJ8P+/6B01Zb6oKEVrA7jxxGZTJhyMoOtN2dXRVAExWNPjNLNmEM4ALXHxq7\nr6kJd0kxxrHjUOn1LQePBGGGMI6fCCoVjmaNf6DoD03VcfAgiGJgN2a26II+VcvUXBTNcWSABbvV\ng9kSuiPxoGU3HjFzVuDIwWDeieFAWLB3QdzpcwGwD2CSjqNZezSZQ/cie2uq8dZUYxw3HkGlajk5\nJ0jtxDw1F/x+HIcGTqgFxiGEgt1xWNa2TZNk4WQyB7/AqU0mDNk5uAry8XeSldgf2PvBFKPY1825\nzYI9Qq6b4/N2H+pnyM5B0GpxHDkSsv50h9/ffCReCHctkt+Pfd9eNHFx6NNHN5+jGjbFfCcwZWSg\njo7GcfTIgEWFOPohCkJ5CZXEkp4INICIZgFg3zdw5pieHKoQLIq2bWo2q/Rk5wLIJXwlaUC1VbtN\nPrZOG6Iqn5IoYj+wD3VUNPrmzOOWk5S6HweVVotxzFg8pSX4rAMTMRYI+QzhrsVdXITodGKeMhVB\nEFCpBExmXdh5+l1AEARM48bjb2rC26qKXH/SH84yx1G5SJsi2I1mbY+0E31GJmpLJPb9ewes0mGo\nw/wkScJ+8CBqiwV9Wnpz280CLQgnMoBpkrwgOAbQzu6whfbwZldhAX6rFfPU3IBZo+UkpeDGwdg8\nj5xHj4asX13RktcRwnfimNx347gJgc9MEXrstu7zO4YDYcHeDcaxcvqy89jATGJFyChadV+RJAnn\nkcOoLRZ0qXIFQ5VKhdEUvHYiqFSYp0zF39SEp6wsJP3qDiXr1BCirFNPeRn+xgZMkyYHasP0VGM3\nZGahMhqxHzwwIC+/z+vH7fKFdtfSvCgpTnHo+dmnioLgODIwVV0d/RDDrrzPxrHjAp+ZI3T4fWK3\n+R3DgbBg7wbjOFmwO/IGRrAHJnGItp3eqpNytun4CW2KXZkidEFlnyooL4DzeF5I+tUdoc46DZhh\nJrWciNRTk5SgVmOaOAlfTQ3eATiwvT+Sk5S/n6KwyO03C/Ygk3MMGZkIegPOgRLsIfa3SKKIM+8Y\n2oQEtLGxgc+VMOCRkKQUFuzdoEtJQRURMWAae8AUEyKNXdGqTi3cZI7Q4fOKeNzB1cYwjh0LDIxg\n74+sUyVMUQlbBNDpNWi0qh5FQgSiQgbAkRyIkArRIi+JIq4Tx9EmJaGJbCn4pZg4gtXY5XDRcXgq\nK/A19H952lC/E56yMkSHo83iBq1MUiPAzh4W7N0gqFQYx47DV1uLt7am35/nsHnQaENXotXZiWBX\n4sODSVIC0CY3L3An+l+whzrrVBJFXMfz0CYno4mOaXPNZNYFbWMHMI1vti/n9f84hNqR7ikvQ3Q6\nMeaMbfO5orH3xHFomiDbph1H+z865tSjAfuKsvtWduMKph7kdwx1woI9CEwBO3v/F4Gy290hsyVK\nkoTjyBHU0dFok9qW/+2xGUIQMOaMwVdT0+9aWqhj2D3lZYguF8bsMe2umSL0uBxeRDE4k5Q2KUle\n4PKPh6RvXRHqyKCAGWZMW8FuNMsFsHq0c5kwSf7OAJye5rSHVmMP2NfHnaqx93yBG6qEBXsQKBPA\n2c92dlFsLtEaqi1nRTl+axOm8RPbJXa0bL+Df5kVgdDf5phQZ50qZYcNOe0FuzlChySB09GDBS47\nR17gGvv38I1Ql6pV/m6GUwS77EzXYg8iA1dBP3o0KpMJ59H+F+x2m6f5oJG+h3xKkoTz2FFZ2UlI\naHOtJToorLED8MgjjzBv3jyWLVsWiuaGHPr0dFQGQyBEqr9w2r1A6JxErhOyVqnYx1ujJED1RDsZ\nKMEeao3ddUIW7MacnHbXerpzATlJB8DVz3XqQ+08dR0/jspsRpfc/vAWU4QuqNIKCoJKhTFnDN7q\n6n6vgKr4W0KRdeo9eRJ/UxOmcePbtWfqYeLeUCYkgn3lypW8/PLLoWhqSCKo1RjGjMVbWYmvsbHf\nnhNq779SsEoRRK3paagfgD4zE0GjwXm8f80QIR+H/BOoDIZAuGdrejMOxmbN33mifwW70idjCHZw\nvoYGOfs4Z0yHRwGazDo8bj/eILJPFQxZ2QD9WhhNFCWcdk/ozTCnOE4BjCYtKpUwIsoKhESwz549\nm8jIzo/VGgmYAuaY/rOzh7rgkzM/H0GnQz8qrd21nhQCU1BpdegzMuWsvX6s7hdK27LfbsdTUY4h\nK7tjgdbDJCUAQ1YWCEJgR9RfOOweDEYtanXfX9PO7OsKyjj0RGs3ZCuCvf8WOJfTiySFbpFX3l/j\nuHHtrgmCIIcBjwCNPfSn445QAtvvgnwss+cAYPPa2V65G5WgIjMyndSIFLSq4IZUkiRKqmwcKqzH\n6/Oj16pxVcs1SEKhnYhuN56yUoxjxiKo29smjQETRM8msXHMGFwnjuMqyA9E2tS7GjhQewS7186M\nxFySTAndtNKCy+OjqNJKQYUVm9NLXJSB6pPyGZmhMEEoQqejXUvrZ/RES1MZjOhSR+EqKkTy+RA0\nGlw+F2W2SmpddVg9NqbGTySxB+NQ3eCk+KSVmkYXjXYPybEmbFY3lsj+ta8rKHPObvMQGR3cOb+G\nTEWwFwQ+a/JYOVhzBLVKjU6tY5pxLALB/4ayGjvHShpweXx4vCKG5jyLUNVOchacQGU0ouvkIG1T\nhI6aShuSJA3ps0+7Y9AEe7CHsg42Sj995qmUCgL+smI0ESJrDn/G+vxvcPtbBIJereOHs65mUdYZ\nnbbncvt4d30ea7cXU9voanMtFRiFig0HK7GkRzNtbPCC4dTxbDxYDJJEzKTxnY61KUKH2+Xr0d9C\nNWsa9Z99iqqiGPuUFJ7f/k8K6ksC1z/K/4zxcdlcPuVipiVP6rSftY1O3vz8KGu3FeM/JSJlAgIR\nCHyxr4JLzxpDQkzvDxR3VpYCkDRzKrEd/E7RKz9b8kuBvgUzHo1TJnLys1JM9joKLV6e3foyTW5b\n4PoHJ/7HOTkLuHzyxUQbOt7NSpLEoYI63t9wnG2HKmmdKyYAs1FRWu9k8+EqLpqXiVbTdoHuyd+t\nvCgfQaMhbfZU1Pr2QjIxSW5Lq1YF326ChbKUZNyFBcTGGvmycAtv7H0fu7elCqjmoIZrc5dz4biz\nUQkd7zx8fpEvthbxxbZi8kraOqQjgfGo2Ha8moRJSSyYntprgeuz2zlWWUlU7lQSk6La/5wECzGx\nZqrKrUSY9CEvGT2QDJpgD8XJ5f3NqSes65JTsB4/zs8+e4I6dwMx+miWZi3BrDVT2FTCjpO7+eu2\n1yisKueirPPaTEBJktiTV8O/1h6jtsmN2aDh9MlJ5GbHYTHpcHv9HPy2GFu5lX2FdWx9YTNn5qZw\n9TljMXYT097RSfB1u+WEHCk5vdOxNpq0NDW4evS38CXIduqSndt5Ub0Zl8/FxNhxTImbiElrZGvF\nTo7WHucPm1Zx69TrmBrfItwTEixUVDby4TcFfLatBK9PJCnWxPQxcWSlRBJl1lHX5GbfF3l4PH4+\n2JTPf78uYMVZ2Vxw2mhUvXiha/fLBbs8cakd/k63V3ZY19bYqa62djiWHZI6GoAvv3iP12MLUQkq\nFqbNJ9mUiFqlYm3RRj4/vomvCrfx4xm3k2ZpqyHaXV5Wf3yY3XlybkRWSiRzJiQSH2Ug0qyjsKSB\nE5sK8UgSf//gAO9/mcfV54xj1viEwFgG+3cT3W5s+QUYMjKpa/IA7XcnIvKqUlHeSHxK8AuGdnQW\nrq1b+MM7v2efUIlBrWdZ9gVEaE04vE6+LPuKV/e8y9aivVw/+WoidW3bLqux8/f/HqKo0oogQG5O\nHLPGJWAx6dBpVRzeW0HV4WqqrW6een0H//06hmvPG0dKnDnoPiooNeRVqe3fCWU8NVp58SkuriMu\nIaLHz+hvgl10QybYR0LhnO7QjE7HU1GOVFXDBdPO56LMc1GrZC3qtJRZLEybx1/3ruZ/hWupdzdy\n7YTLEQQBUZR4Y+0xvtxVhlolcPEZGSw9IxO9rq0GdnJfJTbgnqum8eaXJ/hqXwWHCuu5fflkclLb\naxhdoURsKHbQjjBF6KmtsuP1+II+hk9jiUSKjcZZkI97ZgLXTbqKuckzA9fnJs8krz6fv+59mb/v\n/ye35t7A5DjZP9Fk9/D/3t7L4aJ6Yix6li/IYt7UZNStbN+SJLH/02MkJ0bww9mjeG/jCd7dcIJj\nJQ3cvHQSEUZt0GMgiSKu/BNok5JQR3T8khqMvXOYGZtNO1VH9hC5KI2bp/6A7KjMwPXTk2ezsWwz\n7+V9xPP7/sFDs+8hSi9r7gUVTTy/5gA1jS7GpUez8qxsxqZFtVEEIlUCJyhk/oxR5Khh3c4yVr2/\nn0vmZ3LJgqwe9dVdUgx+f9dzQTHN9cDGDqDPysK6dQuewgJy58zmqvHLida3zNWLpy7iua//wcHa\nI/xt32vcN/P2wDuzbmcp/15/HJ9fZP6UZFYuzCHmlNBOZ7mVqsPVfO+C8aw7Ws3+/FoeW72dW5dN\nYvaExB711VUom4wMWZ2PnzIOTrsHgt8wDzlC4jz9yU9+wtVXX01BQQGLFi3ivffeC0WzQwqv38s2\nXSUAC8VMlmYtCUxQhWRzIg/MvovRllFsqdjOloodeLx+Vr2/ny93lZGWEMGvb5rLZQtz2gl1kF8q\ntVpgfGYsj14/m6XzMqizunj6zT0cKqzrUX9dBfmoLZFoYuM6vcds7rkDtdZZzwmLG6Nb5Oa0S9oI\ndYWxMdncnnsjgiDw0v5XKWgspqzGzk+e28jhonpmjI3ndzefxpnTUtsIdWjJOjVb9MyfmsKvbpzL\n5KxY9p2o5TevbKemIfjDPjyVFXKmZQeJSQqCIGDsRbnWfK0Nl05gVK3Iw3N+3EaoA6hVahann8ml\n2RfS4G7kxX2v4vF72Hm0isf/uZPaRheXzM/koWtmMC49up15QVlooqMMXLV4LI/dMJuEaAMfflPI\nX98/gKsHhapcRYUAGDK6EGi98DUA7NLLO46JNjO3TP1BG6EOEG2I5I7cG5mdNJ2CpiI+yP8ESZJ4\nb+MJ3vjiGEa9mrtXTuWHSye1E+rQ4sxNSbLw4ytyuXP5FNRqgefXHGDtjpJ293dFQLBndqXs9G4c\nhhohEex//OMf+frrrzlw4AAbNmzgsssuC0WzQ4qPC77ggFGO1811xXRq54vUWbhl6nUY1HrezfuQ\nJ975ht15NUzMiOGn184kNb7zLaTdJod1CYKARq1i5Vk53L1iKn5R5Nl39rEnL7iSBr6GBnx1dRiy\ns7u0R5osPZvEkiTx5tH3qIyRp022tXPn5vjYMdw85Qd4RR+vHnybJ/+1g8paB8vmZXLXyqmdmpdO\nTaOPNOu478ppLJ2XSU2ji6fe3E1NY3DCPbBr6SB+vTXmCB32HhREs3nsvHbk31TGa7FYvZjdnX/v\nvIxFnJ48myJrCX/Z9i9e+OAgGo2K+66axvIzszstcnZqyOeohAgevX4OE0ZHs+tYNb//xza8vuBC\nE92FhYBcfrkzeqOx76s+yIeu3fhVkNOk79SGLggC14xfSaIpnnXFm1i1di0fbykiMcbIo9fPZua4\nzlXj1rH8giAwe0IiP/3eTCLNOv61No/3NgYfkeMqKGhWdmI7vUcJKuhJstZQJJx5GgQn7VWsL/kK\nX1IcqFS4iwq6vD/WEMOKMctw+92UGzczd1Ii9105DVMX5zVKUnO87ikOmxnjEvjRFdNQqWDV+/vZ\nd6K22/4G4tezOtdMAMzmniVkbKvcxeG6YwGNx11U1OX9U+InMit+FtWuKlxRedxxWS4rzsru0lau\nCJbWsdsqQWDlWdmsODNLFu7/Ck64K9Ea3Y2DyaxD9Eu4Xd1rwZIk8caRd2n0WIkeI0cFKZpgRwiC\nwDUTVhKvTeaE8xCaqHruv3IaU7I630lBx4WvIoxa7r9qOtPHxLMnr5oXPjiIr5uDuEHW2AW9ocPE\nJIWeFkRz+py8ceRdVFodmrQ0vKWliN7Ov2vQGPjh5O+jktQckjaQkqziZ9fOJD6qa8d4R+WbM5It\n/PwHs0iKMfLxliI+3VrcbX99TU346moxZGV1qewoCoUzrLGPbCRJ4p28D/FLflZOvBR9Wjru4mIk\nX+dCQJQkDuw04a9PQB1Vx4QZTWi6iUV2OeV6JR3F607OjOX+K6ejUgk8/8EBik927TQLVrD3ZNvZ\n5LHybt6H6NU6zjvjGvk5XQg0gEa7h6Nbk5G8OvTp+czO7d7x4+iiLsiy+Vksbxbu/+/tvTi6EcTu\n4iJQqzuM429NT8Zhb81B9tUcZGx0NhNzF7Y8pwsKym1U7pPNIElTCsgZ1X3OR2dJWhq1ijuWT2ba\n2Hh259Xwj/8d7nKnIbrdchz/6NEdxvG3xmTWBa2xf160AZvXzgWZ5xA1Zjz4/biLuxawhw77cBWN\nQ9B4mTCnhqggok4cNg9GU/vyzfHRRh64egbRETre/vI4Ww5UdtmOMle72rVA730NQ42wYO+GvTUH\nOVx3jImx45iWMAVDZhaSz4e7vPMDJ9798gTbDlUxyn0GBrWeTwq/aBMW2RHdnZw0Lj2aW5ZOwuPx\n8+w7e6lrcnV4H7QW7F072QICLYhJvOb4/3D4nFyacxEJcaloE5PkOO5OhIrXJ7Lq/f1U1/qZrJ+P\niI+Xd/272+d0V6L1kvlZLJmTTkWtg+c/OIC/kxOdJJ8Pd0kx+lFpCJquHcPBVroUJZGP8j9DQDYt\nKDZrxYbdETUNTv7yn/34bdGMi5hMtfsk31bs6PI50PU4aDVqfn7jaeSkRrLl4En+u7nz57uL5bBX\nfWb3DldThB6n3dNtQbQ6Vz1flnxFtD6KxekLMGQpOR6dL/Q7j1bx7/XHMTtyiNPHsa1qBycd1V0+\np7vyzXFRBu6/ajomvYbV/zvMwS78UO4gHKcARlNYsI94vH4v7+V9hFpQc8XYSxAEAUPzC9LZJN5y\nsJJPtxWTEmfivhWncXb6mdi8djaVbu7yWQFbYhfJSbMnJHLl4jE02Dw89+4+3B2kf0uShKuwAG1S\nMmpT1yFhwdZJqXLUsK1yF6nmZM4cdToAhsxMRLsdX017u78kSbzxxVGOlzYyd2Iid5x1PuNixrC7\n4gDHG7rW8oMpJ3Dl2WOYlhPHwYI63lrbcfanp6ICyefDkJnZ5fOgbXJOV+w4uYdK+0lOS5lFkjkR\nTXQ06sjITk1STreP597bh9Xh5drzxnL9tOXoVFo+PPEpTl/nCzO0ONI7K99s1Gu457Jc4iL1vP9V\nAbuPdSwkXUWKwzCzy+eBPA6SJO8eu+Kj/M/wij4uyb4AnVrXUlqgsOPSAkWVVv720SF0WjX3XT6D\n5WMvDCySXeH1+PF5xS7nQlpCBPdenosgwAtrDlDdiXM9GMcpgFqjQm/QhJ2nI5mvirZT56rnrLQz\nSN51YmQAACAASURBVDLLoVXKit/RJC6qtPLqJ0cw6tXcc1kuEUYti9PPxKgxsLZ4Iy5f5xqhI8ia\n00vmpLNoeiolVTZe+7T9IdvemmpEpxNDN1tOCH7b+VnReiQkLsg8J+AgU7a0rg78Det3lbFpbwUZ\nSRZuvGgiKpWKZdnny20Vru/yWcEcqqBSCdx6yWRGJZhZt6uUTXvL292jaNHKgc1dEczOxS/6+Tj/\nc9SCmosyzwVk+7l+dCa+ulr81rbmMUmS+Pt/D1FWbeecWWmcPTONaH0USzIWY/XaWF+8qcs+OZr9\nLV3ZgyPNOu65LBedVsXf/nuI0mpbu3sCAi2I+dCShdv5PC2xlrG9cjdpEanMSZ4BgDYxEUFv6NAU\nY3V4WPX+frw+kdsumUxGsoUZCVPJsKSzu2ofRU2dR7Z0ZZZrzbj0aK49bxx2l49V/9nfTuGRJAlX\nQQGa2Lg2B4x0hnK62HAmLNg7QZREPjwiv8jnjl4Y+FyXOgpBpwts7RRsTi+r3t+Pxydy89JJJMea\nADBpjUFp7cEWvhIEgWvOHUd28zb8y91tTUKK9qgfPbrb36jRqtHp1V1O4hpnHdsqd5FkSmRGYss5\nmYqgcDVHXCjklTbw5to8Ik1a7rlsKnqtHNaZHZXB5MRxHKo7SnFTaafPC/ZlNuo1/OiyXMwGDa9/\nfoyiyraC1V0s90s/OrPLdiC4sgJbKrZT46pjfuppxBlboioMGfLC4TrFzv7p1uJANNTV57SEWy4e\nfSZmjYmNZZvxdGKeCzjSgygtMTrJwg8vnoTb42fVf/bjPCUM0l1UhMpgQJuY1G1bxiAW+k8L5UV+\n+ZiLAou8oFJhGD0aT0V5mxpCoiTx9Bs7qWkO7Zw+Nl6+XxC4NOdCAD488Wmnz+rJwe4Lp49i4fRU\niqtsvHqKwuOrq8NvberWDKNgMssZ2X7fwBzc3h+EBXsn7Ks5RLn1JHOTZ7aJzRXUavTpo3GXlSF6\n5IknShIvfXQoMIFnnFIKYHH6AowaY7PW3vEWvCfHf2k1Ku5cPoUIo5Y31+ZxpJVtUXHkBaOhKc/r\n6kX+vOhLREnkgszFbcLZFE3Y3cq+3OTw8MIHB5GQuGP5FGIjDW3aWjHxAgA+K/qy0+c57B50ejUa\nbfe1t+Ojjdy8dBI+v8hf1+zH4WoxIbiKikClQp/eteMUujdJ+UU/nxauR6vSckHm4jbXlJ1L63E4\nWlzPuxtPEB2h47ZLJreJ1derdZyZdgZ2r6NTW3tXjvSOmDMhkQtPG83Jeif/+KRFqIkuJ57KCvSj\nM7p1nEL3C1yNs5a91QcYbRnFhJi2NWf0ozNAknCXtSzaH35dwK4jVUzJjm2XVDU+dgzjonM4Up9H\nqbX9jgtaFcULsk7M984dR05qJN8ePMmGVgpPixkmSMHeA9/TUCUs2DtAkiQ+L/oSAaGNtq5gyMgA\nUcRdKk/iT74tYn9+LZOz2k9gAKPGyDnpZ2L3Odhcsb3DZ/a0VG1spIHbL52MKEk8+c8d2Jrtoorm\nqE/vXmMHWai5HF78HYTN1bsa+LZiB4nGeGYlTmtzTW0yoU1KDjhQlcWt3upm5VnZjB8d0669qUkT\nyLCks7f6AJX2kx32x9HDEq3TxsSzdF4G1Q0u/v5fOUJEEkXcJcXoUkeh0nbfVncF0fbWHKTe3cAZ\nKbMD2aMKp2rsjTY3z39wEAGB2y+dQmQHv2Vh2jw0Kg3rSr5ClNqPe7C7ltasaM5e3XGkivW7ypr7\nJDtOgxVoxm58DV+WfI2ExOL0s9qZiJQdorJjPFhQx0ffFJIYY+TWZZM7DHFdPPpMud3Srzt8Xkeh\nr12h1ai4Q1F41uUFdnGKshOMWQ5GRmRMWLB3QF7DCYqaSpgzahrJ5vZpywFttaSIYyUNvL+pgBiL\nnluWTeo0RvvMUWegUWnYVLq5y5fZaAo+ZX5SZizLF2RR0+Dk7/89hF8UcRcVoYmL6zSF/lSUhcTl\naO8w21S2Bb/k57yMRe2ybEHeFYgOB97qaj7eXMjBgjpyc+K48PSOXyBBEDg/82wkJD4v2tDuut8v\n4nL0/ASp5QuymZgRw57jNXy2rQRPZQWSxxP0rkWtVmHo4gShDSXfALAwbX67a5rYOFQREbiLChFF\niRc/PEiT3cPli3IYlx7dYXuROgunJc9s1oAPtrvem8ObNWoVt186BYtJy1vr8sgvbwoqMak1gRju\nDsbB4ZWVkmh9FDMTc9tdNzSbvNwlRdRb3fzto4OoVAIPXzen0zIQk+MmkGiMZ0flbqye9v6B3pz5\nGhtpaN7FSc27OF/LLjZowa4cQhMW7COKdc2OrUsnLunwuiLYrScKeOED+bT62y6ZTKSp8wkYoTMz\nO3E61c5aDte1P4HIYfc0F/rv2Z/k4jMymT4ugX0nalm34SB+a1PQmgl0blf1ij42l2/DrDExO2lG\nh99VIi3yt+9nzdcFxEbquXlp54sbwNT4SSQa49lZtReb197mmtOhnCDVs6p6ijM1yqzjvY0nKN4j\nH9emzwh+HMzmjk8QKrGWcaKxgImx4zpc5AVBwDA6A+//Z++9oyS560PfT3WOk3ty3JyjNiqsJAQS\nCiRjHgbDRRhjHDg8Xb/jc1+wr6/TxX6PCxiuMRgso4vBZIQQKGu1knalzTnvTs6xezqHqvdHdfX0\nzHRPV3XXzG6P+nMO54jpqq7f/vpX39/3942jo/zqlYtc7pli++oaHtzdsuDz3tVyDwICL/W8Ns8B\nnm+jkUq3lc8+thFRlPjGL87jV8JeVUTEwMLRQW8OHCWaiHJv850ZN3lLQ4Ncvri7i2/98gLTwRgf\nuX8VazKc3BQMgoF7W+4iLiV4vf/IvM+12NjT2bKymkf2yae4J399iXBvD6bKKoxudQW0SqaYZch4\naIIL41foKGtldXXmI6y1sQmMRgbOX2HKH+VDB1Zk1c7SOdCyH4BDfW/O+yzfLjEGg8Cffmwn5S4L\nxw+eBtRrJpDdvnxq5Cz+WIC9jXdgMWbWuJQN5PQbZzAIAn/4/k05i3QZBAN3Ne0lLsbn2ZgLaVpc\n7rTw2ffJpqmzb+QxD65kB6HobOfjweRvdW8GbV1BmYdTh85QU27j04/M7zE7lzpnLZtq1tPl66Fr\nTmRIPqYYhY0dVTx2ZzvjvjCjF6/JjlOPumJZNocFQZgv0BJigoN9b2I1WrizcU/GewWTCUtzC6He\nPq71TLBzjYcHdub2b+yp34ndZONQ/xFi4uy5L2QePnB3B2tbKrh0sYfE1JSqYAIFRw7TXDFQEuxz\nODxwFAmJu5Lx2pkQTCZC5R5c02NsX1HFQ3vULZpWdzMdZW1cGL/CaHCmNEAsliAaSeTdJabCbeVz\n79tIXVj+zkRt5iYCmZhJzpn9Mh/qO4KAwN2N2WvLm5plrbQiMMZv37eKlU3qKlDubbgDs8HE6/1v\nzTJL5auhKaxvq+T9d3VQ4RtBQsDctLDWnI5ycvGnNTKejvo5PnyaWnsNG6rnt1JTiNfKpYwbouP8\n4Qc24bSpM6cdaJI3+jcH3p7190Ln4X13drCp2YUjMEmgok6V4xRkJcHumF8Q7ezYRaYiXvY27MJh\nzl4CIFBei0FMsMYa5vGH16mqm24zWdnfuJvpqJ+Tw2dmfabFkT4Xo8HAH7x/Ix2CbGcPVOSOClIo\n2diXGQkxwZuDR7Gb7OyY4yxM5/zNca7HnJilBJ/YVampTviB5v1ISBzqnwl9DGl0EmVibWslO8rk\nF/IHF0JZMzLnkmkR90730+nrZn31GjyO7DVNfnFsiCmTi6b4FA/snN9PNBtOs4OdtdsYC41zZWIm\nwUirsywTj+xppSE2yZiljGeOZY62yIQjJdhnopYODxwlLsY50Hxn1gJXsbjIDy7K9+yqiNHRoL5F\n5NqqVVTbqjgxfJpQfCaxphBNFWQB/cntZRiQuBiyc6l7UvW9mWK4lY3nrizaOsDIZJBDI7IA/vB6\nKw6VmxvIG5yAwBsDb836e9BfWK/TCpeVRzrkMT3fk8AXVCeoS6aYZcaZsQtMR/3srd+Z1fwwPBHk\nn5++wIhdFniG4eylBTKxvXYzZRa3XNI3IduU83GWZaJ8eoSIxcGZ4Rg/Oaiu6l0mU8yhPtneqWiU\nmXjr4hDPvd2D112DNRpE1Nip/u5m+USUblstVKABJMZGMSViTLk8PHO4S3VFzJR9OdlvVZREDg8c\nxWIws6dhfmlihe+/dJXzkxA3WamYHtE0VoNg4M7G3UTFGMeGTqX+rkcTa9PYIAAjtiq59rvKcscO\np4V4TCSajIcfD01weeIaK8rbaHRlLiIWiSX4p5+fp9con9hck5kjnrJRba9iXdVqbnq7GUxGSyUS\nIuFQrOAuRmU+OSP3pljGN35+XlXRNKvNJNfoLwn25cGb/UnNpCmzZhIMx/jqT84SjMTZcfc2gJyF\nj+ZiMpjY23AHoXiI06Pn5O/N01mWTsLvJz4+TvmqFdRVO3n+aC+vn82tsc7VTkLxMMeHT1Ftq8xq\nfugemubffn0Zm8XI6js2AvMTdHLR5m6hxd3E2bGLTIbldmip7NsCBFqkV/491u/ZjNlk4F9+dYHB\n8UCOu2bmwZ8U7NcmbzIWnmB77Rbspszmh4On+3nt9ACtdW6cHe3ERoY1N/ne27ALg2DgjYG3U05U\nPZpYK+ty54Ht+EMxvp4hIzMTc9fD4cFjSEhZbeuiJPHtZy7SM+Jn3a4NcvXTXm3vBMD+xt3y8waO\nAoX5W9KJ9HZjcDhZtbGdK71TfO+FqznLM880tS4J9qJnJDjG5clrrKrooN453x4nihL//MsLDE0E\neXB3C3fcK0eKaBVoAPsa5GbYR5LOQz00VeVlcrS384UPyxmZTz13Jecx3GY3z3KYnRw5Q1SMsa9h\nd0bzw4QvzNd/dpZoXOSzj22kZu2qWc9XiyAI3N20FwmJI8nYfj02OGUc9RtW86n3riMUSfA/fniG\nqRyOsJQpxidfd3hQFjCKwJnL2RtjfO/5q7jsZrm+fFurnKDTp635Q7nVzZaajfT7B1NO1KA/e+Er\ntUR65cqW++/blsrI/PavLuYs8JV+gkuICY4MHMNusmUMcQT46cEbnLg6yrrWCn7noY1Y6hsI9/Qg\nqTQFKmyp2YDL7OTtoRPExLgu74QYDhEbGcHa2spnHt1Ia62LQ2cGePlE9sxnBSVxr1g7w5UEexJF\nuNzVON9pKkoS//aby5y/OcHmFdX89r2rMNrtmGvr5BK+Gn/8WkcNqyo6uDp5nbHQuC6mmHBaEkZ9\nlYM/+ZCc/v8/f3aOgbHsGqviMFM0pCMDxxEQ2Nuwc961/lCM//GjM4z7IvzWgRVsW12TSoTKVbo2\nEztrt2IxmHlr8ASiJBIMROXa2xra380lPUFr38Z6Pnh3B+O+MF/58Zl56fbppNvYg7Egp0fPU+fw\nsHJOZySQW9v90y/OYzQKfOHDW/BU2LG2JHMbNJ7gYMZ2/cbAW8RjCaKReEFrQUomz1kbGzGYzXz8\n3WtY21LBiSujPPX8lQXXa7rGfmH8Mt6oj11127EY54/ntdP9/ObtHuqqHPzRBzdjMhqwtrUhRcLE\nRrSZY0wGE3sadhKIBTk7ekGnTb5PTtBqacVqkes3lTkt/ODlaxy9tPD4lBr9UQ2dqm4nSoId2Z76\n9uAJ7CYbWz2bZn0mSRLfe+Eqb5wbpL3ezR+8b2OqNrS1pQUxGCA+kbv5xVz2N8ia4JHB4/os4jkZ\np2tbK/nUe9cRjMT5hx+con8B4a5oJ0OBYTp93ayrWk2lbXb4Zjga58s/OsPAWID37Grh4WQSkqmq\nCoPTSaRXm6YKcvOFHbVbGQ9PcH3qplx72zm/9rYWIr09mKpmErQe3d/OPVsb6Rn28z9/fo5INLM5\nIt0Uc3T4FHExzr6GXfMiO/pH/Xz1x2eIxUU+976NqUggm5J5mYcZQnaiVnJy5CyTPjlRpxDBHh0a\nQopGU2vBZDTw+d/aQmudrLH+7FDmKozpzw0GoryZNItkMsO8fnaAp567gtNm4n//7S2pMFdbARuc\n8k4cHjiqi8Ye7p1dN6m63MYXPrwFq9nIvzxzMWtFTCj+FnklwQ5cmriKN+pjZ922WU5TUZT4wUvX\nOHiqn5Zal1z7Oa0LUioDNQ9tdXvtZmxGK28NHk/VAS/UFCPHLM/UqblzcwMff/cafIEo//D9k/SN\nzM/uA7C7LMSiCd7slU1DiqlIwReM8qUfnqZz0Medm+r5yP2rUgJPEASsLa3ERoZJhNT3I1XY23AH\nMLPBFTIHce8UCa93VsyyIAh84sE1bFtVw8WuSf6/H55KlV9Ix2I1YTAK+H0RDg8cxSAY2DPn1HKj\n38sX//0kvmCM333PWrantXSzNDSC0ZiXYDcIBvbU7ySaiHKm/zKgjzkqPVHNYTPxnz+yLdV16Iev\nXEPMoLkr8z/pnebixBVa3U00u2eHzx483c+Tv76Mw2bi//joduoqHanPlLkP5zEP9c5aVpZ3cHny\nGmNTXnk8eig7afPQ0VDGEx/Zislo4J9+cT6rcz1XeYXbHV0E+6FDh3jooYd48MEH+da3vqXHVy4p\niq17X1LIAATCMf76X9/mpRN9NNY4+dOPbpuXfKMkwITz0E4sRgs767YxFfEy4Z1esPZ2LhKRCNHB\nQawt87vkvGtnM598cC3TwRh///2TnL0xfyErNeBP9VzAaXKwxbMx9dnAWIC/+e5xbvT72Luxjk89\nvG5eeKcyD1GN9mWAVRUdeOzVnB68KNfeLmhzk58/t06O0WDgjz64ib0b67jR7+Pv//0k497ZxdgE\nQcDhtOD1Buj3D7K5ej1llplMxbM3xvh//+MUoUiC33tkPfdtnx3eKZhMWBubiPT1IiXU9SJNZ09y\n7V3ol7OSC5qHLPWCypwW/vSj22iodvD80V65xO2cE4wiSPvGhxElkb1pm3xCFPnF6zd56rkruB1m\n/uxjO2irn53NaU3mNuSzwQHsa5Sf1z0qO/4Lm4ceBLMZS33DrL+vbq7gCx/egtEg8LWfnuVXh7vm\n+R6cRR7yWLBgF0WRv/7rv+Y73/kOv/rVr3j22We5cUN9g9lbTSAW5NzoBeqddbS55UV5Y8DLX/3b\nMY5fGmZjRxX/5eM7MpYLSBU+ykNjB9ifXMTT06FUE+t8CPb0yl1yWjIn5Ny7vYnfe2Q9kViCr/z4\nLN9/8eqsRsj25CKOhBLsqt+O2WAiIYocPNXP3/6vmbKrv//ohlmVChUUAZKPI1kQBFlrj8jfq4um\nmqEAmslo4DOPbuCBO5rpHwvwF//6NgdP9c/SWh1Oi6yhSTMCJhiO893nLvOVH59FkuBPPrSZOzc3\nzPt+5blSLEZ0WJt9GaDGXsWaipWMTckRQos1DzXldv6vT+xkXWsFp66N8bf/6wRXe6dSnyuCdGzK\ni0kwckfdtuT/D/H3/36KX77ZRXWZjT/72A5aaufXIzK6XJiqqvMW7Ns9m7EYLYwm5yHfkE8pHic6\n0I+lqRnBOD/BaV1bJX/2sR1UuK387NBN/vt3j+JNc7Ar85CpzEQxkJ+KmMbZs2dpa2ujqUnWYB55\n5BFefvllVuboDH+7cGz4FHEpwd76ndwY8PGrw12phtEfeWAN79nRlNXmayqvwFhenpd9GeSQv3pH\nHVLEgLUy/58ikOzmtFBFxzs3N9BS6+Kbv7zASyf6OHVtjPt2NHHXlobUIjbFrGyr3s6JKyM8/UYn\nfaMBrBYjv//oBvZtyt4I2VqAfRnktPJXzsjOaz00VVuWeTAIAr/zrtU0e1z88JXrPPX8FY5cGOL+\nHc1sXVWN3WkGUaDcUEGdqY3nj/bw/NEepvxRmj1OHn94/YIJSNbWVjgsR6RYG9Vn/yrsbbiD586f\nBPKfB0mSiPT2YK7xYHQ4Ml7jtMlNsb//4lUOnh7gi/9+kl3rannPrhbaG9wYTQKJMGz2bGRiUuSn\np65w5PwQkViC3etr+eSDaxdMQLK2thI4fYq4dwo86uqzKNhMVnZ4tjB8TijIkR4dHJA7aC1QVmJF\nYxn/9VO7+Oenz/PW+SFOXh7hwLYmHtzdoqo2/e1MwYJ9eHiYhoYZDaauro5z584V+rVLxuHXbuK2\n1fLTX0SIhk4AsKa5nA/cvYK772hldHThxtHWllaC58+R8PtVV1RUEASBXVU76ZQgYgzm/W8I3OyS\nx5KjNkprnZu/+NQufn7oJgdP9/OTgzf4+aGbNDsEagFLqIIvfvs6kgQCcNeWBj50zwoqciSJWOrl\nAlD5OMwAKm0VtFrlscfN+dfnCPf2YLDbMdXUZL1GEATu2drI5hXVfO+FK5y6Nsa1Pi9mk4GV9ghu\nrIhDTfyXf5ZzGkxGgQ/e3cF797blbEg+43PpgT3ZSzFkY1vtZl6NyzZ2m4Yqn+nEp6ZITE9jX71m\nwetMRgOffGgd+zc38IOXrnHs8gjHLo9gtRhZb4hgilk5fzzB4SHZgVpdZuV337OG/Zvqc54srS2y\nYI/09sIq9WUdFPY27OTZ2GWwJPJ2pCvm0VzlqxXz1MkbE/zwxSu8eLyXF4/3Umkzsgq41jfIPopD\nSU2nYMGeb5ynR+NOvliU9zVisVThrK6mbUMZ79ndxuZVM4Ih1zgDa1cRPH8O2/QYFR2Zj+gLsT+4\nk05OMMl43nMyeLMTwWikactaDJbcmt7nP7qDx9+/mVeO9/DayT68kWvgr0OYqmJ9exWbV9Zw59ZG\nOhrV1X4BGGxvI9DVTXWFDYM5u1DK9m9cX76Wq/gYZgCPJ3Ps+EIkwmGuDg9TtnEDtbW50/o9Hjd/\n9bkauod8vHlmgDfODBCMD+CmkcRoLVtX13Dn1ib2b26gXGX2Y9yxnj5AGh7I+7f0mGqJAn7HOOs8\nC6+nTM+Y6L4KQNW61arG4PG42bOliWMXhzhxeYSzN4eJ+idwBMqxhl3csb6Sh/a2cceGeowqhaxh\n01omngHT+FDWcS5Edc0WXox3EbIFcFeYsZltuW+aw/SY/Oy6LesoU/H8h+vKeffuVl4+1suxi8Pc\nnOwkOi4QIXbbyCotFCzY6+vrGRiYyXAcHh6mtjZ3NblcmvBS4XY5KBMcfOKTM45TZWwejzvnOMUa\n+eUbOXeZWEO75uf7RuQ42QlxnDOd17KmbWdDEkUC3d2Y6xsY90YA9RrvvnW17F3r4YsHj8BwHfva\nV/LgYzPt77T8RoaGJqTrNxg4dzWrlrTQfNojZYCPs5PnGRrOXP99IUI3roMkYahv1DRuh1Hg3Tua\n2L3RzZd+fhTGG/n9d+1gzUY5SS0aijIaUn8cN9d4mL5xk5ERX14+E1vcSVgI8XLnm7Q5s5/Ass3l\n+DlZ449X1WmahxV1Lvl/66Z58RdhhEAl//UTu1MmoYnxzBFVmYiVy9FCE5ev0Yz2dz0WjSMkjMRM\nYV64eDjl79DC1JVrIAiEnFVEVDzf43EzNRlk56pqdq6q5t8vXeTwwDH+eNunbxtZBeo3yYKdp5s3\nb6anp4f+/n6i0SjPPvss73rXuwr92iXD6ZKTc/I9eaQch3nalxUbXswS4a2hzK3SFiI2MoIYDmsq\nS5pOr7+fgbhc7yYRzj/LrpAIIYBIQN7gvMIklyauar9/AYehGo4NnyJqliNlCnGYWVtaSUxPk/BO\n5b44A2JIQLLEOTt2nmBMe/io1m5BczkyeCxlDss3httUXYPBbi/4nYibI6mINS2k/Ax1dRhs2rX9\nSCLKyZEzVNrKWVe1OvcNtyEFC3aj0cif//mf8+lPf5pHH32URx55pGgcpyB73RMFZJjJHdqteduX\nlZfHZIWjQydJiNpC5RSBls1hmIu3Bk8gGuIYjIU5itK7SuVD+sv81tAJzfcXItglSeLI4HEkc2zW\nWPIhFcedx3qQJEmO5XdZiIlxToyc1vwdkd4ejC43psrsDS6yMRme4vLENdxuuTZOvvOQym0YHiYR\nztzjdyGUd6LM7eCGt5ORYPZEokzEx8cQQ6G834nTI+cIJyLsadiZtarn7Y4uo77nnnt4/vnneeGF\nF/jsZz+rx1cuGWo61C+EYDBgbW6RO7THtH+H8vKsbmhjOurn4sQVTfcXItBiYpzjQ6dwW1w4XbaC\nsuysTc0gCPlvcIEoJrOB2rIazo1eIBDT5kyO9PSA0Sg3QdFIl6+HocAwq+rlzamgeSigxEIkHEcU\nJWoqKhAQNGuriWSbQmtLa15moLcGTyAhsaJWbpBR8AYnSQS7ta8H5bntHvm31DoPhZ7elAYwe+vv\nyHHl7Utxbkc6okdYk7W1FUSRaL/6+t8KyrH/jhbZtq2kcatFrfc/E+fGLhKIB9lVvx2nq7CiRwab\nDXNdHZFe7bVzYKb29r6GO4hLCY4Pq9dWpUSCSF8v1sYmBJN2t5FSUXBfm1yeVw+NPZ/QT+W55W4H\nG6rX0u3rZcA/pPr+mYxT7WtBlETeGjyGxWBmbf0KoHCTFID/Zqfme5WNdVVdG3aTnbcHj2s6yabe\niTzMUWOhCa5O3ZAT5xboRXC7844X7Hp0S0lpaXmYIZTnrqxrodXdxIXxy0xFvKrvj/T2YPXUaA61\nhJkyxXc27sbutCBJEM6Qbq8WW2sbYihEbEzb0VkUJULBKA6XlV11OzAIBt5MK2Obi+jQEFIslteL\nHI6HOT5yhipbJRtqV2O1mQpaC6bKKowud14ae3oxOKWsw9z2gQtRiGC/PtWZKlNcWe6aNZ58UHwu\ngc4uzfcq819WZmdX3Ta80WlNJ9lCNPa3FW29QbvD9naiJNiz9PzUQiGOQ7n9lwmTycj+xt2pgmRq\niHu9JLxTODsy92ZdiLHQOJcnr7GyvJ16Z50uRY9mzBDa5iEciiFJ8m9RbnWzuWYD/f5BeqZzl1eV\nn9clP19D82qFkyNniSai7G24A4NgwOW2FiTYBUHA2tpKbHSURDB3Hfh00ovBba5Zj9Ps4O2hE6q1\n1ZlSAtrnQaluuq9hly6nWKV2TiAfjT2tAJgSEXNk4Jjq+yM9PRjLyzGVqw/XBbmD2uHBY1iNw/az\nyAAAIABJREFUFrZ7Nue+4TamJNh1qAlhaWzKu8FAeu3tO+q2YTaYOTxwdFYv0GwoJwRnR7vm5x5O\nvihK5T5dTi55OlDnNthQxvRG/1tZ70lH2VBteQi0wwPHEBBSdYJcZTbCwRgJFZ12sjErUUkDMxq7\nFZPBxO76HfhjAc6MXVB1f7inB8FiwVKvLWQ2FA9xauQcHns1qyo6sCeTowra4JK1c4Ld3Zpr56QL\n9hZXE02uBs6NX8IXzR12mPD7iU+M56WtX5y4wlTEy676HdhMhXVuutWUBLsOGrvBYsFS30Ckt1dT\ng4FU+6/kGOReq1sYC09wbTJ7aVWFcHdSsK9coWm8CTHBW4PHsJvsbE82ULiVGvvcssXrq1ZTZavk\n+PBpQvHcURWRnm4QhKy1crIxmFamuMomR5G43PILHQ7mb5KaqSFU2DwovQFe7zuS9R4FMRYjOjiA\ntblZdfNqhbcGTxATY+xv2I0gCBiNBmx287ym1lqxtrUhRqNEhwY13Rf0y450s8WIIAjsb1B/kk1F\nieVhlns9qUjcnaEnQ7FREuw6VXGztrbKDQZG1duXQ0nh4XDOZGoq2qrSwWchlKO3a4U2wX5+/BLe\n6DS767enyhTrobGbysowVlRoPrnMbTSS3gv0+PCphW6diVmu1R6zrPgY0rskKYK9kHlImeY0nlzm\ntoOrd9aypmIlV6duMBRYuLBYdKAfEgnNZhhJkni9/wgmwTgrEShTU2utKPMQ6dY+D+lF8XbXb8ds\nMPN6/5GcJ9l87eujgXEujl+ho6x1XpniYuQdL9hNJiMWa2EOM8jPgTpjgpg59q0ob6PeUcupkXN4\nIwsfPSM93Rhdbiw12rz3mRooOJNp84U2FrC1thGfnCQ+rb65daZGI/uUXqD9CztR42NjiMFgqtGF\nWsLxMEcGj1NucbOlZkPq704dBLu5tg7BastbY7enbfR3N8s1Z17PYZbK13F6ZfI6w8FRttduxW2Z\nccA7nBaikQRxFX1Ss2FtawcgnPSBqCEVy59WBM1hdrC7fjvj4UkujF9e8P5wlpLFuXj55htyb9em\n4tfWoSTYAXRpXGtTFnFXl+p7UppqmkATBIEDzXeSkBK80Z/9CJ4IBOSY5bY2TTHLI8HRlGbS5Jqp\nRTLjMCvw+J2HGSJTa8ByaxmbazbQ5x9I9QLNhCI0tEbEvDV0gnAizN1N+zAZZkIkXW7brDHlg2Aw\nYG1J5jZE1X9PwB9JOdIVttZspMzi5u2hE0QS2b8rX8fpoeQaO9A8u2iZLj6X5hbZ96RBY1cc6XPL\n9R5ovhOAg71vLnh/pLtbDr1VUdZEISEmePnmYewmOzuz9HYtNkqCHXkRh0M6Ocw0LOJsLfH2NOzE\nbrJzqP8IsURmW2+mLjlqeLVX1kzua7l71t9TDrMCN7h8en9mm4d7mmRh82rv61nvjeQRsyxKIq/1\nvYlJMHLXHA3NVVa4xg7JVnnJ3qNqCQWiqYQ5BaNBjpYKxcOcWCC2P9zTI/sZmptVP28yPMXZ0Qu0\nuBppL5ut4ephojRYrdibGjU1t86k7AA0uRpYVSF3VxoKjGS8VwyHiQ4NYm1t0+RnOD16Dm/Yx976\nnRl7uxYjJcGOPkX1jQ4H5to6wt1dquOvszWxthot3NW4B38skDVRJ9zdBYBNQ4ifPxbgyOBxqmyV\nbJvT29VoNGBzmAno4GsArSappAliTqnatZWraHY1cnLkLGOhiYz3ztRGUX/0vjRxjZHgGDvrts0y\nP0CaYC/UcagxQkh2pMczNpa4q3EPAgIH+97MuLYkUSTS24uloUFVdU+FN/rfQkLinub98059ejWa\ncK1ckWxunVkYz2WhXqeK1n4oy0k20tsjN5xJnp7VIEkSL/a8hoDAPc3aSy3frpQEO/o5UG1tbXJz\n67HMfRTnEsiiqQIcaN6PQTDwat8bGV/mGYHWrnp8b/S/TUyMcV/znRmrJzqdloJfZHONB4PDkYrY\nUUMwEMXuNGOYo2UJgsC7Wu9BQuKVLFp7uKcHU2UVJnfuUr0KB/veAODepKBIJ2WKKXiD09YPN7TA\nWqi0VbCzbiv9/kHOj1+a93lsZBgpEtZkhgnHw7ze/xYOkz3VJSkdvRpNOJOOfbV29oUE+9aajZRb\nynh78HjGaKl8lJ0rk9fpne5nT/N2ah2e3DcUCSXBjj72REhzFiUXWC5CSU3VmaHed6Wtgu2ezfT7\nB7k2Nb/VYKS7G4PdPqt59ULExDiv9b2JzWhjX2PmeucOl+wwixXgMBMEAVtbO7HhIRJBdfVeFmpi\nvbN2K5XWCo4MHMUfm53wIzevntKkrQ/4h7g4foUV5e20ls03WzidFgRBB5NUY5Pc3FqlSWohgQbw\nnrb7AHi+65V5G324S04CsmlIVDvUf4RAPMj9LXdnND/oEQYMssYO6k2UC82D0WDkQPN+wolIRlv7\njGBvVz2+F7pfBeD969+j+p5ioCTY0U+w2zQK9kAggsEgYLVlrm9yX8tdAPxmzssshsNEh4dkW6JK\nx+nx4dP4otNy+QBT5rBAvY7fyganRluNRePEogkcWZpZGA1G7mu5i6gY4/W+2ZEh+djXn+18AYD3\ntN2b8XPBIMz0Pi0AwWTC2tSsurl1LsHe5Gpgc80GOn098zZ6xWFva1Mn2COJKC/3HMJmtKXMG3PR\n6xSrJM+p3uCy2NgVDjTvx2ly8HLvoXlljSPd3QhWG+Y6dQla3b5erkxeZ23lKlZW5Vfm+HalJNjR\nJzkH0h2oXaquV7JOswnnjvI2NlSt5erkdS5PXEv9PdIrN69Wm4QRS8T4deeLmAQj97ZkfpFhZh4K\nFWqK5hjuzJ1OHgwosfzZbcPKZnSw7w3CaUfwlIamUmPv8fVxevQ87WWtbKpen/U6R4EF0RSsrW1y\nc+vB3MXhsvlb0nmw7X4Anu96ddbfw12dsuNU5Ty80f8W/liA+1ruxGG2Z7xGL43d5HTKvqcedb6n\nXPNgM9l4oPUAoXiIV5MmNQAxEiE6OICttVW14/TF7oPAzGloOVES7OinsRudTswejyoHaqZ43Uy8\nb+V7AXj6xq9TyRlhjbVRXus/zER4kgPNd6YyLDOhxNMXHPrZnhTs3SoE+5xyAhm/z2Tj/pa78ccC\nPN89I9RmTBDqErSeufk8AI+teHDBk47DaSURF4lG8jdJyePqmDXOhcgWGZROR3kraytXcXnyGlfH\n5MxkKZEg0tONpbEJgzV3Gnw0EePFnoNYjZZ5kVHpWG0mDEZBl2bO1tY2xECA+MR4zmuV9ZDJiaxw\nT/N+XGYnr/a+ntLatTpOu329nB49T6u7ibWVq1TdU0yUBDv6aewgmyHEQID4+MIO1Eg4jpiQcgr2\nFncjd9Rto9c/wMmRs/K93eodp/5YgOe6XsZhsvNQ+/0LXjtz/C4sIsRUVS1XOFQR069GoAE80HqA\nSmsFr/S+zlhoAkmSCHfexFhRgakid1OJ61OdXJy4wpqKlTm74ug1D6kNrjN3eYhcphiFhzveDcC/\nnvwhoiQSHRpEikZTz8rFq72vMx31c6D5TpxmR9brBEE2Sekh2BVnphqHeiAQxeYwY1ygcbjNZE1q\n7WFe6T2U/O6u5LPacz5DlER+dPVpJCQ+uOqRvGrX3+6UBDtgs5sRhMJty6Dezp7LlpjOYysexCgY\neebm88TFOOGebtXFnp7rfJlQPMx729+FY4EXGfQ7uQiCgLW9ndjYKAn/wr0y1ZggACxGCx9Y9TBx\nMc7Prz9LfHKShNerSlsXJZFfXP81AI+tfDDn9XqZIaxNzQgmkzqTlMr1sKqig931O7g52cOhviMz\np5b29pzPGA6O8uuul3CbXTzQeiDn9Ypg18MkBRBRc3LxR3HmWAsga+1ui4sXe15jKDCcMn+q0djf\nGjxOl6+HnbVbWbMMtXUoCXZgRjsp1LYMaY7DHNrJjKaa+/hcY6/m7qa9jIXGefbys0T7+7C1tee0\nJfb7BznUf4QaWxV3N+/P+Rw9Ty6KoMm5wanUVEGOkFlR3s7p0XN0npdjme0qBPvzXa/S6etmR+0W\nVpS357xeL1+DYDJhbWsn0tebMwM1FIgiCLKSkYsPrXoUp8XBMzefw3dD7g9rzeE4FSWRf7/0E+Ji\nnI+s/cCC2rqCw2VBTEhEwvm1jVRIKTs5NrhYNJF0pOdeC1ajhY+u+SBxMc5TF39EuKsLwWrNqewE\nY0GevvEbLEYLH1z1iOp/Q7FREuxJHAU2tVZIFYDKqbHnti2n89iKB6m113DhzKuy4zRH4a9ALMi3\nzn6XhJTgw2veh9mQu7OQXho7gK09Gb+cQ0tTa4oBeQP+8OrHEBA4d+ol+Tk5BHunt5tfd71IhbWc\nj679kJqhF9wuMR1be4ecgZqjMJrib1FjFnBbXHx8ywcJJyIMXz0jtwTMUdnyzYG3ueHtZKtnk+pa\n44rSESgwWcvocmGuqyfcdXPBDFTF9KVG2QHYVruZXXU76J/sITI4ILcEXEDZkSSJH1/7Jf5YgIfb\nH6DSVqHtH1JEFCTYn3vuOR599FHWr1/PhQvqakbfrjicFuJxkVi0MIeZ0eXCVFOT04G6UHJSJmwm\nG7+36XdpnJDHF2/OrpmIksi/XfgBY+EJHmq7n81pRa4WwmwxYjIb9NXYcwl2laYYhbayFt634iEq\nRmQTj9CcvRJfOB7m3y78AEmS+E8bPqpKS4UZwVKojR3SI4Sy29klSSLojy7oMJzL/Sv2s8rVimPE\nR6imbMGWgDemuvj59Wexm2z8b2s+oNqmrOcGZ1+xEjEUWrCEb0CDeVLhI2veR7vfgiBJRBqqFrz2\nmZvPc3ToJK3uplQo8XKlIMG+Zs0avv71r7NrV3G3kQL9Mu1A1lZFv3/BEr4zyUnqF3Gzu5GdIbmS\n43+E3s7YvT0hJvjZtV9xceIKG6rX8sgK9YkXejrMTBWVGMsrcjpQlSbWFqv6XqUPtNxDw6TERJmR\n73U9k7HD0Hhokq+c+iZj4Qne3XYvaypXqv5+XU8uyRPFQoI9GkkQj4ua1oJBMPCJqvswiXDdHeLZ\nzhczXnd54hpfP/0vxMQ4v7vutym3qs/Q1cskBaROmOGb2edB2UDU2NgVHGYHDxnWAfBi4irnxi5m\nvO5g75s83/0KHns1f7T192YVfluOFCTYV6xYQXt7e8Hmi9sBPe3L9lWyQyZ841rWawIabMsKkiTh\nGJwk6rJxnXH++7Gv8ubA28QSMSRJotPbzd8f/0de7XsDj72axzf8DgZB20+smKREsfDf1NbeTnxy\ngrh3Kus1akI+5xIfGcYUjROsr+T06Dn+7uiXOTt6QY6UiYc5N3aRvz/+VXqn+9nXsItHO7RlFerl\nPAW5hK/B4Vjw5JIyy6k0QSiYBuSNPVBXwW+6XuKpiz+k0ys3E58IT/JKzyG+ceZfEZH47OZPsq1W\nW7s3p1OfujkAthXyxhq+OT+LWkFLQEE65UNyiejBWgvfPPtdXuh+lfHQJJIk0Tc9wJMXvs9Prv0S\nt8XFn2z7zLz6QMuR5b1taSC1iHXQ0uwrZcEeun6dsn2ZE4JSha80CLX4xAQJr5eqHTt5fOPd/MeV\nn/H9yz/l+5d/ikEwpOLc72zczftXPpwzCiYTDqc11dRaq8Cdi629g8CZ04S7unBtnV+PRBQlQoEo\ndU3qtUiYMe9s2v4uhhoDHB44xjfPfReb0Uo4IQsho2Dkd9Z+iDsb92gOZzOaDHJTax0EuyAI2No7\nCF68QMLvz9h0PJDH6Q1InYYevPPjdE78hreHTvD20AncZhfTMdlUZTGY+YMtn8oZ4pkJXcOAm5oR\nzGbCndkFeyCPDU6SJELXr2EsL+f37v5j/vncv/H0jd/w9I3f4DQ5CMTlshaNznr+04aPUmPX1rug\nWMkp2B9//HHGMhS1euKJJ7j//oXjohfC43Hnfe9iUN8oCxdBmj22fMYpVmygz2Ih1n0z6/3RcByH\n00J9vfqGu2NXzwFQvXkDWzfdza6Ojfzowq+YCE4RSUSxGM18eOPDrPdof4kVqmuc3LwyitVsKvg3\nMm3byPjTP8cw1IvnATkZJv07/dMRJAkqq5yanjU9JJfCbb1jO19Ys5rf8j3ED889Q59vkFpnNR5H\nNfd27GNVdXte4/Z43JRV2Jn2hnVZp8GN6whevIB1apjKjoZ5nw/2eAGoayjT9LxY900MFgsb9uzm\nHw17OTt8iYOdR7gwcpXtDRvZ0bCZXU1bqXLk5yS0W+UInXhMLGgelHuHV6/Cd/kKVS4TRvv8jFcx\nLp8SW1orqax2qvruyOgoiakpqvbuYf2qjaxs/L95vfsoNya6uTnZTXtVM4+tfTfbGzbm3OBvN5lU\nCDkF+5NPPrkoDx4dzd2YdimJJ731I8PTqbF5PO68x2ltayd4/RpDPSMZF7HPG8JVZtP0/aOnzgOQ\nqGtO3mfmtzs+OG+chcytYJQXf3/fJEZLYUFTiepGEATGz5zH8eD0vHGODcv/bTQZNI158uIVMBoJ\nuqoJj05jxcUn1/zO7IvE/OZBGaPVZmJ0KMbgwBQm8/xKmFoQa5sAGD59gXjzfFv/0IAs2EVJUj3m\nSrtAsKcX+9p1jE/K2ZdNplY+vroV0vb1RABGA/mtB1GUEASYnAjkvabSf3NjcxtcvETf8XM41s0v\n6TAxLhd5C0diqp83ffQMAIaW9uQ9RvbX7GN/zewSvGNjC+dTFPKuLyVqNx/dwh2L3c7u1Cm0S8G2\nchUksyPnEo8liEYSmk0doZs3wGDQVL1OK3ral40OB9bmFsKdNxFj8xuGaAl1VJDicSK9PVhbWjGY\nc8d858tSOlDzsS37Ll8BScK+Kv/TWS4MSkG06cLnANLs7FnmIdVBSsNGGrpxHZgxf5aQKUiwv/TS\nSxw4cIAzZ87wuc99js985jN6jWvJ0VOgAakXLpxceOnkLdB6urE2NauqCZIvelX1U7CvXo0Ui2Us\njKY11BFk+7oUj2PX2MBbK3rOg6miAlNVNaEb1zPGcSvKRKbyzdnwXZCjP+yr1xQ8voXQqyAazETG\nhLI4UIP++R2kchG6cT2ZCLa8qjMWSkHO0wceeIAHHnhAr7HcUowmAza7WZfQLgDbSlk7CV2fHxmT\nj0CL9PUixWIprWex0H2DW72WqVdeJnTtKuzbMeuzlNPQrX4eglfkZsb2Net0GV829HQcAtjXrmX6\nyGGiA/1yL9A0gn456zS9iXUufJcugyBgX7nI68FlZXTITzQSx2or7IRkqqzCWFFB+OYNJEmaZfNO\nxEUi4Tg1deojVsRIhEhPN7aOFRjMy6OlnV6UMk/TcLosuoR2AZjcZZjr6uRFPEdLyycRQ9FycmWc\nFopTx9hlkDV2QBbsc8hHUw1dvSJ/75q1OowuO8qY9BLsjrXyRhRMjj+dgD+C3WGZ10EqG2Isiv/a\ndaytbRhsmcvu6oWe60EQBOwdK0l4vfMqPeZzig13dYIolswwGSgJ9jQcbqvcQShaWG0MBfuKVXK2\n3eDsbDslo1GTQFM01UW0qQLYHMkOQjpkXYKcqGT2eAhdn2+GCE5re5mleJzQtatYGpswlWkLkdSK\ncnIJ6DQP9qRgV35HBSXrVJNA60yao1Yv7lqAxTjBJTf6K7M3uFSoo1P9O6GYOW2LfGopRkqCPQ29\ntVVbMlEpNCdRSUvhK5CbFQcvX8JUVY25tk6XsWXDYBCwOy26vcgA9lVrEIMBgr19s/4e8EcwGAVV\nha9Arr8jRaPY1y6utg76m2LMNR5MlVWErlyZZa/OJ+s0nDTvLbZ9HcDp1i9JCcCxXi5vEbw0O0M0\nmEcsf8lxmp2SYE9D7+O3suDC1+YIdo2mmEhPD2IggGPDhiWpHe10yZUu9Yp0UgSQ7+Lslzngj+J0\nWVX/mxRtVzFrLCZ6a6qCIGBfu5aEf5rowExHpXyyToNXZbOWfdXiC/bUyUWnebA0NWN0uwlcujBr\nfWk1xUiiSOjGdUzV1arq8b/TKAn2NGZqY+ijnVgam+RFfPH87EWs0XmqaDeKtrPYOF3WlDNLD5Tj\nt+/ijBlCFCWC/ogmDW2pHKdAMuzOoFt0EMxsSKErl1J/05p1Koki4RvXsDU2YCpXn9yWL3qfXASD\nAce69SSmpoilFQTT+k5EeroR/f4leyeKjZJgTyNlitEpblcwGHBs3ETC6yXa15v6e9AvF74yW9TF\n6wYvyZUzHeuWSLAnj9+BaX02OHN9A0aXG9/FGYEWDkaRJPWaqhSPE7p+DUtD46Lb1xWcLqu+Jqk1\n8x2oWjX2aH8fYihE2frsPVv1xKljpUsFx/qNAATSzDFaywkEzstZ2M5N2urfvFMoCfY0UuVaddLY\nYWbhKQsRwO+PqDZBiLGoLNCampdEQwP9fQ2CIGBfs4bo2BjRoaFZ36021DHc3YUUiaSckEuBw2kh\nFNSnIBqAubYWU2XlLDu7Vo1dOb2VbVwawa6EYOql7EBmO7tWG3vg/DkQhNQmUWI2JcGeht4CDcCx\ncRMIQkqwJ+Ii4WAspRXnInzjBlI0uqRHTr01dgDnlq0A+M+ckr9bY6ijEua4FPZ1BYfLgiRBKKjn\nBreOxLQvFSml1d/iP30KBIHKnTtyX6wDBoMBu9Osq0nK7PHIkVKXLyEl5JLLWk6xiUCA8I3r2Fas\nxOhUV1PmnUZJsKeRqsmuo8ZucpdhbWsndP0aYjg0I9BUaqpLbV8H/TrnpOPcvFXe4M6clr97WqOm\nelk24yx2/Ho6etuXIS2e/bL8u2rZ4BJ+P6Hr17CtWImlYum6/zhdVgL+iK5lQxzrNyCGQqkG14FA\nRHUHqeClCyBJODdv0W08y42SYE/DaJS1Ez01dkiaYxIJgpcupR291WmqwUsXwGDAsQQhfgrKpqPn\nPJjKy3GvWU3o+jUSfr8mm2oiECB4+RLW1rYlM0dBWv0gHU8ujqRpzn/yhPzdGrJOA+fPgihmLIG8\nmDhcFuKxwruLzfrOpAkleOkCoigSCsRK9nUdKQn2OSyGdpJuZ1eEhBpTTCIYINzZKadML3KGYTqu\nRTDFAFTt3gWiSODc2Rmbqop58J8+CYkE7juWtlNXyiSl48nFXFWFbeUqQlcuE/f5CGrIOvWflk87\nzq3bdRuPGmYK5OnoSF6XPLlcukgoEEs+J/fpTZIkAufPYXS5sbaW6sNkoyTY5+BcBO3E1rECg8NB\n4MI5/Elh6VIj0E6evCVHTovVhNFk0NUkBVC1+w5AtrPPmCByv8z+48cAcO1cWsGu/EZ+nTc4985d\nIElMnzyhOutUiscJnj+L2ePB0pi9z+ti4FgkE6WtYwWhq1fwDcvlBdTMQ7S/j8TUFI6NmxZsXP1O\npzQzc1gM+7JgNOLYsJH42BjTQxOAOk3Vd+RNAMr27Mtxpb4IgiAnKekYCQFgb2nB7PEQPH+OgC+C\n2WLM2es0EQwQuHgBa0srlrrFzbqdy4wTWd95cN0hb3BTx0+qzjoNXrmMGA7j3Lp9SZLU0tGz92k6\n7n37QRQZPymH86oxTwbOlcwwaigJ9jnoHcuu4Eoen729Q7Oek43Y+BihK5exr1mL2ePRdSxqcLqt\nBANREon5ZWbzRRAEnFu3I4bDBLxBVRqa/9QpSCRwLbEZBtJ8DTpr7OaqamwrVjJ1U85tUGNbDiSj\niVzbltYMA/pnZCuU7doDRiMTV+X67K6yhedBkiR8bx0GoxHHpk26jmW5URLsc9C7NoaCa+cdGBxO\npsenEYTcx07fW0cAKNu7X9dxqEV5mUM6hrmBLJhEDISjkioNzX9CNsMstX0dwGQyYrObdDfFgLwe\nIkbZb5Jrk5dEEf/p0xgcjkUvApeJmdr0+s6D0e3GuXkLfp86v1P4+jWi/X24tu/E5F6aJLVipSTY\n57BYx06DxULZnXcRESzYzCzoLJMkCd+RNxFMpluiqcLiRMaAXJ0yXlkLgMO+cMxyIhggcOE81pYW\nLHX1uo5DLU63VXeNHeSNShHsuTT2wNkzxCfGcW3fiWBa+v7zi3WKBSjbt5+ISW66nsvvNHXwFQAq\n7r1P93EsN0qCfQ56t8hLp/yee4kYnViiC/dfjHR1EhsawrV9B0aHQ/dxqGExQv0ABJMJy54DAJgm\nBhe8dvrYMdkMs8RO03ScbiuxaIJoRJ+6OQrm6hrE2mYArGSfY0mSmPj1rwCofM9Duo5BLQ6XFUEA\n/3RY9+92btlGxCr38XQ4sod8xqd9+E8cx9LQuKTZx8VKSbDPYTGSUhSkihpEgxGzf4LIQH/W6xSn\nqXvfrTHDwOKE+ikIa2THl3TzEmI08zyLkQgTv3oawWymbP9duo9BLYsV+gkgJRtbx46+kfWa0LWr\nhG/ewLltO9amJt3HoAaDQcDhshLw6T8HBrOZmKMKczxE5NrlrNf53ngdKR6n/MB9S+48LkZKgn0O\n9mSjCb1NEEDKlmiNB/AefDXjNdGhQbyvH8JYXoFzw61zEC3m8Tuc/EqzfwLf4cxCbfKlF4hPTlL5\n7gcxV1XpPga1KCeXxbCzx9zVAMRPHSHS25PxmolfPwtA1Xsf0f35WnCVWQn49auboyBJEiEs2OIB\nxn72E6T4/JORJIp4XzuIYLFQtv/WKTvFREGC/R/+4R9473vfy/vf/34+//nP4/cvbGIoBpTO7Ho7\nT2FG+7WbJbxvHCLS2zvrc0kUGXryO0ixGLUf+/gtsacqLKbGrmyaNqJMPv+bVL0QhbjPx+RvnsXo\nclP50MO6P18Li1E3R8E/HUEQwBIPMfqTH837PNLbQ/D8Wexr1t7yZhIutxVRlHR3pkfCcRIJCWeZ\njUh3FxPP/XreNVMvv0hsbBT37r0YHaXaMGooSLDfddddPPvsszz99NO0tbXxzW9+U69x3VIcLquu\njSYUFOHg2b0NKRql/2tfJj41lfp88sXnCd+4jnvXbjmJ5RaSciIvgkBTNovqbZuIjY4y+sMfzGqb\nN/7M04jhMFXve/8t8zEoLKZgD/giuMpsODdsIHjh/KwKoHHvFEPffRK49do6LF6yljJpmH0fAAAa\nGklEQVSvVWtXYKyoYPyZp4mklbj2nznN6I/+A2N5OdXve7+uz17OFCTY9+/fn4ru2LZtG0PJkqzF\njtNlIREXCQVjun6vsohrNq+j5kMfJj4xQf/XvkLg/Dkmnv8N47/4GUa3m9qPfULX5+aDEuq3GCYp\nZR4aH3svloZGpl55iYGvfYXg1Sv0f+0reF99GXNdHRX33Kv7s7WSEmg6z0MiIRLwR3G5rdR8+CMg\nCAz809cY/fF/ELx0kZ6//SsiXZ2U7b8zVV/mVuJMxpj7dbazKxuFu8pJ3Sc/BYkEg//yTbyHXmP6\nxDEGv/UNBLOZpj/5Auaqal2fvZzR7az/k5/8hEceufWahR64ymwA+KZCGC36uSHSC4BVvPcRosPD\n+N58nf6vfEm+QBCo/cSnMLrduj2zEBwuK36f/pEQQX8Um92EraaKlv/z/2Hwm/9E4NxZAufOAnIr\nvdqPfeKWmqIUUmGfOgs0xTnvKrNia22j/vd+n7Gf/pjJ559j8vnnAKj+4G9R9fCjt4WzcLGcyOm1\nk1ybtlF+zwG8h15j+KknU9c0/OGfYOtYoetzlzs535zHH3+csbGxeX9/4oknuP/++wH4xje+gdls\n5rHHHlP9YI/n9hBemahvLOP8yX68UyHWbtQvfjoWkW3JbR3VWG1map74Y3qb5DR5Z1srrlUrsdXn\n97zFmM/KagcTowHKy+w5U//V4vG4CQailFfak2N2U/fXf0H3975P4GYnTR98P+Vbt9xSYZY+l5Ik\nYbYYiYRius5xKOmU9tSV4fG48Tz2IB0P3sfwiy8x+tobNH3wfVTv26t6nItNJCg7NRNxUfNzF7pe\nTMjmzqaWSjweNzX/+fP4H32IYE8Pwd4+XCtX4jlwd/4D12mcxUbOt/XJJ59c8POf//znvPbaazz1\n1FOaHjw6Oq3p+qVEMMpCxTcZ0nWck+MBzBYjvukwJGOCHe95FAAJmAam83iex+NelPlUmh50dY5T\nWV24rdvjcTPQP0UkHMdqM80as/PhD+AEYsDY2K1zwmeaS4fLwtSUvmuht2cSAKNZmPW9pt1307D7\nbkQWfkcW6zfPRizp4B4dntb03FzjHB2SP4snEjPXVTVgqGrAtW2PfM0S/DuXej7zRe3mU5Cd4dCh\nQ3z729/mG9/4BhaL+qbEtzvKsdM7FdL1ewMamzffapyL0CpwOmnaUcxdxYDLbSUcjJGI61c3J6Ch\nyuftgMNpwWAQdDfF+DWUsS6hnoLO13/zN39DLBbj05/+NABbt27lL//yL/UY1y1FKUbkndRPsMfj\nCcKhONW1Lt2+c7FZjIgQxWbvLi8ewZ6ejVxWoU9dfH9qgysOgSYnKVkWJSrGZjdhNqtr7F5CHQUJ\n9hdeeEGvcdxWKCnUemrsWhpL3C4ojkM9X+Zpb1JTLRKBBmkRIdN6CnZlHopng3O5rQwP+BBFCYNB\nHx+IPKfFMwfFQinzNAMGg4DLbcWno8ZejEdOd1Lo6Bnipphi3MUk0Bahbo5/OoLJbMBqu/WRP2px\nlVmRJHRrbB2NxIlFE0VjjiomSoI9C64yG9O+sG71yFM2VZV9HW8HFG1y2qtfyGNRmmIWySTlcltv\ni1BGtSjzoFcIbDEqO8VCSbBnwVWe1E50SkyZidctHuep1WbCYjWltGw9mPZGVNWjv53Q2yQVi8n+\nlmIywwC43PJ49drgis2BXEyUBHsWUtqqTkJN0XqLSVMFKCu3Me0N61Zewe8L43RbMRqLZ+nNJOfo\nu8kXm0Bz6Zx9qrbBRgntFM/btcS4dV7ExSrYXeVW4jGRcKjw8gpiQiQwHSk6TdWuc6hfSqAVkQMZ\n0kwxemvsRTYPxUBJsGfBlXIc6qOx+7xhLFYjVlv2ZgK3I8pGpMcG5/OGkaSZTbNYUJp769Vowl+E\nDmSYEcC6bXAlG/uiURLsWdDz2ClJEtPeMGXl+oTKLSXuVN2cwoWaEj7qKrJTC8gbXGA6qkuS0kyo\nY3EJNLtDPrnodYpN+Z2KKKCgWCgJ9iwojiI9NPZwKEY8JhadGQbSNXYdBHsyfLTYNHYgFb+uh8/F\nX6Q2doNB55PLdASL1ahbHaISM5QEexasNhNWm4lpHbSTYrWvw8yY9Qh59E4GgeJKylFwV+h3cim2\nrNN0nGU2gv4ooljYyUWSJDnkswjXQjFQEuwLUF5h10VTTQn2Isyw01ewh2Z9ZzFRVj5TyrlQ/NMR\nrDYTZkvxaaoutz5hwJFwnGgkUco6XSRKgn0ByirtRCMJIuHCOtQrWl4xCjRZABl1MUEUsynGrZhi\nCtzgZE01UnRmGAW9fE/KWtCrREOJ2ZQE+wKUJxddoTZFRRiUFaFgFwQBV5lVN429WDXVMp1MMak0\n+iLc3CDNmV7gelBOPiWNfXEoCfYFKK9MCvYCtZNitrGDvCEVenKRJAnvVKho58DhtGA0GZj2FmaK\nmYlhL855KEu+E4XWUVI2yJLGvjiUBPsCpDT2As0QPm84lZ5fjLh0iIwJh2JFrakKgoC73Fawxp4K\ndSxSU0x5pbwWCq18OqOxlwT7YlAS7AugaCeFRMYoMezFqqmCPsdvRaAVW1JOOmXlNiLheEEnF8W2\nrJwGiw1XmQ1B0FFjL+L34namJNgXQA+NPZTsvFPMtsRULHsBgl0xRxVzeJvyGxZijil2wW40GnCX\n23TR2F1lVoymkghaDEqzugDu8qR2UsDxu9jt66BPyGOqDnt5cZogANzJzOFC1oMSy1+sgh3ksYcC\nMaKR/E4uibiI3xcpaeuLSEmwL4DRaKCswo53In/tRLElLgvBXsDJxZ/snFTM86BHZIx3MoTdaS5a\nfwvM2MXznQdlHZXs64tHSbDnoKLKTjgUy7u64XLQ2O0OczIiJH9fQzE2sZ7LzMklv40+kRCZ9oYp\nr3ToOawlRzlt5NsTuBTquPgUpDZ89atf5eWXX8ZgMFBdXc0Xv/hFPB6PXmO7LSivcsCNCbyTIWx2\n7ZUZZ2LYi1c7EQQBd4Gx7FMTQSxWE3ZHcVW3TCelqeY5D9PJ6pbFbIaBdI09T8E+mXwninwebmcK\n0tg/85nP8Mtf/pJf/OIX3HvvvXz961/Xa1y3DRVV8uKbGg/mdf+Mxl68tmWQtVUlZFEroijhnQxR\nU+sqqlZwc0nVD8rTBKGY9IpesCshjwVr7MU9D7czBQl2p9OZ+u9QKITBsPwsOxVV8rF5ajI/we7z\nhrHZzUWZbZmOklKfz8s87Q0jJiSqa525L77NcZfbknXltXeUUtaQoiwUKwVr7KnkpJIpZrEoWNp8\n+ctf5umnn8btdvPUU0/pMabbivKkYM/HgSpJEn5vmOpal97DWnIqq+V5mBwPUFOn7d+jnHZqlsE8\nlFXYGBv2EwxENdcR9y2T+ihmsxGny5J3LLtvKoTZYszLtFlCHTkF++OPP87Y2Ni8vz/xxBPcf//9\nPPHEEzzxxBN861vf4nvf+x6f//znVT3Y43FrH+0toL2jGrPFiN8X0TxmnzdEIiFRU+ta9H/vYn9/\n+4oa3uQ60VBC87OuXxgBWJJ50IOFxljXUM7NK2MYMWj+twT9sgN+5eparLbCT3C3ci6ra130dE5Q\nWenAZDIueG36OCVJwucNU1XjpLa2bLGHqYliWJtqybm6nnzySVVf9Oijj/IHf/AHqgX76Oi0qutu\nJR6Pm7ExP+UVdsZH/YyM+DTZiHs7JwCwuyyL+u/1eNyLPp8Gs/zv7uuZ1Pysvp5JAKprF3+chZJr\nLk0W2dzY0zWOzaVN4xwdnsbhtOCbDkGB07AUv/lCOJwWkODm9bHUaS4Tc8cZDESJRRM4Fvmd0Mqt\nnk+1qN18CjKKd3d3p/775ZdfZsWKFYV83W1LeZWdeEzU3OtxYjQAQLWn+G3LTpcFs8XI5HhA871T\n40EEAapqijvMD/KPZU8kRPy+cNE7ThXyLQZWcpwuDQWdB7/0pS/R2dmJwWCgsbGR//bf/pte47qt\nSDlQJ0Ka4rAnxmQhWFlT/IJdEAQqaxyMDfkRRVGTo3xyIoi73JbzyF4MKGtB6wbnm1oeoY4KqVh2\njQ7UkuN0aShIsP/jP/6jXuO4rSlXQh4ngjS3V6q+b2IsgMEgLJuXubLaycjANN7J8ILH73TCoRjh\nYIy6huVhv3SX2zBbjIyPaBPsqVICRR4Ro1Be0thva5ZffOIiUJFHZIwkSUyOBamodmA0Lo9pTkXG\njKkXalMTyRA/lRvB7Y4gCFTXOpmaCBKPq4/pXy4x7AqKxq1VY1fWTrGHfN7uLA+Js8ikkpQ0xLL7\nfRFi0QRVy8AMo1BZo5gh1M+DEuqobI7LgSqPC0mCyTH186AIwOUi2K02M1abSXNew9iwH4vVVNQl\nNoqBkmBXgdVmxuYwa9LYFcfpcnAYKlRWy5uUFvuysgksF40dZpzhym+shuWmsYN8gvNNhojH1J1c\nYtEEUxMhauqKOwO5GCgJdpVUVNnxTYVIJERV1yuO06plEBGj4C63YTQZNGmqisau1iZfDCgJZ+Oj\nftX3eCdDOFyWos9ATsdT70aSYFzlBqfM13JIVLvdKQl2lVRUOpAk9WFuyykiRsFgEKiosjM1HlSd\nUj81EcRqMy2rLEPFvKbWgRoJx5n2qnc4Fws19bJDfHRIXfz32LAs2Ks1Zi6X0E5JsKskPTJGDROj\nAYwmw7Lz/lfWOInHRVWVHhMJEd9UmIpqx7I6elttJtxlVtWmGEXw1TbcXpmWheKplwW0WsE+PlLS\n2JeKkmBXiWJSGVOxiEVRYmo8SGW1A4Nh+Qg0SK8Zk3uD802FEEWJymXkOFWoqnURDEQJBaM5rx0Z\n9AFQu0xCPhUqqx2YTAZNGrvBIKSc8CUWj5JgV0ldo6xtDfX7cl477Q0Rj4vLKiJGIeVAVWFnV65Z\nTo5TBcWBqsYcMzwgrxllDS0XDAYD1XUuJsdyh36Kosj4aICqGueyCf+9nSnNsErsDgsVVXaGB3yI\n4sL25YlRWaAtJ8epwkzIo3qBphzZlxNqHaiSJDEyMI3TbcHpLu6a/Jnw1LkRRSnnBuedCJGIiyX7\n+hJREuwaqG8qJxZN5EzQmXGcLj9NtbzSjsEgpOylCzHQO4XBIFDXWL4EI1taUiGPOQRaYDpCMBBd\ndvZ1BbV29jHFvl4S7EtCSbBroK5ZMcd4F7wuFeq4DE0xRqOB2sYyxob9RMLZ+8DGonHGhvx46t2Y\nLcVfI2Yu5VV2jEYhZ6jfyKDiOF1e9nUFj8rIGCUipuQ4XRpKgl0D9U2y5jnUl93OLkkSw33eZZ1d\n19xWgSTBQM9U1muG+mWTVUPL8tPWQbYvV9Y4mRgLLGiaW672dYXKGtmBOja08AkuFepYEuxLQkmw\na6Cy2oHFalpQY58cDzLti9C6onJZhfiloxRC6+uazHrNYK88R40tFUsypltBtcdJIi4u6G9QNHZF\ns11uGAwGqmtdTIwFsjpQJUlibMSPu9ymS4ORErkpCXYNCIJAfVMZvqkwwUDmMLeeG+MAtKyoXsqh\nLSm1jWWYzAb6urNr7AO98mf1zctTYwdobJM3uO7r4xk/F0WJ0aFpKmtkhWC54ql3IYpS1rj+oD9K\nOBgr2deXkJJg10h9k3ykHs4S9thzU+6a1LqiasnGtNQYjQYaWyuYGg/iz9B8JB5PMDLgo6bOtaw1\ntPZV1QgCdF6d3zoS5HIKsWhi2TpOFXLZ2ZV3Yrmao25HSoJdI3WKnT2DOSYaiTPY68VT75Jbhy1j\nmpPaan8Gc8zIwDSJxPK1ryvY7GYaWysYGZzOuMEp9vXl6jhV8CT/fdlMc9cuDgOwan3tko3pnU5J\nsGukrtGNIGROVOrrmkQUJVqXsRlGoSkp2Pu657/Mg0kzzHK2ryt0rKkBoCuD1t6f7PW63DXVqhon\nVR4nXdfG55kofd4Q/d1T1DeXL9tggtuRkmDXiNliorrWxeigj3BodrhfygyzcvmaYRSqa53YHGb6\nuybnFQQbSDpOl7vGDtCxWhbsN6+Ozvq7fzrCjUujVFTZl71tWRAENmxrQBQlrpwbmvXZhdMDAKze\nUNLWlxJdBPt3vvMd1q1bx9RUdmfacmLNxjoSCYmTR2aaeUuSRM/NcWx207K3qYL8Mje3VRDwR2cV\nRgsGogz1e6msdmB3LG9zFICrzEZtg5uBnqlZG/25432IosTWPS3LNjoqnTUb6zCaDFw6Mzhroz9/\nsh+DQWDlOs8tHN07j4IF+9DQEIcPH6axsVGP8RQFm3Y04S6zcu5Ef6rK4cRogMB0lJaOqmVX+Csb\nTcmwx7PH+1N/e/2Fa8RjIhu3v3PWQ8eaGiRpJjomEo5z8fQADqeFNRvrbvHolgarzcyqdR68k7Lp\nBeTQ38E+Ly0dle+ITf52omDB/nd/93f82Z/9mR5jKRqMJgO77ulATEgcfb0T72SIF56+CEB78mj+\nTmD1+lqqPE4unhrg3Ik+blwe4eaVUeqby9m0s+lWD2/J6Fgja6PnT8kb/cUzA0QjCTbf0YTJtPyy\nbrOxYZu8mV86M5A0ywwCsGrDO2Nzu50oKBbtlVdeoaGhgbVr1+o1nqJhzca6/7+9u4tpMkvjAP6v\ntIDDOKaK06DD6CwOG4gFRhPdgURtbeSjVlFRboymDUZvrCB+hKJGA8aAqJekxAjRZDTK2myI0Wym\nWiEIIsYFN6Q6bHAcjAVRMhSj9OvZC9dO2NJqzOgp5fndnSYn+acfT09P3/c56Or4DY/+PYBfe19g\n7I0H6Uu/mVI/OWXRUuQVKPH3c/fQ+nMvZNFSREmnQZX31ymx/fCOfPYXSPxOjt/6hvGT+Q6ipNMg\ni46aUr9aAEAx7yvI47/Af+zP8fiXFng8Psiio/Dd95F/MUG4eW9h1+v1GBoK/Me/uLgYZrMZZ8+e\n9T/2oafqRAKJRIK/rfwLrl56ALfLixU5yf4Vy1QyY2Yscjcq8Y+f/gXXmAc/qpIi6uDqD5W3KQ29\nPQPobP0Vvw+/RvrSRMTERs6pUR9CIpFg8Y/z0fLPX/DVzFjMmhOHH5Z+G1HHAU4WEvrIavzo0SPo\n9XrExsa+7Y8yMACFQoHLly9j9mz+hmaMMVE+urD/P7VaDYvFgpkzI/8SN8YYC2d/2nXsEolkSm3F\nMMZYuPrTVuyMMcbCA995yhhjEYYLO2OMRRgu7IwxFmGEFXa73Y7CwkLk5+ejoKAADx48EBXlvc6f\nP4+cnBzodDrU1NSIjhNUuPfsqa6uRm5uLtatW4ddu3ZhdPT9B2J/Ts3NzcjJyUF2djbq6upEx5mQ\nw+HA1q1bkZeXB51Oh3PnzomOFJTP58P69euxc+dO0VGCcjqdMBqNyM3NhVarRVdXl+hIE2poaMCa\nNWug0+lQWloKl2vig378SBCDwUAtLS1ERGSz2WjLli2iooTU3t5Oer2e3G43ERG9ePFCcKKJPXv2\njAwGA6lUKhoeHhYdZ0Ktra3k9XqJiOjEiRNUU1MjONEfvF4vaTQa6u/vJ5fLRWvXrqXe3l7RsQIM\nDg5ST08PERGNjo7S6tWrwzInEVF9fT2VlpbSjh07REcJ6sCBA9TY2EhERG63m5xOp+BEgRwOB6nV\nahobGyMiot27d5PFYgk5R9iKXSKRwOl8e+KK0+mEQhGe/SQuXLiA7du3Qyp9e/fcrFnh2ZJ3MvTs\nyczMxLRpb99yGRkZcDgc75nx+XR3d2P+/PmYN28eZDIZtFotrFar6FgB5syZg5SUFABAXFwckpKS\nMDg4KDhVIIfDgVu3bmHTpk2iowQ1OjqKzs5ObNy4EQAglUrx5Zfh2WLZ5/Ph9evX8Hg8ePPmDb7+\nOnQbZGH3+paVlaGoqAhVVVUgIly8eFFUlJAeP36Mzs5OnD59GjExMdi/fz+USqXoWONMxp49jY2N\n0Gq1omP4DQwMICEhwT9WKBRhvT0IAP39/bDb7UhLSxMdJcC7hca7xVs46u/vh1wuR1lZGex2OxYt\nWoTy8nLExobXgSAKhQJ6vR4rV67E9OnTkZWVhczMzJBzPmlhD9ZnpqSkBLdv30Z5eTk0Gg2uX78O\nk8mE+vr6TxknqFD9cLxeL0ZGRnDp0iV0d3ejuLhYyEpusvTsCfWaq9VqAEBtbS1kMhl0Ot3njheU\nyOfsY7x69QpGoxEmkwlxcXGi44xjs9kQHx+PlJQU3LlzR3ScoDweD3p6enD48GEolUocO3YMdXV1\nMBqNoqONMzIyAqvVips3b2LGjBkwGo1oamoK/fn55BtEQSxZsmTcePHixYKShFZUVEQdHR3+sUaj\noZcvXwpMNN7Dhw8pMzOT1Go1qVQqSk1NJZVKRUNDQ6KjTejKlStUWFjo3y8MF/fv3yeDweAfm81m\nMpvNAhMF53a7yWAwUENDg+goEzp58iStWLGC1Go1ZWVlUUZGBu3bt090rADPnz8ntVrtH9+9ezcs\n/w+4du0alZeX+8cWi4WOHj0aco6wPXaFQoGOjg4AQFtbGxYsWCAqSkgajQZtbW0AgL6+Png8Hsjl\ncsGp/pCcnIzW1lZYrVbcuHEDCoUCFoslLBuxNTc348yZM6itrUV0dHgdvKBUKvHkyRM8ffoULpcL\nV69exapVq0THmpDJZMLChQuxbds20VEmtGfPHthsNlitVpw6dQrLli1DdXW16FgB4uPjkZCQgL6+\nPgBAe3s7kpKSBKcKNHfuXHR1dWFsbAxE9EE5he2xV1RUoLKyEj6fDzExMaioqBAVJaQNGzbAZDJB\np9NBJpOhqqpKdKSQwrlnT2VlJdxuNwwGAwAgPT0dR44cERvqf6KionDo0CEYDAYQEQoKCsLyQ37v\n3j00NTUhOTkZ+fn5kEgkKCkpwfLly0VHm5QOHjyIvXv3wuPxIDExEcePHxcdKUBaWhqys7ORn58P\nqVSK1NRUbN68OeQc7hXDGGMRhu88ZYyxCMOFnTHGIgwXdsYYizBc2BljLMJwYWeMsQjDhZ0xxiIM\nF3bGGIswXNgZYyzC/Be68EGj7hfMcwAAAABJRU5ErkJggg==\n", - "text/plain": [ - "\u003cmatplotlib.figure.Figure at 0x7f385e198650\u003e" - ] - }, - "metadata": { - "tags": [] - }, - "output_type": "display_data" - } - ], - "source": [ - "def f(x):\n", - " return tf.square(tf.sin(x))\n", - "\n", - "def grad(f):\n", - " return lambda x: tfe.gradients_function(f)(x)[0]\n", - "\n", - "x = tf.lin_space(-2*pi, 2*pi, 100) # 100 points between -2π and +2π\n", - "\n", - "import matplotlib.pyplot as plt\n", - "\n", - "plt.plot(x, f(x), label=\"f\")\n", - "plt.plot(x, grad(f)(x), label=\"first derivative\")\n", - "plt.plot(x, grad(grad(f))(x), label=\"second derivative\")\n", - "plt.plot(x, grad(grad(grad(f)))(x), label=\"third derivative\")\n", - "plt.legend()\n", - "plt.show()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "-39gouo7mtgu" - }, - "source": [ - "## Gradient tapes\n", - "\n", - "Every differentiable TensorFlow operation has an associated gradient function. For example, the gradient function of `tf.square(x)` would be a function that returns `2.0 * x`. To compute the gradient of a user-defined function (like `f(x)` in the example above), TensorFlow first \"records\" all the operations applied to compute the output of the function. We call this record a \"tape\". It then uses that tape and the gradients functions associated with each primitive operation to compute the gradients of the user-defined function using [reverse mode differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation).\n", - "\n", - "Since operations are recorded as they are executed, Python control flow (using `if`s and `while`s for example) is naturally handled:\n", - "\n" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "MH0UfjympWf7" - }, - "outputs": [], - "source": [ - "def f(x, y):\n", - " output = 1\n", - " for i in range(y):\n", - " output = tf.multiply(output, x)\n", - " return output\n", - "\n", - "def g(x, y):\n", - " # Return the gradient of `f` with respect to it's first parameter\n", - " return tfe.gradients_function(f)(x, y)[0]\n", - "\n", - "assert f(3.0, 2).numpy() == 9.0 # f(x, 2) is essentially x * x\n", - "assert g(3.0, 2).numpy() == 6.0 # And its gradient will be 2 * x\n", - "assert f(4.0, 3).numpy() == 64.0 # f(x, 3) is essentially x * x * x\n", - "assert g(4.0, 3).numpy() == 48.0 # And its gradient will be 3 * x * x" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "aNmR5-jhpX2t" - }, - "source": [ - "At times it may be inconvenient to encapsulate computation of interest into a function. For example, if you want the gradient of the output with respect to intermediate values computed in the function. In such cases, the slightly more verbose but explicit [tf.GradientTape](https://www.tensorflow.org/api_docs/python/tf/GradientTape) context is useful. All computation inside the context of a `tf.GradientTape` is \"recorded\".\n", - "\n", - "For example:" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "bAFeIE8EuVIq" - }, - "outputs": [], - "source": [ - "x = tf.ones((2, 2))\n", - " \n", - "# TODO(b/78880779): Remove the 'persistent=True' argument and use\n", - "# a single t.gradient() call when the bug is resolved.\n", - "with tf.GradientTape(persistent=True) as t:\n", - " # TODO(ashankar): Explain with \"watch\" argument better?\n", - " t.watch(x)\n", - " y = tf.reduce_sum(x)\n", - " z = tf.multiply(y, y)\n", - "\n", - "# Use the same tape to compute the derivative of z with respect to the\n", - "# intermediate value y.\n", - "dz_dy = t.gradient(z, y)\n", - "assert dz_dy.numpy() == 8.0\n", - "\n", - "# Derivative of z with respect to the original input tensor x\n", - "dz_dx = t.gradient(z, x)\n", - "for i in [0, 1]:\n", - " for j in [0, 1]:\n", - " assert dz_dx[i][j].numpy() == 8.0" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "DK05KXrAAld3" - }, - "source": [ - "### Higher-order gradients\n", - "\n", - "Operations inside of the `GradientTape` context manager are recorded for automatic differentiation. If gradients are computed in that context, then the gradient computation is recorded as well. As a result, the exact same API works for higher-order gradients as well. For example:" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "cPQgthZ7ugRJ" - }, - "outputs": [], - "source": [ - "# TODO(ashankar): Should we use the persistent tape here instead? Follow up on Tom and Alex's discussion\n", - "\n", - "x = tf.constant(1.0) # Convert the Python 1.0 to a Tensor object\n", - "\n", - "with tf.GradientTape() as t:\n", - " with tf.GradientTape() as t2:\n", - " t2.watch(x)\n", - " y = x * x * x\n", - " # Compute the gradient inside the 't' context manager\n", - " # which means the gradient computation is differentiable as well.\n", - " dy_dx = t2.gradient(y, x)\n", - "d2y_dx2 = t.gradient(dy_dx, x)\n", - "\n", - "assert dy_dx.numpy() == 3.0\n", - "assert d2y_dx2.numpy() == 6.0" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "4U1KKzUpNl58" - }, - "source": [ - "## Next Steps\n", - "\n", - "In this tutorial we covered gradient computation in TensorFlow. With that we have enough of the primitives required to build an train neural networks, which we will cover in the [next tutorial](https://github.com/tensorflow/models/tree/master/official/contrib/eager/python/examples/notebooks/3_neural_networks.ipynb)." - ] - } - ], - "metadata": { - "colab": { - "collapsed_sections": [], - "default_view": {}, - "name": "Automatic Differentiation", - "provenance": [], - "version": "0.3.2", - "views": {} - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} diff --git a/tensorflow/contrib/eager/python/examples/notebooks/3_datasets.ipynb b/tensorflow/contrib/eager/python/examples/notebooks/3_datasets.ipynb deleted file mode 100644 index d268cbcd91..0000000000 --- a/tensorflow/contrib/eager/python/examples/notebooks/3_datasets.ipynb +++ /dev/null @@ -1,209 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "U9i2Dsh-ziXr" - }, - "source": [ - "# Eager Execution Tutorial: Importing Data\n", - "\n", - "This notebook demonstrates the use of the [`tf.data.Dataset` API](https://www.tensorflow.org/guide/datasets) to build pipelines to feed data to your program. It covers:\n", - "\n", - "* Creating a `Dataset`.\n", - "* Iteration over a `Dataset` with eager execution enabled.\n", - "\n", - "We recommend using the `Dataset`s API for building performant, complex input pipelines from simple, re-usable pieces that will feed your model's training or evaluation loops.\n", - "\n", - "If you're familiar with TensorFlow graphs, the API for constructing the `Dataset` object remains exactly the same when eager execution is enabled, but the process of iterating over elements of the dataset is slightly simpler.\n", - "You can use Python iteration over the `tf.data.Dataset` object and do not need to explicitly create an `tf.data.Iterator` object.\n", - "As a result, the discussion on iterators in the [TensorFlow Guide](https://www.tensorflow.org/guide/datasets) is not relevant when eager execution is enabled." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "z1JcS5iBXMRO" - }, - "source": [ - "# Setup: Enable eager execution\n" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "cellView": "code", - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "RlIWhyeLoYnG" - }, - "outputs": [], - "source": [ - "# Import TensorFlow.\n", - "import tensorflow as tf\n", - "\n", - "# Enable eager execution\n", - "tf.enable_eager_execution()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "H9UySOPLXdaw" - }, - "source": [ - "# Step 1: Create a source `Dataset`\n", - "\n", - "Create a _source_ dataset using one of the factory functions like [`Dataset.from_tensors`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#from_tensors), [`Dataset.from_tensor_slices`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#from_tensor_slices) or using objects that read from files like [`TextLineDataset`](https://www.tensorflow.org/api_docs/python/tf/data/TextLineDataset) or [`TFRecordDataset`](https://www.tensorflow.org/api_docs/python/tf/data/TFRecordDataset). See the [TensorFlow Guide](https://www.tensorflow.org/guide/datasets#reading_input_data) for more information." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "cellView": "code", - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "WPTUfGq6kJ5w" - }, - "outputs": [], - "source": [ - "ds_tensors = tf.data.Dataset.from_tensor_slices([1, 2, 3, 4, 5, 6])\n", - "\n", - "# Create a CSV file\n", - "import tempfile\n", - "_, filename = tempfile.mkstemp()\n", - "with open(filename, 'w') as f:\n", - " f.write(\"\"\"Line 1\n", - "Line 2\n", - "Line 3\n", - " \"\"\")\n", - "ds_file = tf.data.TextLineDataset(filename)\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "twBfWd5xyu_d" - }, - "source": [ - "# Step 2: Apply transformations\n", - "\n", - "Use the transformations functions like [`map`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#map), [`batch`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#batch), [`shuffle`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#shuffle) etc. to apply transformations to the records of the dataset. See the [API documentation for `tf.data.Dataset`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) for details." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "cellView": "code", - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "ngUe237Wt48W" - }, - "outputs": [], - "source": [ - "ds_tensors = ds_tensors.map(tf.square).shuffle(2).batch(2)\n", - "ds_file = ds_file.batch(2)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "IDY4WsYRhP81" - }, - "source": [ - "# Step 3: Iterate\n", - "\n", - "When eager execution is enabled `Dataset` objects support iteration.\n", - "If you're familiar with the use of `Dataset`s in TensorFlow graphs, note that there is no need for calls to `Dataset.make_one_shot_iterator()` or `get_next()` calls." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "base_uri": "https://localhost:8080/", - "height": 153 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 388, - "status": "ok", - "timestamp": 1525154629129, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "lCUWzso6mbqR", - "outputId": "8e4b0298-d27d-4ac7-e26a-ef94af0594ec" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Elements of ds_tensors:\n", - "tf.Tensor([1 9], shape=(2,), dtype=int32)\n", - "tf.Tensor([16 25], shape=(2,), dtype=int32)\n", - "tf.Tensor([ 4 36], shape=(2,), dtype=int32)\n", - "\n", - "Elements in ds_file:\n", - "tf.Tensor(['Line 1' 'Line 2'], shape=(2,), dtype=string)\n", - "tf.Tensor(['Line 3' ' '], shape=(2,), dtype=string)\n" - ] - } - ], - "source": [ - "print('Elements of ds_tensors:')\n", - "for x in ds_tensors:\n", - " print(x)\n", - "\n", - "print('\\nElements in ds_file:')\n", - "for x in ds_file:\n", - " print(x)" - ] - } - ], - "metadata": { - "colab": { - "collapsed_sections": [], - "default_view": {}, - "name": "Eager Execution Tutorial: Importing Data", - "provenance": [], - "version": "0.3.2", - "views": {} - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} diff --git a/tensorflow/contrib/eager/python/examples/notebooks/3_training_models.ipynb b/tensorflow/contrib/eager/python/examples/notebooks/3_training_models.ipynb deleted file mode 100644 index 84f1d031d4..0000000000 --- a/tensorflow/contrib/eager/python/examples/notebooks/3_training_models.ipynb +++ /dev/null @@ -1,485 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "k2o3TTG4TFpt" - }, - "source": [ - "# Training Models\n", - "\n", - "In the previous tutorial we covered the TensorFlow APIs for automatic differentiation, a basic building block for machine learning.\n", - "In this tutorial we will use the TensorFlow primitives introduced in the prior tutorials to do some simple machine learning.\n", - "\n", - "TensorFlow also includes a higher-level neural networks API (`tf.keras`) which provides useful abstractions to reduce boilerplate. We strongly recommend those higher level APIs for people working with neural networks. However, in this short tutorial we cover neural network training from first principles to establish a strong foundation." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "3LXMVuV0VhDr" - }, - "source": [ - "## Setup" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "PJ64L90aVir3" - }, - "outputs": [], - "source": [ - "import tensorflow as tf\n", - "tf.enable_eager_execution()\n", - "tfe = tf.contrib.eager # Shorthand for some symbols" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "eMAWbDJFVmMk" - }, - "source": [ - "## Variables\n", - "\n", - "Tensors in TensorFlow are immutable stateless objects. Machine learning models, however, need to have changing state: as your model trains, the same code to compute predictions should behave differently over time (hopefully with a lower loss!). To represent this state which needs to change over the course of your computation, you can choose to rely on the fact that Python is a stateful programming language:\n" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "VkJwtLS_Jbn8" - }, - "outputs": [], - "source": [ - "# Using python state\n", - "x = tf.zeros([10, 10])\n", - "x += 2 # This is equivalent to x = x + 2, which does not mutate the original\n", - " # value of x\n", - "print(x)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "wfneTXy7JcUz" - }, - "source": [ - "TensorFlow, however, has stateful operations built in, and these are often more pleasant to use than low-level Python representations of your state. To represent weights in a model, for example, it's often convenient and efficient to use TensorFlow variables.\n", - "\n", - "A Variable is an object which stores a value and, when used in a TensorFlow computation, will implicitly read from this stored value. There are operations (`tf.assign_sub`, `tf.scatter_update`, etc) which manipulate the value stored in a TensorFlow variable." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "itxmrMil6DQi" - }, - "outputs": [], - "source": [ - "v = tfe.Variable(1.0)\n", - "assert v.numpy() == 1.0\n", - "\n", - "# Re-assign the value\n", - "v.assign(3.0)\n", - "assert v.numpy() == 3.0\n", - "\n", - "# Use `v` in a TensorFlow operation like tf.square() and reassign\n", - "v.assign(tf.square(v))\n", - "assert v.numpy() == 9.0" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "-paSaeq1JzwC" - }, - "source": [ - "Computations using Variables are automatically traced when computing gradients. For Variables representing embeddings TensorFlow will do sparse updates by default, which are more computation and memory efficient.\n", - "\n", - "Using Variables is also a way to quickly let a reader of your code know that this piece of state is mutable." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "BMiFcDzE7Qu3" - }, - "source": [ - "## Example: Fitting a linear model\n", - "\n", - "Let's now put the few concepts we have so far ---`Tensor`, `GradientTape`, `Variable` --- to build and train a simple model. This typically involves a few steps:\n", - "\n", - "1. Define the model.\n", - "2. Define a loss function.\n", - "3. Obtain training data.\n", - "4. Run through the training data and use an \"optimizer\" to adjust the variables to fit the data.\n", - "\n", - "In this tutorial, we'll walk through a trivial example of a simple linear model: `f(x) = x * W + b`, which has two variables - `W` and `b`. Furthermore, we'll synthesize data such that a well trained model would have `W = 3.0` and `b = 2.0`." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "gFzH64Jn9PIm" - }, - "source": [ - "### Define the model\n", - "\n", - "Let's define a simple class to encapsulate the variables and the computation." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "_WRu7Pze7wk8" - }, - "outputs": [], - "source": [ - "class Model(object):\n", - " def __init__(self):\n", - " # Initialize variable to (5.0, 0.0)\n", - " # In practice, these should be initialized to random values.\n", - " self.W = tfe.Variable(5.0)\n", - " self.b = tfe.Variable(0.0)\n", - " \n", - " def __call__(self, x):\n", - " return self.W * x + self.b\n", - " \n", - "model = Model()\n", - "\n", - "assert model(3.0).numpy() == 15.0" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "xa6j_yXa-j79" - }, - "source": [ - "### Define a loss function\n", - "\n", - "A loss function measures how well the output of a model for a given input matches the desired output. Let's use the standard L2 loss." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "Y0ysUFGY924U" - }, - "outputs": [], - "source": [ - "def loss(predicted_y, desired_y):\n", - " return tf.reduce_mean(tf.square(predicted_y - desired_y))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "qutT_fkl_CBc" - }, - "source": [ - "### Obtain training data\n", - "\n", - "Let's synthesize the training data with some noise." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "gxPTb-kt_N5m" - }, - "outputs": [], - "source": [ - "TRUE_W = 3.0\n", - "TRUE_b = 2.0\n", - "NUM_EXAMPLES = 1000\n", - "\n", - "inputs = tf.random_normal(shape=[NUM_EXAMPLES])\n", - "noise = tf.random_normal(shape=[NUM_EXAMPLES])\n", - "outputs = inputs * TRUE_W + TRUE_b + noise" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "-50nq-wPBsAW" - }, - "source": [ - "Before we train the model let's visualize where the model stands right now. We'll plot the model's predictions in red and the training data in blue." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "height": 293 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 1210, - "status": "ok", - "timestamp": 1527005898290, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "_eb83LtrB4nt", - "outputId": "3873f508-72fb-41e7-a7f5-3f513deefe38" - }, - "outputs": [ - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXwAAAEDCAYAAAA2k7/eAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJztnXlgU1X2xz/pAhRautCWUsCwWVlcUHHGBUFQcSg7uM8P\nFLUICo4VpygObihI3UdmUHBB0IGZQbEgFNGCqKgMolV2pKylCy1pukDp+n5/3LxmaUsDTUjSns8/\nbZKXd09C+b7zvvfccw2apmkIgiAITR4/TwcgCIIgnB9E8AVBEJoJIviCIAjNBBF8QRCEZoIIviAI\nQjNBBF8QBKGZENDYE+Tk5JCUlER+fj7+/v7cdtttTJgwgcLCQhITEzl27BidOnXijTfeICQkxBUx\nC4IgCOeAobF1+Hl5eeTn59OrVy9OnjzJ2LFj+ec//8mnn35KWFgYCQkJLFy4kKKiIh5//HFXxS0I\ngiCcJY22dKKioujVqxcAbdq0oXv37uTm5pKWlsaYMWMAGDNmDF999VVjhxIEQRAagUs9/MzMTPbs\n2cNll13GiRMniIyMBNRFoaCgwJVDCYIgCGeJywT/5MmTPPLII8ycOZM2bdpgMBhcdWpBEATBBbhE\n8CsrK3nkkUcYNWoUN910EwDt2rUjPz8fUD5/REREg+eRtj6CIAjuo9FVOgAzZ86kR48e3HPPPTXP\nDR48mE8//ZRJkyaxcuVKbrzxxgbPYzAYyMsrdkVIbiUqKkTidCESp2vxhTh9IUbwrTidodGCv23b\nNlavXk1cXByjR4/GYDCQmJhIQkICjz76KJ988gmxsbG8+eabjR1KEARBaASNFvwrr7yS3bt31/na\n4sWLG3t6QRAEwUXISltBEIRmggi+IAhCM0EEXxAEoZkggi8IgtBMEMEXBEFoJojgC4IgNBNE8AVB\nEJoJIviCIAjNBBF8QRCEZoIIviAIQjNBBF8QBKGZIIIvCILQTBDBFwRBaCaI4AuCIDQTRPAFQRCa\nCSL4giAIzQQRfEEQhLOk0GTi84R7+XbIDXyecA+FBSZPh+QULtnTVhAEoTnx7YzHuDflUwyAlv4z\nizEwfNFiT4fVIJLhC4IgnCWhhw9hsPxusDz2BVwi+DNnzuTaa69lxIgRNc/Nnz+fAQMGMGbMGMaM\nGcM333zjiqEEQRA8TqHRiGb5XQMKjV08GI3zuMTSGTt2LOPHjycpKcnu+YkTJzJx4kRXDCEIguA1\nXJ/8OosxEHr4EIXGLlyf/JqnQ3IKlwh+v379OHbsWK3nNU2r42hBEATfJjQ8wic8e0fc6uF//PHH\njBo1iqeeeori4mJ3DiUIgiA0gNsE/+677+arr74iJSWFyMhI5s6d666hBEEQXMLRjAwW9u3FWmN7\nFvbtxeGMDE+H5FLcVpYZERFR8/vtt9/O5MmTnXpfVFSIu0JyKRKna5E4XYsvxOmNMb53xQhmZh1T\n5Zalx5h3ww08cfSop8NyGS4TfEe/Pi8vj6ioKAC+/PJL4uLinDpPXp73Wz9RUSESpwuROF2LL8Tp\nTTEWmkx8O+MxQg8fIjory67cMtZk8po4z4SzF0+XCP706dPZsmULZrOZG264gWnTprFlyxZ2796N\nn58fHTt25Pnnn3fFUIIgCC7FdhHVXFSZpcHyM8vGqWgKuETwX3311VrPjRs3zhWnFgRBcCu2i6ju\nBp4JDKR7QACZ4RH839dfezAy1yMrbQVBaNbYLqK6AOgaP4L4w7lMSt+NsXt3T4bmcqSXjiAIzRpf\nXUR1LojgC4LQrPHVRVTnglg6giA0WXy1jbG7kAxfEIQmi6+2MXYXIviCIDQZbGvqC41Ggg9k+GQb\nY3chgi8IQpPBMaOfE9vRrq7eV9oYuwsRfEEQfB49s/dbn2qX0V8QEcHiq/7odAWOyWRmxoyNHD7c\nFqOxkPffHwX4uzv884YIviAIPk2hycS/B1/HJVnH2In9StnK7heelWc/Y8ZGUlLGAwbS0zWmTFnO\n/PnD3RK3JxDBFwTBJzmakUHquOFE5mTTpbqaAcD1wDygQ1AQ1UOGnnVN/eHDbcHmHuHgwWDXBu1h\nRPAFQfBJUscNt3a2BJYDdwG9gRNDhp5TNY7RWEh6uvUeoWvXEhdG7HlE8AVB8BkKTSY2PDqVgB+/\nI8ZstvPrg1HCvz22I3ec42rZ5OTBwFKLh1/EggUjqapyTezegAi+IAhejz4pq23aQBuzmWHAAuz9\n+t8CA8mPH8Edya8RGn5uXS7tu7w3vS1aRfAFQfBq9ElZR/vmbuBZwGgwkN0hlqEr19C5a7dGjdXU\nJ22ltYIgCF7NtzMe4xKL2IPVvrkAuAgwjBzDpPTdjRZ7aPqTtiL4giB4NaGHD1GC1WDR7ZvktqFs\nbH85r2eMJiHhUwoKzI0ey2gstBtJJm0FQRDciF5u2anARGZ4BC169eIBlI3TBsuk7MbNPJ60Sdkv\nuQa279CApSxaNOasx7NdbNWhw0mGDn2P7OxImbQVBEFwF/rEbNba1cysqKjZSPyF6mo+GzWW0MOH\nOGHsUjMp62i/HD7cttZK2eTkwYSHh51xXEffftSopaxffyMAERHes/euKxDBFwTBK9D74HwO9u0R\nCs3E11FT71gzbzQW1RJvZ7L+ui4cTRWXCP7MmTP5+uuvadeuHatXrwagsLCQxMREjh07RqdOnXjj\njTcICXFuZ3VBEJo+u7Zt48vRQ+ladpqDBgNRrVtjAIqxL7fMrKfE0rFmPjl5EHfcsY2zFe+6LhxN\nFZcI/tixYxk/fjxJSUk1zy1cuJBrrrmGhIQEFi5cyDvvvMPjjz/uiuEEQWgCfDkmntllp5XMahpP\nnzyJBsQDy4BC/MhoFcawxcvqfH94eFit7P1cxLuuC0dTxSWC369fP44dO2b3XFpaGh999BEAY8aM\nYfz48SL4giCwa9s20sbG0+10qZ110x14JSwMf8LZZO7HKt6G0+Hs/8dSFi3q69S5z0W867pwNMS5\nzBV4A27z8E0mE5GRkQBERUVRUFDgrqEEQfAQZyN8+qTs6VUruUjTOIS9dZMNxAwczN8Pjyc9fXTN\n+87GUz8X8T4XzmWuwBvwuknbqCjf8PklTtcicbqW8xXn1Kmf2wlfy5bL+fe/76p1nPnECVbc1J/e\nmZmUAEOBpajOltHAAYOBsBtvZMz7i1g3ZZ2dLRMXV4qfXxUPPZTKwYPBdO1azIIF8UREnJ+Muq7v\nMisrHNu5gqyscJ/423Cb4Ldr1478/HwiIyPJy8sjIsK53ha+UAIVFeUbpVoSp2uROGuzb18QtsK3\nb18QeXnFtTL/+PIVzMjMtGuN0BUYDsxqFcRfjuQCUFEFs2dfT1mZ1ZaZPXsQ99+/qubCsnWrRlnZ\nUubNG+R2W6W+7zI21oTt/UlsbIFH/zacvdi4TPA1+65DDB48mE8//ZRJkyaxcuVKbrzxRlcNJQiC\nl1DfJKlueRgwcUH6FPBbb+fXtwF+A7a0CuLmVevszlmXLVNX6aQnbRVfneh1ieBPnz6dLVu2YDab\nueGGG5g2bRqTJk3iL3/5C5988gmxsbG8+eabrhhKEAQvoi7hKzSZCN34D17hUfIoYS4VLKu29+t3\nderEHWnfOd3Vsq4Liyfr58/XXIGrcYngv/rqq3U+v3jxYlecXhAEL8VW+ApNJj57YAKl337NDSgp\n7mD5GY+yccotO1FNfn8RuXknSUhY6ZQlU9eFJSlpQ7Opn3cVXjdpKwiCb/LtjMeI/fZr7sKayb9k\n+RkG3AkstuxEFRYRwr33fe60JaNfWPS5gTvu2Far742v2CqeRARfEASnqasMs9BUwOJxCVyanU4+\nUIgSeAOqffFLQPuwMAwDB3N98muYTGamTv2c9evB1pLJyPBvMOOvr++NyWQmKcn36uLPNyL4giA4\nja3g/pqeT+DqP3BJ9UHmY83ql6E2J9GAn4Hc9pezLCqRblRzHX4251iGrbNvMh1mx44nOVPGX59v\n76t18ecbEXxBEJxGF1wDJ5jM5fyjOrNWs7NC4G3gd/zZenUS3/74ol0LY6toK2c/KKiCIUPgwIE4\nsrLOPAnrOHl7/PguDhzowaZNucDnqE488U26AVpjEMEXBKEW9a2g7djhGMb0eG5mHa3R6mx2to6r\nWcUPwCrC9uRbXjEDqaxfD+HhO4GBNe8oKytl69Z8evUKtjuTPglr36++nPbtnyY39yrgJFlZUxg7\ndgFm85PY3mMYjZVn/BzNFRF8QRBqoSySEcA60tPD2bp1CY/cW0nv1GfpDeQBuWDX7Kwc2A2s4l+W\nV04C+ZbfU4E7KS01UFqqERT0DKWlLYGZVFcbyMrSqK5+gVGjate2O9o1YWGvACNrYi0o6ITtPUZY\n2GmSk2+u873N3eoRwReEZsDZZrrKElkH3Ik/WxmSNYaCOdX0B0qAB4BVwNNAJ9QFIB9Y1vZeKNoO\n/Aj8iWuuWU6LFktZvx5KS62iXFraDyXS1ucKC411irGjbw/tsL0TCA8/Smmp9fHAgQE1n6059bp3\nBhF8QWgG1JXpnqk1gdFYyK/pp4nnEv7ATiKBUGCA5edyIALoDOwliNfIJCzsM7744g/MmfOz5Zyr\nSU4eTnh4GAkJn5KSYmv8nLT8tBXuzDpjd/Ttr7mmmhYtrHcCM2eOYs6cule9Nqde984ggi8ITRDH\njP7AgTY01JrgxImX+emnYsrKuhKifcxf2EAw0BdqGp6lAnehWiMUA7tpxRtsB8Ixm1vx7LPf0aJF\na8s41nYrtgunjh/fRVbWFEs8y/DzKyYm5gQrV1ptGltqL7q6pdbdyaJFRiff27xr9UXwBaEJ4ijm\nsbFzcJwQdbQ7Nm8uAm0ioxjFzeykEHgCa06+HNCnVf8H/MKlrGUssMvyTDybNy+kqOhBHD1z2xW5\nBQVXMmvWOvbtC8JorCQ5Of6M9lJj2hj4agsEdyGCLwhNEEcxj4jowlVXnbk1gb92kkR6MM/yzCrs\nnfM2wK/Atxh4mXuB14A1qJ6X6hynToXSkGceHh7Gv/99l090Hm1qiOALQhNEedcFqInXlvz++06O\nHGmFn18nOnSoAuztjuh2+7k07Q36Y5XrEuzLLb8DXmYDMAhQ1TLV1aUUFS1BOfoltG5toqiocZ65\nlFK6DxF8QfBRMjIOM27cKgoKOhEefpSVK0fRtavyspOTB7N16wKyslR9elnZGMrKlgHxpKauZfPm\nLwgOziW8bTsuP/4SxvTddMZe5IcCs4COwC78mc8e4D+Wo4rp1CmW7t0rSUmZgC7w/fq9xZ49cy0x\nZTJzZm1fXm+toCyd2oIupZTuQwRfEFzI+cxOx41bVSPopaUaI0c+w9VX9yArK5zYWBMREUa7lasQ\nhFoDO4PiIhN9i/7ENVk/0Qb4G/AKcDvKq28DbAPKgHXczKqaupyLgRGAxv79s3jvvTuxnRQtLw+0\ni2nOnKW1JlQbEnQppXQfIviC4EKczU6dvTDUd5zJZCYnR8O2nUBeXsuasdUuTHOxN2X2AH3wYz/j\nuYgYNHoDpZYjwlBVOCGWM5qBv/Moyqu3LacEMHD6tCrBtP18Q4akoZorpALBbNqUQ0GB2e6z2Qt6\nIZs25TJkSFrN55NSSvchgi8ILsTZ7NTZC4PjcWVl7wGQlpZLdfVMbNsJaFqE3dgFBbG0ajWL06fb\no0Q4GH/SmcooWgNXo8wZ/Qy3AWuBTGB/y1DSus7B//dDVFUtAyqBY8Bky/mV+Dt+PiXWa8HSJNls\nHk5S0lK71saHDlUCHwPDgLWYzY+Tnm79HqSU0n2I4AuCC3E2O63vwmCb0cfE5PH99/arUX/80Q+z\neSLUallWTnCwieJi69iqdcFsYmJe4FTO10xhA3HAPuBFrEK/BJiLMmwOGAL5vPsLxPVpz6fJg3n0\n0dWkpgLkoMR+Hcrw2QU8SEzMJ3YtjWfOvJJNm/6H2Xzmjpb6pC+0q3WslFK6DxF8QXAhzmanHTpk\nk57+L5SBUkSHDrZ7waoeNtAeVd9uFfGiohzL744ty/IpKTlOy5azKC/vhqYFoaZdDbSqzOMeNjCX\nusstw1E9cF7yv4viqo9hv4Hd+zVSU5+ksjIEg8FAy5a5lJXNQ9PiwLLQKiZmPgZDJCkp92N7pzJw\noL/dqlr9oud4kevS5UKMxsI6jxXcgwi+ILgQ57PTQLDbG+o9TCazpc1viuX1AcD1wDwgBmhBdXWU\n5fjrUHl5NKqTzd1o2mbKyu5CtTK7k0BWkMjtdM2HFtRfbvk9MI+HoeoKm6MKKS9vA1wClHD69KXA\nBJt3LeP06WNkZ3fA8U7l3/++krouenXd/Yh9c35xu+APHjyY4OBg/Pz8CAgIYMWKFe4eUhC8nuzs\nSGyFMjs7khkzNmI2P255vgBVUdMH8ENJdjzqYvAekAHMwX4dbIjl8XX48xBTeJvLLM9uwb7c8kng\nAiCdIBaRALyB/YYka1G1O/r5P8T+viAEaFeniNd30bMV97i4UmbPHiT2zXnG7YJvMBhYunQpoaGh\n7h5KEHwG+4VRbTh+fCdVVRdhFdV1wAzL4+G0aJFEefkBVG/KAqAH9gIcDBThzxb+j6uJRTnt+j1E\nf+ApoCfKfd9LIPN4CUjEKub6VuOlqEla2/PnYX9fUMw111Rb2hA7l6HbintUVIistPUAbhd8TdOo\nrq529zCC4HHOpgbfcWFUVtYITKZZwDisjQysgtuyZSTl5UlYBVffHlx/vJcW/EoiH2EE2qLuC/Qz\nhANGlNjPYxgwGnXXsAw4hf1W48ss77I9fy7qziIAP78sbrklnDfeGC4Zuo9xXjL8+++/H4PBwB13\n3MHtt9/u7iEFwSMkJq4hNbUt4E96egDl5Z/z4Yf/V+ex4eFhREf3tlsYdfp0F5RfHw38jvLvwwGN\n4uJg7DPuzsALQCcCWMOdfEJHqJmYreuSsA94jf0og8d2/mA2yh5S7REgwTKOyvZbtjxAWdlUoAug\nMWKErHz1Vdwu+MuXLycqKgqTycTEiRPp1q0b/fr1q/f4qKgQd4fkEiRO1+LNcZ44Yeahh1I5eDCY\nrl2LWbAgnoiIsFrHfPVVNqA6RYLGjz++WvO5Tpww88ADKWzapAF5DBgQRocO2PnfaguRGTaP56E8\n/BIgG3v5Pgp0pxWf8hc+IRi4FPtLQk9Urm4GdhDCAn4BuqPyfNsjL0c1QHseqEB1vDcAd9Kp0zx+\n/fVxpkxJZd++X8jP38vhw0amTl1d5/dwNnjzv7ktvhKnM7hd8KOiogCIiIjg5ptvZvv27WcUfF/w\n9XzFf5Q4XUNCwqqaUsmtW4P57rt/sHHjBMLDw2r62eTktKO6ugu2QlpcHMK+fUctG4Cssus5k5Ky\nBPgNeBWIRAl4H+yFuDd6GwNYgLVBcQkBHGIqM7kQ5ei3pnb1zS6gCEjmJ9Qq226Wcxc5HKkvv7rC\n8rt1nLCwzlRV+TN//nASElaSnj6DzEx9Edi5Z/re/m+u40txOoNbBb+0tJTq6mratGnDqVOn+O67\n75g6dao7hxSEs8IZ3z0jwx94B1XXspOsrF4MGrSEjRsn2PWzUatHrUJaWdmKQYOWEh3d27K61FbM\nNVT3Guvyp8DAX6ioGIO9ZBuAdJSFcyd+7Gc0/ehOUU0bYw1l7tyLtQ/O98AxgviI7aisvrvlqF7A\nXtQU7oVAS9RkrS781cDdNWfu3n1pzfcgPW58H7cKfn5+PlOnTsVgMFBVVcWIESPo37+/O4cUhLPC\nmRYHJtNhVCHjcvQtQbKyNJKSllJQYFuHPgyVscehWo/dR1bWf8nK8kdNehage/JwANs+OFBJUFB7\n2rV7kZycSNS0613AZtQdwAECuZ1EVnARqqmZ7eWjI+oeIBz4BQMvk4Ta+1XP6kNRtf0VwHPAEWAp\nEAXMx8+vkN69+9K5cxHwHtnZkbJdYBPErYLfuXNnUlJS3DmEIDQKZ7LWdu3iLJOr9hOnq1ZVomm7\nsWb1eulxgeXnRqADavJ1OJCEqoTRd4PNRU3QLgBKKSp63tJL/hmU4P8XmI4BE4MZSD920hvV0cYP\ne1PGhDJqnuIOVOb+PKp/zjKgHFWRU2zzGb5HZfnqDDExc9mwwdrKWL/zueOObTV3PrJIyveRlbZC\ns8aZrLVbt5Ns365qz21lVrUviEOJahCqQUEgyjL5K9a+MwuAKajFSvYNz2AkyqefZxnNgLJdDgNt\n8eN1pjKDi6gkFHUPEYqa2l2GtbNlJvAmM1C1OXrzhDCUPbPaMsYCwsJ2YzYPx/Hi1a5dnN1nru/O\nR6pzfBsRfKFZo2etGRmtMZn2kZFhJCHhUzsv33qMH/v3P83p00bUQqRhwHpU24NiVLZ+P8qqWYeq\naTegxHYJyj5xXK2q/x6MkvA2qJLMSIJYwiNs4VpqbyLeFdiBWob1Bd1YxR2oOwioPX2rHgcGmtiy\nZQJJSUvZtCnHIvzqmG7dTtl9L+LXN01E8IVmjb5w6J57PmbHji5kZYWwY0cOP/zwHuXlFwD5XHNN\nMG+8MYJZs75jx47nsbY+eB3ohxL7oSj/Xq+jz0ZZKmGWn0eAVjiuVlVoKKPmYcBAC/L4Cw/QAlUh\nb9s8Qd9E/DCQg4G5bEU1MzuFaocQgrJwnkDdfRxH1c8vY8CAkJrPW1BgJimpfntG/PqmiQi+0OQ4\n212nTCYzX32VhbWG/l8cP/4Mutilpi6jRYuNZGWFY93cIwNV6a7L8XxUhYttHf0ylKWi96XRPfVi\nlOtegrJgwlGZfSEB/I9HeYCXUFOqtvcDbVDSvhmYxzxURq8Bn6CqbabYjP0CMBbdVoqN3cE//zm+\n5jM3tEJW/PqmiQi+0OQ42z1RZ8zYSEVFP6zyGoKj9ZKSchz4BiW5T6KyedvVqi+gJktt31cIvI+q\njLH11FehBD8Y/QLhzxbGE04HrF1yjlG7q2UJLfgHPwArLec5iTJ42tuN3bZtB6677hNLtY2Z5OTx\ntS56Z7owSsuEpokIvtDkcPSfMzL87TbpePLJK5k792ebTUayUTX2+i5MjguTTKiKmj4o774QVSrp\n2Opgj8P7TqKmWG1ragpQ9fVRwDEMHORmRtGXHXRB1ebonW3uRuX/eqOFDYRyqtfrxBauo23bzhQV\n7aBduzgOHy6nqGgnaq5AjT1oUIsGBVs2C29+iOALTQ6r/1wIrGXv3kPs2KGqY9LTNdasmUVl5eya\n15V4ZwMXofZvLUJZNsrDV4+fw75LDdiLeybKdHkS1aYsFHjA8vM9VK+aXqhFVOpcLXmTKXSnDfZe\n/RKsPStPAj8Bb/MhsbGZpG+6tdbn7dv3LYqKpqAvuwoK+onk5IRaxzkiE7PNDz9PByAIrsRkMlNe\nXkFY2AcEBLwCDKWiwr7LTGVlZ8vjVNRkaxFqknMsSozboiY6W6BE+yrss/k+KL9cL4FcjppwDUDZ\nQS1R+XmY5fho1H+1/6F8/+W04s88yqNcBfzB4ewRqPqeA0ApLXibn4AJhIZ2r/Mzq5LKcJTFNJKe\nPS8/45yFjtFYiLrEgEzMNg8kwxe8nrOZhH300S9Yt05tuWetbdFw3A5Q/QxGTWr2xl5y+6Hq4/X3\nV1PbqgkDLkbZKDp6q4IdDsf/BDwGrCKA9fyFJfRE4xDKvsHh6L3Ad0AyM7Dtf1lYmFHnZ7auE1DH\nXXjh6TN9nTXIxGzzQwRf8HrOxmv+8UfbLvB6bcsAVHVMIcq6Kbc8PoaycRzr1k/avL8Y5bnvRmX9\nB1CLqqC21/+75Zi7gVmoydTjwP34cZw7uI8LqKpZLfsAyux5DGsPnP8BvxDLWraj7gqWoyZ9Aykp\naUtBgbnWxc5RuBcsGElVVcPfq0zMNj9E8AWv5+y8Zj1710V4p817v0cJcg9U1h2E6lLZBiW90ZZj\nJlse630oo4CHULZJAaqRWm/LuZdg7SMfClwLfInK9A1AJa14iGmspSWq4YFt82Mj8E/LmX/AwFv8\nRHT0ajiul4BqqN2n/CkqaklS0sZaIu0o3BERvtHhUTj/iOALXo+zi4BMJjMtWxZjbTlcSfv2p+jQ\noYqYmFOsWxeNmjgNQVXbPIG1cuYN1H+HauADVNOx6Vjl+VUgFtXorA/KytmL/cbeT6IuAN2Bv+HP\nVhKYRDBVhKHW49ree8SiMv1iYBVhFF74MqN676CkJJy0tGVAlkMMS2RiVWgUIviC1+Ho2c+ceSWO\nXrPtMR06ZAOB/PCDH2ZzT/SOM4GBc7jiigt4440refTRNahM3LZ2XpffdcCzNs/PcXjdgLqABKP6\n4kRZXg/HWk+Ti6rq6QQYaM10pvI6ccAh1KXAcQeqXagcftfVf+XzVbNqPv+QIWmoLQhXO8QQjtFo\nbvwXLDRbRPAFr8Pesy9g69YFREf3tpuwTUhYaXPMv7AX8uXAXVRUXEpqan9++WW+peVwFUpiQdkx\noGrsq7AX1hhqL3tqgbXR2WzUHMCtKBtHb5u8kAC+ZzjzuAhqvPo4y1nuxtp4YR/wHZEcaD+Jrz+c\nbPf5rXc09s3aYmN3kJw8HkE4V0TwBa/D3rNfR1bWk2RlWSds580bxKZNucBnqLr2AOBDy/GjUR0r\n56JWn75ETs5L2Fe5t8Bq59S1+2suKovXNwjMBfoC/0JZOlGoOv3lqHmAUYCBEN5iCjsJwbbxMDxt\n+WlEratNAl7hCoYOfYCvLRuB22Jt1uaPyTSXdu3i6NbtVJ2rZQXhbBDBF7wG3aZRu0Ppq17bYJt9\n792rcdllb1NW9kfURGknVL2LrdeeidqnNQJrWwMsP09TO6PvgrU12V7L43hUnX4R9nbPMlRWfxKV\nvz+PH/uJJ5o+VDAX1SvT9uzdUReAjqgan9dYQlBQFR9+OK7O70GqZwR3IYIveA22Vg5otG37EuXl\nJzl9Wm8ZUMC+fXuprn4Re4G3ldeLUNOhQ1HevGPVTjFqktb2Ob2RgV4FvxtlEd2Ftbe8fv5y1F3E\nt0B/gunNFPbQBWtdTrHD2fcAJ4C5PI1a2KURGvqi6744QXASEXzBa3AsvywpiaK6uhxYCORjMBRS\nXd0fewFuR+3e7yFY+9G/i/1WIUUoF/0pVO69H2XLrEb5+WGoCdqXUP89CrHtUaNkPRQD2VxPF66h\niDhUBX6X0UVuAAAgAElEQVRLyxHxWHtiHgR+IJhv+NYS0xLgd/r0EWtGOP9IawXB45w4YSYhYSUH\nDuzFdql/dfUhVOVLS+AhNK071kVSYF3s9CyqNn4Z1lYJuhV0G9bSS/189wDXAPcB/ijhH2455/2o\n7cCfQOXl01F2zyrURSKHAJ5kEg9yDUX0Rjn8D6LuJf6Gala8C7WI6oer/8qivbsIC/sSVc4ZCEzn\nxIm62yQIgjtxe4b/zTffMGfOHDRNY9y4cUyaNMndQwpegG3ZZExMHgZDJdnZHepsjfDQQ6kWK8ex\nX/x0rJt+L0fVzt+OypL1TUNOowTeH+XNL0CJ/SFUZh6GyvR160evrNmCEnRQUv0qtdsi2/aoAX/2\ncieP0Qkl246LqK6wRP078D3h7G8/hc/evIvw8DAGDowmJcW6w5T0rRE8gVsFv7q6mtmzZ7N48WKi\no6O59dZbufHGG+neXbKbpo6jH6+EfDTp6Rrl5e/QokXrmjr7I0d0K0fvF78ENXG6DjWRql8AilCZ\nfABqQrYUJdIXUbsscwLwIsryse1cqS+gMqIyeX3B1KWoUk1be2h/zWMD+dxOEp1Qa2mzqb2Iapcl\nor/zHPA05GrMmbOURYuM0rdG8ArcKvi//fYbRqORjh07AjBs2DDS0tJE8JsBGRn+WCtfirGVx82b\n8ygq6g74k54eQIcOv6AmQnWhPWY51rZ0ch5KmF8GrkZZO9NRG4E4ZubBKHHvbvndtsGZXrnTEmuZ\nZXeUXD+OtavNVmASBhYxhme4kZyada/hqBoix0VUR4BlrETdbahY9JWxUnkjeANuFfzc3Fw6dOhQ\n87h9+/Zs377dnUMKHka3cvbu3U99PeSLiqqwzchPnnyRsLBXMJs7oCpkOlN7pWs0ag9Z2wqd5aiL\nQ0vs5fc3lGUzHXVn8aHl+TxUM7OHgR9Qwr4AlZf/EVv7BvJoxTKmMZN5DiPehSoYnYOq9P8dSOav\nQDLWuxn1WcW6EbwJtwq+pmkNH+RAVFSIGyJxPRJn3Uyd+rnFyvkMW8E2GNqiaUtQAh2DdW/YYIqK\nqhk6tA2pqX6orQIN1M6hg1Btix07YZajBFvvn3MUlbFnoCyhLOy3F1mG6pXzrOW5EahGadZiSj/2\n8Sce4BKUfeM4Iqj7h2LgZ+C+las5tKyYgwdX07GjCU2rICtrNV27lrBgwUgiIs7/34ov/H36Qozg\nO3E6g1sFPyYmhqysrJrHubm5REdHn/E9vtDlLyrKN7oReiLOffuCUNKod3pUQqtpLVFTnX1Q2fda\nrFn+cNLSnqBt21YUFenyOgwl4hEosR9qeY/tRWAr1lYJlUCZ5fl01DKnO1HzAbaSHYK6IDjePagW\nyi1ZwZ9ZSRRK7B0bJ+9EXbIOA/P4KzCPqsX1t2uuqjr/f9O+8PfpCzGCb8XpDG4V/EsuuYQjR45w\n7NgxoqKiWLNmDa+99po7hxRsUOWOq5zaOMRVqD4wBajVrh+ibJTTqHLIO1HSeT3wH2xFt7z8QgwG\n6ySpyqFjUcuWdGtoKMoaCkNV1kSiBL81+mbg1sVYRcBbqAzfceGVfZ8cP78f8av+HxN5kWiUQXQZ\nSuyHYu/qlwDbCWAZe1AXDqSDpeAzuFXw/f39mTVrFvfddx+apnHrrbfKhO15xFrueP42qU5OHszW\nrQvIyrLtJvMSyh/XbZxAVAXMIpS9UwSUU1b2V1Stew+sE7ftUNXtF6JEvhR1Z6B78Ccs4ziutu1v\nGddoOacR5d/HoCqBVJ+cmBgThTmnmcrrNVO8rbGK/TqsG5OUAIb7JnHqxLWQ0s0ynvj0gu/g9jr8\nAQMGMGDAAHcPI9TBwYPBOL9xyJlxdpvB8PAwoqN7k5VlK8DtgV9RkqnbOONQojsCdVF4EXVR6Im6\nIPwN6wVjBipTvxh157Ac1YuyBEgE3qb2att1KMG3nW69HVXW+RtgwJ9D9Mx5mb5Qk9lXAL9YzqqL\n/fdArrErr//8ExVVgRQUmJESS8EXkdYKTZiuXYvZurXhjUOcwXGbQb2WPiOjNSbTXiIiutC9eyXJ\nyYOJicnDXoBboeri11HbT9d/jwIWo5oRnLa8pxRVNtkVtSh8JKoLpm255nLUBWW25Ryhlvd84zBW\nBaq0cwYQTkve5EFeJghVgW9bxT8Pa9u0X4G+H3zMjcNGEGbZSUpKLAVfRQS/CbNgQTxlZWfORPXM\nvS7hts3gHfvc/PBDMWbzg+gymZX1Pjt2BLFmzVpURfoLqJbCe1GLnlRFTm0/HcvvIVgbmC1Bib6+\n4YgJlYNrqDsAx7qZwyg//3eU7/8ZyhKy9sDx89tD9+4XcOD3KQzj31yE2tMqHzUlbHvGCNQ9QC6w\ns/f/MX2YbR2/IPguIvhNmIiIhjNRxxWxWVnL2bFjZK1NRxy3GSwpsb0AFKJE/lkqKx27wFdYfgaj\nJmuXo7L3LagFSgtQ3vpfLOcyoKpt9K0D9bJJtayp9sYkO1Fi74eaatVQ62BNGAwLMBgKCAgopLz8\nSTJ/f5XH+DctsC/UdOyGvxdY3fmv9L7iYj4Su0ZoQojgN3McM3clzKvsNh3ZsuUFIiO70bLlLMrK\nugIFVFaWohqSrUMJdBeH8/RC9YzvgzJJ/FENyu5CyeoW1MpWfd1qqOW9GsqnL0RV46iWxMHBwbRu\nncHJkyGcPDkLuBJ1FzAFa0Y/E1sZ17TOaFoorQK2MLG8KxEUcrHlXbaRhqIWUYUDu6KiefS7//FE\neIQLvl1B8C5E8Jsp1s1Gcqg94Wm/yjUnpx05OX7AH1AZ9RSsbvdc6l4odQRrqeQIm2MvRpVq9gJS\nULtP9QdmWc5/EjVluhblxa8FWtO2bQGXXRZJaupk9L48+lixsVmUlMTY1PAb0Dcab8F7TDr1Fn1Q\nS7JKLKPbRhqGarV22YrV3DZgoAu+XUHwTkTwmxG2lTbHj+8kK+shlOwtIyTkFOXlBygr80ctWtJ3\nnApFudlTsIq33mCgN9YLg75QKhrlpV+OfR6t7yk7EiXYek2+vvq1o+U1nWLgH+hZe1aWRnb2U8BS\nlGe/gKCgIMLDs4mIMFJdfYCiIpvaevZzEy25iHIuR80QBAKnLL+/iJrizQUyW7Zk8jdb6Ny1G4LQ\nlBHBb0bY+/WjgPctrxRQXNwG5YPbNv3VO0teQG3bR29yZrtQqhglpzEoJ9w2j24DVGP1823PV4i6\nSNger0u09ThNuxp1UVCWTXi4ucZ6ggJiY+cSHd2bnN8/5s8nV9ADlc3bVuC8iqrSH44ylHq/9TZT\n7ri7MV+rIPgMIvhNGMeVthkZAdgLbQFK0O+zPLbfzi8oKJLQ0AxycsB+Zep2AgK+oby8M0pC26EE\nXpU8qvLKu7GuUf0NdRFohVoEFYSSXF2GQ1H5ti7HJSg7Zz72F4GdqBYIYfj5taekRP8cAOFEtA1j\n4P4JtDhZXNPw7ANq32f8AmwA/mgptxSE5oIIfhPmgQdSSElR1S7p6RrR0S9gK6ABAcFUVtq2Frbv\nHBMenkV09CXk5NyAEu8yoAXV1X+mvPxfqIlafU3qfJTYg8r030bZNDstj5+ynONFVEbvKO6foTJ6\n2wtBEQbDU5bM/iQwGVXeeSctWhykqKhnTbxBPMWf9syhB8qmsZ3yrVXT87fnmPlIoku+Y0HwJUTw\nmzCbNtlPvppMkSi/3AAcoqoqFNiBVWSHoiZXewO7aN26DXv2bANyUPZNN5QL/jFK7HeiRPtt7Gvs\ny7BfHDUHqxVkQElxLPbibsDffzdVVXrXSwNt215ARUUwpaW23n4psbFzadu2M3v2DMOfF3iAp4lE\nTfmWoNbTrkXdY4xCFYgagX1Az7feZqRYOEIzRQTfx6ivxUFdzzvWo1RXF6AmX5cBT6BpytYJCHgG\n6EBl5WGUtbINuICMjN/RtBmo0kt9kdW/UBuRLMde1Gdh3SzcfkMSgyHC0iq72CaeocTEvMjp07EY\nDCauvroNYCQ19YGacw4atJStW49SWmr9DLGxOaSnTyMh4VP279nCgzxNR2qvvT2NMpbyLaMa3nqb\nv4rQC80cEXwfo74WB5s2VWI2twRuID09FFjK1Ve3JDX1JZS1coyIiALy8x0nTcMJDu6C2TwRtcI1\nEHgMNUmql17G2hyvi7njxGtfVG/6LNRCKqtI33ijgV275pKV1cVyvjhiY/ewceM9hIeH1bSgLSgw\n06KF/cpgs7mQMWPmUlDQifDwTFauHMnRjAx6bnyEKyiqKcB0XHt7CDUzkN82lAlfbpIKHEFABN/n\naKjFgV4yefhwW7p00YBpNa/17fsOO3a8YKmpt9opRUU5qOy8LepPwlY+9SZluoAXYW2LYOuOH0Jd\nWKpQ3S5fom3bKAYNakFy8jAAkpI2cvhwT4uYj6/VfK2uHjXh4WGkp0+refzh669wdO7zRKNaIDiu\nHNCAHwEzEP3W20yXrF4QahDB9zEcWxyoCpnaJZNGYxHHjkXYvZafH0OfPhXk5LRCTZqGAK2orn4I\nlQ8/i6qksfXWT6ImVZfTqlUZoaEZ5OYuQS2YeslyjmKUp1+NukNQq2mvu+49OwFvTMOxXdu28Wn8\nYDprGiGWT51nGd22Z/1m4GhQax7/+nvJ6gXBARF8H0H36A8caENs7BxLk7MqysurSE21XgDCwvYw\ncGABTz55Bbfeuhpb8TYai9i06TQwFaugv4+qfAFlyVyCtY/8LtS+sGHAnUREzKWg4EJUnxtFQMBz\nVFY+jaqLWYtqoaA2B8/OjnTJZ1/98VL2JD7MGyhhn24T/VxUfVBHVGfLuLfe5nHJ6gWhTkTwfQTH\nJmeXXvoe0IKjR1sTGzuXdu3i6NbtFMnJdxIeHkZCwkoyMyej574xMb9SXh5JUVE7lH0TjxLyAlQd\n/nKs1TS6NdQH+CcBAZFERBwnK2sq6uKgoQt8dXVny/kqsDY8U6tnjcZKwPle+nXxzovPU/LmK/S0\njOLY2bIT6rK0u20od4lXLwhnRATfwzgrho7e/Y8/+mE2Wy8AV11l3c3KZDKzaVMlqi7+LgCOH99h\n6UNjK+h3Uv8kbAVwjKFDI/jww7sZMiSN48f15z9E7Vg1nerqcMv5PrR7f1jYaZKTbwZqTzQ7s/NW\nWspnbEmYQBuse8sOpfZWJzuB61es5o/SA0cQGkQE38M4K4a1vXt9az8A+92sZszYiNlchbJWQoAi\nqqvD7I4PCqrA3382JSX+qBW2O7H37gNRxY7v2Yy/FvssXu+pY8DP7xjV1db4Bg4MqLlwOV6sGtp5\n6z8LF3D0bzO4Cvu2CMtRl6VZqLqhA35+jFi3kd59Lz/j+QRBUIjgexhnxVDV1VtLFsvL29h590Zj\nUc3dwvr1oNab2u4rOwfb3HjIEPjii3KsneGvB55BbczdApVPG2p8+OTkwWza9CVms2MBJIDGLbdE\n1Cqp1HG8WNW389aWDRv45s7RdEXtcWVfza9G247aB6vlW28zQ7x6QTgr3Cb48+fP5z//+Q/t2rUD\nIDExUfa2rQNnxdCxZLGumvWkJFuf374vjiqvXEZY2GkGDgwgOXkQ69dX2RwTDlyFqrixdrLU4wkP\nD2PgQH9SUmwXQe0gOrqamJh8gHptKceLVV07b+kWzlUood9B7f2xvgcKWrfmwY1SgSMI54JbM/yJ\nEycyceJEdw7h8zgjhnWhXwD0rP6OO7ZZetvrXWRKsG5Q0gb4BX//Mq65xkhy8gjCw8MID8+yW8Wq\nO+V610nHeGrHOr5mgjgl5X7qs6XOtAfsrm3bWDlyCK0qKojCauH0x9pBPwzVmu1SaYsgCI3CrYKv\nVmoKZ+JMYujMhO6jj37BunVKbK37wd4DDMVgeBlNe9Hy2giqql4lNXUKLVooQV65cpRlFWsHNO0A\nXbv2IC5udZ2LovRY580bVBNTUtIGkpMHn7VHD8q++e6uMcRpGq1QbdG+xv5+oyewBwiZ+wp/u39S\ng+cUBOHMuFXwP/74Y1JSUrj44ot54oknCAkJcedwTQZd6Otql+B4cfjxRz9sxTYg4DQXX/yZpea+\nu4PnrpqS6YLctavRbhWrM9Q1yWw0ak7ZUjqrP17KvsSHa/bK0uvpY7G3cHYAwdNncKeIvSC4hEYJ\n/sSJE8nPz6/1fGJiInfffTcPP/wwBoOB119/nblz5zJnzpwGzxkV5RsXBXfFeeKEmZtu+pjMTH17\nQGs1TFZWeK1xDYYT2MpkSEgxv/zyIACjRy+289z1n3FxpWcV/4kTZh56KJWDB4PZv78a2wtMVlY4\n69Zdz5Qpyzl4MJiuXUtYsGAkERG1z3/49995+7rr0PLyuBb7GYYY1KaFy1BtEQ4FBjLhhx+49Mor\nnY7zfNDc/z5diS/ECL4TpzM0SvA/+OADp467/fbbmTx5slPH5uUVNyak84Le7MsdJCSsIjPTdutA\na7uE2NiCWuNefXUbUlP1LpXFXHGFH6NHL7HYQBUMHfoemZlhnDixj4gII927L2X27EHk5RU7vQYg\nIWGVzWSw/d61sbEFVFX5M3/+8Jrjq6pq/ztu2bCBNXeOph2qyfJOVF2QXsV/AHVZy42MYsSaL7nN\nMinrTX8P7vx3dyW+EKcvxAi+FaczuM3SycvLIyoqCoAvv/ySuLg4dw3VpFB2i307ML1dgu0Eqi7W\nmZnRxMbutWm10NbOchk1ailpabcAt9Qay9k1APYe/TDCwl6hS5cLnZpkLjSZ+PD2MZz67Rcisd9A\nUe+8vxXVxrjrW2/zkEzKCoLbcJvgv/zyy+zevRs/Pz86duzI888/766hmhQxMXnArVhbIvzGpk33\n1Mq8HVst6CtthwxJw9kJVGcnW+1LR0MZOLA9ixbd2OBnKTSZePe6fgSeyKcDEIf9fUsUkI5aQnaD\nbDcoCG7HbYKfnJzsrlM3aQyGSlS/GmXRXH55O6daLehi7Wxd/9kce7alo4UmEx/eOoLAHdvpgloC\n1prabYx/B4au38TAmwf4xG2zIPg6stLWy8jO7oCavtQff1bncfWJta04x8WVMnt2/eLsrJCfqXTU\nkS0bNrDqztG0R7VAsF3nexfWNsbfA/1XrJa2CIJwHhHB9zIam3XbinNDE05nI+QNUWgy8cn/3U7Z\nT/8jGrVm19a+CQcWoBZRHbj8Sh5Y/gmh4REuGVsQBOcQwT+POFMV8+STV7J1q76l31FmzhxV57lc\nKdaNpdBk4tUr+hB56iRdgQyUjWNr32QBtGzFdau/kKxeEDyECP55xJmqmLlzfyYr60nAQGmpxpw5\nS1m0yOiJcJ1CX0TVDvsKnGewt286zn1FFlAJgodp8oJfV1ataZzzhhyNwZmqmHNpU+AJ0lI+Iz1h\nAj1QmyN2wN7CuQDV1fJXoK9U4AiCV9DkBb+urBo46w05XIEz/vzZVNl4gkKTifWJD3MkdY1da4SZ\n2Fs4+1FCP12EXhC8hiYv+PVnzOc/i3amKuZcu2eeD45mZLDwun6EVVfVqqmPRO2EG4US+/C/PSdZ\nvSB4GU1e8OvOmM+u2ZercGai1ZsmY3UKTSa+eHgSpWnrCUNtObgT1XxZb41QBJQBmZf2JfG/n0kF\njiB4IU1e8OvPmL0zi/Y2jmZk8J8Bf2RuRTnLgenozZZVa4RoVEYfcNUfeOCj/4jQC4IX0+QFv76M\n2duyaG9k17ZtpA4dRFfq3ua8N/CjwY/79hwQoRcEH6DJC75w9hzNyODzUX/C73guc4FXULZNMbW3\nHBz6xUYRe0HwEUTwBTv0rL43qtfNEdTq2GUooX8JaAvkRbfn9tVfyN6yguBDiOALgEXoRw7hgooK\nLgGGoerrXwKmAGtRk7SFLVpy7efrZbWsIPggIvjNHL0C50Taeru6erXHFrRHZfeHUZ0tbxOhFwSf\nRQTfDTi7k5SnSUv5jG8SJhAFGKlrjy3YB1RFRnHXmi/FvhEEH0cEvwHqEu+GthNzdicpT1JoMvFz\nwgQ6A0+gsnjbCdm9wGZUVi/2jSA0DUTwG6Au8f7sswlnfI8398PRWyPkfLWei4BAVKTxKBvnJJCN\n2nLwZulXLwhNCj9PB+DtnIt4G42FqDwZvKkfztGMDN699CJOpK7huYoKgoBjqEjDgDtRi6hCbhzC\ntL2H+OOAgZ4MVxAEFyMZfgOcSzMzb+yHU2gy8emga7m2vAwT1qx+CfA00BXYHxjI7d9tFa9eEJoo\njRL8devWMX/+fDIyMlixYgV9+vSpee2dd97hk08+wd/fn6eeeor+/fs3OlhPcC7i7S39cMwnTvDf\ne+6hcPO3VBYXM1vTMAAfY83qpwFzAgOpvOkW7ntjviyiEoQmTKMEPy4ujvnz5/P000/bPZ+RkUFq\naipr164lJyeHiRMnsn79egwGQz1n8l68RbzPlqMZGSy89goiNI0eqEVU24FLUTX2r6I6XO5v2Yp7\nf9sjQi8IzYBGCX63burWX9M0u+fT0tKIj48nICCATp06YTQa+e2337jssssaM5zgJLp9E6VpdrtQ\nPY0S/FDADJyIiua2z9eL2AtCM8EtHn5ubi59+/atedy+fXtyc3PdMZTgQKHJxL8HX0e306VUYF9b\n3xVYDByL7ci9GzeL0AtCM6NBwZ84cSL5+fm1nk9MTGTw4MF1vscx4wectnMaqnH3FrwtTvOJE6Q8\n8ACZa9Yws6LCzqvXM/x9QI9Ro3j4/fcJi/Ausfe277M+JE7X4Qsxgu/E6QwNCv4HH3xw1ieNiYkh\nOzu75nFOTg7R0dFOvTcvr/isxzvfREWFeE2cu7Zt48sx8XQ9Xcpx7FfMDgNeADqixF7fW7aiyru+\nZ2/6Ps+ExOk6fCFG8K04ncFldfi2Wf3gwYNZu3Yt5eXlHD16lCNHjnDppZe6aijBhi/HxDP7dCn3\no1bM7sW6AiAU8IvtyIC9h5h+vEi2HBSEZk6jPPyvvvqK2bNnU1BQwOTJk+nZsyfvvvsuPXr0YOjQ\noQwbNoyAgACeeeYZn6zQ8WaOZmSQOm443U6X2vn03VFtEsqBnE6duCPtO/HqBUEAwKDVZbh7EF+5\nffJUnIUmE9/OeIyDa1fzXEUFy1BdLXWffhYQGhZG62uu488fLaGiKtAjcZ4NvnTbLHG6Bl+IEXwr\nTmeQlbY+gi702qYNtDSb6YZ9D5xS4ECrIG5eta6m/01YhG/8sQqCcH4QwfcRvp3xGPemfGqXydv2\nwJkT25G/pO/2ZIiCIHg5IvhejJ7Vhx4+hHbogJ1X3xO1G1V7Pz+yYzowdOUazwUqCIJPIILvxdhm\n9Y419dlhYcQMHMz1ya/JpKwgCE4hgu9l7Nq2jS9G/wljWRm5wALgblRN/SthYXTv0o1CYxfGiNAL\ngnCWiOB7GV+OiefFsrKaTH4ZkIry6SMHDub6RYs9GZ4gCD6MCL6X0a3stJ1XHwKYgoJYPGQo1ye/\n5sHIBEHwdUTwPYztxGyh0cjuwBZo5dYMvxioHjKU4ZLZC4LQSETwPYxduWX6z/x94CCe+vF7jGVl\nHDcYaHX9QMZIZi8IggsQwfcwoYcP2Vk4nQsLuftonidDEgShiSKbmJ9HCk0mPk+4l2+H3MDnCfdQ\nWGCi0Gi02e4cCo1dPBihIAhNGcnwzyOO9s1iDFyf/DqLMVg8/C4yMSsIgtsQwT+PONo3oYcPERoe\nIROygiCcF8TSOY+IfSMIgieRDN8NOJZaXp/8OqHhEWLfCILgUUTw3UBdXv3wRYvFvhEEwaOIpeMG\n6vLqBUEQPI0IvhsQr14QBG9ELB03IF69IAjeSKMEf926dcyfP5+MjAxWrFhBnz59ADh27Bjx8fF0\n69YNgMsuu4xnn3220cH6CuLVC4LgjTRK8OPi4pg/fz5PP/10rdcuuOACVq5c2ZjTC4IgCC6kUYKv\nZ/CapjVwpCAIguBp3DZpm5mZydixYxk/fjw//fSTu4YRBEEQnKTBDH/ixInk5+fXej4xMZHBgwfX\n+Z7o6Gi+/vprQkND2blzJw8//DBr1qyhTZs2DQYUFRXiRNjnD/OJE6Q+9BDBBw9S3LUr8QsWAN4X\nZ31InK5F4nQdvhAj+E6cztCg4H/wwQdnfdLAwEBCQ0MB6NOnD507d+bQoUM1k7pnIi+v+KzHcyef\nJ0yyLqLaupXFZZVM/OwTr4uzLqKiQiROFyJxug5fiBF8K05ncJmlY+vjm0wmqqurATh69ChHjhyh\nc+fOrhrqvCKLqARBaCo0atL2q6++Yvbs2RQUFDB58mR69uzJu+++y08//cTf//53AgIC8PPz4/nn\nn6dt27auivm8Umg0oqX/XLPloCyiEgTBV2mU4N90003cdNNNtZ4fMmQIQ4YMacypvQZZRCUIQlNB\nVto2gCyiEgShqSC9dARBEJoJzVLw69pbVhAEoanTLC2d+vrVC4IgNGWaZYYvpZaCIDRHmqXgS796\nQRCaI03e0qlrf1kptRQEoTnS5AW/Pr9ePHtBEJobTd7SEb9eEARB0eQFX/x6QRAERZO3dMSvFwRB\nUDR5wZfWCIIgCIomb+kIgiAIChF8QRCEZoIIviAIQjNBBF8QBKGZIIIvCILQTBDBFwRBaCY0SvCT\nk5MZOnQoo0aNYtq0aZSUlNS89s477zBkyBCGDh3Kd9991+hABUEQhMbRKMHv378/a9asISUlBaPR\nyDvvvAPA/v37SU1NZe3atSxatIjnnnsOTdMaOJsgCILgThol+Ndeey1+fuoUffv2JScnB4ANGzYQ\nHx9PQEAAnTp1wmg08ttvvzU+WkEQBOGccZmHv2LFCgYOHAhAbm4uHTp0qHmtffv25ObmumooQRAE\n4RxosLXCxIkTyc/Pr/V8YmIigwcPBmDBggUEBgYyfPhwgDrtG4PBUOs5QRAE4fzRoOB/8MEHZ3x9\n5cqVbNq0iSVLltQ8FxMTQ3Z2ds3jnJwcoqOjnQooKirEqeM8jcTpWiRO1+ILcfpCjOA7cTpDoyyd\nb775hnfffZcFCxbQokWLmucHDx7M2rVrKS8v5+jRoxw5coRLL7200cEKgiAI545Ba0T5zJAhQ6io\nqIMzjrUAAATvSURBVCAsLAyAyy67jGeffRZQZZkrVqwgICCAp556iv79+7skYEEQBOHcaJTgC4Ig\nCL6DrLQVBEFoJojgC4IgNBNE8AVBEJoJXiv47733Hj179sRsNns6lDp58803GTlyJKNHj+b+++8n\nLy/P0yHVyZn6HXkT69atY/jw4fTq1YudO3d6Ohw7vvnmG/70pz9xyy23sHDhQk+HUy8zZ87k2muv\nZcSIEZ4OpV5ycnKYMGEC8fHxjBgxwq6c25soLy/ntttuY/To0YwYMYL58+d7OqR6qa6uZsyYMUye\nPLnhgzUvJDs7W7vvvvu0QYMGaQUFBZ4Op05KSkpqfl+yZIn29NNPezCa+tm8ebNWVVWlaZqmvfzy\ny9orr7zi4YjqJiMjQzt48KA2fvx4bceOHZ4Op4aqqirtpptu0jIzM7Xy8nJt5MiR2v79+z0dVp1s\n3bpV27VrlzZ8+HBPh1Ivx48f13bt2qVpmvo/NGTIEK/9Pk+dOqVpmqZVVlZqt912m/brr796OKK6\n+eCDD7Tp06drDz74YIPHemWGP2fOHJKSkjwdxhlp06ZNze+lpaU1PYW8jfr6HXkb3bp1o0uXLl7X\nZO+3337DaDTSsWNHAgMDGTZsGGlpaZ4Oq0769etH27ZtPR3GGYmKiqJXr16A+j/UvXt3jh8/7uGo\n6iYoKAhQ2X5lZaWHo6mbnJwcNm3axG233ebU8Q2utD3fbNiwgQ4dOnDRRRd5OpQGef3110lJSSEk\nJMRrb01tWbFiBcOGDfN0GD5FXX2htm/f7sGImg6ZmZns2bPHaxdlVldXM3bsWI4cOcKf//xnr4xT\nT46Li4udOt4jgl9ff55HH32Ud955h/fff7/mOU9mfA31EUpMTCQxMZGFCxfy0UcfMW3aNA9EeXb9\njjzp7zoTp7fhbXccTYWTJ0/yyCOPMHPmTLu7ZW/Cz8+Pzz77jJKSEh566CH2799Pjx49PB1WDV9/\n/TWRkZH06tWLLVu2OPUejwh+ff159u3bx7Fjxxg1ahSappGbm8u4ceP473//S7t27c5zlA33EdIZ\nPnw4Dz74oMcE/1z6HXkCZ79PbyImJoasrKyax7m5uU73hRLqprKykkceeYRRo0Zx0003eTqcBgkO\nDuYPf/gD3377rVcJ/s8//8yGDRvYtGkTZWVlnDx5kqSkJJKTk+t9j1cZz3FxcWzevJm0tDQ2bNhA\n+/btWblypUfEviEOHz5c83taWhrdunXzYDT1U1+/I2/Gm7LqSy65hCNHjnDs2DHKy8tZs2YNN954\no6fDqhdv+u7qY+bMmfTo0YN77rnH06HUi8lkqrFJTp8+zQ8//OB1/8cfe+wxvv76a9LS0njttdf4\n4x//eEaxBy/08G0xGAxe+wf86quvcvDgQfz8/IiNjeW5557zdEh18sILL1BRUcF9990H2Pc78ia+\n+uorZs+eTUFBAZMnT6Znz568++67ng4Lf39/Zs2axX333Yemadx66610797d02HVyfTp09myZQtm\ns5kbbriBadOmMW7cOE+HZce2bdtYvXo1cXFxjB49GoPBQGJiIgMGDPB0aHbk5eXxxBNPUF1dTXV1\nNfHx8TX7ffgy0ktHEAShmeBVlo4gCILgPkTwBUEQmgki+IIgCM0EEXxBEIRmggi+IAhCM0EEXxAE\noZkggi8IgtBMEMEXBEFoJvw//5K32R/vBHAAAAAASUVORK5CYII=\n", - "text/plain": [ - "\u003cmatplotlib.figure.Figure at 0x7f5be3c99f50\u003e" - ] - }, - "metadata": { - "tags": [] - }, - "output_type": "display_data" - }, - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Current loss: 9.48636\n" - ] - } - ], - "source": [ - "import matplotlib.pyplot as plt\n", - "\n", - "plt.scatter(inputs, outputs, c='b')\n", - "plt.scatter(inputs, model(inputs), c='r')\n", - "plt.show()\n", - "\n", - "print('Current loss: '),\n", - "print(loss(model(inputs), outputs).numpy())" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "sSDP-yeq_4jE" - }, - "source": [ - "### Define a training loop\n", - "\n", - "We now have our network and our training data. Let's train it, i.e., use the training data to update the model's variables (`W` and `b`) so that the loss goes down using [gradient descent](https://en.wikipedia.org/wiki/Gradient_descent). There are many variants of the gradient descent scheme that are captured in `tf.train.Optimizer` implementations. We'd highly recommend using those implementations, but in the spirit of building from first principles, in this particular example we will implement the basic math ourselves." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "MBIACgdnA55X" - }, - "outputs": [], - "source": [ - "def train(model, inputs, outputs, learning_rate):\n", - " with tf.GradientTape() as t:\n", - " current_loss = loss(model(inputs), outputs)\n", - " dW, db = t.gradient(current_loss, [model.W, model.b])\n", - " model.W.assign_sub(learning_rate * dW)\n", - " model.b.assign_sub(learning_rate * db)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "RwWPaJryD2aN" - }, - "source": [ - "Finally, let's repeatedly run through the training data and see how `W` and `b` evolve." - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "height": 446 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 569, - "status": "ok", - "timestamp": 1527005915434, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "XdfkR223D9dW", - "outputId": "c43591ae-d5ac-4f2b-a8e7-bfce607e0919" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Epoch 0: W=5.00 b=0.00, loss=9.48636\n", - "Epoch 1: W=4.58 b=0.42, loss=6.28101\n", - "Epoch 2: W=4.24 b=0.76, loss=4.29357\n", - "Epoch 3: W=3.98 b=1.02, loss=3.06128\n", - "Epoch 4: W=3.78 b=1.23, loss=2.29721\n", - "Epoch 5: W=3.61 b=1.39, loss=1.82345\n", - "Epoch 6: W=3.49 b=1.52, loss=1.52970\n", - "Epoch 7: W=3.38 b=1.62, loss=1.34756\n", - "Epoch 8: W=3.30 b=1.70, loss=1.23463\n", - "Epoch 9: W=3.24 b=1.76, loss=1.16460\n" - ] - }, - { - "data": { - "image/png": "iVBORw0KGgoAAAANSUhEUgAAAW0AAAEDCAYAAAD+/1UIAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xl4VOXdPvD7zJZ9XwmELQkQIAELsiTsi6xiEBGXAiIW\nbV8WBY2K0tLa4lbsr283qxURtIoioAi8SpFNg6whi0FJKAoJBgLZt5k5c87vj5OZLIRkgEnOGXJ/\nritXJsmZyT0sN1+enPOMIMuyDCIicgs6tQMQEZHzWNpERG6EpU1E5EZY2kREboSlTUTkRljaRERu\nxODMQePGjYOvry90Oh0MBgM2b97c1rmIiKgZTpW2IAjYuHEjAgIC2joPERG1wKnlEVmWIUlSW2ch\nIqJWCM5cETl+/HgEBARAEATMmTMH9957b3tkIyKiJpxaHvnggw8QFhaG4uJiLFiwAD179sTgwYPb\nOhsRETXh1PJIWFgYACA4OBgTJ05EVlZWi8fL3t6AIADdugFvvglYrTeflIiIWl8eqampgSRJ8PHx\nQXV1NR5++GEsXrwYI0aMuPadCgtRvfoFeL2zDkJtLWxdu6NqRSrMs+8DDE4N9y4XFuaHoqIKVb73\ntTCTc7SYCdBmLmZyjlYzOaPVSfvy5ct44IEHkJKSgjlz5mDcuHEtFzYAREai6oWXUHwkA9WPPApd\n4QX4L/sVgpMGwWPTvwFRdCocERE15tQPIm9Ew3/FdBcK4P3ntfB89x0IVivEmFhUP/kMzCmzAL2+\nLb79VbT6LysztU6LmQBt5mIm52g1kzPa5YpIKaozKl9+DcWHT6Jm7gLof/wB/r98BEGjh8Fj28cA\nTyckInJKu17GLnWJRuXaP6P40AnUPDgP+jN58F+0AEFjhsO0fRvLm4ioFarsPSJ1647KP/0VxWnH\nUTvnAehPf4+AhfMQNG4ETDs/A/hiOkREzVJ1wyipR09U/OV1lHx9FLX3zIH+uxwEPPQAAieMgunz\nXSxvIqImNLHLny0mDhV/fxMlB4+g9u57YMjORMDcOQicNAamPV+wvImI6miitO1scb1Q8fo6lOz/\nBrUzZsJ4Mh0B99+DwKnjYdy7h+VNRNftL395DR999IHj4+XLl2DVqlWOj//61/+HDz/8txrRboim\nStvO1iceFf96B8V702CeNgPG48cQOGcmAu+cBOOBfSxvInJa//6JyM7OAKBsfldWVorc3FzH17Oz\nM5GQMECteNdNk6VtZ+vXH+Vvv4uSPQdhnjwVxiPfIPCeGQhImQpj2ldqxyMiN5CQMBBZWZkAgLNn\nz6Bnzxj4+PigsrISVqsVP/74A+Liequc0nnqXFN+ncSEASjf8AEMJ0/A+9UX4bH7c5hSpsIycjSq\nnloJcdhwtSMSkRN8Vj8Pj+3bXPqY5jtTULX699f8emhoKPR6Ay5duoisrEz075+I6uoyZGdnwsfH\nBzExsTCotL3GjdD0pN2UOPBnKH/vI5Ts2gPL2PEwHdyPoBmTEDD7LhiOHlY7HhFpVGJiIrKyMpCd\nrZT2gAEDkJWVgaws91oaAdxk0m5KHHQ7yjZtheHIYfi8sgam/Xth2r8X5vETUZ26EuJtg9SOSETN\nqFr9+xan4rbSr18isrIy8d//KssjHh4y/vnPf8HX1wfTpt3V7nluhltN2k2JQ4aibPMnKP1kFyzJ\nI+GxZzeCJo2F/8/vhSHzpNrxiEgjEhIGIC3tIPz9/SEIAgICAlBZWYHs7Cz075+gdrzr4talbWcd\nnoyyrTtQuuUzWIYlweOL/0PQhFHwn3c/9HU/gCCijismJhbl5WXo3z+x0ef8/Pzg7+9er33bLrv8\ntStZhvHAPvi8/AcYjx0BAJin3wWPXz+Hom69lRdn0Ait7jTGTM7RYi5mco5WMznjlpi0GxEEWEeP\nRemO3Sj9YAusPxsEj88+AYYMQeCEUfBc/xaEinK1UxIR3ZBbr7TtBAHWcRNQuutLlG7aCsycCUNO\nNvxSn0BIQm/4Ll8CQ/pxXqhDRG7l1i1tO0GAdex4YMsWFJ88hapnV0EKCYHXu+8gaNJYBI4fCc+3\n/wWhvEztpERErbr1S7sBKSIS1U88heKjmSj9YAvM02bAcOpb+D29HCGJveH7xGIYThzj9E1EmtWh\nSttBp4N13ASUv/2uMn2v/DWk0DB4vbcBQZPHIWjcCHiue5PTNxFpTscs7QakiEhUP/4kio9koHTT\nVpinzYD++1Pwe2aFMn0//j8wHD/K6ZuINKHDl7aDTgfr2PHK9J2eg8rnfgMpNBxe/96IoCnjETQ2\nmdM3kZsqLPwJ8+bNUTuGS7C0myFFRKJm2QoUHzmpTN/T74L+9HfK9J3QC77LfgXDsSOcvonciKCh\nazRuBku7Jfbpe91GXEk/hcrnV0MKj4DX++8iaOoEZfp+6w0IZaVqJyWiVoiiiD/8YTXmz78fy5Yt\ng9lsVjvSDbn1roi8BpddASVJMB7YB6+N62Ha9RkEUYTs5QXzXXejZu5DEAcPcfqqS61elcVMztFi\nLq1nWr3aA9u3u3afujvvFLF6dcsFXFj4E2bPnoF//GMd+vdPwJ/+9CI6dYrGfff93KVZbkbHvSKy\nrel0sI4Zh/K3NuDKye9Q+fxvIYVHwPOD9xA0bSKCxiTB861/cvom0piIiEjH5lAzZsxAZmaGyolu\njFtuzaoVcng4apY+gZrFy2A8uB+eG9fDY+d2+D37FHx/92uYZ8xEzbwF1zV9E93KVq82tzoVt5Wm\na9ru+leSk7Yr6HSwjh6Lin+9o0zfq34HKSISnpv+XTd9D4fnv16HUFqidlKiDquw8Cd8+202AGDH\njh1ITByocqIbw9J2MTk8HDVLHkfxN+ko3fwpau+6G/q8XPitTEVIYm/4LXkMhiOHeeYJUTvr3r0H\ndu36DPPn34+ysjKkpNyjdqQbwh9EtgOhqAiem/4Nz41vw3D2vwAAsU88DAsfxpVREyH16KlKruZo\n/QdZWqLFXMzkHK1mcgYn7XYgh4WhZvEylBw6gdKPt6M25W7oz+QBTz2FkKEDETR6OLxf/gMMWRmc\nwImoRfxBZHvS6WAdORrWkaNReeUKQr/eA/Omj2A6sA8+a1+Gz9qXYYvuCvOUabBMvRPWIcMAN3qV\naCJqe2wElcghIcDChSifcS+EygoYv/wPPHZ+BtPuz+H9xj/g/cY/IAUHwzxpKixTpsMyeizg5aV2\nbCJSGUtbA2RfP1hmzIRlxkzAYoHx64NKgf/fDni9/y683n8Xsrc3LGMnwDx1OiwTJ0EODFI7NhGp\ngKWtNSYTrGPHKy/c8PJaGE4cg8euHTDt3A6PHZ/CY8enkA0GWJNGKgU+ZRqkTlFqpyaidsLS1jKd\nDuLgIRAHD0HV86uhP/09PHZ9BtPO7TAd2AvTgb3AMytg/dkgmKdMh2XqnbDF9VI7NRG1IZ494i4E\nAbbefVD9+JMo/WI/rqTnoOLFV2EZOQaGjJPw/cNvEZw8GEFJg+Dz+9XKHuCSpHZqItVVVlZi69bN\nbfb406dPQGVlJQDgypXLGDnydmRlZTT4+kSUl7vuxcSdLm1JkjBz5kw89thjLvvmdOOkzl1Qu/BR\nlH38Ka7knEH5X/8J89Q7oS/Ih/f/voagKeMRPDAevqlPwLjvS8BiUTsykSoqKsqxdetHzX5NcsFg\n07dvArKzMwEA2dmZ6NWrD7KylI/PnfsRgYFB8Pf3v+nvY+d0aW/YsAExMTEu+8bkOnJQMMz33o/y\n9e/h8qmzKHvnfdTOeQCCuRZe699C4L0pCOkbA79fPgLT9m1A3VRA1BG8/vpfceFCAR5++EH8/e//\ni/T045g3bx5++9vnMX/+fVe9QML777+Lt99+EwBQUJCPFSuW4pFH5mHx4kU4d+7Hqx4/ISHRUdpZ\nWZmYM+dBfPttfYknJCS69Pk4taZdWFiI/fv347HHHsPbb7/t0gDkYt7esEyZBsuUaYAowvhNGky7\nPoPHzs/g+fGH8Pz4Q8geHrCMGQfLlOkw3zEFcmio2qmpAwke1L/Zzxcfz3bJ8U398pdL8MMP/8W6\nde8BANLTjyMrKwsbNnyIyMhIFBb+dM0XSHjllTVITV2Jzp27ICcnG2vXvoQ///kfjY7p3z8R69e/\nBQA4depbPPLIY/joo38DUEo8IWGAUzmd5VRpr1mzBqmpqaio0NZln9QKgwHWEaNgHTEKVb9/GYas\nDOUslF074PH5Lnh8vgu+Oh2sQ4fDMnU6zFOmA2HN/wUhupUkJiYiMjKyxWNqamqQnZ2BVauehn23\nD1EUrzqub99+yM39HrW1tbDZbPD09ERUVGcUFOQjOzsD99/v2j27Wy3tffv2ITQ0FPHx8Th8+LDT\nD+zsdfTtqcNnGj9SeVv7CpCbC3zyCYStW2E6lAbToa/hu+pZICEBYWPGAGPGAKNGARqZwrX4ewdo\nM5fmMzWzxAAAYde68/Ue34TFUg69XufIEBjoDS8vL8fHklQNQajPaDQCOp0JwcHeCAgIwPbtn7by\nHfzQvXs37N//OQYMSEBYmB+GDBmMrKxjKC8vw6Br/E/hRrVa2idOnMCXX36J/fv3w2w2o6qqCqmp\nqXjllVdavJ8WN2NhpgYCI4H5jwLzH4Vw8SI8Pt8Jj53bYUr7CsjKAv7yFwCAGN8X1uHJsCSPhHVY\nMuQwZ/+quI4Wf+8AbeZipqvV1sqoqKh0ZCgtrQZQ31GSZMLly1dw5kwBPD09sXv3HgwbloSaGhkR\nEZ3w4YdbMXbsBABAXl4uYmPjrvoeffr0w7p1b2PhwkdRVFSBbt164YUXViE+vp/Tz93Zf2xbLe3l\ny5dj+fLlAIAjR45g3bp1rRY2uRc5IgK18xagdt4ChPmbUPrFPhi/Pghj2tcwHjsMw6kceK1TfjAj\n9u4D6/BkWJNHwjJ8BOTwcJXTE7XM3z8ACQkDMH/+fRg6NAnDhyc3+rrBYMCCBY9g0aL5iIrqjG7d\nuju+9utfv4A//vElvPPOOthsIsaPv6PZ0k5IGIDNmzehXz/llXF69+6DoqIizJgx0+XP57q2ZrWX\n9uuvv97qsfzXvnVukcligSH9BEyHvlKK/OhhCNXVji+Lcb1gHT4C1uQRsCaNgBTR8jqhSzJphBZz\nMZNztJrJGdxPW0VumclqheHkCRgPfQ3T1wdhOHIYuqr6UwjFmFhYk0Y43lxxib0Wf50AbeZiJudo\nNZMzeBk7XR+jEeLtQyHePhQ1S5crJZ55UllKSTsI4+Fv4LVxPbw2rgcA2Lr3UNbD65ZUpM5d1M1P\n5OZY2nRzjEaIg26HOOh21Cx5HBBFGLIylBI/9BWMh9Lg9d4GeL23AQBg69odluQR9SUe3VXlJ0Dk\nXlja5FoGA8TbBkG8bRBq/mcpYLPB8G0WjF9/VV/iddvNAoAtuiusSSNgsS+ndO3mvi+TTdQOWNrU\ntvR6iIkDISYORM0vFwM2G/Q538KUdtAxjXtu+jc8NylXkNk6d3Gsh1uSRkDq3kPlJ0CkLSxtal96\nPWwJiahJSETNo/8DSBL0p3Ial/hHH8Dzow8AALZOUcCY0fDqkwAxcQDEhETI/gEqPwki9bC0SV06\nHWz9+qOmX3/U/OKXSol//x2MaV/BlKYsqeD99+GL9x13EXv0VKb3hAF1RT5Aefk2og6ApU3aotPB\nFt8Xtvi+qF24CJBlhJUWonx/GgyZGcpb1kl4frIF+GSL4262LtH1JZ44AGLiwDY5Z5zcT2VlJXbv\n/j/MnHlPm32PNWt+i+TkkRg9elybfQ87ljZpmyAAvXrBHNQJ5pRZyudkGbr8844CN2RmwJhxEh67\nPoPHrs8cd7WFR9SXeMJAiIkDIHWJ5g86Oxj7ftpNS1uSJOh07vc6MCxtcj+CACm6KyzRXWGZdqfj\n07qLhTBknmwwkWfA4z9fwOM/XziOkYKCHAVuf7N17wm44V9edzVokE+znz9+vMolxzfVcD9tvV4P\nLy9vREVF4ttvc/Dqq39Gaurj2LBhEwBlL+3a2hosWPALFBTk47XXXkFZWSk8PT2Rmvocunbtds3v\nc/ToYXz44fsoKSnG4sVPIClphFP5rhdLm24ZUkQkLBMnwzJxsuNzwpUrMGTVl7gh82T962va7+fr\nBzEh0bE+LiYOhC02DjDwr8etoOF+2unpx5Ga+gTWrn0VRqPfTe+l3VBh4U/429/eRH7+eSxd+hg2\nbdoGo9Ho8ufDP5V0S5NDQmAdMw7WMfVrjUJ5GQzZWfVTeVYGjIcPwXTo6/r7eXlB7NvfsT4uJg6A\n2DseMJnUeBq3FGcn5Bs9vjV9+/ZDVFRUi5exO7uXdkPjxk0EAHTpEo2oqM748ccfmt1c6maxtKnD\nkf0DHOeCO1RVwZCT3WAiz4AhIx3G40fr72c0QozvpxR4/0Rg2CAIIZ2VnQ65Tu42PD09Hbf1ej1s\ntvrXibRYzAAAWZbg5+fveLUbZzSd2K81wd8sljYRAPj4OPZUcTCbYfgup9FZK4Zvs2HMPOk4JBSA\n5B8AW2wsbDFxsMX1glj33tajJ+Dh0f7PhRrx9vZGdd3OlE33xwsKCkZpaQnKy8vh6emJtLSvMGxY\nEry9fdCpUxT27v1Pq3tp2+3d+x9MnjwNFy4U4MKFghbXv28GS5voWjw8IA64DeKA2+o/Z7VCn3sa\nhqwM+F/4EeaMbOjP5MKQlQnjieON7i7rdJC6doMYG+codFtsHMTYXsqLSXA6bxcN99M2mTwQHBzs\n+Jor9tK2i47uhsWLF6GkpBhPPbWyTdazAW7Nqipmco4WMwFNcokidOd+hOFMLvS5udCfyVXKPS8X\nustFV93XMZ3H1he5LTbupqdzLf5aMZNzuDUrUXsyGCD1jIGlZwzQ4OwVABBKS6DPy4U+LxeGuvf6\nvNOtT+f2Iq9bcuF0TgBLm6jNyYFBEAcPgTh4CMwNvyCK0J/7oa7E86DPO11X7KeVc8sbnF8O1E3n\nccpSixjXS1lyccF0Ts7bsGEd9u79DwRBgCzLEAQBY8dOwNy5C9otA5dHVMRMztFiJqBtc101neee\nVpZczv4XgtXa6FjHdB7XCx59eqEyJBK26GhIXaJh69IVcmioqhO6Fn//tJrJGZy0iTTIqem8bu3c\nUFfoHrs/B3Z/Dt+mj+XlBVvnLkqJR3eFFN0VtrpCl6KjIUV2AvT6dnx2dDNY2kTuxGCArWcsbD1j\ngTumNPqSUFKM0MorKMv8Dvr8c9Dln4f+/Pm69z/CkJfb7EPKBgOkqM6wdbFP59GOYpeio2HrHM3l\nFw1haRPdIuSgYKBXN1iir3FaWmUl9PnnlUI/fx76/PPQ5Z9zFLvx0NcQrrFaaguPUAo8uiukLg0K\nvW5al32d+6893TyWNlFH4esLW5942PrEN/91sxm6CwV1ZX4e+vPn6m+fOwdDxkkYjx9r9q5SYKBS\n4F2i69bT64sdiX0A2YNLMC7C0iYihYcHpB49IfXo2fzXbTboLhbWTen1yy/224b/5kHIzmz2rqF6\nPaTQMEjhEZAiIhq/D49s9DG8vdvwSbo/ljYROUevhxTVGVJUZ4hDh139dVmGUFzcYPlFKXPv4iKI\n5wuUrXPP5ELIymjx20h+/pDCwyFFRCrvHcVu/1wEpIhIyMHBHXJLXZY2EbmGIEAOCYEYEgI0uPTf\nO8wPpQ1OrxMqK6C7dBG6ixfr3hdCd+lS3fv6z+v/e+aaa+xA3Q9Qw8KbTO0RjlJvWPJosEmUu2Np\nE1G7kn39YPP1U86AaYnVCt2Vy1eVeePCvwjD96cgZKS3+FBSQGD91B4RAXTtAm9PX0jBIZCCgyEH\nBUMKCoYcEgIpKFjTJc/SJiJtMhohRXZSziNviSxDqChvMq03md7r3gy5px13a/71cOoe0ttbKfSg\nukIPaVDswcH1X6u7LQcHQ/bxbZeLmFjaROTeBAGyfwBs/gHKKw61xGKB7nIRQsQqlJ45D11JMYSS\nYuiuXGl0Wygpga6kGIYzeRCqnXsRBtlobDSty0H1hS4FBSsTfXDj4pcDAq97XZ6lTUQdh8kEKaoz\nEOYHa9dezt3HbFYKvbgYuuIrSrHbbxcX15e9/eOfLsBwKseph5Z1OsiBgZCCQ4AG/wtoCUubiKgl\nHh7KEk1kJ9icvY8oQigtVQq9boq/ZvHX3XYWS5uIyNUMBsihobCFhgJOvkxkmJMP3fFOciQicmMs\nbSIiN8LSJiJyIyxtIiI30uoPIi0WCx588EFYrVbYbDZMmjQJixcvbo9sRETURKulbTKZsGHDBnh5\necFms+H+++/HqFGjkJiY2B75iIioAaeWR7y8vAAoU7coim0aiIiIrs2p0pYkCSkpKUhOTkZycjKn\nbCIilTh1cY1Op8O2bdtQWVmJX/3qV8jLy0NsbAs7dHXvjmDp6i0Vi49nN3t48KD+zX7epcfrhKsy\nqZoHuCqT6nmaZNJEngaZNJPH7tyPmsrD42+N41tzXVdE+vr6YsiQITh48GDLpQ1Ar7t6t6trvkR8\nM8e2xfFNM6mdp2kmLeRpmEkreeyZtJSnxfuolMd+/FX3UznPVffVQJ5GH2skj7MEWW5hl3EAxcXF\nMBqN8PPzQ21tLRYuXIhFixZh9OjRLT5wUYNNz7UgLMyPmZzATM7TYi5mco5WMzmj1Um7qKgIzzzz\nDCRJgiRJmDp1aquFTUREbaPV0u7duze2bt3aHlmIiKgVvCKSiMiNsLSJiNwIS5uIyI2wtImI3AhL\nm4jIjbC0iYjcCEubiMiNsLSJiNwIS5uIyI2wtImI3AhLm4jIjbC0iYjcCEubiMiNsLSJiNwIS5uI\nyI2wtImI3AhLm4jIjbC0iYjcCEubiMiNsLSJiNwIS5uIyI2wtImI3AhLm4jIjbC0iYjcCEubiMiN\nsLSJiNwIS5uIyI2wtImI3AhLm4jIjbC0iYjcCEubiMiNsLSJiNwIS5uIyI2wtImI3AhLm4jIjbC0\niYjciKG1AwoLC5GamorLly9Dr9dj9uzZmDdvXntkIyKiJlotbb1ej2effRbx8fGoqqrC3XffjeTk\nZMTExLRHPiIiaqDV5ZGwsDDEx8cDAHx8fBATE4NLly61eTAiIrrada1p5+fn47vvvkNiYmJb5SEi\noha0ujxiV1VVhaVLl2LlypXw8fFp8dju3QFJuvqY48ermj1+0KDmH8+Vx+t0V2dSMw+AqzKpnadp\nJi3kaZhJK3nszp1r9tOq5eHxt8bxrXGqtEVRxNKlS3HXXXdhwoQJTj2wTnf1EB8W5neNY5t/DFcf\n3zST2nmaZtJCnoaZtJLHnklLeVq6j1p57Mc3vZ/aeZre1kKehh9rJY+zBFmW5dYOSk1NRVBQEJ59\n9lmnH7ioqOKGArWVsDA/ZnICMzlPi7mYyTlazeSMVte0jx8/ju3bt+Obb75BSkoKZs6ciQMHDtx0\nQCIiun6tLo8MGjQIp06dao8sRETUCl4RSUTkRljaRERuhKVNRORGWNpERG6EpU1E5EacviKSiIiu\nnyQBZWVASYmA4uL6t5ISwfG5khIBn37q3OOxtImInGSxoFHRNizgpkWsfAyUlgqQJMFlGVjaRNTh\nyDJQWYmrCvdaU7D981VVzpWvXi8jKEhGaKiMuDgJQUEygoOVt6Ag1L2XHe+DgmQAvk49NkubiG4Z\nNTVAUZGAixcFXLqkw6VLyu2iIuVj5fMCLl8GLBbnLhv39lZKtUcPqVHR1pdw4/INCZHh5wcIrhuu\nG2FpE5GmSZIyEV+6JDhK2F7IDd8uXtShvLzlpvTwkBERIWPgQMDfX2yxfO23vbza6Yk6iaVNRKqo\nqUGjwm1cwvVTcVGRAFFsuYxDQiR07izhtttkhIfLiIiQEB5uvy3X3Zbg769MwMqGUTXt9Exdi6VN\nRC4likBhoYD8fB3OnxdQUQGcPevRYEpWStn5qVhCeLjUqIAblnJYmAyjsZ2enAawtInoutTUABcu\nCDh/Xof8fB3y8+23laK+cEGAzda0kE2OW9eaiusnYuVzbbku7M5Y2kTUSFkZGpVw49sCLl9u/po8\nQZARGSnjZz+T0KWL/U1GfLwnPD2rEBGhnE3RkabitsDSJupAZFlZR25Ywsq0XH+7oqL58dZolNG5\ns4z4eBFdusjo0kVCdLTkuB0VJcNkuvp+YWGeKCqS2viZdRwsbaJbiNUKnDvXtJDrlzIKCgSYzc2X\nso+P3KiEu3SxfywhOlpZtmjppdeofbC0idyMLAM//SQgL0+H3FwdzpzRIS9PeV9QAEhS8xdphIZK\niI9X1pPrC7m+mAMDuYbsDljaRBpVXQ2cOaOUccNyzsvTobr66naNiJCQlARERFgbTczR0TI6d5bg\n7a3CkyCXY2kTqcg+Nefm1heyfWrOz796LcLTU0bPnhJiYxu/xcQoZ1so5x/XqvBMqL2wtInagX1q\nbljK9um5uak5MlLCyJEiYmIal3OXLlxX7uhY2kQuIsvK+csNJ2b7W0HBtafmuDjJUc72277O7R1E\nHRBLm+g6WSzA99/rcOkScOKEqdH03NzU3KmTMjU3XMqIi5PQuTOnZrp+LG2iFtTUADk5OmRm6pGV\npbw/dUoHq9Vezh4AAC+va681c2omV2JpE9WprASys/XIzKwv6dOndY0uyfbwkJGQIKF/fxsGDzYh\nIqIasbGcmqn9sLSpQyopAbKylIJW3utx5kzj1vX2ljF4sA2JiRISEpT3cXGS4zLssDATiopsKqSn\njoylTbe8S5cEx9KGvaTPnWtc0AEBMkaOFJGQICEx0YbERBt69uT0TNrD0qZbhv3sjYblnJmpQ2Fh\n4+YNDZUwbpyIxESbo6S7dpV5NSC5BZY2uSVZBn74QXAUs30N+sqVxgUdFSVh8mRrgwlaQmQkC5rc\nF0ubNM9mA06f1jUq56ws/VWb6HfrJiEpyepYg05IkBAWJquUmqhtsLRJc8xmID1dj6+/1iMtTY/j\nx4Hqah/H1wVBeYXrCRPqp+f+/W0IDFQxNFE7YWmT6mprgRMnlIJOS9Pj2DE9amvrp+h+/YCEBKtj\nDbpfPxvPfaYOi6VN7a6mBjh+vL6kjx/XO/Z4FgQZfftKSE62YfhwG4YPF9G7NzdBIrJjaVObq64G\njh2rL+lQSYIFAAANpklEQVQTJ/SwWOpLun9/CUlJNiQl2TBsmIigIJUDE2kYS5tcrqrq6pK2X/at\n09WXdHKyiKFDuRZNdD1Y2nTTKiuBo0ftJW1AeroOolhf0omJ9klaKemAAJUDE7kxljZdt8pK4MgR\n+9kdBmRk1Je0Xi9jwAAJw4crk/SQITb4+6scmOgW0mppr1y5Evv27UNISAi2b9/eHplIYyoqgMOH\n6yfpjIz6TZT0ehkDB0pIShKRnGzDkCE8s4OoLbVa2nfffTfmzp2L1NTU9shDGlBeDnzzjVLQaWnK\nFYeSpJS0wSDjttskJCeLGD6cJU3U3lot7cGDB6OgoKA9spBKZBk4eVKHXbsMOHgQSE/3dZS00ajs\ndGc/u+P2223w8WnlAYmozXBNu4OyWoFDh/TYtcuAXbsMuHBB2bPDaARuv92G5GSlpAcPtvFVvIk0\npM1KOyzMr60e+oZ19EzV1cAXXwBbtwLbtyt7SgNAYCAwdy6QkgJMmgT4+BigtX/Ptfh7B2gzFzM5\nR4uZnNFmfzOLiira6qFvSFiYX4fMVFICfPGFATt3GrBvnwE1NcqyR2SkhAULREydKiIpyebY2N/H\np2P+Ot0ILeZiJudoNZMznCptWeZOae7kwgUBu3YpRZ2Wpnec6REba8PUqUpRDxwocYN/IjfUammv\nWLEChw8fRmlpKcaMGYMlS5Zg1qxZ7ZGNrkNurg47dypFnZ6ud3z+ttuUop4yRUSvXpKKCYnIFVot\n7bVr17ZHDrpOkqSc8WEv6rw8paj1euVls+xFHRXF/yUR3Uq09dMmapHVCqSl1Z/x8dNPyvqGl5eM\nKVOsmDpVxB13cMMlolsZS1vjqquBvXuVaXr3bgNKS5X16cBAGffeqxT1mDEiT8sj6iBY2hpUUgJ8\n/rlS1Pv315/xERUlYdYspaiHDas/44OIOg6WtkYUFAiOZY+GZ3z06lX/g8SBAyW+IC1RB8fSVtGp\nU8C775qwc6cBJ0/Wn/Hxs5/ZT82zIjaWP0gkonos7XZWWgp89JER775rxKlTAOABg0HGqFH1Z3x0\n6sSiJqLmsbTbgSwDR4/qsGGDCZ9+akBtrQCjUcbMmcCECTWYOFHkq7cQkVNY2m2orEyZqjduNOLU\nKWX5o0cPCXPnmjFnjoi+fX1RVCSqnJKI3AlL28VkGTh2TIeNG0345BPlzA+jUcZdd1kxd64VI0bY\nePk4Ed0wlraLlJUBmzcbsWFD/VTdrZuEuXMtuP9+K8LCuE5NRDePpX0TZBk4cUJZq962TZmqDQYZ\nM2YoU/XIkZyqici1WNo3oLy8fqrOyWk8Vd93nxXh4ZyqiahtsLSdJMtAeroOGzYYsW2bEdXVylQ9\nfboV8+ZZMWoUp2oianss7VZUVChT9caNRmRnK1N11671U3VEBKdqImo/LO1m2F/oduNGI7ZsUaZq\nvV7GtGnKVD16NKdqIlIHS7uBysr6qTorq36q/vnPlTNAOFUTkdpY2gAyMpS16o8/rp+qp05Vpuox\nYzhVE5F2dNjSrqwEtmxRzgDJzFSm6i5dJCxdasEDD1gRGcmpmoi0p8OVdmamDu+8o6xVV1UpU/Xk\nyVbMn69M1Xp9649BRKSWDlHalZXAtm3A3//u7dgCtXNnCYsXK1M1d9UjIndxS5d2eTnwxhsmvP66\nCeXlgE6nw+TJylr12LGcqonI/dySpV1ZCbz1lgl/+5sJpaUCQkIkrF4tICWliq9OTkRu7ZYq7epq\nYN06I/72NxOuXNEhMFDGc8+ZsXChBT16+KGoiIVNRO7tlijt2lpgwwYj/vxnE4qKdPD3l5Gaasai\nRRb4+6udjojIddy6tM1m4N13lbIuLNTBx0fG8uVmPPaYha8EQ0S3JLcsbasVeP99I/70JxMKCnTw\n9paxZIkZv/qVFSEhXAIholuXW5W2KAIffWTA2rUeOHdOB09PGY89ZsGSJRa+yAARdQhuUdo2G7Bl\niwF//KMHzp7VwWSS8cgjFixbZuF+IETUoWi6tCUJ+PRTA1591YTcXD2MRhkPPWTB449beOoeEXVI\nmixtSQJ27lTK+tQpPfR6GT//uVLWXbuyrImo49JUacsy8MUXerz8sgeys/XQ6WTMmWPF8uVm9OjB\nsiYi0kRpyzKwd69S1unpegiCjLvvtuLJJ82IjWVZExHZqVrasgwcPKiU9dGjykYgM2ZY8eSTFvTp\nI6kZjYhIk1Qr7UOH9HjpJRMOHVIiTJlixVNPWdC/P8uaiOha2r20jx7V4aWXPHDwoPKtJ04UkZpq\nxoABLGsiotY49UJaBw4cwOTJkzFp0iS88cYbN/SNTpzQ4b77vDBtmg8OHjRgzBgRu3ZV4b33aljY\nREROanXSliQJL7zwAtavX4/w8HDcc889GD9+PGJiYpz6BllZOrzyigc+/1z5ViNGiEhNtWDYMNvN\nJSci6oBaLe3MzEx069YNnTt3BgBMmzYNe/bsabW0c3J0ePVVE3bsMAIAhg4V8fTTFowYwbImIrpR\nrZb2xYsX0alTJ8fHERERyMrKavE+990HfPihN2RZwKBBNjz9tBmjR9sgCDcfmIioI2u1tGX5+s+T\n3rQJGDBAwtNPmzF+PMuaiMhVWi3tyMhIXLhwwfHxxYsXER4e3uJ9lJ7XA/C+yXiuFRbmp3aEqzCT\nc7SYCdBmLmZyjhYzOaPVs0cSEhJw7tw5FBQUwGKxYMeOHRg/fnx7ZCMioiZanbT1ej1WrVqFhx9+\nGLIs45577nH6zBEiInItQb6RRWsiIlKFUxfXEBGRNrC0iYjcCEubiMiNuHTDqAMHDmDNmjWQZRmz\nZs3CokWLXPnwN2TlypXYt28fQkJCsH37drXjAAAKCwuRmpqKy5cvQ6/XY/bs2Zg3b56qmSwWCx58\n8EFYrVbYbDZMmjQJixcvVjWTnSRJmDVrFiIiIvD666+rHQfjxo2Dr68vdDodDAYDNm/erHYkVFRU\n4LnnnkNubi50Oh3WrFmDAQMGqJrp7NmzeOKJJyAIAmRZxvnz57Fs2TLV/6yvX78emzdvhiAI6NWr\nF1588UWYTCZVM73zzjuOP0et9oHsIjabTZ4wYYKcn58vWywWecaMGXJeXp6rHv6GHT16VM7JyZGn\nT5+udhSHS5cuyTk5ObIsy3JlZaV8xx13aOLXqrq6WpZlWRZFUZ49e7ackZGhciLF22+/La9YsUJ+\n9NFH1Y4iy7Isjxs3Ti4tLVU7RiNPP/20vHnzZlmWZdlqtcoVFRUqJ2rMZrPJycnJ8oULF1TNUVhY\nKI8bN042m82yLMvysmXL5K1bt6qa6fTp0/L06dNls9ksi6IoP/TQQ/KPP/54zeNdtjzScI8So9Ho\n2KNEbYMHD4a/v7/aMRoJCwtDfHw8AMDHxwcxMTG4dOmSyqkALy8vAMrULYqiymkUhYWF2L9/P2bP\nnq12FAdZliFJ2tmZsrKyEseOHcOsWbMAAAaDAb6+viqnaiwtLQ1du3ZttCWGWiRJQk1NDURRRG1t\nbasXC7a1M2fOYODAgTCZTNDr9bj99tuxe/fuax7vstJubo8SLRSR1uXn5+O7775DYmKi2lEgSRJS\nUlKQnJyM5ORkTWRas2YNUlNTIWhoLwRBELBw4ULMmjULH374odpxkJ+fj6CgIDz77LOYOXMmVq1a\nhdraWrVjNbJz505MmzZN7RiIiIjAggULMGbMGIwaNQp+fn5ISkpSNVNcXByOHj2KsrIy1NTU4MCB\nA/jpp5+uebzLSlvm6d7XraqqCkuXLsXKlSvh4+OjdhzodDps27YNBw4cQEZGBvLy8lTNs2/fPoSG\nhiI+Pl5Tf74++OADbNmyBW+++Sbee+89HDt2TNU8oigiJycHDzzwALZu3QpPT88b3ve+LVitVnz5\n5ZeYMmWK2lFQXl6OPXv2YO/evTh48CCqq6tV/1lXTEwMfvGLX2DBggVYtGgR+vTpA4Ph2j9udFlp\n38geJR2ZKIpYunQp7rrrLkyYMEHtOI34+vpiyJAhOHjwoKo5Tpw4gS+//BLjx4/HihUrcPjwYaSm\npqqaCVCWtwAgODgYEydObHXXy7YWGRmJyMhIJCQkAAAmTZqEnJwcVTM1dODAAfTr1w/BwcFqR0Fa\nWhqio6MRGBgIvV6PiRMnIj09Xe1YmDVrFrZs2YKNGzciICAA3bp1u+axLittLe9RoqUpzW7lypWI\njY3F/Pnz1Y4CACguLkZFRQUAoLa2FocOHULPnj1VzbR8+XLs27cPe/bswWuvvYahQ4filVdeUTVT\nTU0NqqqqAADV1dX46quvEBcXp2qm0NBQdOrUCWfPngUAfPPNN5raamLHjh2YPn262jEAAFFRUcjI\nyIDZbIYsy5r5tSouLgYAXLhwAbt3727x18tlp/xpdY8S+4RWWlqKMWPGYMmSJY4f2Kjl+PHj2L59\nO3r16oWUlBQIgoAnnngCo0aNUi1TUVERnnnmGUiSBEmSMHXqVIwePVq1PFp1+fJlLF68GIIgwGaz\n4c4778SIESPUjoXnn38eTz75JERRRHR0NF588UW1IwFQBoC0tDT87ne/UzsKACAxMRGTJk1CSkoK\nDAYD+vbti3vvvVftWFiyZAnKyspgMBjwm9/8Bn5+196BkHuPEBG5EV4RSUTkRljaRERuhKVNRORG\nWNpERG6EpU1E5EZY2kREboSlTUTkRljaRERu5P8D+7Wym3BFpegAAAAASUVORK5CYII=\n", - "text/plain": [ - "\u003cmatplotlib.figure.Figure at 0x7f5be4b8ec50\u003e" - ] - }, - "metadata": { - "tags": [] - }, - "output_type": "display_data" - } - ], - "source": [ - "model = Model()\n", - "\n", - "# Collect the history of W-values and b-values to plot later\n", - "Ws, bs = [], []\n", - "epochs = range(10)\n", - "for epoch in epochs:\n", - " Ws.append(model.W.numpy())\n", - " bs.append(model.b.numpy())\n", - " current_loss = loss(model(inputs), outputs)\n", - "\n", - " train(model, inputs, outputs, learning_rate=0.1)\n", - " print('Epoch %2d: W=%1.2f b=%1.2f, loss=%2.5f' %\n", - " (epoch, Ws[-1], bs[-1], current_loss))\n", - "\n", - "# Let's plot it all\n", - "plt.plot(epochs, Ws, 'r',\n", - " epochs, bs, 'b')\n", - "plt.plot([TRUE_W] * len(epochs), 'r--',\n", - " [TRUE_b] * len(epochs), 'b--')\n", - "plt.legend(['W', 'b', 'true W', 'true_b'])\n", - "plt.show()\n", - " " - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "vPnIVuaSJwWz" - }, - "source": [ - "## Next Steps\n", - "\n", - "In this tutorial we covered `Variable`s and built and trained a simple linear model using the TensorFlow primitives discussed so far.\n", - "\n", - "In theory, this is pretty much all you need to use TensorFlow for your machine learning research.\n", - "In practice, particularly for neural networks, the higher level APIs like `tf.keras` will be much more convenient since it provides higher level building blocks (called \"layers\"), utilities to save and restore state, a suite of loss functions, a suite of optimization strategies etc. \n", - "\n", - "The [next tutorial](TODO) will cover these higher level APIs." - ] - } - ], - "metadata": { - "colab": { - "collapsed_sections": [], - "default_view": {}, - "name": "Training Models", - "provenance": [], - "version": "0.3.2", - "views": {} - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} diff --git a/tensorflow/contrib/eager/python/examples/notebooks/4_high_level.ipynb b/tensorflow/contrib/eager/python/examples/notebooks/4_high_level.ipynb deleted file mode 100644 index 4fe3a0e3f3..0000000000 --- a/tensorflow/contrib/eager/python/examples/notebooks/4_high_level.ipynb +++ /dev/null @@ -1,551 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "pwX7Fii1rwsJ" - }, - "outputs": [], - "source": [ - "import tensorflow as tf\n", - "tf.enable_eager_execution()\n", - "tfe = tf.contrib.eager\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "UEu3q4jmpKVT" - }, - "source": [ - "# High level API\n", - "\n", - "We recommend using `tf.keras` as a high-level API for building neural networks. That said, most TensorFlow APIs are usable with eager execution.\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "zSFfVVjkrrsI" - }, - "source": [ - "## Layers: common sets of useful operations\n", - "\n", - "Most of the time when writing code for machine learning models you want to operate at a higher level of abstraction than individual operations and manipulation of individual variables.\n", - "\n", - "Many machine learning models are expressible as the composition and stacking of relatively simple layers, and TensorFlow provides both a set of many common layers as a well as easy ways for you to write your own application-specific layers either from scratch or as the composition of existing layers.\n", - "\n", - "TensorFlow includes the full [Keras](https://keras.io) API in the tf.keras package, and the Keras layers are very useful when building your own models.\n" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - } - }, - "colab_type": "code", - "id": "8PyXlPl-4TzQ" - }, - "outputs": [], - "source": [ - "# In the tf.keras.layers package, layers are objects. To construct a layer,\n", - "# simply construct the object. Most layers take as a first argument the number\n", - "# of output dimensions / channels.\n", - "layer = tf.keras.layers.Dense(100)\n", - "# The number of input dimensionss is often unnecessary, as it can be inferred\n", - "# the first time the layer is used, but it can be provided if you want to \n", - "# specify it manually, which is useful in some complex models.\n", - "layer = tf.keras.layers.Dense(10, input_shape=(None, 5))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "Fn69xxPO5Psr" - }, - "source": [ - "The full list of pre-existing layers can be seen in [the documentation](https://www.tensorflow.org/api_docs/python/tf/keras/layers). It includes Dense (a fully-connected layer),\n", - "Conv2D, LSTM, BatchNormalization, Dropout, and many others." - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "height": 204 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 244, - "status": "ok", - "timestamp": 1527783641557, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "E3XKNknP5Mhb", - "outputId": "c5d52434-d980-4488-efa7-5660819d0207" - }, - "outputs": [ - { - "data": { - "text/plain": [ - "\u003ctf.Tensor: id=30, shape=(10, 10), dtype=float32, numpy=\n", - "array([[ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", - " [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", - " [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", - " [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", - " [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", - " [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", - " [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", - " [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", - " [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],\n", - " [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]], dtype=float32)\u003e" - ] - }, - "execution_count": 3, - "metadata": { - "tags": [] - }, - "output_type": "execute_result" - } - ], - "source": [ - "# To use a layer, simply call it.\n", - "layer(tf.zeros([10, 5]))" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "height": 221 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 320, - "status": "ok", - "timestamp": 1527783642457, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "Wt_Nsv-L5t2s", - "outputId": "f0d96dce-0128-4080-bfe2-0ee6fbc0ad90" - }, - "outputs": [ - { - "data": { - "text/plain": [ - "[\u003ctf.Variable 'dense_1/kernel:0' shape=(5, 10) dtype=float32, numpy=\n", - " array([[ 0.43788117, -0.62099844, -0.30525017, -0.59352523, 0.1783089 ,\n", - " 0.47078604, -0.23620895, -0.30482283, 0.01366901, -0.1288507 ],\n", - " [ 0.18407935, -0.56550485, 0.54180616, -0.42254075, 0.3702994 ,\n", - " 0.36705834, -0.29678228, 0.36660975, 0.36717761, 0.46269661],\n", - " [ 0.1709305 , -0.11529458, 0.32710236, 0.46300393, -0.62802851,\n", - " 0.51641601, 0.39624029, 0.26918125, -0.25196898, 0.21353298],\n", - " [ 0.35752094, 0.44161648, 0.61500639, -0.12653333, 0.41629118,\n", - " 0.36193585, 0.066082 , -0.59253877, 0.47318751, 0.17115968],\n", - " [-0.22554061, -0.17727301, 0.5525015 , 0.3678053 , -0.00454676,\n", - " 0.24066836, -0.53640735, 0.13792562, -0.10727292, 0.59708995]], dtype=float32)\u003e,\n", - " \u003ctf.Variable 'dense_1/bias:0' shape=(10,) dtype=float32, numpy=array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)\u003e]" - ] - }, - "execution_count": 4, - "metadata": { - "tags": [] - }, - "output_type": "execute_result" - } - ], - "source": [ - "# Layers have many useful methods. For example, you can inspect all variables\n", - "# in a layer by calling layer.variables. In this case a fully-connected layer\n", - "# will have variables for weights and biases.\n", - "layer.variables" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "height": 221 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 226, - "status": "ok", - "timestamp": 1527783643252, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "6ilvKjz8_4MQ", - "outputId": "f647fced-c2d7-41a3-c237-242036784665" - }, - "outputs": [ - { - "data": { - "text/plain": [ - "(\u003ctf.Variable 'dense_1/kernel:0' shape=(5, 10) dtype=float32, numpy=\n", - " array([[ 0.43788117, -0.62099844, -0.30525017, -0.59352523, 0.1783089 ,\n", - " 0.47078604, -0.23620895, -0.30482283, 0.01366901, -0.1288507 ],\n", - " [ 0.18407935, -0.56550485, 0.54180616, -0.42254075, 0.3702994 ,\n", - " 0.36705834, -0.29678228, 0.36660975, 0.36717761, 0.46269661],\n", - " [ 0.1709305 , -0.11529458, 0.32710236, 0.46300393, -0.62802851,\n", - " 0.51641601, 0.39624029, 0.26918125, -0.25196898, 0.21353298],\n", - " [ 0.35752094, 0.44161648, 0.61500639, -0.12653333, 0.41629118,\n", - " 0.36193585, 0.066082 , -0.59253877, 0.47318751, 0.17115968],\n", - " [-0.22554061, -0.17727301, 0.5525015 , 0.3678053 , -0.00454676,\n", - " 0.24066836, -0.53640735, 0.13792562, -0.10727292, 0.59708995]], dtype=float32)\u003e,\n", - " \u003ctf.Variable 'dense_1/bias:0' shape=(10,) dtype=float32, numpy=array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], dtype=float32)\u003e)" - ] - }, - "execution_count": 5, - "metadata": { - "tags": [] - }, - "output_type": "execute_result" - } - ], - "source": [ - "# The variables are also accessible through nice accessors\n", - "layer.kernel, layer.bias" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "O0kDbE54-5VS" - }, - "source": [ - "## Implementing custom layers\n", - "The best way to implement your own layer is extending the tf.keras.Layer class and implementing:\n", - " * `__init__` , where you can do all input-independent initialization\n", - " * `build`, where you know the shapes of the input tensors and can do the rest of the initialization\n", - " * `call`, where you do the forward computation\n", - "\n", - "Note that you don't have to wait until `build` is called to create your variables, you can also create them in `__init__`. However, the advantage of creating them in `build` is that it enables late variable creation based on the shape of the inputs the layer will operate on. On the other hand, creating variables in `__init__` would mean that shapes requires to create the variables will need to be explicitly specified." - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "height": 391 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 251, - "status": "ok", - "timestamp": 1527783661512, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "5Byl3n1k5kIy", - "outputId": "6e7f9285-649a-4132-82ce-73ea92f15862" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "tf.Tensor(\n", - "[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n", - " [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n", - " [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n", - " [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n", - " [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n", - " [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n", - " [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n", - " [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n", - " [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]\n", - " [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]], shape=(10, 10), dtype=float32)\n", - "[\u003ctf.Variable 'my_dense_layer_1/kernel:0' shape=(5, 10) dtype=float32, numpy=\n", - "array([[-0.4011991 , 0.22458655, -0.33237562, -0.25117266, 0.33528614,\n", - " -0.01392961, 0.58580834, -0.16346583, 0.28465688, -0.47191954],\n", - " [-0.52922136, 0.22416979, -0.58209574, -0.60914612, 0.05226624,\n", - " -0.18325993, 0.5591442 , -0.24718609, 0.37148207, 0.40475875],\n", - " [ 0.16912812, -0.47618777, -0.38989353, 0.30105609, -0.08085585,\n", - " 0.44758242, 0.545829 , 0.51421839, 0.11063248, 0.20159996],\n", - " [ 0.34073615, -0.59835428, 0.06498981, -0.44489855, -0.34302285,\n", - " 0.20969599, 0.35527444, -0.03173476, -0.22227573, 0.09303057],\n", - " [ 0.41764337, -0.06435019, -0.52509922, -0.39957345, 0.56811184,\n", - " 0.23481232, -0.61666459, 0.31144124, -0.11532354, -0.42421889]], dtype=float32)\u003e]\n" - ] - } - ], - "source": [ - "class MyDenseLayer(tf.keras.layers.Layer):\n", - " def __init__(self, num_outputs):\n", - " super(MyDenseLayer, self).__init__()\n", - " self.num_outputs = num_outputs\n", - " \n", - " def build(self, input_shape):\n", - " self.kernel = self.add_variable(\"kernel\", \n", - " shape=[input_shape[-1].value, \n", - " self.num_outputs])\n", - " \n", - " def call(self, input):\n", - " return tf.matmul(input, self.kernel)\n", - " \n", - "layer = MyDenseLayer(10)\n", - "print(layer(tf.zeros([10, 5])))\n", - "print(layer.variables)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "tk8E2vY0-z4Z" - }, - "source": [ - "Note that you don't have to wait until `build` is called to create your variables, you can also create them in `__init__`.\n", - "\n", - "Overall code is easier to read and maintain if it uses standard layers whenever possible, as other readers will be familiar with the behavior of standard layers. If you want to use a layer which is not present in tf.keras.layers or tf.contrib.layers, consider filing a [github issue](http://github.com/tensorflow/tensorflow/issues/new) or, even better, sending us a pull request!" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "Qhg4KlbKrs3G" - }, - "source": [ - "## Models: composing layers\n", - "\n", - "Many interesting layer-like things in machine learning models are implemented by composing existing layers. For example, each residual block in a resnet is a composition of convolutions, batch normalizations, and a shortcut.\n", - "\n", - "The main class used when creating a layer-like thing which contains other layers is tf.keras.Model. Implementing one is done by inheriting from tf.keras.Model." - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "height": 190 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 420, - "status": "ok", - "timestamp": 1527783698512, - "user": { - "displayName": "", - "photoUrl": "", - "userId": "" - }, - "user_tz": 420 - }, - "id": "N30DTXiRASlb", - "outputId": "a8b23a8e-5cf9-4bbf-f93b-6c763d74e2b3" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "tf.Tensor(\n", - "[[[[ 0. 0. 0.]\n", - " [ 0. 0. 0.]\n", - " [ 0. 0. 0.]]\n", - "\n", - " [[ 0. 0. 0.]\n", - " [ 0. 0. 0.]\n", - " [ 0. 0. 0.]]]], shape=(1, 2, 3, 3), dtype=float32)\n", - "['resnet_identity_block_1/conv2d_3/kernel:0', 'resnet_identity_block_1/conv2d_3/bias:0', 'resnet_identity_block_1/batch_normalization_3/gamma:0', 'resnet_identity_block_1/batch_normalization_3/beta:0', 'resnet_identity_block_1/conv2d_4/kernel:0', 'resnet_identity_block_1/conv2d_4/bias:0', 'resnet_identity_block_1/batch_normalization_4/gamma:0', 'resnet_identity_block_1/batch_normalization_4/beta:0', 'resnet_identity_block_1/conv2d_5/kernel:0', 'resnet_identity_block_1/conv2d_5/bias:0', 'resnet_identity_block_1/batch_normalization_5/gamma:0', 'resnet_identity_block_1/batch_normalization_5/beta:0', 'resnet_identity_block_1/batch_normalization_3/moving_mean:0', 'resnet_identity_block_1/batch_normalization_3/moving_variance:0', 'resnet_identity_block_1/batch_normalization_4/moving_mean:0', 'resnet_identity_block_1/batch_normalization_4/moving_variance:0', 'resnet_identity_block_1/batch_normalization_5/moving_mean:0', 'resnet_identity_block_1/batch_normalization_5/moving_variance:0']\n" - ] - } - ], - "source": [ - "class ResnetIdentityBlock(tf.keras.Model):\n", - " def __init__(self, kernel_size, filters):\n", - " super(ResnetIdentityBlock, self).__init__(name='')\n", - " filters1, filters2, filters3 = filters\n", - "\n", - " self.conv2a = tf.keras.layers.Conv2D(filters1, (1, 1))\n", - " self.bn2a = tf.keras.layers.BatchNormalization()\n", - "\n", - " self.conv2b = tf.keras.layers.Conv2D(filters2, kernel_size, padding='same')\n", - " self.bn2b = tf.keras.layers.BatchNormalization()\n", - "\n", - " self.conv2c = tf.keras.layers.Conv2D(filters3, (1, 1))\n", - " self.bn2c = tf.keras.layers.BatchNormalization()\n", - "\n", - " def call(self, input_tensor, training=False):\n", - " x = self.conv2a(input_tensor)\n", - " x = self.bn2a(x, training=training)\n", - " x = tf.nn.relu(x)\n", - "\n", - " x = self.conv2b(x)\n", - " x = self.bn2b(x, training=training)\n", - " x = tf.nn.relu(x)\n", - "\n", - " x = self.conv2c(x)\n", - " x = self.bn2c(x, training=training)\n", - "\n", - " x += input_tensor\n", - " return tf.nn.relu(x)\n", - "\n", - " \n", - "block = ResnetIdentityBlock(1, [1, 2, 3])\n", - "print(block(tf.zeros([1, 2, 3, 3])))\n", - "print([x.name for x in block.variables])" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "wYfucVw65PMj" - }, - "source": [ - "Much of the time, however, models which compose many layers simply call one layer after the other. This can be done in very little code using tf.keras.Sequential" - ] - }, - { - "cell_type": "code", - "execution_count": 0, - "metadata": { - "colab": { - "autoexec": { - "startup": false, - "wait_interval": 0 - }, - "base_uri": "https://localhost:8080/", - "height": 153 - }, - "colab_type": "code", - "executionInfo": { - "elapsed": 361, - "status": "ok", - "timestamp": 1526674830777, - "user": { - "displayName": "Alexandre Passos", - "photoUrl": "//lh4.googleusercontent.com/-kmTTWXEgAPw/AAAAAAAAAAI/AAAAAAAAAC0/q_DoOzKGwds/s50-c-k-no/photo.jpg", - "userId": "108023195365833072773" - }, - "user_tz": 420 - }, - "id": "L9frk7Ur4uvJ", - "outputId": "882e9076-b6d9-4380-bb1e-7c6b57d54c39" - }, - "outputs": [ - { - "data": { - "text/plain": [ - "\u003ctf.Tensor: id=1423, shape=(1, 2, 3, 3), dtype=float32, numpy=\n", - "array([[[[0., 0., 0.],\n", - " [0., 0., 0.],\n", - " [0., 0., 0.]],\n", - "\n", - " [[0., 0., 0.],\n", - " [0., 0., 0.],\n", - " [0., 0., 0.]]]], dtype=float32)\u003e" - ] - }, - "execution_count": 26, - "metadata": { - "tags": [] - }, - "output_type": "execute_result" - } - ], - "source": [ - " my_seq = tf.keras.Sequential([tf.keras.layers.Conv2D(1, (1, 1)),\n", - " tf.keras.layers.BatchNormalization(),\n", - " tf.keras.layers.Conv2D(2, 1, \n", - " padding='same'),\n", - " tf.keras.layers.BatchNormalization(),\n", - " tf.keras.layers.Conv2D(3, (1, 1)),\n", - " tf.keras.layers.BatchNormalization()])\n", - "my_seq(tf.zeros([1, 2, 3, 3]))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "colab_type": "text", - "id": "c5YwYcnuK-wc" - }, - "source": [ - "# Next steps\n", - "\n", - "Now you can go back to the previous notebook and adapt the linear regression example to use layers and models to be better structured." - ] - } - ], - "metadata": { - "colab": { - "collapsed_sections": [], - "default_view": {}, - "name": "4 - High level API - TensorFlow Eager.ipynb", - "provenance": [], - "version": "0.3.2", - "views": {} - }, - "kernelspec": { - "display_name": "Python 3", - "name": "python3" - } - }, - "nbformat": 4, - "nbformat_minor": 0 -} diff --git a/tensorflow/contrib/eager/python/examples/notebooks/README.md b/tensorflow/contrib/eager/python/examples/notebooks/README.md new file mode 100644 index 0000000000..0d5ed84894 --- /dev/null +++ b/tensorflow/contrib/eager/python/examples/notebooks/README.md @@ -0,0 +1,11 @@ +## Research and experimentation + +Eager execution provides an imperative, define-by-run interface for advanced +operations. Write custom layers, forward passes, and training loops with auto +differentiation. Start with these notebooks, then read the +[eager execution guide](https://www.tensorflow.org/guide/eager). + +1. [Eager execution basics](./eager_basics.ipynb) +2. [Automatic differentiation and gradient tapes](./automatic_differentiation.ipynb) +3. [Custom training: basics](./custom_training.ipynb) +4. [Custom layers](./custom_layers.ipynb) diff --git a/tensorflow/contrib/eager/python/examples/notebooks/automatic_differentiation.ipynb b/tensorflow/contrib/eager/python/examples/notebooks/automatic_differentiation.ipynb new file mode 100644 index 0000000000..a18882fafa --- /dev/null +++ b/tensorflow/contrib/eager/python/examples/notebooks/automatic_differentiation.ipynb @@ -0,0 +1,364 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "automatic_differentiation.ipynb", + "version": "0.3.2", + "views": {}, + "default_view": {}, + "provenance": [], + "private_outputs": true, + "collapsed_sections": [], + "toc_visible": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + } + }, + "cells": [ + { + "metadata": { + "id": "t09eeeR5prIJ", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "##### Copyright 2018 The TensorFlow Authors." + ] + }, + { + "metadata": { + "id": "GCCk8_dHpuNf", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + }, + "cellView": "form" + }, + "cell_type": "code", + "source": [ + "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "xh8WkEwWpnm7", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "# Automatic differentiation and gradient tape" + ] + }, + { + "metadata": { + "id": "idv0bPeCp325", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "
\n", + "\n", + " Run in Google Colab\n", + "\n", + "View source on GitHub
" + ] + }, + { + "metadata": { + "id": "vDJ4XzMqodTy", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "In the previous tutorial we introduced `Tensor`s and operations on them. In this tutorial we will cover [automatic differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation), a key technique for optimizing machine learning models." + ] + }, + { + "metadata": { + "id": "GQJysDM__Qb0", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Setup\n" + ] + }, + { + "metadata": { + "id": "OiMPZStlibBv", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "import tensorflow as tf\n", + "tf.enable_eager_execution()\n", + "\n", + "tfe = tf.contrib.eager # Shorthand for some symbols" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "1CLWJl0QliB0", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Derivatives of a function\n", + "\n", + "TensorFlow provides APIs for automatic differentiation - computing the derivative of a function. The way that more closely mimics the math is to encapsulate the computation in a Python function, say `f`, and use `tfe.gradients_function` to create a function that computes the derivatives of `f` with respect to its arguments. If you're familiar with [autograd](https://github.com/HIPS/autograd) for differentiating numpy functions, this will be familiar. For example: " + ] + }, + { + "metadata": { + "id": "9FViq92UX7P8", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "from math import pi\n", + "\n", + "def f(x):\n", + " return tf.square(tf.sin(x))\n", + "\n", + "assert f(pi/2).numpy() == 1.0\n", + "\n", + "\n", + "# grad_f will return a list of derivatives of f\n", + "# with respect to its arguments. Since f() has a single argument,\n", + "# grad_f will return a list with a single element.\n", + "grad_f = tfe.gradients_function(f)\n", + "assert tf.abs(grad_f(pi/2)[0]).numpy() < 1e-7" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "v9fPs8RyopCf", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "### Higher-order gradients\n", + "\n", + "The same API can be used to differentiate as many times as you like:\n" + ] + }, + { + "metadata": { + "id": "3D0ZvnGYo0rW", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "def f(x):\n", + " return tf.square(tf.sin(x))\n", + "\n", + "def grad(f):\n", + " return lambda x: tfe.gradients_function(f)(x)[0]\n", + "\n", + "x = tf.lin_space(-2*pi, 2*pi, 100) # 100 points between -2π and +2π\n", + "\n", + "import matplotlib.pyplot as plt\n", + "\n", + "plt.plot(x, f(x), label=\"f\")\n", + "plt.plot(x, grad(f)(x), label=\"first derivative\")\n", + "plt.plot(x, grad(grad(f))(x), label=\"second derivative\")\n", + "plt.plot(x, grad(grad(grad(f)))(x), label=\"third derivative\")\n", + "plt.legend()\n", + "plt.show()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "-39gouo7mtgu", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Gradient tapes\n", + "\n", + "Every differentiable TensorFlow operation has an associated gradient function. For example, the gradient function of `tf.square(x)` would be a function that returns `2.0 * x`. To compute the gradient of a user-defined function (like `f(x)` in the example above), TensorFlow first \"records\" all the operations applied to compute the output of the function. We call this record a \"tape\". It then uses that tape and the gradients functions associated with each primitive operation to compute the gradients of the user-defined function using [reverse mode differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation).\n", + "\n", + "Since operations are recorded as they are executed, Python control flow (using `if`s and `while`s for example) is naturally handled:\n", + "\n" + ] + }, + { + "metadata": { + "id": "MH0UfjympWf7", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "def f(x, y):\n", + " output = 1\n", + " for i in range(y):\n", + " output = tf.multiply(output, x)\n", + " return output\n", + "\n", + "def g(x, y):\n", + " # Return the gradient of `f` with respect to it's first parameter\n", + " return tfe.gradients_function(f)(x, y)[0]\n", + "\n", + "assert f(3.0, 2).numpy() == 9.0 # f(x, 2) is essentially x * x\n", + "assert g(3.0, 2).numpy() == 6.0 # And its gradient will be 2 * x\n", + "assert f(4.0, 3).numpy() == 64.0 # f(x, 3) is essentially x * x * x\n", + "assert g(4.0, 3).numpy() == 48.0 # And its gradient will be 3 * x * x" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "aNmR5-jhpX2t", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "At times it may be inconvenient to encapsulate computation of interest into a function. For example, if you want the gradient of the output with respect to intermediate values computed in the function. In such cases, the slightly more verbose but explicit [tf.GradientTape](https://www.tensorflow.org/api_docs/python/tf/GradientTape) context is useful. All computation inside the context of a `tf.GradientTape` is \"recorded\".\n", + "\n", + "For example:" + ] + }, + { + "metadata": { + "id": "bAFeIE8EuVIq", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "x = tf.ones((2, 2))\n", + " \n", + "# TODO(b/78880779): Remove the 'persistent=True' argument and use\n", + "# a single t.gradient() call when the bug is resolved.\n", + "with tf.GradientTape(persistent=True) as t:\n", + " # TODO(ashankar): Explain with \"watch\" argument better?\n", + " t.watch(x)\n", + " y = tf.reduce_sum(x)\n", + " z = tf.multiply(y, y)\n", + "\n", + "# Use the same tape to compute the derivative of z with respect to the\n", + "# intermediate value y.\n", + "dz_dy = t.gradient(z, y)\n", + "assert dz_dy.numpy() == 8.0\n", + "\n", + "# Derivative of z with respect to the original input tensor x\n", + "dz_dx = t.gradient(z, x)\n", + "for i in [0, 1]:\n", + " for j in [0, 1]:\n", + " assert dz_dx[i][j].numpy() == 8.0" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "DK05KXrAAld3", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "### Higher-order gradients\n", + "\n", + "Operations inside of the `GradientTape` context manager are recorded for automatic differentiation. If gradients are computed in that context, then the gradient computation is recorded as well. As a result, the exact same API works for higher-order gradients as well. For example:" + ] + }, + { + "metadata": { + "id": "cPQgthZ7ugRJ", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# TODO(ashankar): Should we use the persistent tape here instead? Follow up on Tom and Alex's discussion\n", + "\n", + "x = tf.constant(1.0) # Convert the Python 1.0 to a Tensor object\n", + "\n", + "with tf.GradientTape() as t:\n", + " with tf.GradientTape() as t2:\n", + " t2.watch(x)\n", + " y = x * x * x\n", + " # Compute the gradient inside the 't' context manager\n", + " # which means the gradient computation is differentiable as well.\n", + " dy_dx = t2.gradient(y, x)\n", + "d2y_dx2 = t.gradient(dy_dx, x)\n", + "\n", + "assert dy_dx.numpy() == 3.0\n", + "assert d2y_dx2.numpy() == 6.0" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "4U1KKzUpNl58", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Next Steps\n", + "\n", + "In this tutorial we covered gradient computation in TensorFlow. With that we have enough of the primitives required to build an train neural networks, which we will cover in the [next tutorial](https://github.com/tensorflow/models/tree/master/official/contrib/eager/python/examples/notebooks/3_neural_networks.ipynb)." + ] + } + ] +} \ No newline at end of file diff --git a/tensorflow/contrib/eager/python/examples/notebooks/custom_layers.ipynb b/tensorflow/contrib/eager/python/examples/notebooks/custom_layers.ipynb new file mode 100644 index 0000000000..54fbf2a7e1 --- /dev/null +++ b/tensorflow/contrib/eager/python/examples/notebooks/custom_layers.ipynb @@ -0,0 +1,399 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "custom_layers.ipynb", + "version": "0.3.2", + "views": {}, + "default_view": {}, + "provenance": [], + "private_outputs": true, + "collapsed_sections": [], + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + } + }, + "cells": [ + { + "metadata": { + "id": "tDnwEv8FtJm7", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "##### Copyright 2018 The TensorFlow Authors." + ] + }, + { + "metadata": { + "id": "JlknJBWQtKkI", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + }, + "cellView": "form" + }, + "cell_type": "code", + "source": [ + "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "60RdWsg1tETW", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "# Custom layers" + ] + }, + { + "metadata": { + "id": "BcJg7Enms86w", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "
\n", + "\n", + " Run in Google Colab\n", + "\n", + "View source on GitHub
" + ] + }, + { + "metadata": { + "id": "UEu3q4jmpKVT", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "We recommend using `tf.keras` as a high-level API for building neural networks. That said, most TensorFlow APIs are usable with eager execution.\n" + ] + }, + { + "metadata": { + "id": "pwX7Fii1rwsJ", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "import tensorflow as tf\n", + "tfe = tf.contrib.eager\n", + "\n", + "tf.enable_eager_execution()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "zSFfVVjkrrsI", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Layers: common sets of useful operations\n", + "\n", + "Most of the time when writing code for machine learning models you want to operate at a higher level of abstraction than individual operations and manipulation of individual variables.\n", + "\n", + "Many machine learning models are expressible as the composition and stacking of relatively simple layers, and TensorFlow provides both a set of many common layers as a well as easy ways for you to write your own application-specific layers either from scratch or as the composition of existing layers.\n", + "\n", + "TensorFlow includes the full [Keras](https://keras.io) API in the tf.keras package, and the Keras layers are very useful when building your own models.\n" + ] + }, + { + "metadata": { + "id": "8PyXlPl-4TzQ", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# In the tf.keras.layers package, layers are objects. To construct a layer,\n", + "# simply construct the object. Most layers take as a first argument the number\n", + "# of output dimensions / channels.\n", + "layer = tf.keras.layers.Dense(100)\n", + "# The number of input dimensions is often unnecessary, as it can be inferred\n", + "# the first time the layer is used, but it can be provided if you want to \n", + "# specify it manually, which is useful in some complex models.\n", + "layer = tf.keras.layers.Dense(10, input_shape=(None, 5))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "Fn69xxPO5Psr", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "The full list of pre-existing layers can be seen in [the documentation](https://www.tensorflow.org/api_docs/python/tf/keras/layers). It includes Dense (a fully-connected layer),\n", + "Conv2D, LSTM, BatchNormalization, Dropout, and many others." + ] + }, + { + "metadata": { + "id": "E3XKNknP5Mhb", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# To use a layer, simply call it.\n", + "layer(tf.zeros([10, 5]))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "Wt_Nsv-L5t2s", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# Layers have many useful methods. For example, you can inspect all variables\n", + "# in a layer by calling layer.variables. In this case a fully-connected layer\n", + "# will have variables for weights and biases.\n", + "layer.variables" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "6ilvKjz8_4MQ", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# The variables are also accessible through nice accessors\n", + "layer.kernel, layer.bias" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "O0kDbE54-5VS", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Implementing custom layers\n", + "The best way to implement your own layer is extending the tf.keras.Layer class and implementing:\n", + " * `__init__` , where you can do all input-independent initialization\n", + " * `build`, where you know the shapes of the input tensors and can do the rest of the initialization\n", + " * `call`, where you do the forward computation\n", + "\n", + "Note that you don't have to wait until `build` is called to create your variables, you can also create them in `__init__`. However, the advantage of creating them in `build` is that it enables late variable creation based on the shape of the inputs the layer will operate on. On the other hand, creating variables in `__init__` would mean that shapes required to create the variables will need to be explicitly specified." + ] + }, + { + "metadata": { + "id": "5Byl3n1k5kIy", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "class MyDenseLayer(tf.keras.layers.Layer):\n", + " def __init__(self, num_outputs):\n", + " super(MyDenseLayer, self).__init__()\n", + " self.num_outputs = num_outputs\n", + " \n", + " def build(self, input_shape):\n", + " self.kernel = self.add_variable(\"kernel\", \n", + " shape=[input_shape[-1].value, \n", + " self.num_outputs])\n", + " \n", + " def call(self, input):\n", + " return tf.matmul(input, self.kernel)\n", + " \n", + "layer = MyDenseLayer(10)\n", + "print(layer(tf.zeros([10, 5])))\n", + "print(layer.variables)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "tk8E2vY0-z4Z", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "Note that you don't have to wait until `build` is called to create your variables, you can also create them in `__init__`.\n", + "\n", + "Overall code is easier to read and maintain if it uses standard layers whenever possible, as other readers will be familiar with the behavior of standard layers. If you want to use a layer which is not present in tf.keras.layers or tf.contrib.layers, consider filing a [github issue](http://github.com/tensorflow/tensorflow/issues/new) or, even better, sending us a pull request!" + ] + }, + { + "metadata": { + "id": "Qhg4KlbKrs3G", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Models: composing layers\n", + "\n", + "Many interesting layer-like things in machine learning models are implemented by composing existing layers. For example, each residual block in a resnet is a composition of convolutions, batch normalizations, and a shortcut.\n", + "\n", + "The main class used when creating a layer-like thing which contains other layers is tf.keras.Model. Implementing one is done by inheriting from tf.keras.Model." + ] + }, + { + "metadata": { + "id": "N30DTXiRASlb", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "class ResnetIdentityBlock(tf.keras.Model):\n", + " def __init__(self, kernel_size, filters):\n", + " super(ResnetIdentityBlock, self).__init__(name='')\n", + " filters1, filters2, filters3 = filters\n", + "\n", + " self.conv2a = tf.keras.layers.Conv2D(filters1, (1, 1))\n", + " self.bn2a = tf.keras.layers.BatchNormalization()\n", + "\n", + " self.conv2b = tf.keras.layers.Conv2D(filters2, kernel_size, padding='same')\n", + " self.bn2b = tf.keras.layers.BatchNormalization()\n", + "\n", + " self.conv2c = tf.keras.layers.Conv2D(filters3, (1, 1))\n", + " self.bn2c = tf.keras.layers.BatchNormalization()\n", + "\n", + " def call(self, input_tensor, training=False):\n", + " x = self.conv2a(input_tensor)\n", + " x = self.bn2a(x, training=training)\n", + " x = tf.nn.relu(x)\n", + "\n", + " x = self.conv2b(x)\n", + " x = self.bn2b(x, training=training)\n", + " x = tf.nn.relu(x)\n", + "\n", + " x = self.conv2c(x)\n", + " x = self.bn2c(x, training=training)\n", + "\n", + " x += input_tensor\n", + " return tf.nn.relu(x)\n", + "\n", + " \n", + "block = ResnetIdentityBlock(1, [1, 2, 3])\n", + "print(block(tf.zeros([1, 2, 3, 3])))\n", + "print([x.name for x in block.variables])" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "wYfucVw65PMj", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "Much of the time, however, models which compose many layers simply call one layer after the other. This can be done in very little code using tf.keras.Sequential" + ] + }, + { + "metadata": { + "id": "L9frk7Ur4uvJ", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + " my_seq = tf.keras.Sequential([tf.keras.layers.Conv2D(1, (1, 1)),\n", + " tf.keras.layers.BatchNormalization(),\n", + " tf.keras.layers.Conv2D(2, 1, \n", + " padding='same'),\n", + " tf.keras.layers.BatchNormalization(),\n", + " tf.keras.layers.Conv2D(3, (1, 1)),\n", + " tf.keras.layers.BatchNormalization()])\n", + "my_seq(tf.zeros([1, 2, 3, 3]))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "c5YwYcnuK-wc", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "# Next steps\n", + "\n", + "Now you can go back to the previous notebook and adapt the linear regression example to use layers and models to be better structured." + ] + } + ] +} \ No newline at end of file diff --git a/tensorflow/contrib/eager/python/examples/notebooks/custom_training.ipynb b/tensorflow/contrib/eager/python/examples/notebooks/custom_training.ipynb new file mode 100644 index 0000000000..0a781d2153 --- /dev/null +++ b/tensorflow/contrib/eager/python/examples/notebooks/custom_training.ipynb @@ -0,0 +1,478 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "Custom training: basics", + "version": "0.3.2", + "views": {}, + "default_view": {}, + "provenance": [], + "private_outputs": true, + "collapsed_sections": [], + "toc_visible": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + } + }, + "cells": [ + { + "metadata": { + "id": "5rmpybwysXGV", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "##### Copyright 2018 The TensorFlow Authors." + ] + }, + { + "metadata": { + "id": "m8y3rGtQsYP2", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + }, + "cellView": "form" + }, + "cell_type": "code", + "source": [ + "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "hrXv0rU9sIma", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "# Custom training: basics" + ] + }, + { + "metadata": { + "id": "7S0BwJ_8sLu7", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "
\n", + "\n", + " Run in Google Colab\n", + "\n", + "View source on GitHub
" + ] + }, + { + "metadata": { + "id": "k2o3TTG4TFpt", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "In the previous tutorial we covered the TensorFlow APIs for automatic differentiation, a basic building block for machine learning.\n", + "In this tutorial we will use the TensorFlow primitives introduced in the prior tutorials to do some simple machine learning.\n", + "\n", + "TensorFlow also includes a higher-level neural networks API (`tf.keras`) which provides useful abstractions to reduce boilerplate. We strongly recommend those higher level APIs for people working with neural networks. However, in this short tutorial we cover neural network training from first principles to establish a strong foundation." + ] + }, + { + "metadata": { + "id": "3LXMVuV0VhDr", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Setup" + ] + }, + { + "metadata": { + "id": "PJ64L90aVir3", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "import tensorflow as tf\n", + "tfe = tf.contrib.eager # Shorthand for some symbols\n", + "\n", + "tf.enable_eager_execution()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "eMAWbDJFVmMk", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Variables\n", + "\n", + "Tensors in TensorFlow are immutable stateless objects. Machine learning models, however, need to have changing state: as your model trains, the same code to compute predictions should behave differently over time (hopefully with a lower loss!). To represent this state which needs to change over the course of your computation, you can choose to rely on the fact that Python is a stateful programming language:\n" + ] + }, + { + "metadata": { + "id": "VkJwtLS_Jbn8", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "# Using python state\n", + "x = tf.zeros([10, 10])\n", + "x += 2 # This is equivalent to x = x + 2, which does not mutate the original\n", + " # value of x\n", + "print(x)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "wfneTXy7JcUz", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "TensorFlow, however, has stateful operations built in, and these are often more pleasant to use than low-level Python representations of your state. To represent weights in a model, for example, it's often convenient and efficient to use TensorFlow variables.\n", + "\n", + "A Variable is an object which stores a value and, when used in a TensorFlow computation, will implicitly read from this stored value. There are operations (`tf.assign_sub`, `tf.scatter_update`, etc) which manipulate the value stored in a TensorFlow variable." + ] + }, + { + "metadata": { + "id": "itxmrMil6DQi", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "v = tfe.Variable(1.0)\n", + "assert v.numpy() == 1.0\n", + "\n", + "# Re-assign the value\n", + "v.assign(3.0)\n", + "assert v.numpy() == 3.0\n", + "\n", + "# Use `v` in a TensorFlow operation like tf.square() and reassign\n", + "v.assign(tf.square(v))\n", + "assert v.numpy() == 9.0" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "-paSaeq1JzwC", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "Computations using Variables are automatically traced when computing gradients. For Variables representing embeddings TensorFlow will do sparse updates by default, which are more computation and memory efficient.\n", + "\n", + "Using Variables is also a way to quickly let a reader of your code know that this piece of state is mutable." + ] + }, + { + "metadata": { + "id": "BMiFcDzE7Qu3", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Example: Fitting a linear model\n", + "\n", + "Let's now put the few concepts we have so far ---`Tensor`, `GradientTape`, `Variable` --- to build and train a simple model. This typically involves a few steps:\n", + "\n", + "1. Define the model.\n", + "2. Define a loss function.\n", + "3. Obtain training data.\n", + "4. Run through the training data and use an \"optimizer\" to adjust the variables to fit the data.\n", + "\n", + "In this tutorial, we'll walk through a trivial example of a simple linear model: `f(x) = x * W + b`, which has two variables - `W` and `b`. Furthermore, we'll synthesize data such that a well trained model would have `W = 3.0` and `b = 2.0`." + ] + }, + { + "metadata": { + "id": "gFzH64Jn9PIm", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "### Define the model\n", + "\n", + "Let's define a simple class to encapsulate the variables and the computation." + ] + }, + { + "metadata": { + "id": "_WRu7Pze7wk8", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "class Model(object):\n", + " def __init__(self):\n", + " # Initialize variable to (5.0, 0.0)\n", + " # In practice, these should be initialized to random values.\n", + " self.W = tfe.Variable(5.0)\n", + " self.b = tfe.Variable(0.0)\n", + " \n", + " def __call__(self, x):\n", + " return self.W * x + self.b\n", + " \n", + "model = Model()\n", + "\n", + "assert model(3.0).numpy() == 15.0" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "xa6j_yXa-j79", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "### Define a loss function\n", + "\n", + "A loss function measures how well the output of a model for a given input matches the desired output. Let's use the standard L2 loss." + ] + }, + { + "metadata": { + "id": "Y0ysUFGY924U", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "def loss(predicted_y, desired_y):\n", + " return tf.reduce_mean(tf.square(predicted_y - desired_y))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "qutT_fkl_CBc", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "### Obtain training data\n", + "\n", + "Let's synthesize the training data with some noise." + ] + }, + { + "metadata": { + "id": "gxPTb-kt_N5m", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "TRUE_W = 3.0\n", + "TRUE_b = 2.0\n", + "NUM_EXAMPLES = 1000\n", + "\n", + "inputs = tf.random_normal(shape=[NUM_EXAMPLES])\n", + "noise = tf.random_normal(shape=[NUM_EXAMPLES])\n", + "outputs = inputs * TRUE_W + TRUE_b + noise" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "-50nq-wPBsAW", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "Before we train the model let's visualize where the model stands right now. We'll plot the model's predictions in red and the training data in blue." + ] + }, + { + "metadata": { + "id": "_eb83LtrB4nt", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "import matplotlib.pyplot as plt\n", + "\n", + "plt.scatter(inputs, outputs, c='b')\n", + "plt.scatter(inputs, model(inputs), c='r')\n", + "plt.show()\n", + "\n", + "print('Current loss: '),\n", + "print(loss(model(inputs), outputs).numpy())" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "sSDP-yeq_4jE", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "### Define a training loop\n", + "\n", + "We now have our network and our training data. Let's train it, i.e., use the training data to update the model's variables (`W` and `b`) so that the loss goes down using [gradient descent](https://en.wikipedia.org/wiki/Gradient_descent). There are many variants of the gradient descent scheme that are captured in `tf.train.Optimizer` implementations. We'd highly recommend using those implementations, but in the spirit of building from first principles, in this particular example we will implement the basic math ourselves." + ] + }, + { + "metadata": { + "id": "MBIACgdnA55X", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "def train(model, inputs, outputs, learning_rate):\n", + " with tf.GradientTape() as t:\n", + " current_loss = loss(model(inputs), outputs)\n", + " dW, db = t.gradient(current_loss, [model.W, model.b])\n", + " model.W.assign_sub(learning_rate * dW)\n", + " model.b.assign_sub(learning_rate * db)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "RwWPaJryD2aN", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "Finally, let's repeatedly run through the training data and see how `W` and `b` evolve." + ] + }, + { + "metadata": { + "id": "XdfkR223D9dW", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "model = Model()\n", + "\n", + "# Collect the history of W-values and b-values to plot later\n", + "Ws, bs = [], []\n", + "epochs = range(10)\n", + "for epoch in epochs:\n", + " Ws.append(model.W.numpy())\n", + " bs.append(model.b.numpy())\n", + " current_loss = loss(model(inputs), outputs)\n", + "\n", + " train(model, inputs, outputs, learning_rate=0.1)\n", + " print('Epoch %2d: W=%1.2f b=%1.2f, loss=%2.5f' %\n", + " (epoch, Ws[-1], bs[-1], current_loss))\n", + "\n", + "# Let's plot it all\n", + "plt.plot(epochs, Ws, 'r',\n", + " epochs, bs, 'b')\n", + "plt.plot([TRUE_W] * len(epochs), 'r--',\n", + " [TRUE_b] * len(epochs), 'b--')\n", + "plt.legend(['W', 'b', 'true W', 'true_b'])\n", + "plt.show()\n", + " " + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "vPnIVuaSJwWz", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Next Steps\n", + "\n", + "In this tutorial we covered `Variable`s and built and trained a simple linear model using the TensorFlow primitives discussed so far.\n", + "\n", + "In theory, this is pretty much all you need to use TensorFlow for your machine learning research.\n", + "In practice, particularly for neural networks, the higher level APIs like `tf.keras` will be much more convenient since it provides higher level building blocks (called \"layers\"), utilities to save and restore state, a suite of loss functions, a suite of optimization strategies etc. \n", + "\n", + "The [next tutorial](TODO) will cover these higher level APIs." + ] + } + ] +} \ No newline at end of file diff --git a/tensorflow/contrib/eager/python/examples/notebooks/eager_basics.ipynb b/tensorflow/contrib/eager/python/examples/notebooks/eager_basics.ipynb new file mode 100644 index 0000000000..b37a18c9a6 --- /dev/null +++ b/tensorflow/contrib/eager/python/examples/notebooks/eager_basics.ipynb @@ -0,0 +1,491 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "eager_basics.ipynb", + "version": "0.3.2", + "views": {}, + "default_view": {}, + "provenance": [], + "private_outputs": true, + "collapsed_sections": [], + "toc_visible": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + } + }, + "cells": [ + { + "metadata": { + "id": "iPpI7RaYoZuE", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "##### Copyright 2018 The TensorFlow Authors." + ] + }, + { + "metadata": { + "id": "hro2InpHobKk", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + }, + "cellView": "form" + }, + "cell_type": "code", + "source": [ + "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "U9i2Dsh-ziXr", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "# Eager execution basics" + ] + }, + { + "metadata": { + "id": "Hndw-YcxoOJK", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "
\n", + "\n", + " Run in Google Colab\n", + "\n", + "View source on GitHub
" + ] + }, + { + "metadata": { + "id": "6sILUVbHoSgH", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "This is an introductory tutorial for using TensorFlow. It will cover:\n", + "\n", + "* Importing required packages\n", + "* Creating and using Tensors\n", + "* Using GPU acceleration\n", + "* Datasets" + ] + }, + { + "metadata": { + "id": "z1JcS5iBXMRO", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Import TensorFlow\n", + "\n", + "To get started, import the `tensorflow` module and enable eager execution.\n", + "Eager execution enables a more interactive frontend to TensorFlow, the details of which we will discuss much later." + ] + }, + { + "metadata": { + "id": "RlIWhyeLoYnG", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + }, + "cellView": "code" + }, + "cell_type": "code", + "source": [ + "import tensorflow as tf\n", + "\n", + "tf.enable_eager_execution()" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "H9UySOPLXdaw", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Tensors\n", + "\n", + "A Tensor is a multi-dimensional array. Similar to NumPy `ndarray` objects, `Tensor` objects have a data type and a shape. Additionally, Tensors can reside in accelerator (like GPU) memory. TensorFlow offers a rich library of operations ([tf.add](https://www.tensorflow.org/api_docs/python/tf/add), [tf.matmul](https://www.tensorflow.org/api_docs/python/tf/matmul), [tf.linalg.inv](https://www.tensorflow.org/api_docs/python/tf/linalg/inv) etc.) that consume and produce Tensors. These operations automatically convert native Python types. For example:\n" + ] + }, + { + "metadata": { + "id": "ngUe237Wt48W", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + }, + "cellView": "code" + }, + "cell_type": "code", + "source": [ + "print(tf.add(1, 2))\n", + "print(tf.add([1, 2], [3, 4]))\n", + "print(tf.square(5))\n", + "print(tf.reduce_sum([1, 2, 3]))\n", + "print(tf.encode_base64(\"hello world\"))\n", + "\n", + "# Operator overloading is also supported\n", + "print(tf.square(2) + tf.square(3))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "IDY4WsYRhP81", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "Each Tensor has a shape and a datatype" + ] + }, + { + "metadata": { + "id": "srYWH1MdJNG7", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "x = tf.matmul([[1]], [[2, 3]])\n", + "print(x.shape)\n", + "print(x.dtype)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "eBPw8e8vrsom", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "The most obvious differences between NumPy arrays and TensorFlow Tensors are:\n", + "\n", + "1. Tensors can be backed by accelerator memory (like GPU, TPU).\n", + "2. Tensors are immutable." + ] + }, + { + "metadata": { + "id": "Dwi1tdW3JBw6", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "### NumPy Compatibility\n", + "\n", + "Conversion between TensorFlow Tensors and NumPy ndarrays is quite simple as:\n", + "* TensorFlow operations automatically convert NumPy ndarrays to Tensors.\n", + "* NumPy operations automatically convert Tensors to NumPy ndarrays.\n", + "\n", + "Tensors can be explicitly converted to NumPy ndarrays by invoking the `.numpy()` method on them.\n", + "These conversions are typically cheap as the array and Tensor share the underlying memory representation if possible. However, sharing the underlying representation isn't always possible since the Tensor may be hosted in GPU memory while NumPy arrays are always backed by host memory, and the conversion will thus involve a copy from GPU to host memory." + ] + }, + { + "metadata": { + "id": "lCUWzso6mbqR", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "import numpy as np\n", + "\n", + "ndarray = np.ones([3, 3])\n", + "\n", + "print(\"TensorFlow operations convert numpy arrays to Tensors automatically\")\n", + "tensor = tf.multiply(ndarray, 42)\n", + "print(tensor)\n", + "\n", + "\n", + "print(\"And NumPy operations convert Tensors to numpy arrays automatically\")\n", + "print(np.add(tensor, 1))\n", + "\n", + "print(\"The .numpy() method explicitly converts a Tensor to a numpy array\")\n", + "print(tensor.numpy())" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "PBNP8yTRfu_X", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## GPU acceleration\n", + "\n", + "Many TensorFlow operations can be accelerated by using the GPU for computation. Without any annotations, TensorFlow automatically decides whether to use the GPU or CPU for an operation (and copies the tensor between CPU and GPU memory if necessary). Tensors produced by an operation are typically backed by the memory of the device on which the operation executed. For example:" + ] + }, + { + "metadata": { + "id": "3Twf_Rw-gQFM", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + }, + "cellView": "code" + }, + "cell_type": "code", + "source": [ + "x = tf.random_uniform([3, 3])\n", + "\n", + "print(\"Is there a GPU available: \"),\n", + "print(tf.test.is_gpu_available())\n", + "\n", + "print(\"Is the Tensor on GPU #0: \"),\n", + "print(x.device.endswith('GPU:0'))" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "vpgYzgVXW2Ud", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "### Device Names\n", + "\n", + "The `Tensor.device` property provides a fully qualified string name of the device hosting the contents of the Tensor. This name encodes a bunch of details, such as an identifier of the network address of the host on which this program is executing and the device within that host. This is required for distributed execution of TensorFlow programs, but we'll skip that for now. The string will end with `GPU:` if the tensor is placed on the `N`-th tensor on the host." + ] + }, + { + "metadata": { + "id": "ZWZQCimzuqyP", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "\n", + "\n", + "### Explicit Device Placement\n", + "\n", + "The term \"placement\" in TensorFlow refers to how individual operations are assigned (placed on) a device for execution. As mentioned above, when there is no explicit guidance provided, TensorFlow automatically decides which device to execute an operation, and copies Tensors to that device if needed. However, TensorFlow operations can be explicitly placed on specific devices using the `tf.device` context manager. For example:" + ] + }, + { + "metadata": { + "id": "RjkNZTuauy-Q", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "def time_matmul(x):\n", + " %timeit tf.matmul(x, x)\n", + "\n", + "# Force execution on CPU\n", + "print(\"On CPU:\")\n", + "with tf.device(\"CPU:0\"):\n", + " x = tf.random_uniform([1000, 1000])\n", + " assert x.device.endswith(\"CPU:0\")\n", + " time_matmul(x)\n", + "\n", + "# Force execution on GPU #0 if available\n", + "if tf.test.is_gpu_available():\n", + " with tf.device(\"GPU:0\"): # Or GPU:1 for the 2nd GPU, GPU:2 for the 3rd etc.\n", + " x = tf.random_uniform([1000, 1000])\n", + " assert x.device.endswith(\"GPU:0\")\n", + " time_matmul(x)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "o1K4dlhhHtQj", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "## Datasets\n", + "\n", + "This section demonstrates the use of the [`tf.data.Dataset` API](https://www.tensorflow.org/guide/datasets) to build pipelines to feed data to your model. It covers:\n", + "\n", + "* Creating a `Dataset`.\n", + "* Iteration over a `Dataset` with eager execution enabled.\n", + "\n", + "We recommend using the `Dataset`s API for building performant, complex input pipelines from simple, re-usable pieces that will feed your model's training or evaluation loops.\n", + "\n", + "If you're familiar with TensorFlow graphs, the API for constructing the `Dataset` object remains exactly the same when eager execution is enabled, but the process of iterating over elements of the dataset is slightly simpler.\n", + "You can use Python iteration over the `tf.data.Dataset` object and do not need to explicitly create an `tf.data.Iterator` object.\n", + "As a result, the discussion on iterators in the [TensorFlow Guide](https://www.tensorflow.org/guide/datasets) is not relevant when eager execution is enabled." + ] + }, + { + "metadata": { + "id": "zI0fmOynH-Ne", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "### Create a source `Dataset`\n", + "\n", + "Create a _source_ dataset using one of the factory functions like [`Dataset.from_tensors`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#from_tensors), [`Dataset.from_tensor_slices`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#from_tensor_slices) or using objects that read from files like [`TextLineDataset`](https://www.tensorflow.org/api_docs/python/tf/data/TextLineDataset) or [`TFRecordDataset`](https://www.tensorflow.org/api_docs/python/tf/data/TFRecordDataset). See the [TensorFlow Guide](https://www.tensorflow.org/guide/datasets#reading_input_data) for more information." + ] + }, + { + "metadata": { + "id": "F04fVOHQIBiG", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "ds_tensors = tf.data.Dataset.from_tensor_slices([1, 2, 3, 4, 5, 6])\n", + "\n", + "# Create a CSV file\n", + "import tempfile\n", + "_, filename = tempfile.mkstemp()\n", + "\n", + "with open(filename, 'w') as f:\n", + " f.write(\"\"\"Line 1\n", + "Line 2\n", + "Line 3\n", + " \"\"\")\n", + "\n", + "ds_file = tf.data.TextLineDataset(filename)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "vbxIhC-5IPdf", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "### Apply transformations\n", + "\n", + "Use the transformations functions like [`map`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#map), [`batch`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#batch), [`shuffle`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#shuffle) etc. to apply transformations to the records of the dataset. See the [API documentation for `tf.data.Dataset`](https://www.tensorflow.org/api_docs/python/tf/data/Dataset) for details." + ] + }, + { + "metadata": { + "id": "uXSDZWE-ISsd", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "ds_tensors = ds_tensors.map(tf.square).shuffle(2).batch(2)\n", + "\n", + "ds_file = ds_file.batch(2)" + ], + "execution_count": 0, + "outputs": [] + }, + { + "metadata": { + "id": "A8X1GNfoIZKJ", + "colab_type": "text" + }, + "cell_type": "markdown", + "source": [ + "### Iterate\n", + "\n", + "When eager execution is enabled `Dataset` objects support iteration.\n", + "If you're familiar with the use of `Dataset`s in TensorFlow graphs, note that there is no need for calls to `Dataset.make_one_shot_iterator()` or `get_next()` calls." + ] + }, + { + "metadata": { + "id": "ws-WKRk5Ic6-", + "colab_type": "code", + "colab": { + "autoexec": { + "startup": false, + "wait_interval": 0 + } + } + }, + "cell_type": "code", + "source": [ + "print('Elements of ds_tensors:')\n", + "for x in ds_tensors:\n", + " print(x)\n", + "\n", + "print('\\nElements in ds_file:')\n", + "for x in ds_file:\n", + " print(x)" + ], + "execution_count": 0, + "outputs": [] + } + ] +} \ No newline at end of file -- cgit v1.2.3 From 2b13b7ac7253e6f0d7d96855b1b3e7fee49277a7 Mon Sep 17 00:00:00 2001 From: Billy Lamberta Date: Tue, 3 Jul 2018 16:40:14 -0700 Subject: Update docs_src in 1.9 to match master --- tensorflow/docs_src/community/leftnav_files | 1 - tensorflow/docs_src/community/swift.md | 60 -- tensorflow/docs_src/get_started/_index.yaml | 249 ------- .../docs_src/get_started/basic_classification.md | 3 - .../docs_src/get_started/basic_regression.md | 3 - .../get_started/basic_text_classification.md | 3 - tensorflow/docs_src/get_started/eager.md | 3 - tensorflow/docs_src/get_started/leftnav_files | 10 - tensorflow/docs_src/get_started/next_steps.md | 36 - .../docs_src/get_started/overfit_and_underfit.md | 3 - .../get_started/save_and_restore_models.md | 3 - tensorflow/docs_src/guide/custom_estimators.md | 8 +- .../docs_src/guide/datasets_for_estimators.md | 6 +- tensorflow/docs_src/guide/debugger.md | 30 +- tensorflow/docs_src/guide/eager.md | 12 +- tensorflow/docs_src/guide/graphs.md | 2 +- tensorflow/docs_src/guide/keras.md | 24 +- tensorflow/docs_src/guide/saved_model.md | 9 +- .../docs_src/guide/tensorboard_histograms.md | 4 +- tensorflow/docs_src/install/install_c.md | 2 +- tensorflow/docs_src/install/install_go.md | 2 +- tensorflow/docs_src/install/install_java.md | 24 +- tensorflow/docs_src/install/install_linux.md | 24 +- tensorflow/docs_src/install/install_mac.md | 13 +- tensorflow/docs_src/install/install_raspbian.md | 2 +- tensorflow/docs_src/install/install_sources.md | 36 +- tensorflow/docs_src/install/install_windows.md | 2 +- tensorflow/docs_src/mobile/leftnav_files | 1 + tensorflow/docs_src/mobile/linking_libs.md | 2 +- tensorflow/docs_src/mobile/mobile_intro.md | 3 +- tensorflow/docs_src/mobile/prepare_models.md | 4 +- tensorflow/docs_src/mobile/tflite/demo_android.md | 24 +- tensorflow/docs_src/mobile/tflite/devguide.md | 9 +- tensorflow/docs_src/mobile/tflite/index.md | 17 +- tensorflow/docs_src/mobile/tflite/performance.md | 174 +++++ tensorflow/docs_src/performance/quantization.md | 2 +- .../performance/xla/operation_semantics.md | 39 +- tensorflow/docs_src/tutorials/_index.yaml | 251 +++++++ tensorflow/docs_src/tutorials/_toc.yaml | 93 +++ tensorflow/docs_src/tutorials/audio_recognition.md | 631 ------------------ tensorflow/docs_src/tutorials/deep_cnn.md | 452 ------------- .../tutorials/eager/custom_training_walkthrough.md | 3 + tensorflow/docs_src/tutorials/eager/index.md | 13 + tensorflow/docs_src/tutorials/image_recognition.md | 456 ------------- tensorflow/docs_src/tutorials/image_retraining.md | 4 - tensorflow/docs_src/tutorials/images/deep_cnn.md | 446 +++++++++++++ .../docs_src/tutorials/images/image_recognition.md | 455 +++++++++++++ tensorflow/docs_src/tutorials/images/layers.md | 694 ++++++++++++++++++++ tensorflow/docs_src/tutorials/index.md | 59 -- .../tutorials/keras/basic_classification.md | 3 + .../docs_src/tutorials/keras/basic_regression.md | 3 + .../tutorials/keras/basic_text_classification.md | 3 + tensorflow/docs_src/tutorials/keras/index.md | 22 + .../tutorials/keras/overfit_and_underfit.md | 3 + .../tutorials/keras/save_and_restore_models.md | 3 + tensorflow/docs_src/tutorials/kernel_methods.md | 304 --------- tensorflow/docs_src/tutorials/layers.md | 727 --------------------- tensorflow/docs_src/tutorials/leftnav_files | 23 - tensorflow/docs_src/tutorials/linear.md | 237 ------- tensorflow/docs_src/tutorials/mandelbrot.md | 116 ---- tensorflow/docs_src/tutorials/next_steps.md | 36 + tensorflow/docs_src/tutorials/non-ml/mandelbrot.md | 116 ++++ tensorflow/docs_src/tutorials/non-ml/pdes.md | 140 ++++ tensorflow/docs_src/tutorials/pdes.md | 141 ---- tensorflow/docs_src/tutorials/recurrent.md | 232 ------- .../docs_src/tutorials/recurrent_quickdraw.md | 411 ------------ .../tutorials/representation/kernel_methods.md | 304 +++++++++ .../docs_src/tutorials/representation/linear.md | 237 +++++++ .../docs_src/tutorials/representation/wide.md | 461 +++++++++++++ .../tutorials/representation/wide_and_deep.md | 243 +++++++ .../docs_src/tutorials/representation/word2vec.md | 405 ++++++++++++ tensorflow/docs_src/tutorials/seq2seq.md | 5 - .../tutorials/sequences/audio_recognition.md | 631 ++++++++++++++++++ .../docs_src/tutorials/sequences/recurrent.md | 232 +++++++ .../tutorials/sequences/recurrent_quickdraw.md | 411 ++++++++++++ tensorflow/docs_src/tutorials/wide.md | 461 ------------- tensorflow/docs_src/tutorials/wide_and_deep.md | 243 ------- tensorflow/docs_src/tutorials/word2vec.md | 405 ------------ 78 files changed, 5561 insertions(+), 5403 deletions(-) delete mode 100644 tensorflow/docs_src/community/swift.md delete mode 100644 tensorflow/docs_src/get_started/_index.yaml delete mode 100644 tensorflow/docs_src/get_started/basic_classification.md delete mode 100644 tensorflow/docs_src/get_started/basic_regression.md delete mode 100644 tensorflow/docs_src/get_started/basic_text_classification.md delete mode 100644 tensorflow/docs_src/get_started/eager.md delete mode 100644 tensorflow/docs_src/get_started/leftnav_files delete mode 100644 tensorflow/docs_src/get_started/next_steps.md delete mode 100644 tensorflow/docs_src/get_started/overfit_and_underfit.md delete mode 100644 tensorflow/docs_src/get_started/save_and_restore_models.md create mode 100644 tensorflow/docs_src/mobile/tflite/performance.md create mode 100644 tensorflow/docs_src/tutorials/_index.yaml create mode 100644 tensorflow/docs_src/tutorials/_toc.yaml delete mode 100644 tensorflow/docs_src/tutorials/audio_recognition.md delete mode 100644 tensorflow/docs_src/tutorials/deep_cnn.md create mode 100644 tensorflow/docs_src/tutorials/eager/custom_training_walkthrough.md create mode 100644 tensorflow/docs_src/tutorials/eager/index.md delete mode 100644 tensorflow/docs_src/tutorials/image_recognition.md delete mode 100644 tensorflow/docs_src/tutorials/image_retraining.md create mode 100644 tensorflow/docs_src/tutorials/images/deep_cnn.md create mode 100644 tensorflow/docs_src/tutorials/images/image_recognition.md create mode 100644 tensorflow/docs_src/tutorials/images/layers.md delete mode 100644 tensorflow/docs_src/tutorials/index.md create mode 100644 tensorflow/docs_src/tutorials/keras/basic_classification.md create mode 100644 tensorflow/docs_src/tutorials/keras/basic_regression.md create mode 100644 tensorflow/docs_src/tutorials/keras/basic_text_classification.md create mode 100644 tensorflow/docs_src/tutorials/keras/index.md create mode 100644 tensorflow/docs_src/tutorials/keras/overfit_and_underfit.md create mode 100644 tensorflow/docs_src/tutorials/keras/save_and_restore_models.md delete mode 100644 tensorflow/docs_src/tutorials/kernel_methods.md delete mode 100644 tensorflow/docs_src/tutorials/layers.md delete mode 100644 tensorflow/docs_src/tutorials/leftnav_files delete mode 100644 tensorflow/docs_src/tutorials/linear.md delete mode 100755 tensorflow/docs_src/tutorials/mandelbrot.md create mode 100644 tensorflow/docs_src/tutorials/next_steps.md create mode 100644 tensorflow/docs_src/tutorials/non-ml/mandelbrot.md create mode 100644 tensorflow/docs_src/tutorials/non-ml/pdes.md delete mode 100755 tensorflow/docs_src/tutorials/pdes.md delete mode 100644 tensorflow/docs_src/tutorials/recurrent.md delete mode 100644 tensorflow/docs_src/tutorials/recurrent_quickdraw.md create mode 100644 tensorflow/docs_src/tutorials/representation/kernel_methods.md create mode 100644 tensorflow/docs_src/tutorials/representation/linear.md create mode 100644 tensorflow/docs_src/tutorials/representation/wide.md create mode 100644 tensorflow/docs_src/tutorials/representation/wide_and_deep.md create mode 100644 tensorflow/docs_src/tutorials/representation/word2vec.md delete mode 100644 tensorflow/docs_src/tutorials/seq2seq.md create mode 100644 tensorflow/docs_src/tutorials/sequences/audio_recognition.md create mode 100644 tensorflow/docs_src/tutorials/sequences/recurrent.md create mode 100644 tensorflow/docs_src/tutorials/sequences/recurrent_quickdraw.md delete mode 100644 tensorflow/docs_src/tutorials/wide.md delete mode 100644 tensorflow/docs_src/tutorials/wide_and_deep.md delete mode 100644 tensorflow/docs_src/tutorials/word2vec.md diff --git a/tensorflow/docs_src/community/leftnav_files b/tensorflow/docs_src/community/leftnav_files index 2bae60d9dd..0bd1f14de9 100644 --- a/tensorflow/docs_src/community/leftnav_files +++ b/tensorflow/docs_src/community/leftnav_files @@ -6,4 +6,3 @@ groups.md documentation.md style_guide.md benchmarks.md -swift.md diff --git a/tensorflow/docs_src/community/swift.md b/tensorflow/docs_src/community/swift.md deleted file mode 100644 index d1625d3b93..0000000000 --- a/tensorflow/docs_src/community/swift.md +++ /dev/null @@ -1,60 +0,0 @@ -

- -

- -# Swift for TensorFlow - -Welcome to the Swift for TensorFlow development community! - -Swift for TensorFlow is a new way to develop machine learning models. It -gives you the power of -[TensorFlow](https://www.tensorflow.org) directly -integrated into the [Swift programming language](https://swift.org/about). -With Swift, you can write the following imperative code, and Swift -automatically turns it into **a single TensorFlow Graph** and runs it -with the full performance of TensorFlow Sessions on CPU, GPU and -[TPU](https://cloud.google.com/tpu/docs/tpus). - -```swift -import TensorFlow - -var x = Tensor([[1, 2], [3, 4]]) - -for i in 1...5 { - x += x ⊗ x -} - -print(x) -``` - -Swift combines the flexibility of -[Eager Execution](https://www.tensorflow.org/programmers_guide/eager) with the -high performance of [Graphs and Sessions](https://www.tensorflow.org/programmers_guide/graphs). -Behind the scenes, Swift analyzes your Tensor code and automatically builds -graphs for you. Swift also catches type errors and shape mismatches before -running your code, and has [Automatic Differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation) -built right in. We believe that machine learning tools are so important that -they deserve **a first-class language and a compiler**. - -Note: Swift for TensorFlow is an early stage research project. It has been -released to enable open source development and is not yet ready for general use -by machine learning developers. - -## Open Source - -We have released Swift for TensorFlow as an open-source project on GitHub! - -Our [documentation repository](https://github.com/tensorflow/swift) contains a -[project overview](https://github.com/tensorflow/swift/blob/master/docs/DesignOverview.md) -and [technical papers](https://github.com/tensorflow/swift/tree/master/docs) -explaining specific areas in depth. There are also instructions for [installing -pre-built packages](https://github.com/tensorflow/swift/blob/master/Installation.md) -(for macOS and Ubuntu) as well as a simple -[usage tutorial](https://github.com/tensorflow/swift/blob/master/Usage.md). - -Moving forward, we will use an open design model and all discussions will be -public. - -[Sign up here to join the community Google -group](https://groups.google.com/a/tensorflow.org/d/forum/swift), which we will -use for announcements and general discussion. diff --git a/tensorflow/docs_src/get_started/_index.yaml b/tensorflow/docs_src/get_started/_index.yaml deleted file mode 100644 index 4060804892..0000000000 --- a/tensorflow/docs_src/get_started/_index.yaml +++ /dev/null @@ -1,249 +0,0 @@ -project_path: /_project.yaml -book_path: /_book.yaml -description: -landing_page: - show_side_navs: True - rows: - - description: > -

Get Started with TensorFlow

-

- TensorFlow is an open-source machine learning library for research and - production. TensorFlow offers APIs for beginners and experts to develop - for desktop, mobile, web, and cloud. See the sections below to get - started. -

- items: - - custom_html: > - -
-

Learn and use ML

-
-

- The high-level Keras API provides building blocks to create and - train deep learning models. Start with these beginner-friendly - notebook examples, then read the - TensorFlow Keras guide. -

-
    -
  1. Basic classification
  2. -
  3. Text classification
  4. -
  5. Regression
  6. -
  7. Overfitting and underfitting
  8. -
  9. Save and load
  10. -
-
- -
- - classname: tfo-landing-row-item-code-block - code_block: | -
-        import tensorflow as tf
-        mnist = tf.keras.datasets.mnist
-
-        (x_train, y_train),(x_test, y_test) = mnist.load_data()
-        x_train, x_test = x_train / 255.0, x_test / 255.0
-
-        model = tf.keras.models.Sequential([
-          tf.keras.layers.Flatten(),
-          tf.keras.layers.Dense(512, activation=tf.nn.relu),
-          tf.keras.layers.Dropout(0.2),
-          tf.keras.layers.Dense(10, activation=tf.nn.softmax)
-        ])
-        model.compile(optimizer='adam',
-                      loss='sparse_categorical_crossentropy',
-                      metrics=['accuracy'])
-
-        model.fit(x_train, y_train, epochs=5)
-        model.evaluate(x_test, y_test)
-        
- {% dynamic if request.tld != 'cn' %} - Run in a Notebook - {% dynamic endif %} - - - items: - - custom_html: > -
-

Research and experimentation

-
-

- Eager execution provides an imperative, define-by-run interface for advanced operations. Write custom layers, forward passes, and training loops with auto‑differentiation. Start with - these notebooks, then read the eager execution guide. -

-
    -
  1. - {% dynamic if request.tld == 'cn' %} - Eager execution basics - {% dynamic else %} - Eager execution basics - {% dynamic endif %} -
  2. -
  3. - {% dynamic if request.tld == 'cn' %} - Automatic differentiation and gradient tapes - {% dynamic else %} - Automatic differentiation and gradient tapes - {% dynamic endif %} -
  4. -
  5. - {% dynamic if request.tld == 'cn' %} - Variables, models, and training - {% dynamic else %} - Variables, models, and training - {% dynamic endif %} -
  6. -
  7. - {% dynamic if request.tld == 'cn' %} - Custom layers - {% dynamic else %} - Custom layers - {% dynamic endif %} -
  8. -
  9. Custom training walkthrough
  10. -
  11. - {% dynamic if request.tld == 'cn' %} - Example: Neural machine translation w/ attention - {% dynamic else %} - Example: Neural machine translation w/ attention - {% dynamic endif %} -
  12. -
-
- -
- - custom_html: > -
-

ML at production scale

-
-

- Estimators can train large models on multiple machines in a - production environment. Try the examples below and read the - Estimators guide. -

-
    -
  1. How to build a simple text classifier with TF-Hub
  2. -
  3. Classifying Higgs boson processes
  4. -
  5. Wide and deep learning using estimators
  6. -
-
- -
- - - description: > -

Google Colab: An easy way to learn and use TensorFlow

-

- Colaboratory - is a Google research project created to help disseminate machine learning - education and research. It's a Jupyter notebook environment that requires - no setup to use and runs entirely in the cloud. - Read the blog post. -

- - - description: > -

Build your first ML app

-

Create and deploy TensorFlow models on web and mobile.

- background: grey - items: - - custom_html: > -
- -

Web developers

-
-
- TensorFlow.js is a WebGL accelerated, JavaScript library to train and - deploy ML models in the browser and for Node.js. -
-
- - custom_html: > -
- -

Mobile developers

-
-
- TensorFlow Lite is lightweight solution for mobile and embedded devices. -
-
- - - description: > -

Videos and updates

-

- Subscribe to the TensorFlow - YouTube channel - and blog for - the latest videos and updates. -

- items: - - description: > -

Get started with TensorFlow's High-Level APIs

- youtube_id: tjsHSIG8I08 - buttons: - - label: Watch the video - path: https://www.youtube.com/watch?v=tjsHSIG8I08 - - description: > -

Eager execution

- youtube_id: T8AW0fKP0Hs - background: grey - buttons: - - label: Watch the video - path: https://www.youtube.com/watch?v=T8AW0fKP0Hs - - description: > -

tf.data: Fast, flexible, and easy-to-use input pipelines

- youtube_id: uIcqeP7MFH0 - buttons: - - label: Watch the video - path: https://www.youtube.com/watch?v=uIcqeP7MFH0 diff --git a/tensorflow/docs_src/get_started/basic_classification.md b/tensorflow/docs_src/get_started/basic_classification.md deleted file mode 100644 index 91bbd85b24..0000000000 --- a/tensorflow/docs_src/get_started/basic_classification.md +++ /dev/null @@ -1,3 +0,0 @@ -# Basic Classification - -[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/basic_classification.ipynb) diff --git a/tensorflow/docs_src/get_started/basic_regression.md b/tensorflow/docs_src/get_started/basic_regression.md deleted file mode 100644 index a535f22f5a..0000000000 --- a/tensorflow/docs_src/get_started/basic_regression.md +++ /dev/null @@ -1,3 +0,0 @@ -# Basic Regression - -[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/basic_regression.ipynb) diff --git a/tensorflow/docs_src/get_started/basic_text_classification.md b/tensorflow/docs_src/get_started/basic_text_classification.md deleted file mode 100644 index 7c5d4f7896..0000000000 --- a/tensorflow/docs_src/get_started/basic_text_classification.md +++ /dev/null @@ -1,3 +0,0 @@ -# Basic Text Classification - -[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/basic_text_classification.ipynb) diff --git a/tensorflow/docs_src/get_started/eager.md b/tensorflow/docs_src/get_started/eager.md deleted file mode 100644 index ddf239485a..0000000000 --- a/tensorflow/docs_src/get_started/eager.md +++ /dev/null @@ -1,3 +0,0 @@ -# Custom Training Walkthrough - -[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/r1.9.0/samples/core/get_started/eager.ipynb) diff --git a/tensorflow/docs_src/get_started/leftnav_files b/tensorflow/docs_src/get_started/leftnav_files deleted file mode 100644 index 99d2b2c3e1..0000000000 --- a/tensorflow/docs_src/get_started/leftnav_files +++ /dev/null @@ -1,10 +0,0 @@ -### Learn and use ML -basic_classification.md: Basic classification -basic_text_classification.md: Text classification -basic_regression.md: Regression -overfit_and_underfit.md -save_and_restore_models.md -next_steps.md - -### Research and experimentation -eager.md diff --git a/tensorflow/docs_src/get_started/next_steps.md b/tensorflow/docs_src/get_started/next_steps.md deleted file mode 100644 index 01c9f7204a..0000000000 --- a/tensorflow/docs_src/get_started/next_steps.md +++ /dev/null @@ -1,36 +0,0 @@ -# Next steps - -## Learn more about TensorFlow - -* The [TensorFlow Guide](/guide) includes usage guides for the - high-level APIs, as well as advanced TensorFlow operations. -* [Premade Estimators](/guide/premade_estimators) are designed to - get results out of the box. Use TensorFlow without building your own models. -* [TensorFlow.js](https://js.tensorflow.org/) allows web developers to train and - deploy ML models in the browser and using Node.js. -* [TFLite](/mobile/tflite) allows mobile developers to do inference efficiently - on mobile devices. -* [TensorFlow Serving](/serving) is an open-source project that can put - TensorFlow models in production quickly. -* The [ecosystem](/ecosystem) contains more projects, including - [Magenta](https://magenta.tensorflow.org/), [TFX](/tfx), - [Swift for TensorFlow](https://github.com/tensorflow/swift), and more. - -## Learn more about machine learning - -Recommended resources include: - -* [Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/), - a course from Google that introduces machine learning concepts. -* [CS 20: Tensorflow for Deep Learning Research](http://web.stanford.edu/class/cs20si/), - notes from an intro course from Stanford. -* [CS231n: Convolutional Neural Networks for Visual Recognition](http://cs231n.stanford.edu/), - a course that teaches how convolutional networks work. -* [Machine Learning Recipes](https://www.youtube.com/watch?v=cKxRvEZd3Mw&list=PLOU2XLYxmsIIuiBfYad6rFYQU_jL2ryal), - a video series that introduces basic machine learning concepts with few prerequisites. -* [Deep Learning with Python](https://www.manning.com/books/deep-learning-with-python), - a book by Francois Chollet about the Keras API, as well as an excellent hands on intro to Deep Learning. -* [Hands-on Machine Learning with Scikit-Learn and TensorFlow](https://github.com/ageron/handson-ml), - a book by Aurélien Geron's that is a clear getting-started guide to data science and deep learning. -* [Deep Learning](https://www.deeplearningbook.org/), a book by Ian Goodfellow et al. - that provides a technical dive into learning machine learning. diff --git a/tensorflow/docs_src/get_started/overfit_and_underfit.md b/tensorflow/docs_src/get_started/overfit_and_underfit.md deleted file mode 100644 index e5b5ae7b5a..0000000000 --- a/tensorflow/docs_src/get_started/overfit_and_underfit.md +++ /dev/null @@ -1,3 +0,0 @@ -# Overfitting and Underfitting - -[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/overfit_and_underfit.ipynb) diff --git a/tensorflow/docs_src/get_started/save_and_restore_models.md b/tensorflow/docs_src/get_started/save_and_restore_models.md deleted file mode 100644 index 44b3772945..0000000000 --- a/tensorflow/docs_src/get_started/save_and_restore_models.md +++ /dev/null @@ -1,3 +0,0 @@ -# Save and restore Models - -[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/save_and_restore_models.ipynb) diff --git a/tensorflow/docs_src/guide/custom_estimators.md b/tensorflow/docs_src/guide/custom_estimators.md index fb20b35c12..a63e2bafb3 100644 --- a/tensorflow/docs_src/guide/custom_estimators.md +++ b/tensorflow/docs_src/guide/custom_estimators.md @@ -362,10 +362,10 @@ model's loss. This is the that will be optimized. We can calculate the loss by calling @{tf.losses.sparse_softmax_cross_entropy}. -The value returned by this function will be lowest, approximately 0, -probability of the correct class (at index `label`) is near 1.0. The loss value -returned is progressively larger as the probability of the correct class -decreases. +The value returned by this function will be approximately 0 at lowest, +when the probability of the correct class (at index `label`) is near 1.0. +The loss value returned is progressively larger as the probability of the +correct class decreases. This function returns the average over the whole batch. diff --git a/tensorflow/docs_src/guide/datasets_for_estimators.md b/tensorflow/docs_src/guide/datasets_for_estimators.md index b04af78cd8..b55a5731a4 100644 --- a/tensorflow/docs_src/guide/datasets_for_estimators.md +++ b/tensorflow/docs_src/guide/datasets_for_estimators.md @@ -76,9 +76,9 @@ Let's walk through the `train_input_fn()`. The function starts by using the @{tf.data.Dataset.from_tensor_slices} function to create a @{tf.data.Dataset} representing slices of the array. The array is sliced across the first dimension. For example, an array containing the -@{$tutorials/layers$mnist training data} has a shape of `(60000, 28, 28)`. -Passing this to `from_tensor_slices` returns a `Dataset` object containing -60000 slices, each one a 28x28 image. +MNIST training data has a shape of `(60000, 28, 28)`. Passing this to +`from_tensor_slices` returns a `Dataset` object containing 60000 slices, each one +a 28x28 image. The code that returns this `Dataset` is as follows: diff --git a/tensorflow/docs_src/guide/debugger.md b/tensorflow/docs_src/guide/debugger.md index 6bd941886d..8d78fe6fbd 100644 --- a/tensorflow/docs_src/guide/debugger.md +++ b/tensorflow/docs_src/guide/debugger.md @@ -17,7 +17,7 @@ how to use the graphical user interface (GUI) of tfdbg, i.e., the Note: The TensorFlow debugger uses a [curses](https://en.wikipedia.org/wiki/Curses_\(programming_library\))-based text user interface. On Mac OS X, the `ncurses` library is required and can be -installed with `brew install homebrew/dupes/ncurses`. On Windows, curses isn't as +installed with `brew install ncurses`. On Windows, curses isn't as well supported, so a [readline](https://en.wikipedia.org/wiki/GNU_Readline)-based interface can be used with tfdbg by installing `pyreadline` with `pip`. If you use Anaconda3, you can install it with a command such as @@ -33,8 +33,9 @@ and [`inf`s](https://en.wikipedia.org/wiki/Infinity), a frequently-encountered type of bug in TensorFlow model development. The following example is for users who use the low-level [`Session`](https://www.tensorflow.org/api_docs/python/tf/Session) API of -TensorFlow. A later section of this document describes how to use **tfdbg** -with a higher-level API, namely `Estimator`s. +TensorFlow. Later sections of this document describe how to use **tfdbg** +with higher-level APIs of TensorFlow, including `tf.estimator`, +`tf.keras` / `keras` and `tf.contrib.slim`. To *observe* such an issue, run the following command without the debugger (the source code can be found [here](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/debug/examples/debug_mnist.py)): @@ -209,6 +210,7 @@ Try the following commands at the `tfdbg>` prompt (referencing the code at | **`config`** | | **Set or show persistent TFDBG UI configuration.** | | | | `set` | Set the value of a config item: {`graph_recursion_depth`, `mouse_mode`}. | `config set graph_recursion_depth 3` | | | `show` | Show current persistent UI configuration. | `config show` | +| **`version`** | | **Print the version of TensorFlow and its key dependencies.** | `version` | | **`help`** | | **Print general help information** | `help` | | | `help ` | Print help for given command. | `help lt` | @@ -461,7 +463,6 @@ predict_results = classifier.predict(predict_input_fn, hooks=hooks) ``` [debug_tflearn_iris.py](https://www.tensorflow.org/code/tensorflow/python/debug/examples/debug_tflearn_iris.py), -based on [tf-learn's iris tutorial](https://www.tensorflow.org/versions/r1.8/get_started/tflearn), contains a full example of how to use the tfdbg with `Estimator`s. To run this example, do: @@ -477,20 +478,31 @@ for more details. ## Debugging Keras Models with TFDBG -To use TFDBG with [Keras](https://keras.io/), let the Keras backend use -a TFDBG-wrapped Session object. For example, to use the CLI wrapper: +To use TFDBG with +[tf.keras](https://www.tensorflow.org/api_docs/python/tf/keras), +let the Keras backend use a TFDBG-wrapped Session object. For example, to use +the CLI wrapper: ``` python import tensorflow as tf -from keras import backend as keras_backend from tensorflow.python import debug as tf_debug -keras_backend.set_session(tf_debug.LocalCLIDebugWrapperSession(tf.Session())) +tf.keras.backend.set_session(tf_debug.LocalCLIDebugWrapperSession(tf.Session())) # Define your keras model, called "model". -model.fit(...) # This will break into the TFDBG CLI. + +# Calls to `fit()`, 'evaluate()` and `predict()` methods will break into the +# TFDBG CLI. +model.fit(...) +model.evaluate(...) +model.predict(...) ``` +With minor modification, the preceding code example also works for the +[non-TensorFlow version of Keras](https://keras.io/) running against a +TensorFlow backend. You just need to replace `tf.keras.backend` with +`keras.backend`. + ## Debugging tf-slim with TFDBG TFDBG supports debugging of training and evaluation with diff --git a/tensorflow/docs_src/guide/eager.md b/tensorflow/docs_src/guide/eager.md index 00d02b4455..003ca265fe 100644 --- a/tensorflow/docs_src/guide/eager.md +++ b/tensorflow/docs_src/guide/eager.md @@ -149,16 +149,17 @@ it to implement your own layer: ```py class MySimpleLayer(tf.keras.layers.Layer): def __init__(self, output_units): + super(MySimpleLayer, self).__init__() self.output_units = output_units - def build(self, input): + def build(self, input_shape): # The build method gets called the first time your layer is used. # Creating variables on build() allows you to make their shape depend - # on the input shape and hence remove the need for the user to specify + # on the input shape and hence removes the need for the user to specify # full shapes. It is possible to create variables during __init__() if # you already know their full shapes. self.kernel = self.add_variable( - "kernel", [input.shape[-1], self.output_units]) + "kernel", [input_shape[-1], self.output_units]) def call(self, input): # Override call() instead of __call__ so we can perform some bookkeeping. @@ -315,9 +316,8 @@ for (batch, (images, labels)) in enumerate(dataset): The following example creates a multi-layer model that classifies the standard -[MNIST handwritten digits](https://www.tensorflow.org/tutorials/layers). It -demonstrates the optimizer and layer APIs to build trainable graphs in an eager -execution environment. +MNIST handwritten digits. It demonstrates the optimizer and layer APIs to build +trainable graphs in an eager execution environment. ### Train a model diff --git a/tensorflow/docs_src/guide/graphs.md b/tensorflow/docs_src/guide/graphs.md index e6246ef148..492f97c191 100644 --- a/tensorflow/docs_src/guide/graphs.md +++ b/tensorflow/docs_src/guide/graphs.md @@ -486,7 +486,7 @@ subgraph inside. ![](../images/mnist_deep.png) For more information about visualizing your TensorFlow application with -TensorBoard, see the [TensorBoard tutorial](../get_started/summaries_and_tensorboard.md). +TensorBoard, see the [TensorBoard guide](./summaries_and_tensorboard.md). ## Programming with multiple graphs diff --git a/tensorflow/docs_src/guide/keras.md b/tensorflow/docs_src/guide/keras.md index d584ebe945..1d846df104 100644 --- a/tensorflow/docs_src/guide/keras.md +++ b/tensorflow/docs_src/guide/keras.md @@ -221,7 +221,7 @@ To *evaluate* the inference-mode loss and metrics for the data provided: ```python model.evaluate(x, y, batch_size=32) -model.evaluate(dataset, steps=30 +model.evaluate(dataset, steps=30) ``` And to *predict* the output of the last layer in inference for the data provided, @@ -548,11 +548,9 @@ model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), estimator = keras.estimator.model_to_estimator(model) ``` -Note: -* Enable [eager execution](./eager.md) for debugging +Note: Enable [eager execution](./eager.md) for debugging [Estimator input functions](./premade_estimators.md#create_input_functions) and inspecting data. -* Don't use batch normalization or try to finetune batch normalization models with estimators created from `tf.keras.estimator.model_to_estimator`. More details at [#17950](https://github.com/tensorflow/tensorflow/issues/17950) ### Multiple GPUs @@ -583,15 +581,6 @@ model.compile(loss='binary_crossentropy', optimizer=optimizer) model.summary() ``` -Convert the Keras model to a `tf.estimator.Estimator` instance: - -```python -keras_estimator = keras.estimator.model_to_estimator( - keras_model=model, - config=config, - model_dir='/tmp/model_dir') -``` - Define an *input pipeline*. The `input_fn` returns a `tf.data.Dataset` object used to distribute the data across multiple devices—with each device processing a slice of the input batch. @@ -617,6 +606,15 @@ strategy = tf.contrib.distribute.MirroredStrategy() config = tf.estimator.RunConfig(train_distribute=strategy) ``` +Convert the Keras model to a `tf.estimator.Estimator` instance: + +```python +keras_estimator = keras.estimator.model_to_estimator( + keras_model=model, + config=config, + model_dir='/tmp/model_dir') +``` + Finally, train the `Estimator` instance by providing the `input_fn` and `steps` arguments: diff --git a/tensorflow/docs_src/guide/saved_model.md b/tensorflow/docs_src/guide/saved_model.md index 27ef7bb0da..acc3d3ca0b 100644 --- a/tensorflow/docs_src/guide/saved_model.md +++ b/tensorflow/docs_src/guide/saved_model.md @@ -794,11 +794,12 @@ Here's the syntax: ``` usage: saved_model_cli run [-h] --dir DIR --tag_set TAG_SET --signature_def SIGNATURE_DEF_KEY [--inputs INPUTS] - [--input_exprs INPUT_EXPRS] [--outdir OUTDIR] + [--input_exprs INPUT_EXPRS] + [--input_examples INPUT_EXAMPLES] [--outdir OUTDIR] [--overwrite] [--tf_debug] ``` -The `run` command provides the following two ways to pass inputs to the model: +The `run` command provides the following three ways to pass inputs to the model: * `--inputs` option enables you to pass numpy ndarray in files. * `--input_exprs` option enables you to pass Python expressions. @@ -847,7 +848,7 @@ dictionary is stored in the pickle file and the value corresponding to the *variable_name* will be used. -#### `--inputs_exprs` +#### `--input_exprs` To pass inputs through Python expressions, specify the `--input_exprs` option. This can be useful for when you don't have data @@ -869,7 +870,7 @@ example: (Note that the `numpy` module is already available to you as `np`.) -#### `--inputs_examples` +#### `--input_examples` To pass `tf.train.Example` as inputs, specify the `--input_examples` option. For each input key, it takes a list of dictionary, where each dictionary is an diff --git a/tensorflow/docs_src/guide/tensorboard_histograms.md b/tensorflow/docs_src/guide/tensorboard_histograms.md index 918deda190..af8f2cadd1 100644 --- a/tensorflow/docs_src/guide/tensorboard_histograms.md +++ b/tensorflow/docs_src/guide/tensorboard_histograms.md @@ -13,8 +13,8 @@ TensorFlow has an op which is perfect for this purpose. As is usually the case with TensorBoard, we will ingest data using a summary op; in this case, ['tf.summary.histogram'](https://www.tensorflow.org/api_docs/python/tf/summary/histogram). -For a primer on how summaries work, please see the general -[TensorBoard tutorial](https://www.tensorflow.org/get_started/summaries_and_tensorboard). +For a primer on how summaries work, please see the +[TensorBoard guide](./summaries_and_tensorboard.md). Here is a code snippet that will generate some histogram summaries containing normally distributed data, where the mean of the distribution increases over diff --git a/tensorflow/docs_src/install/install_c.md b/tensorflow/docs_src/install/install_c.md index 9aebf2bfa4..2901848745 100644 --- a/tensorflow/docs_src/install/install_c.md +++ b/tensorflow/docs_src/install/install_c.md @@ -38,7 +38,7 @@ enable TensorFlow for C: OS="linux" # Change to "darwin" for macOS TARGET_DIRECTORY="/usr/local" curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-${OS}-x86_64-1.9.0-rc2.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-${OS}-x86_64-1.9.0-rc0.tar.gz" | sudo tar -C $TARGET_DIRECTORY -xz The `tar` command extracts the TensorFlow C library into the `lib` diff --git a/tensorflow/docs_src/install/install_go.md b/tensorflow/docs_src/install/install_go.md index 1907355341..2c126df5aa 100644 --- a/tensorflow/docs_src/install/install_go.md +++ b/tensorflow/docs_src/install/install_go.md @@ -38,7 +38,7 @@ steps to install this library and enable TensorFlow for Go: TF_TYPE="cpu" # Change to "gpu" for GPU support TARGET_DIRECTORY='/usr/local' curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-$(go env GOOS)-x86_64-1.9.0-rc2.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-$(go env GOOS)-x86_64-1.9.0-rc0.tar.gz" | sudo tar -C $TARGET_DIRECTORY -xz The `tar` command extracts the TensorFlow C library into the `lib` diff --git a/tensorflow/docs_src/install/install_java.md b/tensorflow/docs_src/install/install_java.md index b9c9912816..692dfc9cef 100644 --- a/tensorflow/docs_src/install/install_java.md +++ b/tensorflow/docs_src/install/install_java.md @@ -36,7 +36,7 @@ following to the project's `pom.xml` to use the TensorFlow Java APIs: org.tensorflow tensorflow - 1.9.0-rc2 + 1.9.0-rc0 ``` @@ -65,7 +65,7 @@ As an example, these steps will create a Maven project that uses TensorFlow: org.tensorflow tensorflow - 1.9.0-rc2 + 1.9.0-rc0 @@ -124,12 +124,12 @@ instead: org.tensorflow libtensorflow - 1.9.0-rc2 + 1.9.0-rc0 org.tensorflow libtensorflow_jni_gpu - 1.9.0-rc2 + 1.9.0-rc0 ``` @@ -148,7 +148,7 @@ refer to the simpler instructions above instead. Take the following steps to install TensorFlow for Java on Linux or macOS: 1. Download - [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc2.jar), + [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc0.jar), which is the TensorFlow Java Archive (JAR). 2. Decide whether you will run TensorFlow for Java on CPU(s) only or with @@ -167,7 +167,7 @@ Take the following steps to install TensorFlow for Java on Linux or macOS: OS=$(uname -s | tr '[:upper:]' '[:lower:]') mkdir -p ./jni curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-${TF_TYPE}-${OS}-x86_64-1.9.0-rc2.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-${TF_TYPE}-${OS}-x86_64-1.9.0-rc0.tar.gz" | tar -xz -C ./jni ### Install on Windows @@ -175,13 +175,13 @@ Take the following steps to install TensorFlow for Java on Linux or macOS: Take the following steps to install TensorFlow for Java on Windows: 1. Download - [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc2.jar), + [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc0.jar), which is the TensorFlow Java Archive (JAR). 2. Download the following Java Native Interface (JNI) file appropriate for - [TensorFlow for Java on Windows](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-cpu-windows-x86_64-1.9.0-rc2.zip). + [TensorFlow for Java on Windows](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-cpu-windows-x86_64-1.9.0-rc0.zip). 3. Extract this .zip file. - +__Note__: The native library (`tensorflow_jni.dll`) requires `msvcp140.dll` at runtime, which is included in the [Visual C++ 2015 Redistributable](https://www.microsoft.com/en-us/download/details.aspx?id=48145) package. ### Validate the installation @@ -227,7 +227,7 @@ must be part of your `classpath`. For example, you can include the downloaded `.jar` in your `classpath` by using the `-cp` compilation flag as follows: -
javac -cp libtensorflow-1.9.0-rc2.jar HelloTF.java
+
javac -cp libtensorflow-1.9.0-rc0.jar HelloTF.java
### Running @@ -241,11 +241,11 @@ two files are available to the JVM: For example, the following command line executes the `HelloTF` program on Linux and macOS X: -
java -cp libtensorflow-1.9.0-rc2.jar:. -Djava.library.path=./jni HelloTF
+
java -cp libtensorflow-1.9.0-rc0.jar:. -Djava.library.path=./jni HelloTF
And the following command line executes the `HelloTF` program on Windows: -
java -cp libtensorflow-1.9.0-rc2.jar;. -Djava.library.path=jni HelloTF
+
java -cp libtensorflow-1.9.0-rc0.jar;. -Djava.library.path=jni HelloTF
If the program prints Hello from version, you've successfully installed TensorFlow for Java and are ready to use the API. If the program diff --git a/tensorflow/docs_src/install/install_linux.md b/tensorflow/docs_src/install/install_linux.md index ae3d50ff39..f21c073a1b 100644 --- a/tensorflow/docs_src/install/install_linux.md +++ b/tensorflow/docs_src/install/install_linux.md @@ -339,9 +339,7 @@ Docker will download the TensorFlow binary image the first time you launch it. #### GPU support -Prior to installing TensorFlow with GPU support, ensure that your system meets all -[NVIDIA software requirements](#NVIDIARequirements). To launch a Docker container -with NVidia GPU support, enter a command of the following format: +To launch a Docker container with NVidia GPU support, enter a command of the following format (this [does not require any local CUDA installation](https://github.com/nvidia/nvidia-docker/wiki/CUDA#requirements)):
 $ nvidia-docker run -it -p hostPort:containerPort TensorFlowGPUImage
@@ -438,7 +436,7 @@ Take the following steps to install TensorFlow in an Anaconda environment:
 
      
      (tensorflow)$ pip install --ignore-installed --upgrade \
-     https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp34-cp34m-linux_x86_64.whl
+ https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp34-cp34m-linux_x86_64.whl
## Validate your installation @@ -491,7 +489,7 @@ TensorFlow programs: If the system outputs an error message instead of a greeting, see [Common installation problems](#common_installation_problems). -To learn more, see [Get Started with TensorFlow](https://www.tensorflow.org/get_started). +To learn more, see the [TensorFlow tutorials](../tutorials/). ## TensorFlow GPU support @@ -678,14 +676,14 @@ This section documents the relevant values for Linux installations. CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp27-none-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp27-none-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp27-none-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc0-cp27-none-linux_x86_64.whl
 
Note that GPU support requires the NVIDIA hardware and software described in @@ -697,14 +695,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp34-cp34m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp34-cp34m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp34-cp34m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc0-cp34-cp34m-linux_x86_64.whl
 
Note that GPU support requires the NVIDIA hardware and software described in @@ -716,14 +714,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp35-cp35m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp35-cp35m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp35-cp35m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc0-cp35-cp35m-linux_x86_64.whl
 
@@ -735,14 +733,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp36-cp36m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp36-cp36m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp36-cp36m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc0-cp36-cp36m-linux_x86_64.whl
 
diff --git a/tensorflow/docs_src/install/install_mac.md b/tensorflow/docs_src/install/install_mac.md index 3de6da1342..c6f0c17924 100644 --- a/tensorflow/docs_src/install/install_mac.md +++ b/tensorflow/docs_src/install/install_mac.md @@ -119,7 +119,7 @@ Take the following steps to install TensorFlow with Virtualenv: TensorFlow in the active Virtualenv is as follows:
 $ pip3 install --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py3-none-any.whl
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py3-none-any.whl If you encounter installation problems, see [Common Installation Problems](#common-installation-problems). @@ -242,7 +242,7 @@ take the following steps: issue the following command:
 $ sudo pip3 install --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py3-none-any.whl 
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py3-none-any.whl If the preceding command fails, see [installation problems](#common-installation-problems). @@ -350,7 +350,7 @@ Take the following steps to install TensorFlow in an Anaconda environment: TensorFlow for Python 2.7:
 (targetDirectory)$ pip install --ignore-installed --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py2-none-any.whl
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py2-none-any.whl @@ -403,8 +403,7 @@ writing TensorFlow programs: If the system outputs an error message instead of a greeting, see [Common installation problems](#common_installation_problems). -To learn more, see [Get Started with TensorFlow](https://www.tensorflow.org/get_started). - +To learn more, see the [TensorFlow tutorials](../tutorials/). ## Common installation problems @@ -518,7 +517,7 @@ The value you specify depends on your Python version.
-https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py2-none-any.whl
+https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py2-none-any.whl
 
@@ -526,5 +525,5 @@ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py2-none-a
-https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py3-none-any.whl
+https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py3-none-any.whl
 
diff --git a/tensorflow/docs_src/install/install_raspbian.md b/tensorflow/docs_src/install/install_raspbian.md index 0caab6d335..46c4944ca7 100644 --- a/tensorflow/docs_src/install/install_raspbian.md +++ b/tensorflow/docs_src/install/install_raspbian.md @@ -230,7 +230,7 @@ problems, despite the log message. If the system outputs an error message instead of a greeting, see [Common installation problems](#common_installation_problems). -To learn more, see [Get Started with TensorFlow](https://www.tensorflow.org/get_started). +To learn more, see the [TensorFlow tutorials](../tutorials/). ## Common installation problems diff --git a/tensorflow/docs_src/install/install_sources.md b/tensorflow/docs_src/install/install_sources.md index 3520f97c9a..fc1f6d05bd 100644 --- a/tensorflow/docs_src/install/install_sources.md +++ b/tensorflow/docs_src/install/install_sources.md @@ -81,7 +81,7 @@ or [macOS](#PrepareMac) - + ## Prepare environment for Linux Before building TensorFlow on Linux, install the following build @@ -289,17 +289,27 @@ Note: If you're only interested in building the libraries for the TensorFlow C or Java APIs, see [Build the C or Java libraries](#BuildCorJava), you do not need to build the pip package in that case. -To build a pip package for TensorFlow with CPU-only support, -you would typically invoke the following command: +### CPU-only support + +To build a pip package for TensorFlow with CPU-only support: + +
+$ bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
+
+ +To build a pip package for TensorFlow with CPU-only support for the Intel® MKL-DNN:
-$ bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
+$ bazel build --config=mkl --config=opt //tensorflow/tools/pip_package:build_pip_package
 
-To build a pip package for TensorFlow with GPU support, -invoke the following command: +### GPU support + +To build a pip package for TensorFlow with GPU support: -
$ bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package 
+
+$ bazel build --config=opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
+
**NOTE on gcc 5 or later:** the binary pip packages available on the TensorFlow website are built with gcc 4, which uses the older ABI. To @@ -328,10 +338,10 @@ Invoke `pip install` to install that pip package. The filename of the `.whl` file depends on your platform. For example, the following command will install the pip package -for TensorFlow 1.9.0rc2 on Linux: +for TensorFlow 1.9.0rc0 on Linux:
-$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.9.0rc2-py2-none-any.whl
+$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.9.0rc0-py2-none-any.whl
 
## Validate your installation @@ -362,7 +372,7 @@ TensorFlow programs:
Hello, TensorFlow!
-To learn more, see [Get Started with TensorFlow](https://www.tensorflow.org/get_started). +To learn more, see the [TensorFlow tutorials](../tutorials/). If the system outputs an error message instead of a greeting, see [Common installation problems](#common_installation_problems). @@ -373,9 +383,9 @@ The build and installation problems you encounter typically depend on the operating system. See the "Common installation problems" section of one of the following guides: - * @{$install_linux#CommonInstallationProblems$Installing TensorFlow on Linux} - * @{$install_mac#CommonInstallationProblems$Installing TensorFlow on Mac OS} - * @{$install_windows#CommonInstallationProblems$Installing TensorFlow on Windows} + * @{$install_linux#common_installation_problems$Installing TensorFlow on Linux} + * @{$install_mac#common_installation_problems$Installing TensorFlow on Mac OS} + * @{$install_windows#common_installation_problems$Installing TensorFlow on Windows} Beyond the errors documented in those two guides, the following table notes additional errors specific to building TensorFlow. Note that we diff --git a/tensorflow/docs_src/install/install_windows.md b/tensorflow/docs_src/install/install_windows.md index 7fe94f0bc3..7b7b17ce81 100644 --- a/tensorflow/docs_src/install/install_windows.md +++ b/tensorflow/docs_src/install/install_windows.md @@ -157,7 +157,7 @@ TensorFlow programs: If the system outputs an error message instead of a greeting, see [Common installation problems](#common_installation_problems). -To learn more, see [Get Started with TensorFlow](https://www.tensorflow.org/get_started). +To learn more, see the [TensorFlow tutorials](../tutorials/). ## Common installation problems diff --git a/tensorflow/docs_src/mobile/leftnav_files b/tensorflow/docs_src/mobile/leftnav_files index 585470d5f0..97340ef7e1 100644 --- a/tensorflow/docs_src/mobile/leftnav_files +++ b/tensorflow/docs_src/mobile/leftnav_files @@ -4,6 +4,7 @@ tflite/index.md tflite/devguide.md tflite/demo_android.md tflite/demo_ios.md +tflite/performance.md >>> ### TensorFlow Mobile mobile_intro.md diff --git a/tensorflow/docs_src/mobile/linking_libs.md b/tensorflow/docs_src/mobile/linking_libs.md index cf0db59021..efef5dd0da 100644 --- a/tensorflow/docs_src/mobile/linking_libs.md +++ b/tensorflow/docs_src/mobile/linking_libs.md @@ -27,7 +27,7 @@ called `libandroid_tensorflow_inference_java.jar`. There are three ways to include this functionality in your program: 1. Include the jcenter AAR which contains it, as in this - [example app](https://github.com/googlecodelabs/tensorflow-for-poets-2/blob/master/android/build.gradle#L59-L65) + [example app](https://github.com/googlecodelabs/tensorflow-for-poets-2/blob/master/android/tfmobile/build.gradle#L59-L65) 2. Download the nightly precompiled version from [ci.tensorflow.org](http://ci.tensorflow.org/view/Nightly/job/nightly-android/lastSuccessfulBuild/artifact/out/). diff --git a/tensorflow/docs_src/mobile/mobile_intro.md b/tensorflow/docs_src/mobile/mobile_intro.md index 241f01d460..baad443308 100644 --- a/tensorflow/docs_src/mobile/mobile_intro.md +++ b/tensorflow/docs_src/mobile/mobile_intro.md @@ -38,7 +38,8 @@ speech-driven interface, and many of these require on-device processing. Most of the time a user isn’t giving commands, and so streaming audio continuously to a remote server would be a waste of bandwidth, since it would mostly be silence or background noises. To solve this problem it’s common to have a small neural -network running on-device @{$tutorials/audio_recognition$listening out for a particular keyword}. +network running on-device +[listening out for a particular keyword](../tutorials/sequences/audio_recognition). Once that keyword has been spotted, the rest of the conversation can be transmitted over to the server for further processing if more computing power is needed. diff --git a/tensorflow/docs_src/mobile/prepare_models.md b/tensorflow/docs_src/mobile/prepare_models.md index 8b22c04d87..2b84dbb973 100644 --- a/tensorflow/docs_src/mobile/prepare_models.md +++ b/tensorflow/docs_src/mobile/prepare_models.md @@ -105,8 +105,8 @@ inline constants so everything’s in one file. To handle the conversion, you need the `freeze_graph.py` script, that’s held in [`tensorflow/python/tools/freeze_graph.py`](https://www.tensorflow.org/code/tensorflow/python/tools/freeze_graph.py). You’ll run it like this: - bazel build tensorflow/tools:freeze_graph - bazel-bin/tensorflow/tools/freeze_graph \ + bazel build tensorflow/python/tools:freeze_graph + bazel-bin/tensorflow/python/tools/freeze_graph \ --input_graph=/tmp/model/my_graph.pb \ --input_checkpoint=/tmp/model/model.ckpt-1000 \ --output_graph=/tmp/frozen_graph.pb \ diff --git a/tensorflow/docs_src/mobile/tflite/demo_android.md b/tensorflow/docs_src/mobile/tflite/demo_android.md index 7f2f8882a2..fdf0bcf3c1 100644 --- a/tensorflow/docs_src/mobile/tflite/demo_android.md +++ b/tensorflow/docs_src/mobile/tflite/demo_android.md @@ -1,7 +1,7 @@ # Android Demo App An example Android application using TensorFLow Lite is available -[on GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/app). +[on GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo). The demo is a sample camera app that classifies images continuously using either a quantized Mobilenet model or a floating point Inception-v3 model. To run the demo, a device running Android 5.0 ( API 21) or higher is required. @@ -44,20 +44,22 @@ app: Android Studio project. * Install all the Gradle extensions it requests. -To get a model, either: +Now you can build and run the demo app. -* Download the quantized [Mobilenet TensorFlow Lite model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_224_android_quant_2017_11_08.zip) - and unzip and copy `mobilenet_quant_v1_224.tflite` to the assets directory: - `tensorflow/contrib/lite/java/demo/app/src/main/assets/`. -* Or, download the floating point [Inception-v3 model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/inception_v3_slim_2016_android_2017_11_10.zip) - and unzip and copy `inceptionv3_non_slim_2015.tflite` to the assets - directory. Change the chosen classifier in - [Camera2BasicFragment.java](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/java/demo/app/src/main/java/com/example/android/tflitecamerademo/Camera2BasicFragment.java)
+The build process downloads the quantized [Mobilenet TensorFlow Lite model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/mobilenet_v1_224_android_quant_2017_11_08.zip), and unzips it into the assets directory: `tensorflow/contrib/lite/java/demo/app/src/main/assets/`. + +Some additional details are available on the +[TF Lite Android App page](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/java/demo/README.md). + +### Using other models + +To use a different model: +* Download the floating point [Inception-v3 model](https://storage.googleapis.com/download.tensorflow.org/models/tflite/inception_v3_slim_2016_android_2017_11_10.zip). +* Unzip and copy `inceptionv3_non_slim_2015.tflite` to the assets directory. +* Change the chosen classifier in [Camera2BasicFragment.java](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/java/demo/app/src/main/java/com/example/android/tflitecamerademo/Camera2BasicFragment.java)
from: `classifier = new ImageClassifierQuantizedMobileNet(getActivity());`
to: `classifier = new ImageClassifierFloatInception(getActivity());`. -Now you can build and run the demo app. - ## Build TensorFlow Lite and the demo app from source diff --git a/tensorflow/docs_src/mobile/tflite/devguide.md b/tensorflow/docs_src/mobile/tflite/devguide.md index 4133bc172a..b168d6c183 100644 --- a/tensorflow/docs_src/mobile/tflite/devguide.md +++ b/tensorflow/docs_src/mobile/tflite/devguide.md @@ -54,10 +54,11 @@ both floating point and quantized inference. ### Train a custom model A developer may choose to train a custom model using Tensorflow (see the -@{$tutorials} for examples of building and training models). If you have already -written a model, the first step is to export this to a @{tf.GraphDef} file. This -is required because some formats do not store the model structure outside the -code, and we must communicate with other parts of the framework. See +[TensorFlow tutorials](../../tutorials/) for examples of building and training +models). If you have already written a model, the first step is to export this +to a @{tf.GraphDef} file. This is required because some formats do not store the +model structure outside the code, and we must communicate with other parts of the +framework. See [Exporting the Inference Graph](https://github.com/tensorflow/models/blob/master/research/slim/README.md) to create .pb file for the custom model. diff --git a/tensorflow/docs_src/mobile/tflite/index.md b/tensorflow/docs_src/mobile/tflite/index.md index 5622034827..3d1733024e 100644 --- a/tensorflow/docs_src/mobile/tflite/index.md +++ b/tensorflow/docs_src/mobile/tflite/index.md @@ -37,8 +37,9 @@ a custom (less-dynamic) memory allocator to ensure minimal load, initialization, and execution latency. TensorFlow Lite provides an interface to leverage hardware acceleration, if -available on the device. It does so via the Android Neural Networks library, -released as part of Android O-MR1. +available on the device. It does so via the +[Android Neural Networks API](https://developer.android.com/ndk/guides/neuralnetworks/index.html), +available on Android 8.1 (API level 27) and higher. ## Why do we need a new mobile-specific library? @@ -116,6 +117,10 @@ following: Wear](https://research.googleblog.com/2017/02/on-device-machine-intelligence.html) to all first-party and third-party apps. + Also see the complete list of + [TensorFlow Lite's supported models](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/lite/g3doc/models.md), + including the model sizes, performance numbers, and downloadable model files. + - Quantized versions of the MobileNet model, which runs faster than the non-quantized (float) version on CPU. @@ -131,10 +136,10 @@ compatibility with this release. ## Getting Started We recommend you try out TensorFlow Lite with the pre-tested models indicated -above. If you have an existing mode, you will need to test whether your model is -compatible with both the converter and the supported operator set. To test your -model, see the [documentation on -GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite). +above. If you have an existing model, you will need to test whether your model +is compatible with both the converter and the supported operator set. To test +your model, see the +[documentation on GitHub](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite). ### Retrain Inception-V3 or MobileNet for a custom data set diff --git a/tensorflow/docs_src/mobile/tflite/performance.md b/tensorflow/docs_src/mobile/tflite/performance.md new file mode 100644 index 0000000000..79bacaaa1b --- /dev/null +++ b/tensorflow/docs_src/mobile/tflite/performance.md @@ -0,0 +1,174 @@ +# Performance + +This document lists TensorFlow Lite performance benchmarks when running well +known models on some Android and iOS devices. + +These performance benchmark numbers were generated with the +[Android TFLite benchmark binary](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark) +and the [iOS benchmark app](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark/ios). + +# Android performance benchmarks + +For Android benchmarks, the CPU affinity is set to use big cores on the device to +reduce variance (see [details](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark#reducing-variance-between-runs-on-android)). + +It assumes that models were download and unzipped to the +`/data/local/tmp/tflite_models` directory. The benchmark binary is built +using [these instructions](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark#on-android) +and assumed in the `/data/local/tmp` directory. + +To run the benchmark: + +``` +adb shell taskset ${CPU_MASK} /data/local/tmp/benchmark_model \ + --num_threads=1 \ + --graph=/data/local/tmp/tflite_models/${GRAPH} \ + --warmup_runs=1 \ + --num_runs=50 \ + --use_nnapi=false +``` + +Here, `${GRAPH}` is the name of model and `${CPU_MASK}` is the CPU affinity +chosen according to the following table: + +Device | CPU_MASK | +-------| ---------- +Pixel 2 | f0 | +Pixel xl | 0c | + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Model NameDevice Mean inference time (std dev)
+ Mobilenet_1.0_224(float) + Pixel 2 166.5 ms (2.6 ms)
Pixel xl 122.9 ms (1.8 ms)
+ Mobilenet_1.0_224 (quant) + Pixel 2 69.5 ms (0.9 ms)
Pixel xl 78.9 ms (2.2 ms)
+ NASNet mobile + Pixel 2 273.8 ms (3.5 ms)
Pixel xl 210.8 ms (4.2 ms)
+ SqueezeNet + Pixel 2 234.0 ms (2.1 ms)
Pixel xl 158.0 ms (2.1 ms)
+ Inception_ResNet_V2 + Pixel 2 2846.0 ms (15.0 ms)
Pixel xl 1973.0 ms (15.0 ms)
+ Inception_V4 + Pixel 2 3180.0 ms (11.7 ms)
Pixel xl 2262.0 ms (21.0 ms)
+ +# iOS benchmarks + +To run iOS benchmarks, the [benchmark +app](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/lite/tools/benchmark/ios) +was modified to include the appropriate model and `benchmark_params.json` was +modified to set `num_threads` to 1. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Model NameDevice Mean inference time (std dev)
+ Mobilenet_1.0_224(float) + iPhone 8 32.2 ms (0.8 ms)
+ Mobilenet_1.0_224 (quant) + iPhone 8 24.4 ms (0.8 ms)
+ NASNet mobile + iPhone 8 60.3 ms (0.6 ms)
+ SqueezeNet + iPhone 8 44.3 (0.7 ms)
+ Inception_ResNet_V2 + iPhone 8562.4 ms (18.2 ms)
+ Inception_V4 + iPhone 8 661.0 ms (29.2 ms)
diff --git a/tensorflow/docs_src/performance/quantization.md b/tensorflow/docs_src/performance/quantization.md index 2fea02d861..c97f74139c 100644 --- a/tensorflow/docs_src/performance/quantization.md +++ b/tensorflow/docs_src/performance/quantization.md @@ -227,8 +227,8 @@ of 30.0f, and an 8-bit array, the quantized values represent the following: - +
QuantizedFloat
0-10.0
25530.0
12810.0
25530.0
Table 2: Example quantized value range diff --git a/tensorflow/docs_src/performance/xla/operation_semantics.md b/tensorflow/docs_src/performance/xla/operation_semantics.md index 5887c3d88b..4c4f3f3934 100644 --- a/tensorflow/docs_src/performance/xla/operation_semantics.md +++ b/tensorflow/docs_src/performance/xla/operation_semantics.md @@ -581,12 +581,21 @@ Computes a sum across replicas. Arguments | Type | Semantics --------- | ------- | ----------------------------- `operand` | `XlaOp` | Array to sum across replicas. +| `replica_group_ids` | `int64` vector | Group ID for each replica. | The output shape is the same as the input shape. For example, if there are two replicas and the operand has the value `(1.0, 2.5)` and `(3.0, 5.25)` respectively on the two replicas, then the output value from this op will be `(4.0, 7.75)` on both replicas. +`replica_group_ids` identifies the group ID of each replica. The group ID must +either be empty (all replicas belong to a single group), or contain the same +number of elements as the number of replicas. For example, if +`replica_group_ids` = {0, 1, 2, 3, 0, 1, 2, 3} has eight replicas, there are +four subgroups of replica IDs: {0, 4}, {1, 5}, {2, 6}, and {3, 7}. The size of +each subgroup *must* be identical, so, for example, using: +`replica_group_ids` = {0, 1, 2, 0} for four replicas is invalid. + Computing the result of CrossReplicaSum requires having one input from each replica, so if one replica executes a CrossReplicaSum node more times than another, then the former replica will wait forever. Since the replicas are all @@ -1299,12 +1308,10 @@ See also : : : parameters of type T and M of : : : : arbitrary type : | `dimensions` | `int64` array | array of map dimensions | -| `static_operands` | sequence of M `XlaOp`s | M arrays of arbitrary type | Applies a scalar function over the given `operands` arrays, producing an array of the same dimensions where each element is the result of the mapped function -applied to the corresponding elements in the input arrays with `static_operands` -given as additional input to `computation`. +applied to the corresponding elements in the input arrays. The mapped function is an arbitrary computation with the restriction that it has N inputs of scalar type `T` and a single output with type `S`. The output has @@ -2003,13 +2010,35 @@ Slice(b, {2, 1}, {4, 3}) produces: See also [`XlaBuilder::Sort`](https://www.tensorflow.org/code/tensorflow/compiler/xla/client/xla_client/xla_builder.h). -Sorts the elements in the operand. +There are two versions of the Sort instruction: a single-operand and a +two-operand version. `Sort(operand)` +Arguments | Type | Semantics +--------- | ------- | -------------------- +`operand` | `XlaOp` | The operand to sort. + +Sorts the elements in the operand in ascending order. The operand must be rank-1. +If the operand's elements have floating point type, and the operand contains +NaN elements, the order of elements in the output is implementation-defined. + +`Sort(key, value)` + +Sorts both the key and the value operands. The keys are sorted as in the +single-operand version. The values are sorted according to the order of their +corresponding keys. For example, if the inputs are `keys = [3, 1]` and +`values = [42, 50]`, then the output of the sort is the tuple `{[1, 3], [50, 42]}`. +The sort is not guaranteed to be stable, that is, if the keys array contains +duplicates, the order of their corresponding values may not be preserved. + Arguments | Type | Semantics --------- | ------- | ------------------- -`operand` | `XlaOp` | The operand to sort +`keys` | `XlaOp` | The sort keys. +`values` | `XlaOp` | The values to sort. + +The `keys` and `values` operand must both be rank-1, and must have the same +dimensions, but may have different element types. ## Transpose diff --git a/tensorflow/docs_src/tutorials/_index.yaml b/tensorflow/docs_src/tutorials/_index.yaml new file mode 100644 index 0000000000..6fc8155669 --- /dev/null +++ b/tensorflow/docs_src/tutorials/_index.yaml @@ -0,0 +1,251 @@ +project_path: /_project.yaml +book_path: /_book.yaml +description: +landing_page: + show_side_navs: True + rows: + - description: > +

Get Started with TensorFlow

+

+ TensorFlow is an open-source machine learning library for research and + production. TensorFlow offers APIs for beginners and experts to develop + for desktop, mobile, web, and cloud. See the sections below to get + started. +

+ items: + - custom_html: > + +
+

Learn and use ML

+
+

+ The high-level Keras API provides building blocks to create and + train deep learning models. Start with these beginner-friendly + notebook examples, then read the + TensorFlow Keras guide. +

+
    +
  1. Basic classification
  2. +
  3. Text classification
  4. +
  5. Regression
  6. +
  7. Overfitting and underfitting
  8. +
  9. Save and load
  10. +
+
+ +
+ - classname: tfo-landing-row-item-code-block + code_block: | +
+        import tensorflow as tf
+        mnist = tf.keras.datasets.mnist
+
+        (x_train, y_train),(x_test, y_test) = mnist.load_data()
+        x_train, x_test = x_train / 255.0, x_test / 255.0
+
+        model = tf.keras.models.Sequential([
+          tf.keras.layers.Flatten(),
+          tf.keras.layers.Dense(512, activation=tf.nn.relu),
+          tf.keras.layers.Dropout(0.2),
+          tf.keras.layers.Dense(10, activation=tf.nn.softmax)
+        ])
+        model.compile(optimizer='adam',
+                      loss='sparse_categorical_crossentropy',
+                      metrics=['accuracy'])
+
+        model.fit(x_train, y_train, epochs=5)
+        model.evaluate(x_test, y_test)
+        
+ {% dynamic if request.tld != 'cn' %} + Run in a Notebook + {% dynamic endif %} + + - items: + - custom_html: > +
+

Research and experimentation

+
+

+ Eager execution provides an imperative, define-by-run interface for advanced operations. Write custom layers, forward passes, and training loops with auto‑differentiation. Start with + these notebooks, then read the eager execution guide. +

+
    +
  1. + {% dynamic if request.tld == 'cn' %} + Eager execution basics + {% dynamic else %} + Eager execution basics + {% dynamic endif %} +
  2. +
  3. + {% dynamic if request.tld == 'cn' %} + Automatic differentiation and gradient tape + {% dynamic else %} + Automatic differentiation and gradient tape + {% dynamic endif %} +
  4. +
  5. + {% dynamic if request.tld == 'cn' %} + Custom training: basics + {% dynamic else %} + Custom training: basics + {% dynamic endif %} +
  6. +
  7. + {% dynamic if request.tld == 'cn' %} + Custom layers + {% dynamic else %} + Custom layers + {% dynamic endif %} +
  8. +
  9. Custom training: walkthrough
  10. +
  11. + {% dynamic if request.tld == 'cn' %} + Example: Neural machine translation w/ attention + {% dynamic else %} + Example: Neural machine translation w/ attention + {% dynamic endif %} +
  12. +
+
+ +
+ - custom_html: > +
+

ML at production scale

+ + +
+ + - description: > +

Google Colab: An easy way to learn and use TensorFlow

+

+ Colaboratory + is a Google research project created to help disseminate machine learning + education and research. It's a Jupyter notebook environment that requires + no setup to use and runs entirely in the cloud. + Read the blog post. +

+ + - description: > +

Build your first ML app

+

Create and deploy TensorFlow models on web and mobile.

+ background: grey + items: + - custom_html: > +
+ +

Web developers

+
+
+ TensorFlow.js is a WebGL accelerated, JavaScript library to train and + deploy ML models in the browser and for Node.js. +
+
+ - custom_html: > +
+ +

Mobile developers

+
+
+ TensorFlow Lite is lightweight solution for mobile and embedded devices. +
+
+ + - description: > +

Videos and updates

+

+ Subscribe to the TensorFlow + YouTube channel + and blog for + the latest videos and updates. +

+ items: + - description: > +

Get started with TensorFlow's High-Level APIs

+ youtube_id: tjsHSIG8I08 + buttons: + - label: Watch the video + path: https://www.youtube.com/watch?v=tjsHSIG8I08 + - description: > +

Eager execution

+ youtube_id: T8AW0fKP0Hs + background: grey + buttons: + - label: Watch the video + path: https://www.youtube.com/watch?v=T8AW0fKP0Hs + - description: > +

tf.data: Fast, flexible, and easy-to-use input pipelines

+ youtube_id: uIcqeP7MFH0 + buttons: + - label: Watch the video + path: https://www.youtube.com/watch?v=uIcqeP7MFH0 diff --git a/tensorflow/docs_src/tutorials/_toc.yaml b/tensorflow/docs_src/tutorials/_toc.yaml new file mode 100644 index 0000000000..d46d570a93 --- /dev/null +++ b/tensorflow/docs_src/tutorials/_toc.yaml @@ -0,0 +1,93 @@ +toc: +- title: Get started with TensorFlow + path: /tutorials/ + +- title: Learn and use ML + style: accordion + section: + - title: Overview + path: /tutorials/keras/ + - title: Basic classification + path: /tutorials/keras/basic_classification + - title: Text classification + path: /tutorials/keras/basic_text_classification + - title: Regression + path: /tutorials/keras/basic_regression + - title: Overfitting and underfitting + path: /tutorials/keras/overfit_and_underfit + - title: Save and restore models + path: /tutorials/keras/save_and_restore_models + +- title: Research and experimentation + style: accordion + section: + - title: Overview + path: /tutorials/eager/ + - title: Eager execution + path: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/notebooks/eager_intro.ipynb + status: external + - title: Automatic differentiation + path: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/notebooks/automatic_differentiation.ipynb + status: external + - title: "Custom training: basics" + path: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/notebooks/custom_training.ipynb + status: external + - title: Custom layers + path: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/notebooks/custom_layers.ipynb + status: external + - title: "Custom training: walkthrough" + path: /tutorials/eager/custom_training_walkthrough + - title: Neural machine translation + path: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb + status: external + +- title: Images + style: accordion + section: + - title: Build a CNN using Estimators + path: /tutorials/images/layers + - title: Image recognition + path: /tutorials/images/image_recognition + - title: Image retraining + path: /hub/tutorials/image_retraining + - title: Advanced CNN + path: /tutorials/images/deep_cnn + +- title: Sequences + style: accordion + section: + - title: Recurrent neural network + path: /tutorials/sequences/recurrent + - title: Drawing classification + path: /tutorials/sequences/recurrent_quickdraw + - title: Simple audio recognition + path: /tutorials/sequences/audio_recognition + - title: Neural machine translation + path: https://github.com/tensorflow/nmt + status: external + +- title: Data representation + style: accordion + section: + - title: Linear models + path: /tutorials/representation/wide + - title: Wide and deep learning + path: /tutorials/representation/wide_and_deep + - title: Vector representations of words + path: /tutorials/representation/word2vec + - title: Kernel methods + path: /tutorials/representation/kernel_methods + - title: Large-scale linear models + path: /tutorials/representation/linear + +- title: Non-ML + style: accordion + section: + - title: Mandelbrot set + path: /tutorials/non-ml/mandelbrot + - title: Partial differential equations + path: /tutorials/non-ml/pdes + +- break: True +- title: Next steps + path: /tutorials/next_steps diff --git a/tensorflow/docs_src/tutorials/audio_recognition.md b/tensorflow/docs_src/tutorials/audio_recognition.md deleted file mode 100644 index d7a8da6f96..0000000000 --- a/tensorflow/docs_src/tutorials/audio_recognition.md +++ /dev/null @@ -1,631 +0,0 @@ -# Simple Audio Recognition - -This tutorial will show you how to build a basic speech recognition network that -recognizes ten different words. It's important to know that real speech and -audio recognition systems are much more complex, but like MNIST for images, it -should give you a basic understanding of the techniques involved. Once you've -completed this tutorial, you'll have a model that tries to classify a one second -audio clip as either silence, an unknown word, "yes", "no", "up", "down", -"left", "right", "on", "off", "stop", or "go". You'll also be able to take this -model and run it in an Android application. - -## Preparation - -You should make sure you have TensorFlow installed, and since the script -downloads over 1GB of training data, you'll need a good internet connection and -enough free space on your machine. The training process itself can take several -hours, so make sure you have a machine available for that long. - -## Training - -To begin the training process, go to the TensorFlow source tree and run: - -```bash -python tensorflow/examples/speech_commands/train.py -``` - -The script will start off by downloading the [Speech Commands -dataset](https://storage.cloud.google.com/download.tensorflow.org/data/speech_commands_v0.02.tar.gz), -which consists of over 105,000 WAVE audio files of people saying thirty -different words. This data was collected by Google and released under a CC BY -license, and you can help improve it by [contributing five minutes of your own -voice](https://aiyprojects.withgoogle.com/open_speech_recording). The archive is -over 2GB, so this part may take a while, but you should see progress logs, and -once it's been downloaded once you won't need to do this step again. You can -find more information about this dataset in this -[Speech Commands paper](https://arxiv.org/abs/1804.03209). - -Once the downloading has completed, you'll see logging information that looks -like this: - -``` -I0730 16:53:44.766740 55030 train.py:176] Training from step: 1 -I0730 16:53:47.289078 55030 train.py:217] Step #1: rate 0.001000, accuracy 7.0%, cross entropy 2.611571 -``` - -This shows that the initialization process is done and the training loop has -begun. You'll see that it outputs information for every training step. Here's a -break down of what it means: - -`Step #1` shows that we're on the first step of the training loop. In this case -there are going to be 18,000 steps in total, so you can look at the step number -to get an idea of how close it is to finishing. - -`rate 0.001000` is the learning rate that's controlling the speed of the -network's weight updates. Early on this is a comparatively high number (0.001), -but for later training cycles it will be reduced 10x, to 0.0001. - -`accuracy 7.0%` is the how many classes were correctly predicted on this -training step. This value will often fluctuate a lot, but should increase on -average as training progresses. The model outputs an array of numbers, one for -each label, and each number is the predicted likelihood of the input being that -class. The predicted label is picked by choosing the entry with the highest -score. The scores are always between zero and one, with higher values -representing more confidence in the result. - -`cross entropy 2.611571` is the result of the loss function that we're using to -guide the training process. This is a score that's obtained by comparing the -vector of scores from the current training run to the correct labels, and this -should trend downwards during training. - -After a hundred steps, you should see a line like this: - -`I0730 16:54:41.813438 55030 train.py:252] Saving to -"/tmp/speech_commands_train/conv.ckpt-100"` - -This is saving out the current trained weights to a checkpoint file. If your -training script gets interrupted, you can look for the last saved checkpoint and -then restart the script with -`--start_checkpoint=/tmp/speech_commands_train/conv.ckpt-100` as a command line -argument to start from that point. - -## Confusion Matrix - -After four hundred steps, this information will be logged: - -``` -I0730 16:57:38.073667 55030 train.py:243] Confusion Matrix: - [[258 0 0 0 0 0 0 0 0 0 0 0] - [ 7 6 26 94 7 49 1 15 40 2 0 11] - [ 10 1 107 80 13 22 0 13 10 1 0 4] - [ 1 3 16 163 6 48 0 5 10 1 0 17] - [ 15 1 17 114 55 13 0 9 22 5 0 9] - [ 1 1 6 97 3 87 1 12 46 0 0 10] - [ 8 6 86 84 13 24 1 9 9 1 0 6] - [ 9 3 32 112 9 26 1 36 19 0 0 9] - [ 8 2 12 94 9 52 0 6 72 0 0 2] - [ 16 1 39 74 29 42 0 6 37 9 0 3] - [ 15 6 17 71 50 37 0 6 32 2 1 9] - [ 11 1 6 151 5 42 0 8 16 0 0 20]] -``` - -The first section is a [confusion -matrix](https://www.tensorflow.org/api_docs/python/tf/confusion_matrix). To -understand what it means, you first need to know the labels being used, which in -this case are "_silence_", "_unknown_", "yes", "no", "up", "down", "left", -"right", "on", "off", "stop", and "go". Each column represents a set of samples -that were predicted to be each label, so the first column represents all the -clips that were predicted to be silence, the second all those that were -predicted to be unknown words, the third "yes", and so on. - -Each row represents clips by their correct, ground truth labels. The first row -is all the clips that were silence, the second clips that were unknown words, -the third "yes", etc. - -This matrix can be more useful than just a single accuracy score because it -gives a good summary of what mistakes the network is making. In this example you -can see that all of the entries in the first row are zero, apart from the -initial one. Because the first row is all the clips that are actually silence, -this means that none of them were mistakenly labeled as words, so we have no -false negatives for silence. This shows the network is already getting pretty -good at distinguishing silence from words. - -If we look down the first column though, we see a lot of non-zero values. The -column represents all the clips that were predicted to be silence, so positive -numbers outside of the first cell are errors. This means that some clips of real -spoken words are actually being predicted to be silence, so we do have quite a -few false positives. - -A perfect model would produce a confusion matrix where all of the entries were -zero apart from a diagonal line through the center. Spotting deviations from -that pattern can help you figure out how the model is most easily confused, and -once you've identified the problems you can address them by adding more data or -cleaning up categories. - -## Validation - -After the confusion matrix, you should see a line like this: - -`I0730 16:57:38.073777 55030 train.py:245] Step 400: Validation accuracy = 26.3% -(N=3093)` - -It's good practice to separate your data set into three categories. The largest -(in this case roughly 80% of the data) is used for training the network, a -smaller set (10% here, known as "validation") is reserved for evaluation of the -accuracy during training, and another set (the last 10%, "testing") is used to -evaluate the accuracy once after the training is complete. - -The reason for this split is that there's always a danger that networks will -start memorizing their inputs during training. By keeping the validation set -separate, you can ensure that the model works with data it's never seen before. -The testing set is an additional safeguard to make sure that you haven't just -been tweaking your model in a way that happens to work for both the training and -validation sets, but not a broader range of inputs. - -The training script automatically separates the data set into these three -categories, and the logging line above shows the accuracy of model when run on -the validation set. Ideally, this should stick fairly close to the training -accuracy. If the training accuracy increases but the validation doesn't, that's -a sign that overfitting is occurring, and your model is only learning things -about the training clips, not broader patterns that generalize. - -## Tensorboard - -A good way to visualize how the training is progressing is using Tensorboard. By -default, the script saves out events to /tmp/retrain_logs, and you can load -these by running: - -`tensorboard --logdir /tmp/retrain_logs` - -Then navigate to [http://localhost:6006](http://localhost:6006) in your browser, -and you'll see charts and graphs showing your models progress. - -
- -
- -## Training Finished - -After a few hours of training (depending on your machine's speed), the script -should have completed all 18,000 steps. It will print out a final confusion -matrix, along with an accuracy score, all run on the testing set. With the -default settings, you should see an accuracy of between 85% and 90%. - -Because audio recognition is particularly useful on mobile devices, next we'll -export it to a compact format that's easy to work with on those platforms. To do -that, run this command line: - -``` -python tensorflow/examples/speech_commands/freeze.py \ ---start_checkpoint=/tmp/speech_commands_train/conv.ckpt-18000 \ ---output_file=/tmp/my_frozen_graph.pb -``` - -Once the frozen model has been created, you can test it with the `label_wav.py` -script, like this: - -``` -python tensorflow/examples/speech_commands/label_wav.py \ ---graph=/tmp/my_frozen_graph.pb \ ---labels=/tmp/speech_commands_train/conv_labels.txt \ ---wav=/tmp/speech_dataset/left/a5d485dc_nohash_0.wav -``` - -This should print out three labels: - -``` -left (score = 0.81477) -right (score = 0.14139) -_unknown_ (score = 0.03808) -``` - -Hopefully "left" is the top score since that's the correct label, but since the -training is random it may not for the first file you try. Experiment with some -of the other .wav files in that same folder to see how well it does. - -The scores are between zero and one, and higher values mean the model is more -confident in its prediction. - -## Running the Model in an Android App - -The easiest way to see how this model works in a real application is to download -[the prebuilt Android demo -applications](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android#prebuilt-components) -and install them on your phone. You'll see 'TF Speech' appear in your app list, -and opening it will show you the same list of action words we've just trained -our model on, starting with "Yes" and "No". Once you've given the app permission -to use the microphone, you should be able to try saying those words and see them -highlighted in the UI when the model recognizes one of them. - -You can also build this application yourself, since it's open source and -[available as part of the TensorFlow repository on -github](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android#building-in-android-studio-using-the-tensorflow-aar-from-jcenter). -By default it downloads [a pretrained model from -tensorflow.org](http://download.tensorflow.org/models/speech_commands_v0.02.zip), -but you can easily [replace it with a model you've trained -yourself](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android#install-model-files-optional). -If you do this, you'll need to make sure that the constants in [the main -SpeechActivity Java source -file](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android/src/org/tensorflow/demo/SpeechActivity.java) -like `SAMPLE_RATE` and `SAMPLE_DURATION` match any changes you've made to the -defaults while training. You'll also see that there's a [Java version of the -RecognizeCommands -module](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android/src/org/tensorflow/demo/RecognizeCommands.java) -that's very similar to the C++ version in this tutorial. If you've tweaked -parameters for that, you can also update them in SpeechActivity to get the same -results as in your server testing. - -The demo app updates its UI list of results automatically based on the labels -text file you copy into assets alongside your frozen graph, which means you can -easily try out different models without needing to make any code changes. You -will need to update `LABEL_FILENAME` and `MODEL_FILENAME` to point to the files -you've added if you change the paths though. - -## How does this Model Work? - -The architecture used in this tutorial is based on some described in the paper -[Convolutional Neural Networks for Small-footprint Keyword -Spotting](http://www.isca-speech.org/archive/interspeech_2015/papers/i15_1478.pdf). -It was chosen because it's comparatively simple, quick to train, and easy to -understand, rather than being state of the art. There are lots of different -approaches to building neural network models to work with audio, including -[recurrent networks](https://svds.com/tensorflow-rnn-tutorial/) or [dilated -(atrous) -convolutions](https://deepmind.com/blog/wavenet-generative-model-raw-audio/). -This tutorial is based on the kind of convolutional network that will feel very -familiar to anyone who's worked with image recognition. That may seem surprising -at first though, since audio is inherently a one-dimensional continuous signal -across time, not a 2D spatial problem. - -We solve that issue by defining a window of time we believe our spoken words -should fit into, and converting the audio signal in that window into an image. -This is done by grouping the incoming audio samples into short segments, just a -few milliseconds long, and calculating the strength of the frequencies across a -set of bands. Each set of frequency strengths from a segment is treated as a -vector of numbers, and those vectors are arranged in time order to form a -two-dimensional array. This array of values can then be treated like a -single-channel image, and is known as a -[spectrogram](https://en.wikipedia.org/wiki/Spectrogram). If you want to view -what kind of image an audio sample produces, you can run the `wav_to_spectrogram -tool: - -``` -bazel run tensorflow/examples/wav_to_spectrogram:wav_to_spectrogram -- \ ---input_wav=/tmp/speech_dataset/happy/ab00c4b2_nohash_0.wav \ ---output_image=/tmp/spectrogram.png -``` - -If you open up `/tmp/spectrogram.png` you should see something like this: - -
- -
- -Because of TensorFlow's memory order, time in this image is increasing from top -to bottom, with frequencies going from left to right, unlike the usual -convention for spectrograms where time is left to right. You should be able to -see a couple of distinct parts, with the first syllable "Ha" distinct from -"ppy". - -Because the human ear is more sensitive to some frequencies than others, it's -been traditional in speech recognition to do further processing to this -representation to turn it into a set of [Mel-Frequency Cepstral -Coefficients](https://en.wikipedia.org/wiki/Mel-frequency_cepstrum), or MFCCs -for short. This is also a two-dimensional, one-channel representation so it can -be treated like an image too. If you're targeting general sounds rather than -speech you may find you can skip this step and operate directly on the -spectrograms. - -The image that's produced by these processing steps is then fed into a -multi-layer convolutional neural network, with a fully-connected layer followed -by a softmax at the end. You can see the definition of this portion in -[tensorflow/examples/speech_commands/models.py](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/models.py). - -## Streaming Accuracy - -Most audio recognition applications need to run on a continuous stream of audio, -rather than on individual clips. A typical way to use a model in this -environment is to apply it repeatedly at different offsets in time and average -the results over a short window to produce a smoothed prediction. If you think -of the input as an image, it's continuously scrolling along the time axis. The -words we want to recognize can start at any time, so we need to take a series of -snapshots to have a chance of having an alignment that captures most of the -utterance in the time window we feed into the model. If we sample at a high -enough rate, then we have a good chance of capturing the word in multiple -windows, so averaging the results improves the overall confidence of the -prediction. - -For an example of how you can use your model on streaming data, you can look at -[test_streaming_accuracy.cc](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/). -This uses the -[RecognizeCommands](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/recognize_commands.h) -class to run through a long-form input audio, try to spot words, and compare -those predictions against a ground truth list of labels and times. This makes it -a good example of applying a model to a stream of audio signals over time. - -You'll need a long audio file to test it against, along with labels showing -where each word was spoken. If you don't want to record one yourself, you can -generate some synthetic test data using the `generate_streaming_test_wav` -utility. By default this will create a ten minute .wav file with words roughly -every three seconds, and a text file containing the ground truth of when each -word was spoken. These words are pulled from the test portion of your current -dataset, mixed in with background noise. To run it, use: - -``` -bazel run tensorflow/examples/speech_commands:generate_streaming_test_wav -``` - -This will save a .wav file to `/tmp/speech_commands_train/streaming_test.wav`, -and a text file listing the labels to -`/tmp/speech_commands_train/streaming_test_labels.txt`. You can then run -accuracy testing with: - -``` -bazel run tensorflow/examples/speech_commands:test_streaming_accuracy -- \ ---graph=/tmp/my_frozen_graph.pb \ ---labels=/tmp/speech_commands_train/conv_labels.txt \ ---wav=/tmp/speech_commands_train/streaming_test.wav \ ---ground_truth=/tmp/speech_commands_train/streaming_test_labels.txt \ ---verbose -``` - -This will output information about the number of words correctly matched, how -many were given the wrong labels, and how many times the model triggered when -there was no real word spoken. There are various parameters that control how the -signal averaging works, including `--average_window_ms` which sets the length of -time to average results over, `--clip_stride_ms` which is the time between -applications of the model, `--suppression_ms` which stops subsequent word -detections from triggering for a certain time after an initial one is found, and -`--detection_threshold`, which controls how high the average score must be -before it's considered a solid result. - -You'll see that the streaming accuracy outputs three numbers, rather than just -the one metric used in training. This is because different applications have -varying requirements, with some being able to tolerate frequent incorrect -results as long as real words are found (high recall), while others very focused -on ensuring the predicted labels are highly likely to be correct even if some -aren't detected (high precision). The numbers from the tool give you an idea of -how your model will perform in an application, and you can try tweaking the -signal averaging parameters to tune it to give the kind of performance you want. -To understand what the right parameters are for your application, you can look -at generating an [ROC -curve](https://en.wikipedia.org/wiki/Receiver_operating_characteristic) to help -you understand the tradeoffs. - -## RecognizeCommands - -The streaming accuracy tool uses a simple decoder contained in a small C++ class -called -[RecognizeCommands](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/recognize_commands.h). -This class is fed the output of running the TensorFlow model over time, it -averages the signals, and returns information about a label when it has enough -evidence to think that a recognized word has been found. The implementation is -fairly small, just keeping track of the last few predictions and averaging them, -so it's easy to port to other platforms and languages as needed. For example, -it's convenient to do something similar at the Java level on Android, or Python -on the Raspberry Pi. As long as these implementations share the same logic, you -can tune the parameters that control the averaging using the streaming test -tool, and then transfer them over to your application to get similar results. - -## Advanced Training - -The defaults for the training script are designed to produce good end to end -results in a comparatively small file, but there are a lot of options you can -change to customize the results for your own requirements. - -### Custom Training Data - -By default the script will download the [Speech Commands -dataset](https://download.tensorflow.org/data/speech_commands_v0.01.tgz), but -you can also supply your own training data. To train on your own data, you -should make sure that you have at least several hundred recordings of each sound -you would like to recognize, and arrange them into folders by class. For -example, if you were trying to recognize dog barks from cat miaows, you would -create a root folder called `animal_sounds`, and then within that two -sub-folders called `bark` and `miaow`. You would then organize your audio files -into the appropriate folders. - -To point the script to your new audio files, you'll need to set `--data_url=` to -disable downloading of the Speech Commands dataset, and -`--data_dir=/your/data/folder/` to find the files you've just created. - -The files themselves should be 16-bit little-endian PCM-encoded WAVE format. The -sample rate defaults to 16,000, but as long as all your audio is consistently -the same rate (the script doesn't support resampling) you can change this with -the `--sample_rate` argument. The clips should also all be roughly the same -duration. The default expected duration is one second, but you can set this with -the `--clip_duration_ms` flag. If you have clips with variable amounts of -silence at the start, you can look at word alignment tools to standardize them -([here's a quick and dirty approach you can use -too](https://petewarden.com/2017/07/17/a-quick-hack-to-align-single-word-audio-recordings/)). - -One issue to watch out for is that you may have very similar repetitions of the -same sounds in your dataset, and these can give misleading metrics if they're -spread across your training, validation, and test sets. For example, the Speech -Commands set has people repeating the same word multiple times. Each one of -those repetitions is likely to be pretty close to the others, so if training was -overfitting and memorizing one, it could perform unrealistically well when it -saw a very similar copy in the test set. To avoid this danger, Speech Commands -trys to ensure that all clips featuring the same word spoken by a single person -are put into the same partition. Clips are assigned to training, test, or -validation sets based on a hash of their filename, to ensure that the -assignments remain steady even as new clips are added and avoid any training -samples migrating into the other sets. To make sure that all a given speaker's -words are in the same bucket, [the hashing -function](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/input_data.py) -ignores anything in a filename after '_nohash_' when calculating the -assignments. This means that if you have file names like `pete_nohash_0.wav` and -`pete_nohash_1.wav`, they're guaranteed to be in the same set. - -### Unknown Class - -It's likely that your application will hear sounds that aren't in your training -set, and you'll want the model to indicate that it doesn't recognize the noise -in those cases. To help the network learn what sounds to ignore, you need to -provide some clips of audio that are neither of your classes. To do this, you'd -create `quack`, `oink`, and `moo` subfolders and populate them with noises from -other animals your users might encounter. The `--wanted_words` argument to the -script defines which classes you care about, all the others mentioned in -subfolder names will be used to populate an `_unknown_` class during training. -The Speech Commands dataset has twenty words in its unknown classes, including -the digits zero through nine and random names like "Sheila". - -By default 10% of the training examples are picked from the unknown classes, but -you can control this with the `--unknown_percentage` flag. Increasing this will -make the model less likely to mistake unknown words for wanted ones, but making -it too large can backfire as the model might decide it's safest to categorize -all words as unknown! - -### Background Noise - -Real applications have to recognize audio even when there are other irrelevant -sounds happening in the environment. To build a model that's robust to this kind -of interference, we need to train against recorded audio with similar -properties. The files in the Speech Commands dataset were captured on a variety -of devices by users in many different environments, not in a studio, so that -helps add some realism to the training. To add even more, you can mix in random -segments of environmental audio to the training inputs. In the Speech Commands -set there's a special folder called `_background_noise_` which contains -minute-long WAVE files with white noise and recordings of machinery and everyday -household activity. - -Small snippets of these files are chosen at random and mixed at a low volume -into clips during training. The loudness is also chosen randomly, and controlled -by the `--background_volume` argument as a proportion where 0 is silence, and 1 -is full volume. Not all clips have background added, so the -`--background_frequency` flag controls what proportion have them mixed in. - -Your own application might operate in its own environment with different -background noise patterns than these defaults, so you can supply your own audio -clips in the `_background_noise_` folder. These should be the same sample rate -as your main dataset, but much longer in duration so that a good set of random -segments can be selected from them. - -### Silence - -In most cases the sounds you care about will be intermittent and so it's -important to know when there's no matching audio. To support this, there's a -special `_silence_` label that indicates when the model detects nothing -interesting. Because there's never complete silence in real environments, we -actually have to supply examples with quiet and irrelevant audio. For this, we -reuse the `_background_noise_` folder that's also mixed in to real clips, -pulling short sections of the audio data and feeding those in with the ground -truth class of `_silence_`. By default 10% of the training data is supplied like -this, but the `--silence_percentage` can be used to control the proportion. As -with unknown words, setting this higher can weight the model results in favor of -true positives for silence, at the expense of false negatives for words, but too -large a proportion can cause it to fall into the trap of always guessing -silence. - -### Time Shifting - -Adding in background noise is one way of distorting the training data in a -realistic way to effectively increase the size of the dataset, and so increase -overall accuracy, and time shifting is another. This involves a random offset in -time of the training sample data, so that a small part of the start or end is -cut off and the opposite section is padded with zeroes. This mimics the natural -variations in starting time in the training data, and is controlled with the -`--time_shift_ms` flag, which defaults to 100ms. Increasing this value will -provide more variation, but at the risk of cutting off important parts of the -audio. A related way of augmenting the data with realistic distortions is by -using [time stretching and pitch -scaling](https://en.wikipedia.org/wiki/Audio_time_stretching_and_pitch_scaling), -but that's outside the scope of this tutorial. - -## Customizing the Model - -The default model used for this script is pretty large, taking over 800 million -FLOPs for each inference and using 940,000 weight parameters. This runs at -usable speeds on desktop machines or modern phones, but it involves too many -calculations to run at interactive speeds on devices with more limited -resources. To support these use cases, there's a couple of alternatives -available: - - -**low_latency_conv** -Based on the 'cnn-one-fstride4' topology described in the [Convolutional -Neural Networks for Small-footprint Keyword Spotting -paper](http://www.isca-speech.org/archive/interspeech_2015/papers/i15_1478.pdf). -The accuracy is slightly lower than 'conv' but the number of weight parameters -is about the same, and it only needs 11 million FLOPs to run one prediction, -making it much faster. - -To use this model, you specify `--model_architecture=low_latency_conv` on -the command line. You'll also need to update the training rates and the number -of steps, so the full command will look like: - -``` -python tensorflow/examples/speech_commands/train \ ---model_architecture=low_latency_conv \ ---how_many_training_steps=20000,6000 \ ---learning_rate=0.01,0.001 -``` - -This asks the script to train with a learning rate of 0.01 for 20,000 steps, and -then do a fine-tuning pass of 6,000 steps with a 10x smaller rate. - -**low_latency_svdf** -Based on the topology presented in the [Compressing Deep Neural Networks using a -Rank-Constrained Topology paper](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43813.pdf). -The accuracy is also lower than 'conv' but it only uses about 750 thousand -parameters, and most significantly, it allows for an optimized execution at -test time (i.e. when you will actually use it in your application), resulting -in 750 thousand FLOPs. - -To use this model, you specify `--model_architecture=low_latency_svdf` on -the command line, and update the training rates and the number -of steps, so the full command will look like: - -``` -python tensorflow/examples/speech_commands/train \ ---model_architecture=low_latency_svdf \ ---how_many_training_steps=100000,35000 \ ---learning_rate=0.01,0.005 -``` - -Note that despite requiring a larger number of steps than the previous two -topologies, the reduced number of computations means that training should take -about the same time, and at the end reach an accuracy of around 85%. -You can also further tune the topology fairly easily for computation and -accuracy by changing these parameters in the SVDF layer: - -* rank - The rank of the approximation (higher typically better, but results in - more computation). -* num_units - Similar to other layer types, specifies the number of nodes in - the layer (more nodes better quality, and more computation). - -Regarding runtime, since the layer allows optimizations by caching some of the -internal neural network activations, you need to make sure to use a consistent -stride (e.g. 'clip_stride_ms' flag) both when you freeze the graph, and when -executing the model in streaming mode (e.g. test_streaming_accuracy.cc). - -**Other parameters to customize** -If you want to experiment with customizing models, a good place to start is by -tweaking the spectrogram creation parameters. This has the effect of altering -the size of the input image to the model, and the creation code in -[models.py](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/models.py) -will adjust the number of computations and weights automatically to fit with -different dimensions. If you make the input smaller, the model will need fewer -computations to process it, so it can be a great way to trade off some accuracy -for improved latency. The `--window_stride_ms` controls how far apart each -frequency analysis sample is from the previous. If you increase this value, then -fewer samples will be taken for a given duration, and the time axis of the input -will shrink. The `--dct_coefficient_count` flag controls how many buckets are -used for the frequency counting, so reducing this will shrink the input in the -other dimension. The `--window_size_ms` argument doesn't affect the size, but -does control how wide the area used to calculate the frequencies is for each -sample. Reducing the duration of the training samples, controlled by -`--clip_duration_ms`, can also help if the sounds you're looking for are short, -since that also reduces the time dimension of the input. You'll need to make -sure that all your training data contains the right audio in the initial portion -of the clip though. - -If you have an entirely different model in mind for your problem, you may find -that you can plug it into -[models.py](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/models.py) -and have the rest of the script handle all of the preprocessing and training -mechanics. You would add a new clause to `create_model`, looking for the name of -your architecture and then calling a model creation function. This function is -given the size of the spectrogram input, along with other model information, and -is expected to create TensorFlow ops to read that in and produce an output -prediction vector, and a placeholder to control the dropout rate. The rest of -the script will handle integrating this model into a larger graph doing the -input calculations and applying softmax and a loss function to train it. - -One common problem when you're adjusting models and training hyper-parameters is -that not-a-number values can creep in, thanks to numerical precision issues. In -general you can solve these by reducing the magnitude of things like learning -rates and weight initialization functions, but if they're persistent you can -enable the `--check_nans` flag to track down the source of the errors. This will -insert check ops between most regular operations in TensorFlow, and abort the -training process with a useful error message when they're encountered. diff --git a/tensorflow/docs_src/tutorials/deep_cnn.md b/tensorflow/docs_src/tutorials/deep_cnn.md deleted file mode 100644 index 44a32d9d1d..0000000000 --- a/tensorflow/docs_src/tutorials/deep_cnn.md +++ /dev/null @@ -1,452 +0,0 @@ -# Convolutional Neural Networks - -> **NOTE:** This tutorial is intended for *advanced* users of TensorFlow -and assumes expertise and experience in machine learning. - -## Overview - -CIFAR-10 classification is a common benchmark problem in machine learning. The -problem is to classify RGB 32x32 pixel images across 10 categories: -``` -airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. -``` - -For more details refer to the [CIFAR-10 page](https://www.cs.toronto.edu/~kriz/cifar.html) -and a [Tech Report](https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf) -by Alex Krizhevsky. - -### Goals - -The goal of this tutorial is to build a relatively small [convolutional neural -network](https://en.wikipedia.org/wiki/Convolutional_neural_network) (CNN) for -recognizing images. In the process, this tutorial: - -1. Highlights a canonical organization for network architecture, -training and evaluation. -2. Provides a template for constructing larger and more sophisticated models. - -The reason CIFAR-10 was selected was that it is complex enough to exercise -much of TensorFlow's ability to scale to large models. At the same time, -the model is small enough to train fast, which is ideal for trying out -new ideas and experimenting with new techniques. - -### Highlights of the Tutorial -The CIFAR-10 tutorial demonstrates several important constructs for -designing larger and more sophisticated models in TensorFlow: - -* Core mathematical components including @{tf.nn.conv2d$convolution} -([wiki](https://en.wikipedia.org/wiki/Convolution)), -@{tf.nn.relu$rectified linear activations} -([wiki](https://en.wikipedia.org/wiki/Rectifier_(neural_networks))), -@{tf.nn.max_pool$max pooling} -([wiki](https://en.wikipedia.org/wiki/Convolutional_neural_network#Pooling_layer)) -and @{tf.nn.local_response_normalization$local response normalization} -(Chapter 3.3 in -[AlexNet paper](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)). -* @{$summaries_and_tensorboard$Visualization} -of network activities during training, including input images, -losses and distributions of activations and gradients. -* Routines for calculating the -@{tf.train.ExponentialMovingAverage$moving average} -of learned parameters and using these averages -during evaluation to boost predictive performance. -* Implementation of a -@{tf.train.exponential_decay$learning rate schedule} -that systematically decrements over time. -* Prefetching @{tf.train.shuffle_batch$queues} -for input -data to isolate the model from disk latency and expensive image pre-processing. - -We also provide a [multi-GPU version](#training-a-model-using-multiple-gpu-cards) -of the model which demonstrates: - -* Configuring a model to train across multiple GPU cards in parallel. -* Sharing and updating variables among multiple GPUs. - -We hope that this tutorial provides a launch point for building larger CNNs for -vision tasks on TensorFlow. - -### Model Architecture - -The model in this CIFAR-10 tutorial is a multi-layer architecture consisting of -alternating convolutions and nonlinearities. These layers are followed by fully -connected layers leading into a softmax classifier. The model follows the -architecture described by -[Alex Krizhevsky](https://code.google.com/p/cuda-convnet/), with a few -differences in the top few layers. - -This model achieves a peak performance of about 86% accuracy within a few hours -of training time on a GPU. Please see [below](#evaluating-a-model) and the code -for details. It consists of 1,068,298 learnable parameters and requires about -19.5M multiply-add operations to compute inference on a single image. - -## Code Organization - -The code for this tutorial resides in -[`models/tutorials/image/cifar10/`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/). - -File | Purpose ---- | --- -[`cifar10_input.py`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/cifar10_input.py) | Reads the native CIFAR-10 binary file format. -[`cifar10.py`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/cifar10.py) | Builds the CIFAR-10 model. -[`cifar10_train.py`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/cifar10_train.py) | Trains a CIFAR-10 model on a CPU or GPU. -[`cifar10_multi_gpu_train.py`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/cifar10_multi_gpu_train.py) | Trains a CIFAR-10 model on multiple GPUs. -[`cifar10_eval.py`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/cifar10_eval.py) | Evaluates the predictive performance of a CIFAR-10 model. - - -## CIFAR-10 Model - -The CIFAR-10 network is largely contained in -[`cifar10.py`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/cifar10.py). -The complete training -graph contains roughly 765 operations. We find that we can make the code most -reusable by constructing the graph with the following modules: - -1. [**Model inputs:**](#model-inputs) `inputs()` and `distorted_inputs()` add -operations that read and preprocess CIFAR images for evaluation and training, -respectively. -1. [**Model prediction:**](#model-prediction) `inference()` -adds operations that perform inference, i.e. classification, on supplied images. -1. [**Model training:**](#model-training) `loss()` and `train()` -add operations that compute the loss, -gradients, variable updates and visualization summaries. - -### Model Inputs - -The input part of the model is built by the functions `inputs()` and -`distorted_inputs()` which read images from the CIFAR-10 binary data files. -These files contain fixed byte length records, so we use -@{tf.FixedLengthRecordReader}. -See @{$reading_data#reading-from-files$Reading Data} to -learn more about how the `Reader` class works. - -The images are processed as follows: - -* They are cropped to 24 x 24 pixels, centrally for evaluation or - @{tf.random_crop$randomly} for training. -* They are @{tf.image.per_image_standardization$approximately whitened} - to make the model insensitive to dynamic range. - -For training, we additionally apply a series of random distortions to -artificially increase the data set size: - -* @{tf.image.random_flip_left_right$Randomly flip} the image from left to right. -* Randomly distort the @{tf.image.random_brightness$image brightness}. -* Randomly distort the @{tf.image.random_contrast$image contrast}. - -Please see the @{$python/image$Images} page for the list of -available distortions. We also attach an -@{tf.summary.image} to the images -so that we may visualize them in @{$summaries_and_tensorboard$TensorBoard}. -This is a good practice to verify that inputs are built correctly. - -
- -
- -Reading images from disk and distorting them can use a non-trivial amount of -processing time. To prevent these operations from slowing down training, we run -them inside 16 separate threads which continuously fill a TensorFlow -@{tf.train.shuffle_batch$queue}. - -### Model Prediction - -The prediction part of the model is constructed by the `inference()` function -which adds operations to compute the *logits* of the predictions. That part of -the model is organized as follows: - -Layer Name | Description ---- | --- -`conv1` | @{tf.nn.conv2d$convolution} and @{tf.nn.relu$rectified linear} activation. -`pool1` | @{tf.nn.max_pool$max pooling}. -`norm1` | @{tf.nn.local_response_normalization$local response normalization}. -`conv2` | @{tf.nn.conv2d$convolution} and @{tf.nn.relu$rectified linear} activation. -`norm2` | @{tf.nn.local_response_normalization$local response normalization}. -`pool2` | @{tf.nn.max_pool$max pooling}. -`local3` | @{$python/nn$fully connected layer with rectified linear activation}. -`local4` | @{$python/nn$fully connected layer with rectified linear activation}. -`softmax_linear` | linear transformation to produce logits. - -Here is a graph generated from TensorBoard describing the inference operation: - -
- -
- -> **EXERCISE**: The output of `inference` are un-normalized logits. Try editing -the network architecture to return normalized predictions using -@{tf.nn.softmax}. - -The `inputs()` and `inference()` functions provide all the components -necessary to perform an evaluation of a model. We now shift our focus towards -building operations for training a model. - -> **EXERCISE:** The model architecture in `inference()` differs slightly from -the CIFAR-10 model specified in -[cuda-convnet](https://code.google.com/p/cuda-convnet/). In particular, the top -layers of Alex's original model are locally connected and not fully connected. -Try editing the architecture to exactly reproduce the locally connected -architecture in the top layer. - -### Model Training - -The usual method for training a network to perform N-way classification is -[multinomial logistic regression](https://en.wikipedia.org/wiki/Multinomial_logistic_regression), -aka. *softmax regression*. Softmax regression applies a -@{tf.nn.softmax$softmax} nonlinearity to the -output of the network and calculates the -@{tf.nn.sparse_softmax_cross_entropy_with_logits$cross-entropy} -between the normalized predictions and the label index. -For regularization, we also apply the usual -@{tf.nn.l2_loss$weight decay} losses to all learned -variables. The objective function for the model is the sum of the cross entropy -loss and all these weight decay terms, as returned by the `loss()` function. - -We visualize it in TensorBoard with a @{tf.summary.scalar}: - -![CIFAR-10 Loss](https://www.tensorflow.org/images/cifar_loss.png "CIFAR-10 Total Loss") - -We train the model using standard -[gradient descent](https://en.wikipedia.org/wiki/Gradient_descent) -algorithm (see @{$python/train$Training} for other methods) -with a learning rate that -@{tf.train.exponential_decay$exponentially decays} -over time. - -![CIFAR-10 Learning Rate Decay](https://www.tensorflow.org/images/cifar_lr_decay.png "CIFAR-10 Learning Rate Decay") - -The `train()` function adds the operations needed to minimize the objective by -calculating the gradient and updating the learned variables (see -@{tf.train.GradientDescentOptimizer} -for details). It returns an operation that executes all the calculations -needed to train and update the model for one batch of images. - -## Launching and Training the Model - -We have built the model, let's now launch it and run the training operation with -the script `cifar10_train.py`. - -```shell -python cifar10_train.py -``` - -> **NOTE:** The first time you run any target in the CIFAR-10 tutorial, -the CIFAR-10 dataset is automatically downloaded. The data set is ~160MB -so you may want to grab a quick cup of coffee for your first run. - -You should see the output: - -```shell -Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes. -2015-11-04 11:45:45.927302: step 0, loss = 4.68 (2.0 examples/sec; 64.221 sec/batch) -2015-11-04 11:45:49.133065: step 10, loss = 4.66 (533.8 examples/sec; 0.240 sec/batch) -2015-11-04 11:45:51.397710: step 20, loss = 4.64 (597.4 examples/sec; 0.214 sec/batch) -2015-11-04 11:45:54.446850: step 30, loss = 4.62 (391.0 examples/sec; 0.327 sec/batch) -2015-11-04 11:45:57.152676: step 40, loss = 4.61 (430.2 examples/sec; 0.298 sec/batch) -2015-11-04 11:46:00.437717: step 50, loss = 4.59 (406.4 examples/sec; 0.315 sec/batch) -... -``` - -The script reports the total loss every 10 steps as well as the speed at which -the last batch of data was processed. A few comments: - -* The first batch of data can be inordinately slow (e.g. several minutes) as the -preprocessing threads fill up the shuffling queue with 20,000 processed CIFAR -images. - -* The reported loss is the average loss of the most recent batch. Remember that -this loss is the sum of the cross entropy and all weight decay terms. - -* Keep an eye on the processing speed of a batch. The numbers shown above were -obtained on a Tesla K40c. If you are running on a CPU, expect slower performance. - - -> **EXERCISE:** When experimenting, it is sometimes annoying that the first -training step can take so long. Try decreasing the number of images that -initially fill up the queue. Search for `min_fraction_of_examples_in_queue` -in `cifar10_input.py`. - -`cifar10_train.py` periodically @{tf.train.Saver$saves} -all model parameters in -@{$guide/saved_model$checkpoint files} -but it does *not* evaluate the model. The checkpoint file -will be used by `cifar10_eval.py` to measure the predictive -performance (see [Evaluating a Model](#evaluating-a-model) below). - - -If you followed the previous steps, then you have now started training -a CIFAR-10 model. [Congratulations!](https://www.youtube.com/watch?v=9bZkp7q19f0) - -The terminal text returned from `cifar10_train.py` provides minimal insight into -how the model is training. We want more insight into the model during training: - -* Is the loss *really* decreasing or is that just noise? -* Is the model being provided appropriate images? -* Are the gradients, activations and weights reasonable? -* What is the learning rate currently at? - -@{$summaries_and_tensorboard$TensorBoard} provides this -functionality, displaying data exported periodically from `cifar10_train.py` via -a -@{tf.summary.FileWriter}. - -For instance, we can watch how the distribution of activations and degree of -sparsity in `local3` features evolve during training: - -
- - -
- -Individual loss functions, as well as the total loss, are particularly -interesting to track over time. However, the loss exhibits a considerable amount -of noise due to the small batch size employed by training. In practice we find -it extremely useful to visualize their moving averages in addition to their raw -values. See how the scripts use -@{tf.train.ExponentialMovingAverage} -for this purpose. - -## Evaluating a Model - -Let us now evaluate how well the trained model performs on a hold-out data set. -The model is evaluated by the script `cifar10_eval.py`. It constructs the model -with the `inference()` function and uses all 10,000 images in the evaluation set -of CIFAR-10. It calculates the *precision at 1:* how often the top prediction -matches the true label of the image. - -To monitor how the model improves during training, the evaluation script runs -periodically on the latest checkpoint files created by the `cifar10_train.py`. - -```shell -python cifar10_eval.py -``` - -> Be careful not to run the evaluation and training binary on the same GPU or -else you might run out of memory. Consider running the evaluation on -a separate GPU if available or suspending the training binary while running -the evaluation on the same GPU. - -You should see the output: - -```shell -2015-11-06 08:30:44.391206: precision @ 1 = 0.860 -... -``` - -The script merely returns the precision @ 1 periodically -- in this case -it returned 86% accuracy. `cifar10_eval.py` also -exports summaries that may be visualized in TensorBoard. These summaries -provide additional insight into the model during evaluation. - -The training script calculates the -@{tf.train.ExponentialMovingAverage$moving average} -version of all learned variables. The evaluation script substitutes -all learned model parameters with the moving average version. This -substitution boosts model performance at evaluation time. - -> **EXERCISE:** Employing averaged parameters may boost predictive performance -by about 3% as measured by precision @ 1. Edit `cifar10_eval.py` to not employ -the averaged parameters for the model and verify that the predictive performance -drops. - - -## Training a Model Using Multiple GPU Cards - -Modern workstations may contain multiple GPUs for scientific computation. -TensorFlow can leverage this environment to run the training operation -concurrently across multiple cards. - -Training a model in a parallel, distributed fashion requires -coordinating training processes. For what follows we term *model replica* -to be one copy of a model training on a subset of data. - -Naively employing asynchronous updates of model parameters -leads to sub-optimal training performance -because an individual model replica might be trained on a stale -copy of the model parameters. Conversely, employing fully synchronous -updates will be as slow as the slowest model replica. - -In a workstation with multiple GPU cards, each GPU will have similar speed -and contain enough memory to run an entire CIFAR-10 model. Thus, we opt to -design our training system in the following manner: - -* Place an individual model replica on each GPU. -* Update model parameters synchronously by waiting for all GPUs to finish -processing a batch of data. - -Here is a diagram of this model: - -
- -
- -Note that each GPU computes inference as well as the gradients for a unique -batch of data. This setup effectively permits dividing up a larger batch -of data across the GPUs. - -This setup requires that all GPUs share the model parameters. A well-known -fact is that transferring data to and from GPUs is quite slow. For this -reason, we decide to store and update all model parameters on the CPU (see -green box). A fresh set of model parameters is transferred to the GPU -when a new batch of data is processed by all GPUs. - -The GPUs are synchronized in operation. All gradients are accumulated from -the GPUs and averaged (see green box). The model parameters are updated with -the gradients averaged across all model replicas. - -### Placing Variables and Operations on Devices - -Placing operations and variables on devices requires some special -abstractions. - -The first abstraction we require is a function for computing inference and -gradients for a single model replica. In the code we term this abstraction -a "tower". We must set two attributes for each tower: - -* A unique name for all operations within a tower. -@{tf.name_scope} provides -this unique name by prepending a scope. For instance, all operations in -the first tower are prepended with `tower_0`, e.g. `tower_0/conv1/Conv2D`. - -* A preferred hardware device to run the operation within a tower. -@{tf.device} specifies this. For -instance, all operations in the first tower reside within `device('/device:GPU:0')` -scope indicating that they should be run on the first GPU. - -All variables are pinned to the CPU and accessed via -@{tf.get_variable} -in order to share them in a multi-GPU version. -See how-to on @{$variables$Sharing Variables}. - -### Launching and Training the Model on Multiple GPU cards - -If you have several GPU cards installed on your machine you can use them to -train the model faster with the `cifar10_multi_gpu_train.py` script. This -version of the training script parallelizes the model across multiple GPU cards. - -```shell -python cifar10_multi_gpu_train.py --num_gpus=2 -``` - -Note that the number of GPU cards used defaults to 1. Additionally, if only 1 -GPU is available on your machine, all computations will be placed on it, even if -you ask for more. - -> **EXERCISE:** The default settings for `cifar10_train.py` is to -run on a batch size of 128. Try running `cifar10_multi_gpu_train.py` on 2 GPUs -with a batch size of 64 and compare the training speed. - -## Next Steps - -[Congratulations!](https://www.youtube.com/watch?v=9bZkp7q19f0) You have -completed the CIFAR-10 tutorial. - -If you are now interested in developing and training your own image -classification system, we recommend forking this tutorial and replacing -components to address your image classification problem. - - -> **EXERCISE:** Download the -[Street View House Numbers (SVHN)](http://ufldl.stanford.edu/housenumbers/) data set. -Fork the CIFAR-10 tutorial and swap in the SVHN as the input data. Try adapting -the network architecture to improve predictive performance. diff --git a/tensorflow/docs_src/tutorials/eager/custom_training_walkthrough.md b/tensorflow/docs_src/tutorials/eager/custom_training_walkthrough.md new file mode 100644 index 0000000000..b45fbefac0 --- /dev/null +++ b/tensorflow/docs_src/tutorials/eager/custom_training_walkthrough.md @@ -0,0 +1,3 @@ +# Custom training: walkthrough + +[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/eager.ipynb) diff --git a/tensorflow/docs_src/tutorials/eager/index.md b/tensorflow/docs_src/tutorials/eager/index.md new file mode 100644 index 0000000000..5445e0c343 --- /dev/null +++ b/tensorflow/docs_src/tutorials/eager/index.md @@ -0,0 +1,13 @@ +# Research and experimentation + +Eager execution provides an imperative, define-by-run interface for advanced +operations. Write custom layers, forward passes, and training loops with +auto differentiation. Start with these notebooks, then read the +[eager execution guide](../../guide/eager). + +1. [Eager execution](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/notebooks/eager_intro.ipynb){:.external} +2. [Automatic differentiation and gradient tape](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/notebooks/automatic_differentiation.ipynb){:.external} +3. [Custom training: basics](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/notebooks/custom_training.ipynb){:.external} +4. [Custom layers](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/notebooks/custom_layers.ipynb){:.external} +5. [Custom training: walkthrough](/tutorials/eager/custom_training_walkthrough) +6. [Advanced example: Neural machine translation with attention](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/eager/python/examples/nmt_with_attention/nmt_with_attention.ipynb){:.external} diff --git a/tensorflow/docs_src/tutorials/image_recognition.md b/tensorflow/docs_src/tutorials/image_recognition.md deleted file mode 100644 index 332bcf54f0..0000000000 --- a/tensorflow/docs_src/tutorials/image_recognition.md +++ /dev/null @@ -1,456 +0,0 @@ -# Image Recognition - -Our brains make vision seem easy. It doesn't take any effort for humans to -tell apart a lion and a jaguar, read a sign, or recognize a human's face. -But these are actually hard problems to solve with a computer: they only -seem easy because our brains are incredibly good at understanding images. - -In the last few years, the field of machine learning has made tremendous -progress on addressing these difficult problems. In particular, we've -found that a kind of model called a deep -[convolutional neural network](https://colah.github.io/posts/2014-07-Conv-Nets-Modular/) -can achieve reasonable performance on hard visual recognition tasks -- -matching or exceeding human performance in some domains. - -Researchers have demonstrated steady progress -in computer vision by validating their work against -[ImageNet](http://www.image-net.org) -- an academic benchmark for computer vision. -Successive models continue to show improvements, each time achieving -a new state-of-the-art result: -[QuocNet], [AlexNet], [Inception (GoogLeNet)], [BN-Inception-v2]. -Researchers both internal and external to Google have published papers describing all -these models but the results are still hard to reproduce. -We're now taking the next step by releasing code for running image recognition -on our latest model, [Inception-v3]. - -[QuocNet]: https://static.googleusercontent.com/media/research.google.com/en//archive/unsupervised_icml2012.pdf -[AlexNet]: https://www.cs.toronto.edu/~fritz/absps/imagenet.pdf -[Inception (GoogLeNet)]: https://arxiv.org/abs/1409.4842 -[BN-Inception-v2]: https://arxiv.org/abs/1502.03167 -[Inception-v3]: https://arxiv.org/abs/1512.00567 - -Inception-v3 is trained for the [ImageNet] Large Visual Recognition Challenge -using the data from 2012. This is a standard task in computer vision, -where models try to classify entire -images into [1000 classes], like "Zebra", "Dalmatian", and "Dishwasher". -For example, here are the results from [AlexNet] classifying some images: - -
- -
- -To compare models, we examine how often the model fails to predict the -correct answer as one of their top 5 guesses -- termed "top-5 error rate". -[AlexNet] achieved by setting a top-5 error rate of 15.3% on the 2012 -validation data set; [Inception (GoogLeNet)] achieved 6.67%; -[BN-Inception-v2] achieved 4.9%; [Inception-v3] reaches 3.46%. - -> How well do humans do on ImageNet Challenge? There's a [blog post] by -Andrej Karpathy who attempted to measure his own performance. He reached -5.1% top-5 error rate. - -[ImageNet]: http://image-net.org/ -[1000 classes]: http://image-net.org/challenges/LSVRC/2014/browse-synsets -[blog post]: https://karpathy.github.io/2014/09/02/what-i-learned-from-competing-against-a-convnet-on-imagenet/ - -This tutorial will teach you how to use [Inception-v3]. You'll learn how to -classify images into [1000 classes] in Python or C++. We'll also discuss how to -extract higher level features from this model which may be reused for other -vision tasks. - -We're excited to see what the community will do with this model. - - -##Usage with Python API - -`classify_image.py` downloads the trained model from `tensorflow.org` -when the program is run for the first time. You'll need about 200M of free space -available on your hard disk. - -Start by cloning the [TensorFlow models repo](https://github.com/tensorflow/models) from GitHub. Run the following commands: - - cd models/tutorials/image/imagenet - python classify_image.py - -The above command will classify a supplied image of a panda bear. - -
- -
- -If the model runs correctly, the script will produce the following output: - - giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca (score = 0.88493) - indri, indris, Indri indri, Indri brevicaudatus (score = 0.00878) - lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens (score = 0.00317) - custard apple (score = 0.00149) - earthstar (score = 0.00127) - -If you wish to supply other JPEG images, you may do so by editing -the `--image_file` argument. - -> If you download the model data to a different directory, you -will need to point `--model_dir` to the directory used. - -## Usage with the C++ API - -You can run the same [Inception-v3] model in C++ for use in production -environments. You can download the archive containing the GraphDef that defines -the model like this (running from the root directory of the TensorFlow -repository): - -```bash -curl -L "https://storage.googleapis.com/download.tensorflow.org/models/inception_v3_2016_08_28_frozen.pb.tar.gz" | - tar -C tensorflow/examples/label_image/data -xz -``` - -Next, we need to compile the C++ binary that includes the code to load and run the graph. -If you've followed -@{$install_sources$the instructions to download the source installation of TensorFlow} -for your platform, you should be able to build the example by -running this command from your shell terminal: - -```bash -bazel build tensorflow/examples/label_image/... -``` - -That should create a binary executable that you can then run like this: - -```bash -bazel-bin/tensorflow/examples/label_image/label_image -``` - -This uses the default example image that ships with the framework, and should -output something similar to this: - -``` -I tensorflow/examples/label_image/main.cc:206] military uniform (653): 0.834306 -I tensorflow/examples/label_image/main.cc:206] mortarboard (668): 0.0218692 -I tensorflow/examples/label_image/main.cc:206] academic gown (401): 0.0103579 -I tensorflow/examples/label_image/main.cc:206] pickelhaube (716): 0.00800814 -I tensorflow/examples/label_image/main.cc:206] bulletproof vest (466): 0.00535088 -``` -In this case, we're using the default image of -[Admiral Grace Hopper](https://en.wikipedia.org/wiki/Grace_Hopper), and you can -see the network correctly identifies she's wearing a military uniform, with a high -score of 0.8. - - -
- -
- -Next, try it out on your own images by supplying the --image= argument, e.g. - -```bash -bazel-bin/tensorflow/examples/label_image/label_image --image=my_image.png -``` - -If you look inside the [`tensorflow/examples/label_image/main.cc`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/label_image/main.cc) -file, you can find out -how it works. We hope this code will help you integrate TensorFlow into -your own applications, so we will walk step by step through the main functions: - -The command line flags control where the files are loaded from, and properties of the input images. -The model expects to get square 299x299 RGB images, so those are the `input_width` -and `input_height` flags. We also need to scale the pixel values from integers that -are between 0 and 255 to the floating point values that the graph operates on. -We control the scaling with the `input_mean` and `input_std` flags: we first subtract -`input_mean` from each pixel value, then divide it by `input_std`. - -These values probably look somewhat magical, but they are just defined by the -original model author based on what he/she wanted to use as input images for -training. If you have a graph that you've trained yourself, you'll just need -to adjust the values to match whatever you used during your training process. - -You can see how they're applied to an image in the -[`ReadTensorFromImageFile()`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/label_image/main.cc#L88) -function. - -```C++ -// Given an image file name, read in the data, try to decode it as an image, -// resize it to the requested size, and then scale the values as desired. -Status ReadTensorFromImageFile(string file_name, const int input_height, - const int input_width, const float input_mean, - const float input_std, - std::vector* out_tensors) { - tensorflow::GraphDefBuilder b; -``` -We start by creating a `GraphDefBuilder`, which is an object we can use to -specify a model to run or load. - -```C++ - string input_name = "file_reader"; - string output_name = "normalized"; - tensorflow::Node* file_reader = - tensorflow::ops::ReadFile(tensorflow::ops::Const(file_name, b.opts()), - b.opts().WithName(input_name)); -``` -We then start creating nodes for the small model we want to run -to load, resize, and scale the pixel values to get the result the main model -expects as its input. The first node we create is just a `Const` op that holds a -tensor with the file name of the image we want to load. That's then passed as the -first input to the `ReadFile` op. You might notice we're passing `b.opts()` as the last -argument to all the op creation functions. The argument ensures that the node is added to -the model definition held in the `GraphDefBuilder`. We also name the `ReadFile` -operator by making the `WithName()` call to `b.opts()`. This gives a name to the node, -which isn't strictly necessary since an automatic name will be assigned if you don't -do this, but it does make debugging a bit easier. - -```C++ - // Now try to figure out what kind of file it is and decode it. - const int wanted_channels = 3; - tensorflow::Node* image_reader; - if (tensorflow::StringPiece(file_name).ends_with(".png")) { - image_reader = tensorflow::ops::DecodePng( - file_reader, - b.opts().WithAttr("channels", wanted_channels).WithName("png_reader")); - } else { - // Assume if it's not a PNG then it must be a JPEG. - image_reader = tensorflow::ops::DecodeJpeg( - file_reader, - b.opts().WithAttr("channels", wanted_channels).WithName("jpeg_reader")); - } - // Now cast the image data to float so we can do normal math on it. - tensorflow::Node* float_caster = tensorflow::ops::Cast( - image_reader, tensorflow::DT_FLOAT, b.opts().WithName("float_caster")); - // The convention for image ops in TensorFlow is that all images are expected - // to be in batches, so that they're four-dimensional arrays with indices of - // [batch, height, width, channel]. Because we only have a single image, we - // have to add a batch dimension of 1 to the start with ExpandDims(). - tensorflow::Node* dims_expander = tensorflow::ops::ExpandDims( - float_caster, tensorflow::ops::Const(0, b.opts()), b.opts()); - // Bilinearly resize the image to fit the required dimensions. - tensorflow::Node* resized = tensorflow::ops::ResizeBilinear( - dims_expander, tensorflow::ops::Const({input_height, input_width}, - b.opts().WithName("size")), - b.opts()); - // Subtract the mean and divide by the scale. - tensorflow::ops::Div( - tensorflow::ops::Sub( - resized, tensorflow::ops::Const({input_mean}, b.opts()), b.opts()), - tensorflow::ops::Const({input_std}, b.opts()), - b.opts().WithName(output_name)); -``` -We then keep adding more nodes, to decode the file data as an image, to cast the -integers into floating point values, to resize it, and then finally to run the -subtraction and division operations on the pixel values. - -```C++ - // This runs the GraphDef network definition that we've just constructed, and - // returns the results in the output tensor. - tensorflow::GraphDef graph; - TF_RETURN_IF_ERROR(b.ToGraphDef(&graph)); -``` -At the end of this we have -a model definition stored in the b variable, which we turn into a full graph -definition with the `ToGraphDef()` function. - -```C++ - std::unique_ptr session( - tensorflow::NewSession(tensorflow::SessionOptions())); - TF_RETURN_IF_ERROR(session->Create(graph)); - TF_RETURN_IF_ERROR(session->Run({}, {output_name}, {}, out_tensors)); - return Status::OK(); -``` -Then we create a @{tf.Session} -object, which is the interface to actually running the graph, and run it, -specifying which node we want to get the output from, and where to put the -output data. - -This gives us a vector of `Tensor` objects, which in this case we know will only be a -single object long. You can think of a `Tensor` as a multi-dimensional array in this -context, and it holds a 299 pixel high, 299 pixel wide, 3 channel image as float -values. If you have your own image-processing framework in your product already, you -should be able to use that instead, as long as you apply the same transformations -before you feed images into the main graph. - -This is a simple example of creating a small TensorFlow graph dynamically in C++, -but for the pre-trained Inception model we want to load a much larger definition from -a file. You can see how we do that in the `LoadGraph()` function. - -```C++ -// Reads a model graph definition from disk, and creates a session object you -// can use to run it. -Status LoadGraph(string graph_file_name, - std::unique_ptr* session) { - tensorflow::GraphDef graph_def; - Status load_graph_status = - ReadBinaryProto(tensorflow::Env::Default(), graph_file_name, &graph_def); - if (!load_graph_status.ok()) { - return tensorflow::errors::NotFound("Failed to load compute graph at '", - graph_file_name, "'"); - } -``` -If you've looked through the image loading code, a lot of the terms should seem familiar. Rather than -using a `GraphDefBuilder` to produce a `GraphDef` object, we load a protobuf file that -directly contains the `GraphDef`. - -```C++ - session->reset(tensorflow::NewSession(tensorflow::SessionOptions())); - Status session_create_status = (*session)->Create(graph_def); - if (!session_create_status.ok()) { - return session_create_status; - } - return Status::OK(); -} -``` -Then we create a Session object from that `GraphDef` and -pass it back to the caller so that they can run it at a later time. - -The `GetTopLabels()` function is a lot like the image loading, except that in this case -we want to take the results of running the main graph, and turn it into a sorted list -of the highest-scoring labels. Just like the image loader, it creates a -`GraphDefBuilder`, adds a couple of nodes to it, and then runs the short graph to get a -pair of output tensors. In this case they represent the sorted scores and index -positions of the highest results. - -```C++ -// Analyzes the output of the Inception graph to retrieve the highest scores and -// their positions in the tensor, which correspond to categories. -Status GetTopLabels(const std::vector& outputs, int how_many_labels, - Tensor* indices, Tensor* scores) { - tensorflow::GraphDefBuilder b; - string output_name = "top_k"; - tensorflow::ops::TopK(tensorflow::ops::Const(outputs[0], b.opts()), - how_many_labels, b.opts().WithName(output_name)); - // This runs the GraphDef network definition that we've just constructed, and - // returns the results in the output tensors. - tensorflow::GraphDef graph; - TF_RETURN_IF_ERROR(b.ToGraphDef(&graph)); - std::unique_ptr session( - tensorflow::NewSession(tensorflow::SessionOptions())); - TF_RETURN_IF_ERROR(session->Create(graph)); - // The TopK node returns two outputs, the scores and their original indices, - // so we have to append :0 and :1 to specify them both. - std::vector out_tensors; - TF_RETURN_IF_ERROR(session->Run({}, {output_name + ":0", output_name + ":1"}, - {}, &out_tensors)); - *scores = out_tensors[0]; - *indices = out_tensors[1]; - return Status::OK(); -``` -The `PrintTopLabels()` function takes those sorted results, and prints them out in a -friendly way. The `CheckTopLabel()` function is very similar, but just makes sure that -the top label is the one we expect, for debugging purposes. - -At the end, [`main()`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/label_image/main.cc#L252) -ties together all of these calls. - -```C++ -int main(int argc, char* argv[]) { - // We need to call this to set up global state for TensorFlow. - tensorflow::port::InitMain(argv[0], &argc, &argv); - Status s = tensorflow::ParseCommandLineFlags(&argc, argv); - if (!s.ok()) { - LOG(ERROR) << "Error parsing command line flags: " << s.ToString(); - return -1; - } - - // First we load and initialize the model. - std::unique_ptr session; - string graph_path = tensorflow::io::JoinPath(FLAGS_root_dir, FLAGS_graph); - Status load_graph_status = LoadGraph(graph_path, &session); - if (!load_graph_status.ok()) { - LOG(ERROR) << load_graph_status; - return -1; - } -``` -We load the main graph. - -```C++ - // Get the image from disk as a float array of numbers, resized and normalized - // to the specifications the main graph expects. - std::vector resized_tensors; - string image_path = tensorflow::io::JoinPath(FLAGS_root_dir, FLAGS_image); - Status read_tensor_status = ReadTensorFromImageFile( - image_path, FLAGS_input_height, FLAGS_input_width, FLAGS_input_mean, - FLAGS_input_std, &resized_tensors); - if (!read_tensor_status.ok()) { - LOG(ERROR) << read_tensor_status; - return -1; - } - const Tensor& resized_tensor = resized_tensors[0]; -``` -Load, resize, and process the input image. - -```C++ - // Actually run the image through the model. - std::vector outputs; - Status run_status = session->Run({{FLAGS_input_layer, resized_tensor}}, - {FLAGS_output_layer}, {}, &outputs); - if (!run_status.ok()) { - LOG(ERROR) << "Running model failed: " << run_status; - return -1; - } -``` -Here we run the loaded graph with the image as an input. - -```C++ - // This is for automated testing to make sure we get the expected result with - // the default settings. We know that label 866 (military uniform) should be - // the top label for the Admiral Hopper image. - if (FLAGS_self_test) { - bool expected_matches; - Status check_status = CheckTopLabel(outputs, 866, &expected_matches); - if (!check_status.ok()) { - LOG(ERROR) << "Running check failed: " << check_status; - return -1; - } - if (!expected_matches) { - LOG(ERROR) << "Self-test failed!"; - return -1; - } - } -``` -For testing purposes we can check to make sure we get the output we expect here. - -```C++ - // Do something interesting with the results we've generated. - Status print_status = PrintTopLabels(outputs, FLAGS_labels); -``` -Finally we print the labels we found. - -```C++ - if (!print_status.ok()) { - LOG(ERROR) << "Running print failed: " << print_status; - return -1; - } -``` - -The error handling here is using TensorFlow's `Status` -object, which is very convenient because it lets you know whether any error has -occurred with the `ok()` checker, and then can be printed out to give a readable error -message. - -In this case we are demonstrating object recognition, but you should be able to -use very similar code on other models you've found or trained yourself, across -all -sorts of domains. We hope this small example gives you some ideas on how to use -TensorFlow within your own products. - -> **EXERCISE**: Transfer learning is the idea that, if you know how to solve a task well, you -should be able to transfer some of that understanding to solving related -problems. One way to perform transfer learning is to remove the final -classification layer of the network and extract -the [next-to-last layer of the CNN](https://arxiv.org/abs/1310.1531), in this case a 2048 dimensional vector. -There's a guide to doing this @{$image_retraining$in the how-to section}. - - -## Resources for Learning More - -To learn about neural networks in general, Michael Nielsen's -[free online book](http://neuralnetworksanddeeplearning.com/chap1.html) -is an excellent resource. For convolutional neural networks in particular, -Chris Olah has some -[nice blog posts](https://colah.github.io/posts/2014-07-Conv-Nets-Modular/), -and Michael Nielsen's book has a -[great chapter](http://neuralnetworksanddeeplearning.com/chap6.html) -covering them. - -To find out more about implementing convolutional neural networks, you can jump -to the TensorFlow @{$deep_cnn$deep convolutional networks tutorial}, -or start a bit more gently with our @{$layers$MNIST starter tutorial}. -Finally, if you want to get up to speed on research in this area, you can -read the recent work of all the papers referenced in this tutorial. - diff --git a/tensorflow/docs_src/tutorials/image_retraining.md b/tensorflow/docs_src/tutorials/image_retraining.md deleted file mode 100644 index 27784eef9c..0000000000 --- a/tensorflow/docs_src/tutorials/image_retraining.md +++ /dev/null @@ -1,4 +0,0 @@ -# How to Retrain Inception's Final Layer for New Categories - -**NOTE: This tutorial has moved to** -https://github.com/tensorflow/hub/tree/master/docs/tutorials/image_retraining.md diff --git a/tensorflow/docs_src/tutorials/images/deep_cnn.md b/tensorflow/docs_src/tutorials/images/deep_cnn.md new file mode 100644 index 0000000000..1590f15eb9 --- /dev/null +++ b/tensorflow/docs_src/tutorials/images/deep_cnn.md @@ -0,0 +1,446 @@ +# Advanced Convolutional Neural Networks + +## Overview + +CIFAR-10 classification is a common benchmark problem in machine learning. The +problem is to classify RGB 32x32 pixel images across 10 categories: +``` +airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. +``` + +For more details refer to the [CIFAR-10 page](https://www.cs.toronto.edu/~kriz/cifar.html) +and a [Tech Report](https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf) +by Alex Krizhevsky. + +### Goals + +The goal of this tutorial is to build a relatively small [convolutional neural +network](https://en.wikipedia.org/wiki/Convolutional_neural_network) (CNN) for +recognizing images. In the process, this tutorial: + +1. Highlights a canonical organization for network architecture, +training and evaluation. +2. Provides a template for constructing larger and more sophisticated models. + +The reason CIFAR-10 was selected was that it is complex enough to exercise +much of TensorFlow's ability to scale to large models. At the same time, +the model is small enough to train fast, which is ideal for trying out +new ideas and experimenting with new techniques. + +### Highlights of the Tutorial +The CIFAR-10 tutorial demonstrates several important constructs for +designing larger and more sophisticated models in TensorFlow: + +* Core mathematical components including @{tf.nn.conv2d$convolution} +([wiki](https://en.wikipedia.org/wiki/Convolution)), +@{tf.nn.relu$rectified linear activations} +([wiki](https://en.wikipedia.org/wiki/Rectifier_(neural_networks))), +@{tf.nn.max_pool$max pooling} +([wiki](https://en.wikipedia.org/wiki/Convolutional_neural_network#Pooling_layer)) +and @{tf.nn.local_response_normalization$local response normalization} +(Chapter 3.3 in +[AlexNet paper](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)). +* @{$summaries_and_tensorboard$Visualization} +of network activities during training, including input images, +losses and distributions of activations and gradients. +* Routines for calculating the +@{tf.train.ExponentialMovingAverage$moving average} +of learned parameters and using these averages +during evaluation to boost predictive performance. +* Implementation of a +@{tf.train.exponential_decay$learning rate schedule} +that systematically decrements over time. +* Prefetching @{tf.train.shuffle_batch$queues} +for input +data to isolate the model from disk latency and expensive image pre-processing. + +We also provide a [multi-GPU version](#training-a-model-using-multiple-gpu-cards) +of the model which demonstrates: + +* Configuring a model to train across multiple GPU cards in parallel. +* Sharing and updating variables among multiple GPUs. + +We hope that this tutorial provides a launch point for building larger CNNs for +vision tasks on TensorFlow. + +### Model Architecture + +The model in this CIFAR-10 tutorial is a multi-layer architecture consisting of +alternating convolutions and nonlinearities. These layers are followed by fully +connected layers leading into a softmax classifier. The model follows the +architecture described by +[Alex Krizhevsky](https://code.google.com/p/cuda-convnet/), with a few +differences in the top few layers. + +This model achieves a peak performance of about 86% accuracy within a few hours +of training time on a GPU. Please see [below](#evaluating-a-model) and the code +for details. It consists of 1,068,298 learnable parameters and requires about +19.5M multiply-add operations to compute inference on a single image. + +## Code Organization + +The code for this tutorial resides in +[`models/tutorials/image/cifar10/`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/). + +File | Purpose +--- | --- +[`cifar10_input.py`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/cifar10_input.py) | Reads the native CIFAR-10 binary file format. +[`cifar10.py`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/cifar10.py) | Builds the CIFAR-10 model. +[`cifar10_train.py`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/cifar10_train.py) | Trains a CIFAR-10 model on a CPU or GPU. +[`cifar10_multi_gpu_train.py`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/cifar10_multi_gpu_train.py) | Trains a CIFAR-10 model on multiple GPUs. +[`cifar10_eval.py`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/cifar10_eval.py) | Evaluates the predictive performance of a CIFAR-10 model. + + +## CIFAR-10 Model + +The CIFAR-10 network is largely contained in +[`cifar10.py`](https://www.tensorflow.org/code/tensorflow_models/tutorials/image/cifar10/cifar10.py). +The complete training +graph contains roughly 765 operations. We find that we can make the code most +reusable by constructing the graph with the following modules: + +1. [**Model inputs:**](#model-inputs) `inputs()` and `distorted_inputs()` add +operations that read and preprocess CIFAR images for evaluation and training, +respectively. +1. [**Model prediction:**](#model-prediction) `inference()` +adds operations that perform inference, i.e. classification, on supplied images. +1. [**Model training:**](#model-training) `loss()` and `train()` +add operations that compute the loss, +gradients, variable updates and visualization summaries. + +### Model Inputs + +The input part of the model is built by the functions `inputs()` and +`distorted_inputs()` which read images from the CIFAR-10 binary data files. +These files contain fixed byte length records, so we use +@{tf.FixedLengthRecordReader}. +See @{$reading_data#reading-from-files$Reading Data} to +learn more about how the `Reader` class works. + +The images are processed as follows: + +* They are cropped to 24 x 24 pixels, centrally for evaluation or + @{tf.random_crop$randomly} for training. +* They are @{tf.image.per_image_standardization$approximately whitened} + to make the model insensitive to dynamic range. + +For training, we additionally apply a series of random distortions to +artificially increase the data set size: + +* @{tf.image.random_flip_left_right$Randomly flip} the image from left to right. +* Randomly distort the @{tf.image.random_brightness$image brightness}. +* Randomly distort the @{tf.image.random_contrast$image contrast}. + +Please see the @{$python/image$Images} page for the list of +available distortions. We also attach an +@{tf.summary.image} to the images +so that we may visualize them in @{$summaries_and_tensorboard$TensorBoard}. +This is a good practice to verify that inputs are built correctly. + +
+ +
+ +Reading images from disk and distorting them can use a non-trivial amount of +processing time. To prevent these operations from slowing down training, we run +them inside 16 separate threads which continuously fill a TensorFlow +@{tf.train.shuffle_batch$queue}. + +### Model Prediction + +The prediction part of the model is constructed by the `inference()` function +which adds operations to compute the *logits* of the predictions. That part of +the model is organized as follows: + +Layer Name | Description +--- | --- +`conv1` | @{tf.nn.conv2d$convolution} and @{tf.nn.relu$rectified linear} activation. +`pool1` | @{tf.nn.max_pool$max pooling}. +`norm1` | @{tf.nn.local_response_normalization$local response normalization}. +`conv2` | @{tf.nn.conv2d$convolution} and @{tf.nn.relu$rectified linear} activation. +`norm2` | @{tf.nn.local_response_normalization$local response normalization}. +`pool2` | @{tf.nn.max_pool$max pooling}. +`local3` | @{$python/nn$fully connected layer with rectified linear activation}. +`local4` | @{$python/nn$fully connected layer with rectified linear activation}. +`softmax_linear` | linear transformation to produce logits. + +Here is a graph generated from TensorBoard describing the inference operation: + +
+ +
+ +> **EXERCISE**: The output of `inference` are un-normalized logits. Try editing +the network architecture to return normalized predictions using +@{tf.nn.softmax}. + +The `inputs()` and `inference()` functions provide all the components +necessary to perform an evaluation of a model. We now shift our focus towards +building operations for training a model. + +> **EXERCISE:** The model architecture in `inference()` differs slightly from +the CIFAR-10 model specified in +[cuda-convnet](https://code.google.com/p/cuda-convnet/). In particular, the top +layers of Alex's original model are locally connected and not fully connected. +Try editing the architecture to exactly reproduce the locally connected +architecture in the top layer. + +### Model Training + +The usual method for training a network to perform N-way classification is +[multinomial logistic regression](https://en.wikipedia.org/wiki/Multinomial_logistic_regression), +aka. *softmax regression*. Softmax regression applies a +@{tf.nn.softmax$softmax} nonlinearity to the +output of the network and calculates the +@{tf.nn.sparse_softmax_cross_entropy_with_logits$cross-entropy} +between the normalized predictions and the label index. +For regularization, we also apply the usual +@{tf.nn.l2_loss$weight decay} losses to all learned +variables. The objective function for the model is the sum of the cross entropy +loss and all these weight decay terms, as returned by the `loss()` function. + +We visualize it in TensorBoard with a @{tf.summary.scalar}: + +![CIFAR-10 Loss](https://www.tensorflow.org/images/cifar_loss.png "CIFAR-10 Total Loss") + +We train the model using standard +[gradient descent](https://en.wikipedia.org/wiki/Gradient_descent) +algorithm (see @{$python/train$Training} for other methods) +with a learning rate that +@{tf.train.exponential_decay$exponentially decays} +over time. + +![CIFAR-10 Learning Rate Decay](https://www.tensorflow.org/images/cifar_lr_decay.png "CIFAR-10 Learning Rate Decay") + +The `train()` function adds the operations needed to minimize the objective by +calculating the gradient and updating the learned variables (see +@{tf.train.GradientDescentOptimizer} +for details). It returns an operation that executes all the calculations +needed to train and update the model for one batch of images. + +## Launching and Training the Model + +We have built the model, let's now launch it and run the training operation with +the script `cifar10_train.py`. + +```shell +python cifar10_train.py +``` + +> **NOTE:** The first time you run any target in the CIFAR-10 tutorial, +the CIFAR-10 dataset is automatically downloaded. The data set is ~160MB +so you may want to grab a quick cup of coffee for your first run. + +You should see the output: + +```shell +Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes. +2015-11-04 11:45:45.927302: step 0, loss = 4.68 (2.0 examples/sec; 64.221 sec/batch) +2015-11-04 11:45:49.133065: step 10, loss = 4.66 (533.8 examples/sec; 0.240 sec/batch) +2015-11-04 11:45:51.397710: step 20, loss = 4.64 (597.4 examples/sec; 0.214 sec/batch) +2015-11-04 11:45:54.446850: step 30, loss = 4.62 (391.0 examples/sec; 0.327 sec/batch) +2015-11-04 11:45:57.152676: step 40, loss = 4.61 (430.2 examples/sec; 0.298 sec/batch) +2015-11-04 11:46:00.437717: step 50, loss = 4.59 (406.4 examples/sec; 0.315 sec/batch) +... +``` + +The script reports the total loss every 10 steps as well as the speed at which +the last batch of data was processed. A few comments: + +* The first batch of data can be inordinately slow (e.g. several minutes) as the +preprocessing threads fill up the shuffling queue with 20,000 processed CIFAR +images. + +* The reported loss is the average loss of the most recent batch. Remember that +this loss is the sum of the cross entropy and all weight decay terms. + +* Keep an eye on the processing speed of a batch. The numbers shown above were +obtained on a Tesla K40c. If you are running on a CPU, expect slower performance. + + +> **EXERCISE:** When experimenting, it is sometimes annoying that the first +training step can take so long. Try decreasing the number of images that +initially fill up the queue. Search for `min_fraction_of_examples_in_queue` +in `cifar10_input.py`. + +`cifar10_train.py` periodically @{tf.train.Saver$saves} +all model parameters in +@{$guide/saved_model$checkpoint files} +but it does *not* evaluate the model. The checkpoint file +will be used by `cifar10_eval.py` to measure the predictive +performance (see [Evaluating a Model](#evaluating-a-model) below). + + +If you followed the previous steps, then you have now started training +a CIFAR-10 model. [Congratulations!](https://www.youtube.com/watch?v=9bZkp7q19f0) + +The terminal text returned from `cifar10_train.py` provides minimal insight into +how the model is training. We want more insight into the model during training: + +* Is the loss *really* decreasing or is that just noise? +* Is the model being provided appropriate images? +* Are the gradients, activations and weights reasonable? +* What is the learning rate currently at? + +@{$summaries_and_tensorboard$TensorBoard} provides this +functionality, displaying data exported periodically from `cifar10_train.py` via +a +@{tf.summary.FileWriter}. + +For instance, we can watch how the distribution of activations and degree of +sparsity in `local3` features evolve during training: + +
+ + +
+ +Individual loss functions, as well as the total loss, are particularly +interesting to track over time. However, the loss exhibits a considerable amount +of noise due to the small batch size employed by training. In practice we find +it extremely useful to visualize their moving averages in addition to their raw +values. See how the scripts use +@{tf.train.ExponentialMovingAverage} +for this purpose. + +## Evaluating a Model + +Let us now evaluate how well the trained model performs on a hold-out data set. +The model is evaluated by the script `cifar10_eval.py`. It constructs the model +with the `inference()` function and uses all 10,000 images in the evaluation set +of CIFAR-10. It calculates the *precision at 1:* how often the top prediction +matches the true label of the image. + +To monitor how the model improves during training, the evaluation script runs +periodically on the latest checkpoint files created by the `cifar10_train.py`. + +```shell +python cifar10_eval.py +``` + +> Be careful not to run the evaluation and training binary on the same GPU or +else you might run out of memory. Consider running the evaluation on +a separate GPU if available or suspending the training binary while running +the evaluation on the same GPU. + +You should see the output: + +```shell +2015-11-06 08:30:44.391206: precision @ 1 = 0.860 +... +``` + +The script merely returns the precision @ 1 periodically -- in this case +it returned 86% accuracy. `cifar10_eval.py` also +exports summaries that may be visualized in TensorBoard. These summaries +provide additional insight into the model during evaluation. + +The training script calculates the +@{tf.train.ExponentialMovingAverage$moving average} +version of all learned variables. The evaluation script substitutes +all learned model parameters with the moving average version. This +substitution boosts model performance at evaluation time. + +> **EXERCISE:** Employing averaged parameters may boost predictive performance +by about 3% as measured by precision @ 1. Edit `cifar10_eval.py` to not employ +the averaged parameters for the model and verify that the predictive performance +drops. + + +## Training a Model Using Multiple GPU Cards + +Modern workstations may contain multiple GPUs for scientific computation. +TensorFlow can leverage this environment to run the training operation +concurrently across multiple cards. + +Training a model in a parallel, distributed fashion requires +coordinating training processes. For what follows we term *model replica* +to be one copy of a model training on a subset of data. + +Naively employing asynchronous updates of model parameters +leads to sub-optimal training performance +because an individual model replica might be trained on a stale +copy of the model parameters. Conversely, employing fully synchronous +updates will be as slow as the slowest model replica. + +In a workstation with multiple GPU cards, each GPU will have similar speed +and contain enough memory to run an entire CIFAR-10 model. Thus, we opt to +design our training system in the following manner: + +* Place an individual model replica on each GPU. +* Update model parameters synchronously by waiting for all GPUs to finish +processing a batch of data. + +Here is a diagram of this model: + +
+ +
+ +Note that each GPU computes inference as well as the gradients for a unique +batch of data. This setup effectively permits dividing up a larger batch +of data across the GPUs. + +This setup requires that all GPUs share the model parameters. A well-known +fact is that transferring data to and from GPUs is quite slow. For this +reason, we decide to store and update all model parameters on the CPU (see +green box). A fresh set of model parameters is transferred to the GPU +when a new batch of data is processed by all GPUs. + +The GPUs are synchronized in operation. All gradients are accumulated from +the GPUs and averaged (see green box). The model parameters are updated with +the gradients averaged across all model replicas. + +### Placing Variables and Operations on Devices + +Placing operations and variables on devices requires some special +abstractions. + +The first abstraction we require is a function for computing inference and +gradients for a single model replica. In the code we term this abstraction +a "tower". We must set two attributes for each tower: + +* A unique name for all operations within a tower. +@{tf.name_scope} provides +this unique name by prepending a scope. For instance, all operations in +the first tower are prepended with `tower_0`, e.g. `tower_0/conv1/Conv2D`. + +* A preferred hardware device to run the operation within a tower. +@{tf.device} specifies this. For +instance, all operations in the first tower reside within `device('/device:GPU:0')` +scope indicating that they should be run on the first GPU. + +All variables are pinned to the CPU and accessed via +@{tf.get_variable} +in order to share them in a multi-GPU version. +See how-to on @{$variables$Sharing Variables}. + +### Launching and Training the Model on Multiple GPU cards + +If you have several GPU cards installed on your machine you can use them to +train the model faster with the `cifar10_multi_gpu_train.py` script. This +version of the training script parallelizes the model across multiple GPU cards. + +```shell +python cifar10_multi_gpu_train.py --num_gpus=2 +``` + +Note that the number of GPU cards used defaults to 1. Additionally, if only 1 +GPU is available on your machine, all computations will be placed on it, even if +you ask for more. + +> **EXERCISE:** The default settings for `cifar10_train.py` is to +run on a batch size of 128. Try running `cifar10_multi_gpu_train.py` on 2 GPUs +with a batch size of 64 and compare the training speed. + +## Next Steps + +If you are now interested in developing and training your own image +classification system, we recommend forking this tutorial and replacing +components to address your image classification problem. + + +> **EXERCISE:** Download the +[Street View House Numbers (SVHN)](http://ufldl.stanford.edu/housenumbers/) data set. +Fork the CIFAR-10 tutorial and swap in the SVHN as the input data. Try adapting +the network architecture to improve predictive performance. diff --git a/tensorflow/docs_src/tutorials/images/image_recognition.md b/tensorflow/docs_src/tutorials/images/image_recognition.md new file mode 100644 index 0000000000..432d470d0c --- /dev/null +++ b/tensorflow/docs_src/tutorials/images/image_recognition.md @@ -0,0 +1,455 @@ +# Image Recognition + +Our brains make vision seem easy. It doesn't take any effort for humans to +tell apart a lion and a jaguar, read a sign, or recognize a human's face. +But these are actually hard problems to solve with a computer: they only +seem easy because our brains are incredibly good at understanding images. + +In the last few years, the field of machine learning has made tremendous +progress on addressing these difficult problems. In particular, we've +found that a kind of model called a deep +[convolutional neural network](https://colah.github.io/posts/2014-07-Conv-Nets-Modular/) +can achieve reasonable performance on hard visual recognition tasks -- +matching or exceeding human performance in some domains. + +Researchers have demonstrated steady progress +in computer vision by validating their work against +[ImageNet](http://www.image-net.org) -- an academic benchmark for computer vision. +Successive models continue to show improvements, each time achieving +a new state-of-the-art result: +[QuocNet], [AlexNet], [Inception (GoogLeNet)], [BN-Inception-v2]. +Researchers both internal and external to Google have published papers describing all +these models but the results are still hard to reproduce. +We're now taking the next step by releasing code for running image recognition +on our latest model, [Inception-v3]. + +[QuocNet]: https://static.googleusercontent.com/media/research.google.com/en//archive/unsupervised_icml2012.pdf +[AlexNet]: https://www.cs.toronto.edu/~fritz/absps/imagenet.pdf +[Inception (GoogLeNet)]: https://arxiv.org/abs/1409.4842 +[BN-Inception-v2]: https://arxiv.org/abs/1502.03167 +[Inception-v3]: https://arxiv.org/abs/1512.00567 + +Inception-v3 is trained for the [ImageNet] Large Visual Recognition Challenge +using the data from 2012. This is a standard task in computer vision, +where models try to classify entire +images into [1000 classes], like "Zebra", "Dalmatian", and "Dishwasher". +For example, here are the results from [AlexNet] classifying some images: + +
+ +
+ +To compare models, we examine how often the model fails to predict the +correct answer as one of their top 5 guesses -- termed "top-5 error rate". +[AlexNet] achieved by setting a top-5 error rate of 15.3% on the 2012 +validation data set; [Inception (GoogLeNet)] achieved 6.67%; +[BN-Inception-v2] achieved 4.9%; [Inception-v3] reaches 3.46%. + +> How well do humans do on ImageNet Challenge? There's a [blog post] by +Andrej Karpathy who attempted to measure his own performance. He reached +5.1% top-5 error rate. + +[ImageNet]: http://image-net.org/ +[1000 classes]: http://image-net.org/challenges/LSVRC/2014/browse-synsets +[blog post]: https://karpathy.github.io/2014/09/02/what-i-learned-from-competing-against-a-convnet-on-imagenet/ + +This tutorial will teach you how to use [Inception-v3]. You'll learn how to +classify images into [1000 classes] in Python or C++. We'll also discuss how to +extract higher level features from this model which may be reused for other +vision tasks. + +We're excited to see what the community will do with this model. + + +##Usage with Python API + +`classify_image.py` downloads the trained model from `tensorflow.org` +when the program is run for the first time. You'll need about 200M of free space +available on your hard disk. + +Start by cloning the [TensorFlow models repo](https://github.com/tensorflow/models) from GitHub. Run the following commands: + + cd models/tutorials/image/imagenet + python classify_image.py + +The above command will classify a supplied image of a panda bear. + +
+ +
+ +If the model runs correctly, the script will produce the following output: + + giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca (score = 0.88493) + indri, indris, Indri indri, Indri brevicaudatus (score = 0.00878) + lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens (score = 0.00317) + custard apple (score = 0.00149) + earthstar (score = 0.00127) + +If you wish to supply other JPEG images, you may do so by editing +the `--image_file` argument. + +> If you download the model data to a different directory, you +will need to point `--model_dir` to the directory used. + +## Usage with the C++ API + +You can run the same [Inception-v3] model in C++ for use in production +environments. You can download the archive containing the GraphDef that defines +the model like this (running from the root directory of the TensorFlow +repository): + +```bash +curl -L "https://storage.googleapis.com/download.tensorflow.org/models/inception_v3_2016_08_28_frozen.pb.tar.gz" | + tar -C tensorflow/examples/label_image/data -xz +``` + +Next, we need to compile the C++ binary that includes the code to load and run the graph. +If you've followed +@{$install_sources$the instructions to download the source installation of TensorFlow} +for your platform, you should be able to build the example by +running this command from your shell terminal: + +```bash +bazel build tensorflow/examples/label_image/... +``` + +That should create a binary executable that you can then run like this: + +```bash +bazel-bin/tensorflow/examples/label_image/label_image +``` + +This uses the default example image that ships with the framework, and should +output something similar to this: + +``` +I tensorflow/examples/label_image/main.cc:206] military uniform (653): 0.834306 +I tensorflow/examples/label_image/main.cc:206] mortarboard (668): 0.0218692 +I tensorflow/examples/label_image/main.cc:206] academic gown (401): 0.0103579 +I tensorflow/examples/label_image/main.cc:206] pickelhaube (716): 0.00800814 +I tensorflow/examples/label_image/main.cc:206] bulletproof vest (466): 0.00535088 +``` +In this case, we're using the default image of +[Admiral Grace Hopper](https://en.wikipedia.org/wiki/Grace_Hopper), and you can +see the network correctly identifies she's wearing a military uniform, with a high +score of 0.8. + + +
+ +
+ +Next, try it out on your own images by supplying the --image= argument, e.g. + +```bash +bazel-bin/tensorflow/examples/label_image/label_image --image=my_image.png +``` + +If you look inside the [`tensorflow/examples/label_image/main.cc`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/label_image/main.cc) +file, you can find out +how it works. We hope this code will help you integrate TensorFlow into +your own applications, so we will walk step by step through the main functions: + +The command line flags control where the files are loaded from, and properties of the input images. +The model expects to get square 299x299 RGB images, so those are the `input_width` +and `input_height` flags. We also need to scale the pixel values from integers that +are between 0 and 255 to the floating point values that the graph operates on. +We control the scaling with the `input_mean` and `input_std` flags: we first subtract +`input_mean` from each pixel value, then divide it by `input_std`. + +These values probably look somewhat magical, but they are just defined by the +original model author based on what he/she wanted to use as input images for +training. If you have a graph that you've trained yourself, you'll just need +to adjust the values to match whatever you used during your training process. + +You can see how they're applied to an image in the +[`ReadTensorFromImageFile()`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/label_image/main.cc#L88) +function. + +```C++ +// Given an image file name, read in the data, try to decode it as an image, +// resize it to the requested size, and then scale the values as desired. +Status ReadTensorFromImageFile(string file_name, const int input_height, + const int input_width, const float input_mean, + const float input_std, + std::vector* out_tensors) { + tensorflow::GraphDefBuilder b; +``` +We start by creating a `GraphDefBuilder`, which is an object we can use to +specify a model to run or load. + +```C++ + string input_name = "file_reader"; + string output_name = "normalized"; + tensorflow::Node* file_reader = + tensorflow::ops::ReadFile(tensorflow::ops::Const(file_name, b.opts()), + b.opts().WithName(input_name)); +``` +We then start creating nodes for the small model we want to run +to load, resize, and scale the pixel values to get the result the main model +expects as its input. The first node we create is just a `Const` op that holds a +tensor with the file name of the image we want to load. That's then passed as the +first input to the `ReadFile` op. You might notice we're passing `b.opts()` as the last +argument to all the op creation functions. The argument ensures that the node is added to +the model definition held in the `GraphDefBuilder`. We also name the `ReadFile` +operator by making the `WithName()` call to `b.opts()`. This gives a name to the node, +which isn't strictly necessary since an automatic name will be assigned if you don't +do this, but it does make debugging a bit easier. + +```C++ + // Now try to figure out what kind of file it is and decode it. + const int wanted_channels = 3; + tensorflow::Node* image_reader; + if (tensorflow::StringPiece(file_name).ends_with(".png")) { + image_reader = tensorflow::ops::DecodePng( + file_reader, + b.opts().WithAttr("channels", wanted_channels).WithName("png_reader")); + } else { + // Assume if it's not a PNG then it must be a JPEG. + image_reader = tensorflow::ops::DecodeJpeg( + file_reader, + b.opts().WithAttr("channels", wanted_channels).WithName("jpeg_reader")); + } + // Now cast the image data to float so we can do normal math on it. + tensorflow::Node* float_caster = tensorflow::ops::Cast( + image_reader, tensorflow::DT_FLOAT, b.opts().WithName("float_caster")); + // The convention for image ops in TensorFlow is that all images are expected + // to be in batches, so that they're four-dimensional arrays with indices of + // [batch, height, width, channel]. Because we only have a single image, we + // have to add a batch dimension of 1 to the start with ExpandDims(). + tensorflow::Node* dims_expander = tensorflow::ops::ExpandDims( + float_caster, tensorflow::ops::Const(0, b.opts()), b.opts()); + // Bilinearly resize the image to fit the required dimensions. + tensorflow::Node* resized = tensorflow::ops::ResizeBilinear( + dims_expander, tensorflow::ops::Const({input_height, input_width}, + b.opts().WithName("size")), + b.opts()); + // Subtract the mean and divide by the scale. + tensorflow::ops::Div( + tensorflow::ops::Sub( + resized, tensorflow::ops::Const({input_mean}, b.opts()), b.opts()), + tensorflow::ops::Const({input_std}, b.opts()), + b.opts().WithName(output_name)); +``` +We then keep adding more nodes, to decode the file data as an image, to cast the +integers into floating point values, to resize it, and then finally to run the +subtraction and division operations on the pixel values. + +```C++ + // This runs the GraphDef network definition that we've just constructed, and + // returns the results in the output tensor. + tensorflow::GraphDef graph; + TF_RETURN_IF_ERROR(b.ToGraphDef(&graph)); +``` +At the end of this we have +a model definition stored in the b variable, which we turn into a full graph +definition with the `ToGraphDef()` function. + +```C++ + std::unique_ptr session( + tensorflow::NewSession(tensorflow::SessionOptions())); + TF_RETURN_IF_ERROR(session->Create(graph)); + TF_RETURN_IF_ERROR(session->Run({}, {output_name}, {}, out_tensors)); + return Status::OK(); +``` +Then we create a @{tf.Session} +object, which is the interface to actually running the graph, and run it, +specifying which node we want to get the output from, and where to put the +output data. + +This gives us a vector of `Tensor` objects, which in this case we know will only be a +single object long. You can think of a `Tensor` as a multi-dimensional array in this +context, and it holds a 299 pixel high, 299 pixel wide, 3 channel image as float +values. If you have your own image-processing framework in your product already, you +should be able to use that instead, as long as you apply the same transformations +before you feed images into the main graph. + +This is a simple example of creating a small TensorFlow graph dynamically in C++, +but for the pre-trained Inception model we want to load a much larger definition from +a file. You can see how we do that in the `LoadGraph()` function. + +```C++ +// Reads a model graph definition from disk, and creates a session object you +// can use to run it. +Status LoadGraph(string graph_file_name, + std::unique_ptr* session) { + tensorflow::GraphDef graph_def; + Status load_graph_status = + ReadBinaryProto(tensorflow::Env::Default(), graph_file_name, &graph_def); + if (!load_graph_status.ok()) { + return tensorflow::errors::NotFound("Failed to load compute graph at '", + graph_file_name, "'"); + } +``` +If you've looked through the image loading code, a lot of the terms should seem familiar. Rather than +using a `GraphDefBuilder` to produce a `GraphDef` object, we load a protobuf file that +directly contains the `GraphDef`. + +```C++ + session->reset(tensorflow::NewSession(tensorflow::SessionOptions())); + Status session_create_status = (*session)->Create(graph_def); + if (!session_create_status.ok()) { + return session_create_status; + } + return Status::OK(); +} +``` +Then we create a Session object from that `GraphDef` and +pass it back to the caller so that they can run it at a later time. + +The `GetTopLabels()` function is a lot like the image loading, except that in this case +we want to take the results of running the main graph, and turn it into a sorted list +of the highest-scoring labels. Just like the image loader, it creates a +`GraphDefBuilder`, adds a couple of nodes to it, and then runs the short graph to get a +pair of output tensors. In this case they represent the sorted scores and index +positions of the highest results. + +```C++ +// Analyzes the output of the Inception graph to retrieve the highest scores and +// their positions in the tensor, which correspond to categories. +Status GetTopLabels(const std::vector& outputs, int how_many_labels, + Tensor* indices, Tensor* scores) { + tensorflow::GraphDefBuilder b; + string output_name = "top_k"; + tensorflow::ops::TopK(tensorflow::ops::Const(outputs[0], b.opts()), + how_many_labels, b.opts().WithName(output_name)); + // This runs the GraphDef network definition that we've just constructed, and + // returns the results in the output tensors. + tensorflow::GraphDef graph; + TF_RETURN_IF_ERROR(b.ToGraphDef(&graph)); + std::unique_ptr session( + tensorflow::NewSession(tensorflow::SessionOptions())); + TF_RETURN_IF_ERROR(session->Create(graph)); + // The TopK node returns two outputs, the scores and their original indices, + // so we have to append :0 and :1 to specify them both. + std::vector out_tensors; + TF_RETURN_IF_ERROR(session->Run({}, {output_name + ":0", output_name + ":1"}, + {}, &out_tensors)); + *scores = out_tensors[0]; + *indices = out_tensors[1]; + return Status::OK(); +``` +The `PrintTopLabels()` function takes those sorted results, and prints them out in a +friendly way. The `CheckTopLabel()` function is very similar, but just makes sure that +the top label is the one we expect, for debugging purposes. + +At the end, [`main()`](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/label_image/main.cc#L252) +ties together all of these calls. + +```C++ +int main(int argc, char* argv[]) { + // We need to call this to set up global state for TensorFlow. + tensorflow::port::InitMain(argv[0], &argc, &argv); + Status s = tensorflow::ParseCommandLineFlags(&argc, argv); + if (!s.ok()) { + LOG(ERROR) << "Error parsing command line flags: " << s.ToString(); + return -1; + } + + // First we load and initialize the model. + std::unique_ptr session; + string graph_path = tensorflow::io::JoinPath(FLAGS_root_dir, FLAGS_graph); + Status load_graph_status = LoadGraph(graph_path, &session); + if (!load_graph_status.ok()) { + LOG(ERROR) << load_graph_status; + return -1; + } +``` +We load the main graph. + +```C++ + // Get the image from disk as a float array of numbers, resized and normalized + // to the specifications the main graph expects. + std::vector resized_tensors; + string image_path = tensorflow::io::JoinPath(FLAGS_root_dir, FLAGS_image); + Status read_tensor_status = ReadTensorFromImageFile( + image_path, FLAGS_input_height, FLAGS_input_width, FLAGS_input_mean, + FLAGS_input_std, &resized_tensors); + if (!read_tensor_status.ok()) { + LOG(ERROR) << read_tensor_status; + return -1; + } + const Tensor& resized_tensor = resized_tensors[0]; +``` +Load, resize, and process the input image. + +```C++ + // Actually run the image through the model. + std::vector outputs; + Status run_status = session->Run({{FLAGS_input_layer, resized_tensor}}, + {FLAGS_output_layer}, {}, &outputs); + if (!run_status.ok()) { + LOG(ERROR) << "Running model failed: " << run_status; + return -1; + } +``` +Here we run the loaded graph with the image as an input. + +```C++ + // This is for automated testing to make sure we get the expected result with + // the default settings. We know that label 866 (military uniform) should be + // the top label for the Admiral Hopper image. + if (FLAGS_self_test) { + bool expected_matches; + Status check_status = CheckTopLabel(outputs, 866, &expected_matches); + if (!check_status.ok()) { + LOG(ERROR) << "Running check failed: " << check_status; + return -1; + } + if (!expected_matches) { + LOG(ERROR) << "Self-test failed!"; + return -1; + } + } +``` +For testing purposes we can check to make sure we get the output we expect here. + +```C++ + // Do something interesting with the results we've generated. + Status print_status = PrintTopLabels(outputs, FLAGS_labels); +``` +Finally we print the labels we found. + +```C++ + if (!print_status.ok()) { + LOG(ERROR) << "Running print failed: " << print_status; + return -1; + } +``` + +The error handling here is using TensorFlow's `Status` +object, which is very convenient because it lets you know whether any error has +occurred with the `ok()` checker, and then can be printed out to give a readable error +message. + +In this case we are demonstrating object recognition, but you should be able to +use very similar code on other models you've found or trained yourself, across +all +sorts of domains. We hope this small example gives you some ideas on how to use +TensorFlow within your own products. + +> **EXERCISE**: Transfer learning is the idea that, if you know how to solve a task well, you +should be able to transfer some of that understanding to solving related +problems. One way to perform transfer learning is to remove the final +classification layer of the network and extract +the [next-to-last layer of the CNN](https://arxiv.org/abs/1310.1531), in this case a 2048 dimensional vector. + + +## Resources for Learning More + +To learn about neural networks in general, Michael Nielsen's +[free online book](http://neuralnetworksanddeeplearning.com/chap1.html) +is an excellent resource. For convolutional neural networks in particular, +Chris Olah has some +[nice blog posts](https://colah.github.io/posts/2014-07-Conv-Nets-Modular/), +and Michael Nielsen's book has a +[great chapter](http://neuralnetworksanddeeplearning.com/chap6.html) +covering them. + +To find out more about implementing convolutional neural networks, you can jump +to the TensorFlow @{$deep_cnn$deep convolutional networks tutorial}, +or start a bit more gently with our @{$layers$MNIST starter tutorial}. +Finally, if you want to get up to speed on research in this area, you can +read the recent work of all the papers referenced in this tutorial. + diff --git a/tensorflow/docs_src/tutorials/images/layers.md b/tensorflow/docs_src/tutorials/images/layers.md new file mode 100644 index 0000000000..12a215b50c --- /dev/null +++ b/tensorflow/docs_src/tutorials/images/layers.md @@ -0,0 +1,694 @@ +# Build a Convolutional Neural Network using Estimators + +The TensorFlow @{tf.layers$`layers` module} provides a high-level API that makes +it easy to construct a neural network. It provides methods that facilitate the +creation of dense (fully connected) layers and convolutional layers, adding +activation functions, and applying dropout regularization. In this tutorial, +you'll learn how to use `layers` to build a convolutional neural network model +to recognize the handwritten digits in the MNIST data set. + +![handwritten digits 0–9 from the MNIST data set](https://www.tensorflow.org/images/mnist_0-9.png) + +**The [MNIST dataset](http://yann.lecun.com/exdb/mnist/) comprises 60,000 +training examples and 10,000 test examples of the handwritten digits 0–9, +formatted as 28x28-pixel monochrome images.** + +## Getting Started + +Let's set up the skeleton for our TensorFlow program. Create a file called +`cnn_mnist.py`, and add the following code: + +```python +from __future__ import absolute_import +from __future__ import division +from __future__ import print_function + +# Imports +import numpy as np +import tensorflow as tf + +tf.logging.set_verbosity(tf.logging.INFO) + +# Our application logic will be added here + +if __name__ == "__main__": + tf.app.run() +``` + +As you work through the tutorial, you'll add code to construct, train, and +evaluate the convolutional neural network. The complete, final code can be +[found here](https://www.tensorflow.org/code/tensorflow/examples/tutorials/layers/cnn_mnist.py). + +## Intro to Convolutional Neural Networks + +Convolutional neural networks (CNNs) are the current state-of-the-art model +architecture for image classification tasks. CNNs apply a series of filters to +the raw pixel data of an image to extract and learn higher-level features, which +the model can then use for classification. CNNs contains three components: + +* **Convolutional layers**, which apply a specified number of convolution + filters to the image. For each subregion, the layer performs a set of + mathematical operations to produce a single value in the output feature map. + Convolutional layers then typically apply a + [ReLU activation function](https://en.wikipedia.org/wiki/Rectifier_\(neural_networks\)) to + the output to introduce nonlinearities into the model. + +* **Pooling layers**, which + [downsample the image data](https://en.wikipedia.org/wiki/Convolutional_neural_network#Pooling_layer) + extracted by the convolutional layers to reduce the dimensionality of the + feature map in order to decrease processing time. A commonly used pooling + algorithm is max pooling, which extracts subregions of the feature map + (e.g., 2x2-pixel tiles), keeps their maximum value, and discards all other + values. + +* **Dense (fully connected) layers**, which perform classification on the + features extracted by the convolutional layers and downsampled by the + pooling layers. In a dense layer, every node in the layer is connected to + every node in the preceding layer. + +Typically, a CNN is composed of a stack of convolutional modules that perform +feature extraction. Each module consists of a convolutional layer followed by a +pooling layer. The last convolutional module is followed by one or more dense +layers that perform classification. The final dense layer in a CNN contains a +single node for each target class in the model (all the possible classes the +model may predict), with a +[softmax](https://en.wikipedia.org/wiki/Softmax_function) activation function to +generate a value between 0–1 for each node (the sum of all these softmax values +is equal to 1). We can interpret the softmax values for a given image as +relative measurements of how likely it is that the image falls into each target +class. + +> Note: For a more comprehensive walkthrough of CNN architecture, see Stanford +> University's +> Convolutional Neural Networks for Visual Recognition course materials.

+ +## Building the CNN MNIST Classifier {#building_the_cnn_mnist_classifier} + +Let's build a model to classify the images in the MNIST dataset using the +following CNN architecture: + +1. **Convolutional Layer #1**: Applies 32 5x5 filters (extracting 5x5-pixel + subregions), with ReLU activation function +2. **Pooling Layer #1**: Performs max pooling with a 2x2 filter and stride of 2 + (which specifies that pooled regions do not overlap) +3. **Convolutional Layer #2**: Applies 64 5x5 filters, with ReLU activation + function +4. **Pooling Layer #2**: Again, performs max pooling with a 2x2 filter and + stride of 2 +5. **Dense Layer #1**: 1,024 neurons, with dropout regularization rate of 0.4 + (probability of 0.4 that any given element will be dropped during training) +6. **Dense Layer #2 (Logits Layer)**: 10 neurons, one for each digit target + class (0–9). + +The `tf.layers` module contains methods to create each of the three layer types +above: + +* `conv2d()`. Constructs a two-dimensional convolutional layer. Takes number + of filters, filter kernel size, padding, and activation function as + arguments. +* `max_pooling2d()`. Constructs a two-dimensional pooling layer using the + max-pooling algorithm. Takes pooling filter size and stride as arguments. +* `dense()`. Constructs a dense layer. Takes number of neurons and activation + function as arguments. + +Each of these methods accepts a tensor as input and returns a transformed tensor +as output. This makes it easy to connect one layer to another: just take the +output from one layer-creation method and supply it as input to another. + +Open `cnn_mnist.py` and add the following `cnn_model_fn` function, which +conforms to the interface expected by TensorFlow's Estimator API (more on this +later in [Create the Estimator](#create-the-estimator)). `cnn_mnist.py` takes +MNIST feature data, labels, and +@{tf.estimator.ModeKeys$model mode} (`TRAIN`, `EVAL`, `PREDICT`) as arguments; +configures the CNN; and returns predictions, loss, and a training operation: + +```python +def cnn_model_fn(features, labels, mode): + """Model function for CNN.""" + # Input Layer + input_layer = tf.reshape(features["x"], [-1, 28, 28, 1]) + + # Convolutional Layer #1 + conv1 = tf.layers.conv2d( + inputs=input_layer, + filters=32, + kernel_size=[5, 5], + padding="same", + activation=tf.nn.relu) + + # Pooling Layer #1 + pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2) + + # Convolutional Layer #2 and Pooling Layer #2 + conv2 = tf.layers.conv2d( + inputs=pool1, + filters=64, + kernel_size=[5, 5], + padding="same", + activation=tf.nn.relu) + pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2) + + # Dense Layer + pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64]) + dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu) + dropout = tf.layers.dropout( + inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN) + + # Logits Layer + logits = tf.layers.dense(inputs=dropout, units=10) + + predictions = { + # Generate predictions (for PREDICT and EVAL mode) + "classes": tf.argmax(input=logits, axis=1), + # Add `softmax_tensor` to the graph. It is used for PREDICT and by the + # `logging_hook`. + "probabilities": tf.nn.softmax(logits, name="softmax_tensor") + } + + if mode == tf.estimator.ModeKeys.PREDICT: + return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions) + + # Calculate Loss (for both TRAIN and EVAL modes) + loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits) + + # Configure the Training Op (for TRAIN mode) + if mode == tf.estimator.ModeKeys.TRAIN: + optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001) + train_op = optimizer.minimize( + loss=loss, + global_step=tf.train.get_global_step()) + return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op) + + # Add evaluation metrics (for EVAL mode) + eval_metric_ops = { + "accuracy": tf.metrics.accuracy( + labels=labels, predictions=predictions["classes"])} + return tf.estimator.EstimatorSpec( + mode=mode, loss=loss, eval_metric_ops=eval_metric_ops) +``` + +The following sections (with headings corresponding to each code block above) +dive deeper into the `tf.layers` code used to create each layer, as well as how +to calculate loss, configure the training op, and generate predictions. If +you're already experienced with CNNs and @{$custom_estimators$TensorFlow `Estimator`s}, +and find the above code intuitive, you may want to skim these sections or just +skip ahead to ["Training and Evaluating the CNN MNIST Classifier"](#train_eval_mnist). + +### Input Layer + +The methods in the `layers` module for creating convolutional and pooling layers +for two-dimensional image data expect input tensors to have a shape of +[batch_size, image_height, image_width, +channels] by default. This behavior can be changed using the data_format parameter; defined as follows: + + +* _`batch_size`_. Size of the subset of examples to use when performing + gradient descent during training. +* _`image_height`_. Height of the example images. +* _`image_width`_. Width of the example images. +* _`channels`_. Number of color channels in the example images. For color + images, the number of channels is 3 (red, green, blue). For monochrome + images, there is just 1 channel (black). +* _`data_format`_. A string, one of `channels_last` (default) or `channels_first`. + `channels_last` corresponds to inputs with shape + `(batch, ..., channels)` while `channels_first` corresponds to + inputs with shape `(batch, channels, ...)`. + +Here, our MNIST dataset is composed of monochrome 28x28 pixel images, so the +desired shape for our input layer is [batch_size, 28, 28, +1]. + +To convert our input feature map (`features`) to this shape, we can perform the +following `reshape` operation: + +```python +input_layer = tf.reshape(features["x"], [-1, 28, 28, 1]) +``` + +Note that we've indicated `-1` for batch size, which specifies that this +dimension should be dynamically computed based on the number of input values in +`features["x"]`, holding the size of all other dimensions constant. This allows +us to treat `batch_size` as a hyperparameter that we can tune. For example, if +we feed examples into our model in batches of 5, `features["x"]` will contain +3,920 values (one value for each pixel in each image), and `input_layer` will +have a shape of `[5, 28, 28, 1]`. Similarly, if we feed examples in batches of +100, `features["x"]` will contain 78,400 values, and `input_layer` will have a +shape of `[100, 28, 28, 1]`. + +### Convolutional Layer #1 + +In our first convolutional layer, we want to apply 32 5x5 filters to the input +layer, with a ReLU activation function. We can use the `conv2d()` method in the +`layers` module to create this layer as follows: + +```python +conv1 = tf.layers.conv2d( + inputs=input_layer, + filters=32, + kernel_size=[5, 5], + padding="same", + activation=tf.nn.relu) +``` + +The `inputs` argument specifies our input tensor, which must have the shape +[batch_size, image_height, image_width, +channels]. Here, we're connecting our first convolutional layer +to `input_layer`, which has the shape [batch_size, 28, 28, +1]. + +> Note: conv2d() will instead accept a shape of +> [batch_size, channels, image_height, image_width] when passed the argument +> data_format=channels_first. + +The `filters` argument specifies the number of filters to apply (here, 32), and +`kernel_size` specifies the dimensions of the filters as [height, +width] (here, [5, 5]). + +

TIP: If filter height and width have the same value, you can instead specify a +single integer for kernel_size—e.g., kernel_size=5.

+ +The `padding` argument specifies one of two enumerated values +(case-insensitive): `valid` (default value) or `same`. To specify that the +output tensor should have the same height and width values as the input tensor, +we set `padding=same` here, which instructs TensorFlow to add 0 values to the +edges of the input tensor to preserve height and width of 28. (Without padding, +a 5x5 convolution over a 28x28 tensor will produce a 24x24 tensor, as there are +24x24 locations to extract a 5x5 tile from a 28x28 grid.) + +The `activation` argument specifies the activation function to apply to the +output of the convolution. Here, we specify ReLU activation with +@{tf.nn.relu}. + +Our output tensor produced by `conv2d()` has a shape of +[batch_size, 28, 28, 32]: the same height and width +dimensions as the input, but now with 32 channels holding the output from each +of the filters. + +### Pooling Layer #1 + +Next, we connect our first pooling layer to the convolutional layer we just +created. We can use the `max_pooling2d()` method in `layers` to construct a +layer that performs max pooling with a 2x2 filter and stride of 2: + +```python +pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2) +``` + +Again, `inputs` specifies the input tensor, with a shape of +[batch_size, image_height, image_width, +channels]. Here, our input tensor is `conv1`, the output from +the first convolutional layer, which has a shape of [batch_size, +28, 28, 32]. + +> Note: As with conv2d(), max_pooling2d() will instead +> accept a shape of [batch_size, channels, +> image_height, image_width] when passed the argument +> data_format=channels_first. + +The `pool_size` argument specifies the size of the max pooling filter as +[height, width] (here, `[2, 2]`). If both +dimensions have the same value, you can instead specify a single integer (e.g., +`pool_size=2`). + +The `strides` argument specifies the size of the stride. Here, we set a stride +of 2, which indicates that the subregions extracted by the filter should be +separated by 2 pixels in both the height and width dimensions (for a 2x2 filter, +this means that none of the regions extracted will overlap). If you want to set +different stride values for height and width, you can instead specify a tuple or +list (e.g., `stride=[3, 6]`). + +Our output tensor produced by `max_pooling2d()` (`pool1`) has a shape of +[batch_size, 14, 14, 32]: the 2x2 filter reduces height and width by 50% each. + +### Convolutional Layer #2 and Pooling Layer #2 + +We can connect a second convolutional and pooling layer to our CNN using +`conv2d()` and `max_pooling2d()` as before. For convolutional layer #2, we +configure 64 5x5 filters with ReLU activation, and for pooling layer #2, we use +the same specs as pooling layer #1 (a 2x2 max pooling filter with stride of 2): + +```python +conv2 = tf.layers.conv2d( + inputs=pool1, + filters=64, + kernel_size=[5, 5], + padding="same", + activation=tf.nn.relu) + +pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2) +``` + +Note that convolutional layer #2 takes the output tensor of our first pooling +layer (`pool1`) as input, and produces the tensor `conv2` as output. `conv2` +has a shape of [batch_size, 14, 14, 64], the same height and width as `pool1` (due to `padding="same"`), and 64 channels for the 64 +filters applied. + +Pooling layer #2 takes `conv2` as input, producing `pool2` as output. `pool2` +has shape [batch_size, 7, 7, 64] (50% reduction of height and width from `conv2`). + +### Dense Layer + +Next, we want to add a dense layer (with 1,024 neurons and ReLU activation) to +our CNN to perform classification on the features extracted by the +convolution/pooling layers. Before we connect the layer, however, we'll flatten +our feature map (`pool2`) to shape [batch_size, +features], so that our tensor has only two dimensions: + +```python +pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64]) +``` + +In the `reshape()` operation above, the `-1` signifies that the *`batch_size`* +dimension will be dynamically calculated based on the number of examples in our +input data. Each example has 7 (`pool2` height) * 7 (`pool2` width) * 64 +(`pool2` channels) features, so we want the `features` dimension to have a value +of 7 * 7 * 64 (3136 in total). The output tensor, `pool2_flat`, has shape +[batch_size, 3136]. + +Now, we can use the `dense()` method in `layers` to connect our dense layer as +follows: + +```python +dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu) +``` + +The `inputs` argument specifies the input tensor: our flattened feature map, +`pool2_flat`. The `units` argument specifies the number of neurons in the dense +layer (1,024). The `activation` argument takes the activation function; again, +we'll use `tf.nn.relu` to add ReLU activation. + +To help improve the results of our model, we also apply dropout regularization +to our dense layer, using the `dropout` method in `layers`: + +```python +dropout = tf.layers.dropout( + inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN) +``` + +Again, `inputs` specifies the input tensor, which is the output tensor from our +dense layer (`dense`). + +The `rate` argument specifies the dropout rate; here, we use `0.4`, which means +40% of the elements will be randomly dropped out during training. + +The `training` argument takes a boolean specifying whether or not the model is +currently being run in training mode; dropout will only be performed if +`training` is `True`. Here, we check if the `mode` passed to our model function +`cnn_model_fn` is `TRAIN` mode. + +Our output tensor `dropout` has shape [batch_size, 1024]. + +### Logits Layer + +The final layer in our neural network is the logits layer, which will return the +raw values for our predictions. We create a dense layer with 10 neurons (one for +each target class 0–9), with linear activation (the default): + +```python +logits = tf.layers.dense(inputs=dropout, units=10) +``` + +Our final output tensor of the CNN, `logits`, has shape +[batch_size, 10]. + +### Generate Predictions {#generate_predictions} + +The logits layer of our model returns our predictions as raw values in a +[batch_size, 10]-dimensional tensor. Let's convert these +raw values into two different formats that our model function can return: + +* The **predicted class** for each example: a digit from 0–9. +* The **probabilities** for each possible target class for each example: the + probability that the example is a 0, is a 1, is a 2, etc. + +For a given example, our predicted class is the element in the corresponding row +of the logits tensor with the highest raw value. We can find the index of this +element using the @{tf.argmax} +function: + +```python +tf.argmax(input=logits, axis=1) +``` + +The `input` argument specifies the tensor from which to extract maximum +values—here `logits`. The `axis` argument specifies the axis of the `input` +tensor along which to find the greatest value. Here, we want to find the largest +value along the dimension with index of 1, which corresponds to our predictions +(recall that our logits tensor has shape [batch_size, +10]). + +We can derive probabilities from our logits layer by applying softmax activation +using @{tf.nn.softmax}: + +```python +tf.nn.softmax(logits, name="softmax_tensor") +``` + +> Note: We use the `name` argument to explicitly name this operation +> `softmax_tensor`, so we can reference it later. (We'll set up logging for the +> softmax values in ["Set Up a Logging Hook"](#set-up-a-logging-hook)). + +We compile our predictions in a dict, and return an `EstimatorSpec` object: + +```python +predictions = { + "classes": tf.argmax(input=logits, axis=1), + "probabilities": tf.nn.softmax(logits, name="softmax_tensor") +} +if mode == tf.estimator.ModeKeys.PREDICT: + return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions) +``` + +### Calculate Loss {#calculating-loss} + +For both training and evaluation, we need to define a +[loss function](https://en.wikipedia.org/wiki/Loss_function) +that measures how closely the model's predictions match the target classes. For +multiclass classification problems like MNIST, +[cross entropy](https://en.wikipedia.org/wiki/Cross_entropy) is typically used +as the loss metric. The following code calculates cross entropy when the model +runs in either `TRAIN` or `EVAL` mode: + +```python +loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits) +``` + +Let's take a closer look at what's happening above. + +Our `labels` tensor contains a list of prediction indices for our examples, e.g. `[1, +9, ...]`. `logits` contains the linear outputs of our last layer. + +`tf.losses.sparse_softmax_cross_entropy`, calculates the softmax crossentropy +(aka: categorical crossentropy, negative log-likelihood) from these two inputs +in an efficient, numerically stable way. + + +### Configure the Training Op + +In the previous section, we defined loss for our CNN as the softmax +cross-entropy of the logits layer and our labels. Let's configure our model to +optimize this loss value during training. We'll use a learning rate of 0.001 and +[stochastic gradient descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) +as the optimization algorithm: + +```python +if mode == tf.estimator.ModeKeys.TRAIN: + optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001) + train_op = optimizer.minimize( + loss=loss, + global_step=tf.train.get_global_step()) + return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op) +``` + +> Note: For a more in-depth look at configuring training ops for Estimator model +> functions, see @{$custom_estimators#defining-the-training-op-for-the-model$"Defining the training op for the model"} +> in the @{$custom_estimators$"Creating Estimations in tf.estimator"} tutorial. + + +### Add evaluation metrics + +To add accuracy metric in our model, we define `eval_metric_ops` dict in EVAL +mode as follows: + +```python +eval_metric_ops = { + "accuracy": tf.metrics.accuracy( + labels=labels, predictions=predictions["classes"])} +return tf.estimator.EstimatorSpec( + mode=mode, loss=loss, eval_metric_ops=eval_metric_ops) +``` + + +## Training and Evaluating the CNN MNIST Classifier + +We've coded our MNIST CNN model function; now we're ready to train and evaluate +it. + +### Load Training and Test Data + +First, let's load our training and test data. Add a `main()` function to +`cnn_mnist.py` with the following code: + +```python +def main(unused_argv): + # Load training and eval data + mnist = tf.contrib.learn.datasets.load_dataset("mnist") + train_data = mnist.train.images # Returns np.array + train_labels = np.asarray(mnist.train.labels, dtype=np.int32) + eval_data = mnist.test.images # Returns np.array + eval_labels = np.asarray(mnist.test.labels, dtype=np.int32) +``` + +We store the training feature data (the raw pixel values for 55,000 images of +hand-drawn digits) and training labels (the corresponding value from 0–9 for +each image) as [numpy +arrays](https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html) +in `train_data` and `train_labels`, respectively. Similarly, we store the +evaluation feature data (10,000 images) and evaluation labels in `eval_data` +and `eval_labels`, respectively. + +### Create the Estimator {#create-the-estimator} + +Next, let's create an `Estimator` (a TensorFlow class for performing high-level +model training, evaluation, and inference) for our model. Add the following code +to `main()`: + +```python +# Create the Estimator +mnist_classifier = tf.estimator.Estimator( + model_fn=cnn_model_fn, model_dir="/tmp/mnist_convnet_model") +``` + +The `model_fn` argument specifies the model function to use for training, +evaluation, and prediction; we pass it the `cnn_model_fn` we created in +["Building the CNN MNIST Classifier."](#building-the-cnn-mnist-classifier) The +`model_dir` argument specifies the directory where model data (checkpoints) will +be saved (here, we specify the temp directory `/tmp/mnist_convnet_model`, but +feel free to change to another directory of your choice). + +> Note: For an in-depth walkthrough of the TensorFlow `Estimator` API, see the +> tutorial @{$custom_estimators$"Creating Estimators in tf.estimator."} + +### Set Up a Logging Hook {#set_up_a_logging_hook} + +Since CNNs can take a while to train, let's set up some logging so we can track +progress during training. We can use TensorFlow's @{tf.train.SessionRunHook} to create a +@{tf.train.LoggingTensorHook} +that will log the probability values from the softmax layer of our CNN. Add the +following to `main()`: + +```python +# Set up logging for predictions +tensors_to_log = {"probabilities": "softmax_tensor"} +logging_hook = tf.train.LoggingTensorHook( + tensors=tensors_to_log, every_n_iter=50) +``` + +We store a dict of the tensors we want to log in `tensors_to_log`. Each key is a +label of our choice that will be printed in the log output, and the +corresponding label is the name of a `Tensor` in the TensorFlow graph. Here, our +`probabilities` can be found in `softmax_tensor`, the name we gave our softmax +operation earlier when we generated the probabilities in `cnn_model_fn`. + +> Note: If you don't explicitly assign a name to an operation via the `name` +> argument, TensorFlow will assign a default name. A couple easy ways to +> discover the names applied to operations are to visualize your graph on +> @{$graph_viz$TensorBoard}) or to enable the +> @{$guide/debugger$TensorFlow Debugger (tfdbg)}. + +Next, we create the `LoggingTensorHook`, passing `tensors_to_log` to the +`tensors` argument. We set `every_n_iter=50`, which specifies that probabilities +should be logged after every 50 steps of training. + +### Train the Model + +Now we're ready to train our model, which we can do by creating `train_input_fn` +and calling `train()` on `mnist_classifier`. Add the following to `main()`: + +```python +# Train the model +train_input_fn = tf.estimator.inputs.numpy_input_fn( + x={"x": train_data}, + y=train_labels, + batch_size=100, + num_epochs=None, + shuffle=True) +mnist_classifier.train( + input_fn=train_input_fn, + steps=20000, + hooks=[logging_hook]) +``` + +In the `numpy_input_fn` call, we pass the training feature data and labels to +`x` (as a dict) and `y`, respectively. We set a `batch_size` of `100` (which +means that the model will train on minibatches of 100 examples at each step). +`num_epochs=None` means that the model will train until the specified number of +steps is reached. We also set `shuffle=True` to shuffle the training data. +In the `train` call, we set `steps=20000` +(which means the model will train for 20,000 steps total). We pass our +`logging_hook` to the `hooks` argument, so that it will be triggered during +training. + +### Evaluate the Model + +Once training is complete, we want to evaluate our model to determine its +accuracy on the MNIST test set. We call the `evaluate` method, which evaluates +the metrics we specified in `eval_metric_ops` argument in the `model_fn`. +Add the following to `main()`: + +```python +# Evaluate the model and print results +eval_input_fn = tf.estimator.inputs.numpy_input_fn( + x={"x": eval_data}, + y=eval_labels, + num_epochs=1, + shuffle=False) +eval_results = mnist_classifier.evaluate(input_fn=eval_input_fn) +print(eval_results) +``` + +To create `eval_input_fn`, we set `num_epochs=1`, so that the model evaluates +the metrics over one epoch of data and returns the result. We also set +`shuffle=False` to iterate through the data sequentially. + +### Run the Model + +We've coded the CNN model function, `Estimator`, and the training/evaluation +logic; now let's see the results. Run `cnn_mnist.py`. + +> Note: Training CNNs is quite computationally intensive. Estimated completion +> time of `cnn_mnist.py` will vary depending on your processor, but will likely +> be upwards of 1 hour on CPU. To train more quickly, you can decrease the +> number of `steps` passed to `train()`, but note that this will affect accuracy. + +As the model trains, you'll see log output like the following: + +```python +INFO:tensorflow:loss = 2.36026, step = 1 +INFO:tensorflow:probabilities = [[ 0.07722801 0.08618255 0.09256398, ...]] +... +INFO:tensorflow:loss = 2.13119, step = 101 +INFO:tensorflow:global_step/sec: 5.44132 +... +INFO:tensorflow:Loss for final step: 0.553216. + +INFO:tensorflow:Restored model from /tmp/mnist_convnet_model +INFO:tensorflow:Eval steps [0,inf) for training step 20000. +INFO:tensorflow:Input iterator is exhausted. +INFO:tensorflow:Saving evaluation summary for step 20000: accuracy = 0.9733, loss = 0.0902271 +{'loss': 0.090227105, 'global_step': 20000, 'accuracy': 0.97329998} +``` + +Here, we've achieved an accuracy of 97.3% on our test data set. + +## Additional Resources + +To learn more about TensorFlow Estimators and CNNs in TensorFlow, see the +following resources: + +* @{$custom_estimators$Creating Estimators in tf.estimator} + provides an introduction to the TensorFlow Estimator API. It walks through + configuring an Estimator, writing a model function, calculating loss, and + defining a training op. +* @{$deep_cnn} walks through how to build a MNIST CNN classification model + *without estimators* using lower-level TensorFlow operations. diff --git a/tensorflow/docs_src/tutorials/index.md b/tensorflow/docs_src/tutorials/index.md deleted file mode 100644 index 6bd3a3a897..0000000000 --- a/tensorflow/docs_src/tutorials/index.md +++ /dev/null @@ -1,59 +0,0 @@ -# Tutorials - - -This section contains tutorials demonstrating how to do specific tasks -in TensorFlow. If you are new to TensorFlow, we recommend reading -[Get Started with TensorFlow](/get_started/). - -## Images - -These tutorials cover different aspects of image recognition: - - * @{$layers$MNIST}, which introduces convolutional neural networks (CNNs) and - demonstrates how to build a CNN in TensorFlow. - * @{$image_recognition}, which introduces the field of image recognition and - uses a pre-trained model (Inception) for recognizing images. - * @{$image_retraining}, which has a wonderfully self-explanatory title. - * @{$deep_cnn}, which demonstrates how to build a small CNN for recognizing - images. This tutorial is aimed at advanced TensorFlow users. - - -## Sequences - -These tutorials focus on machine learning problems dealing with sequence data. - - * @{$recurrent}, which demonstrates how to use a - recurrent neural network to predict the next word in a sentence. - * @{$seq2seq}, which demonstrates how to use a - sequence-to-sequence model to translate text from English to French. - * @{$recurrent_quickdraw} - builds a classification model for drawings, directly from the sequence of - pen strokes. - * @{$audio_recognition}, which shows how to - build a basic speech recognition network. - -## Data representation - -These tutorials demonstrate various data representations that can be used in -TensorFlow. - - * @{$wide}, uses - @{tf.feature_column$feature columns} to feed a variety of data types - to linear model, to solve a classification problem. - * @{$wide_and_deep}, builds on the - above linear model tutorial, adding a deep feed-forward neural network - component and a DNN-compatible data representation. - * @{$word2vec}, which demonstrates how to - create an embedding for words. - * @{$kernel_methods}, - which shows how to improve the quality of a linear model by using explicit - kernel mappings. - -## Non Machine Learning - -Although TensorFlow specializes in machine learning, the core of TensorFlow is -a powerful numeric computation system which you can also use to solve other -kinds of math problems. For example: - - * @{$mandelbrot} - * @{$pdes} diff --git a/tensorflow/docs_src/tutorials/keras/basic_classification.md b/tensorflow/docs_src/tutorials/keras/basic_classification.md new file mode 100644 index 0000000000..91bbd85b24 --- /dev/null +++ b/tensorflow/docs_src/tutorials/keras/basic_classification.md @@ -0,0 +1,3 @@ +# Basic Classification + +[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/basic_classification.ipynb) diff --git a/tensorflow/docs_src/tutorials/keras/basic_regression.md b/tensorflow/docs_src/tutorials/keras/basic_regression.md new file mode 100644 index 0000000000..a535f22f5a --- /dev/null +++ b/tensorflow/docs_src/tutorials/keras/basic_regression.md @@ -0,0 +1,3 @@ +# Basic Regression + +[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/basic_regression.ipynb) diff --git a/tensorflow/docs_src/tutorials/keras/basic_text_classification.md b/tensorflow/docs_src/tutorials/keras/basic_text_classification.md new file mode 100644 index 0000000000..7c5d4f7896 --- /dev/null +++ b/tensorflow/docs_src/tutorials/keras/basic_text_classification.md @@ -0,0 +1,3 @@ +# Basic Text Classification + +[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/basic_text_classification.ipynb) diff --git a/tensorflow/docs_src/tutorials/keras/index.md b/tensorflow/docs_src/tutorials/keras/index.md new file mode 100644 index 0000000000..9d42281c8f --- /dev/null +++ b/tensorflow/docs_src/tutorials/keras/index.md @@ -0,0 +1,22 @@ +# Learn and use machine learning + +This notebook collection is inspired by the book +*[Deep Learning with Python](https://books.google.com/books?id=Yo3CAQAACAAJ)*. +These tutorials use `tf.keras`, TensorFlow's high-level Python API for building +and training deep learning models. To learn more about using Keras with +TensorFlow, see the [TensorFlow Keras Guide](../../guide/keras). + +Publisher's note: *Deep Learning with Python* introduces the field of deep +learning using the Python language and the powerful Keras library. Written by +Keras creator and Google AI researcher François Chollet, this book builds your +understanding through intuitive explanations and practical examples. + +To learn about machine learning fundamentals and concepts, consider taking the +[Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/). +Additional TensorFlow and machine learning resources are listed in [next steps](../next_steps). + +1. [Basic classification](./basic_classification) +2. [Text classification](./basic_text_classification) +3. [Regression](./basic_regression) +4. [Overfitting and underfitting](./overfit_and_underfit) +5. [Save and restore models](./save_and_restore_models) diff --git a/tensorflow/docs_src/tutorials/keras/overfit_and_underfit.md b/tensorflow/docs_src/tutorials/keras/overfit_and_underfit.md new file mode 100644 index 0000000000..e5b5ae7b5a --- /dev/null +++ b/tensorflow/docs_src/tutorials/keras/overfit_and_underfit.md @@ -0,0 +1,3 @@ +# Overfitting and Underfitting + +[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/overfit_and_underfit.ipynb) diff --git a/tensorflow/docs_src/tutorials/keras/save_and_restore_models.md b/tensorflow/docs_src/tutorials/keras/save_and_restore_models.md new file mode 100644 index 0000000000..44b3772945 --- /dev/null +++ b/tensorflow/docs_src/tutorials/keras/save_and_restore_models.md @@ -0,0 +1,3 @@ +# Save and restore Models + +[Colab notebook](https://colab.research.google.com/github/tensorflow/models/blob/master/samples/core/get_started/save_and_restore_models.ipynb) diff --git a/tensorflow/docs_src/tutorials/kernel_methods.md b/tensorflow/docs_src/tutorials/kernel_methods.md deleted file mode 100644 index 205e2a2d2c..0000000000 --- a/tensorflow/docs_src/tutorials/kernel_methods.md +++ /dev/null @@ -1,304 +0,0 @@ -# Improving Linear Models Using Explicit Kernel Methods - -Note: This document uses a deprecated version of @{tf.estimator}, -which has a @{tf.contrib.learn.Estimator$different interface}. -It also uses other `contrib` methods whose -@{$version_compat#not_covered$API may not be stable}. - -In this tutorial, we demonstrate how combining (explicit) kernel methods with -linear models can drastically increase the latters' quality of predictions -without significantly increasing training and inference times. Unlike dual -kernel methods, explicit (primal) kernel methods scale well with the size of the -training dataset both in terms of training/inference times and in terms of -memory requirements. - -**Intended audience:** Even though we provide a high-level overview of concepts -related to explicit kernel methods, this tutorial primarily targets readers who -already have at least basic knowledge of kernel methods and Support Vector -Machines (SVMs). If you are new to kernel methods, refer to either of the -following sources for an introduction: - -* If you have a strong mathematical background: -[Kernel Methods in Machine Learning](https://arxiv.org/pdf/math/0701907.pdf) -* [Kernel method wikipedia page](https://en.wikipedia.org/wiki/Kernel_method) - -Currently, TensorFlow supports explicit kernel mappings for dense features only; -TensorFlow will provide support for sparse features at a later release. - -This tutorial uses [tf.contrib.learn](https://www.tensorflow.org/code/tensorflow/contrib/learn/python/learn) -(TensorFlow's high-level Machine Learning API) Estimators for our ML models. -If you are not familiar with this API, [tf.estimator Quickstart](https://www.tensorflow.org/get_started/estimator) -is a good place to start. We will use the MNIST dataset. The tutorial consists -of the following steps: - -* Load and prepare MNIST data for classification. -* Construct a simple linear model, train it, and evaluate it on the eval data. -* Replace the linear model with a kernelized linear model, re-train, and -re-evaluate. - -## Load and prepare MNIST data for classification -Run the following utility command to load the MNIST dataset: - -```python -data = tf.contrib.learn.datasets.mnist.load_mnist() -``` -The preceding method loads the entire MNIST dataset (containing 70K samples) and -splits it into train, validation, and test data with 55K, 5K, and 10K samples -respectively. Each split contains one numpy array for images (with shape -[sample_size, 784]) and one for labels (with shape [sample_size, 1]). In this -tutorial, we only use the train and validation splits to train and evaluate our -models respectively. - -In order to feed data to a `tf.contrib.learn Estimator`, it is helpful to convert -it to Tensors. For this, we will use an `input function` which adds Ops to the -TensorFlow graph that, when executed, create mini-batches of Tensors to be used -downstream. For more background on input functions, check -@{$premade_estimators#create_input_functions$this section on input functions}. -In this example, we will use the `tf.train.shuffle_batch` Op which, besides -converting numpy arrays to Tensors, allows us to specify the batch_size and -whether to randomize the input every time the input_fn Ops are executed -(randomization typically expedites convergence during training). The full code -for loading and preparing the data is shown in the snippet below. In this -example, we use mini-batches of size 256 for training and the entire sample -(5K entries) for evaluation. Feel free to experiment with different batch sizes. - -```python -import numpy as np -import tensorflow as tf - -def get_input_fn(dataset_split, batch_size, capacity=10000, min_after_dequeue=3000): - - def _input_fn(): - images_batch, labels_batch = tf.train.shuffle_batch( - tensors=[dataset_split.images, dataset_split.labels.astype(np.int32)], - batch_size=batch_size, - capacity=capacity, - min_after_dequeue=min_after_dequeue, - enqueue_many=True, - num_threads=4) - features_map = {'images': images_batch} - return features_map, labels_batch - - return _input_fn - -data = tf.contrib.learn.datasets.mnist.load_mnist() - -train_input_fn = get_input_fn(data.train, batch_size=256) -eval_input_fn = get_input_fn(data.validation, batch_size=5000) - -``` - -## Training a simple linear model -We can now train a linear model over the MNIST dataset. We will use the -@{tf.contrib.learn.LinearClassifier} estimator with 10 classes representing the -10 digits. The input features form a 784-dimensional dense vector which can -be specified as follows: - -```python -image_column = tf.contrib.layers.real_valued_column('images', dimension=784) -``` - -The full code for constructing, training and evaluating a LinearClassifier -estimator is as follows: - -```python -import time - -# Specify the feature(s) to be used by the estimator. -image_column = tf.contrib.layers.real_valued_column('images', dimension=784) -estimator = tf.contrib.learn.LinearClassifier(feature_columns=[image_column], n_classes=10) - -# Train. -start = time.time() -estimator.fit(input_fn=train_input_fn, steps=2000) -end = time.time() -print('Elapsed time: {} seconds'.format(end - start)) - -# Evaluate and report metrics. -eval_metrics = estimator.evaluate(input_fn=eval_input_fn, steps=1) -print(eval_metrics) -``` -The following table summarizes the results on the eval data. - -metric | value -:------------ | :------------ -loss | 0.25 to 0.30 -accuracy | 92.5% -training time | ~25 seconds on my machine - -Note: Metrics will vary depending on various factors. - -In addition to experimenting with the (training) batch size and the number of -training steps, there are a couple other parameters that can be tuned as well. -For instance, you can change the optimization method used to minimize the loss -by explicitly selecting another optimizer from the collection of -[available optimizers](https://www.tensorflow.org/code/tensorflow/python/training). -As an example, the following code constructs a LinearClassifier estimator that -uses the Follow-The-Regularized-Leader (FTRL) optimization strategy with a -specific learning rate and L2-regularization. - - -```python -optimizer = tf.train.FtrlOptimizer(learning_rate=5.0, l2_regularization_strength=1.0) -estimator = tf.contrib.learn.LinearClassifier( - feature_columns=[image_column], n_classes=10, optimizer=optimizer) -``` - -Regardless of the values of the parameters, the maximum accuracy a linear model -can achieve on this dataset caps at around **93%**. - -## Using explicit kernel mappings with the linear model. -The relatively high error (~7%) of the linear model over MNIST indicates that -the input data is not linearly separable. We will use explicit kernel mappings -to reduce the classification error. - -**Intuition:** The high-level idea is to use a non-linear map to transform the -input space to another feature space (of possibly higher dimension) where the -(transformed) features are (almost) linearly separable and then apply a linear -model on the mapped features. This is shown in the following figure: - -
- -
- - -### Technical details -In this example we will use **Random Fourier Features**, introduced in the -["Random Features for Large-Scale Kernel Machines"](https://people.eecs.berkeley.edu/~brecht/papers/07.rah.rec.nips.pdf) -paper by Rahimi and Recht, to map the input data. Random Fourier Features map a -vector \\(\mathbf{x} \in \mathbb{R}^d\\) to \\(\mathbf{x'} \in \mathbb{R}^D\\) -via the following mapping: - -$$ -RFFM(\cdot): \mathbb{R}^d \to \mathbb{R}^D, \quad -RFFM(\mathbf{x}) = \cos(\mathbf{\Omega} \cdot \mathbf{x}+ \mathbf{b}) -$$ - -where \\(\mathbf{\Omega} \in \mathbb{R}^{D \times d}\\), -\\(\mathbf{x} \in \mathbb{R}^d,\\) \\(\mathbf{b} \in \mathbb{R}^D\\) and the -cosine is applied element-wise. - -In this example, the entries of \\(\mathbf{\Omega}\\) and \\(\mathbf{b}\\) are -sampled from distributions such that the mapping satisfies the following -property: - -$$ -RFFM(\mathbf{x})^T \cdot RFFM(\mathbf{y}) \approx -e^{-\frac{\|\mathbf{x} - \mathbf{y}\|^2}{2 \sigma^2}} -$$ - -The right-hand-side quantity of the expression above is known as the RBF (or -Gaussian) kernel function. This function is one of the most-widely used kernel -functions in Machine Learning and implicitly measures similarity in a different, -much higher dimensional space than the original one. See -[Radial basis function kernel](https://en.wikipedia.org/wiki/Radial_basis_function_kernel) -for more details. - -### Kernel classifier -@{tf.contrib.kernel_methods.KernelLinearClassifier} is a pre-packaged -`tf.contrib.learn` estimator that combines the power of explicit kernel mappings -with linear models. Its constructor is almost identical to that of the -LinearClassifier estimator with the additional option to specify a list of -explicit kernel mappings to be applied to each feature the classifier uses. The -following code snippet demonstrates how to replace LinearClassifier with -KernelLinearClassifier. - - -```python -# Specify the feature(s) to be used by the estimator. This is identical to the -# code used for the LinearClassifier. -image_column = tf.contrib.layers.real_valued_column('images', dimension=784) -optimizer = tf.train.FtrlOptimizer( - learning_rate=50.0, l2_regularization_strength=0.001) - - -kernel_mapper = tf.contrib.kernel_methods.RandomFourierFeatureMapper( - input_dim=784, output_dim=2000, stddev=5.0, name='rffm') -kernel_mappers = {image_column: [kernel_mapper]} -estimator = tf.contrib.kernel_methods.KernelLinearClassifier( - n_classes=10, optimizer=optimizer, kernel_mappers=kernel_mappers) - -# Train. -start = time.time() -estimator.fit(input_fn=train_input_fn, steps=2000) -end = time.time() -print('Elapsed time: {} seconds'.format(end - start)) - -# Evaluate and report metrics. -eval_metrics = estimator.evaluate(input_fn=eval_input_fn, steps=1) -print(eval_metrics) -``` -The only additional parameter passed to `KernelLinearClassifier` is a dictionary -from feature_columns to a list of kernel mappings to be applied to the -corresponding feature column. The following lines instruct the classifier to -first map the initial 784-dimensional images to 2000-dimensional vectors using -random Fourier features and then learn a linear model on the transformed -vectors: - -```python -kernel_mapper = tf.contrib.kernel_methods.RandomFourierFeatureMapper( - input_dim=784, output_dim=2000, stddev=5.0, name='rffm') -kernel_mappers = {image_column: [kernel_mapper]} -estimator = tf.contrib.kernel_methods.KernelLinearClassifier( - n_classes=10, optimizer=optimizer, kernel_mappers=kernel_mappers) -``` -Notice the `stddev` parameter. This is the standard deviation (\\(\sigma\\)) of -the approximated RBF kernel and controls the similarity measure used in -classification. `stddev` is typically determined via hyperparameter tuning. - -The results of running the preceding code are summarized in the following table. -We can further increase the accuracy by increasing the output dimension of the -mapping and tuning the standard deviation. - -metric | value -:------------ | :------------ -loss | 0.10 -accuracy | 97% -training time | ~35 seconds on my machine - - -### stddev -The classification quality is very sensitive to the value of stddev. The -following table shows the accuracy of the classifier on the eval data for -different values of stddev. The optimal value is stddev=5.0. Notice how too -small or too high stddev values can dramatically decrease the accuracy of the -classification. - -stddev | eval accuracy -:----- | :------------ -1.0 | 0.1362 -2.0 | 0.4764 -4.0 | 0.9654 -5.0 | 0.9766 -8.0 | 0.9714 -16.0 | 0.8878 - -### Output dimension -Intuitively, the larger the output dimension of the mapping, the closer the -inner product of two mapped vectors approximates the kernel, which typically -translates to better classification accuracy. Another way to think about this is -that the output dimension equals the number of weights of the linear model; the -larger this dimension, the larger the "degrees of freedom" of the model. -However, after a certain threshold, higher output dimensions increase the -accuracy by very little, while making training take more time. This is shown in -the following two Figures which depict the eval accuracy as a function of the -output dimension and the training time, respectively. - -![image](https://www.tensorflow.org/versions/master/images/acc_vs_outdim.png) -![image](https://www.tensorflow.org/versions/master/images/acc-vs-trn_time.png) - - -## Summary -Explicit kernel mappings combine the predictive power of nonlinear models with -the scalability of linear models. Unlike traditional dual kernel methods, -explicit kernel methods can scale to millions or hundreds of millions of -samples. When using explicit kernel mappings, consider the following tips: - -* Random Fourier Features can be particularly effective for datasets with dense -features. -* The parameters of the kernel mapping are often data-dependent. Model quality -can be very sensitive to these parameters. Use hyperparameter tuning to find the -optimal values. -* If you have multiple numerical features, concatenate them into a single -multi-dimensional feature and apply the kernel mapping to the concatenated -vector. diff --git a/tensorflow/docs_src/tutorials/layers.md b/tensorflow/docs_src/tutorials/layers.md deleted file mode 100644 index 212e337637..0000000000 --- a/tensorflow/docs_src/tutorials/layers.md +++ /dev/null @@ -1,727 +0,0 @@ -# A Guide to TF Layers: Building a Convolutional Neural Network - -The TensorFlow @{tf.layers$`layers` module} provides a high-level API that makes -it easy to construct a neural network. It provides methods that facilitate the -creation of dense (fully connected) layers and convolutional layers, adding -activation functions, and applying dropout regularization. In this tutorial, -you'll learn how to use `layers` to build a convolutional neural network model -to recognize the handwritten digits in the MNIST data set. - -![handwritten digits 0–9 from the MNIST data set](https://www.tensorflow.org/images/mnist_0-9.png) - -**The [MNIST dataset](http://yann.lecun.com/exdb/mnist/) comprises 60,000 -training examples and 10,000 test examples of the handwritten digits 0–9, -formatted as 28x28-pixel monochrome images.** - -## Getting Started - -Let's set up the skeleton for our TensorFlow program. Create a file called -`cnn_mnist.py`, and add the following code: - -```python -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -# Imports -import numpy as np -import tensorflow as tf - -tf.logging.set_verbosity(tf.logging.INFO) - -# Our application logic will be added here - -if __name__ == "__main__": - tf.app.run() -``` - -As you work through the tutorial, you'll add code to construct, train, and -evaluate the convolutional neural network. The complete, final code can be -[found here](https://www.tensorflow.org/code/tensorflow/examples/tutorials/layers/cnn_mnist.py). - -## Intro to Convolutional Neural Networks - -Convolutional neural networks (CNNs) are the current state-of-the-art model -architecture for image classification tasks. CNNs apply a series of filters to -the raw pixel data of an image to extract and learn higher-level features, which -the model can then use for classification. CNNs contains three components: - -* **Convolutional layers**, which apply a specified number of convolution - filters to the image. For each subregion, the layer performs a set of - mathematical operations to produce a single value in the output feature map. - Convolutional layers then typically apply a - [ReLU activation function](https://en.wikipedia.org/wiki/Rectifier_\(neural_networks\)) to - the output to introduce nonlinearities into the model. - -* **Pooling layers**, which - [downsample the image data](https://en.wikipedia.org/wiki/Convolutional_neural_network#Pooling_layer) - extracted by the convolutional layers to reduce the dimensionality of the - feature map in order to decrease processing time. A commonly used pooling - algorithm is max pooling, which extracts subregions of the feature map - (e.g., 2x2-pixel tiles), keeps their maximum value, and discards all other - values. - -* **Dense (fully connected) layers**, which perform classification on the - features extracted by the convolutional layers and downsampled by the - pooling layers. In a dense layer, every node in the layer is connected to - every node in the preceding layer. - -Typically, a CNN is composed of a stack of convolutional modules that perform -feature extraction. Each module consists of a convolutional layer followed by a -pooling layer. The last convolutional module is followed by one or more dense -layers that perform classification. The final dense layer in a CNN contains a -single node for each target class in the model (all the possible classes the -model may predict), with a -[softmax](https://en.wikipedia.org/wiki/Softmax_function) activation function to -generate a value between 0–1 for each node (the sum of all these softmax values -is equal to 1). We can interpret the softmax values for a given image as -relative measurements of how likely it is that the image falls into each target -class. - -> Note: For a more comprehensive walkthrough of CNN architecture, see Stanford -> University's -> Convolutional Neural Networks for Visual Recognition course materials.

- -## Building the CNN MNIST Classifier {#building_the_cnn_mnist_classifier} - -Let's build a model to classify the images in the MNIST dataset using the -following CNN architecture: - -1. **Convolutional Layer #1**: Applies 32 5x5 filters (extracting 5x5-pixel - subregions), with ReLU activation function -2. **Pooling Layer #1**: Performs max pooling with a 2x2 filter and stride of 2 - (which specifies that pooled regions do not overlap) -3. **Convolutional Layer #2**: Applies 64 5x5 filters, with ReLU activation - function -4. **Pooling Layer #2**: Again, performs max pooling with a 2x2 filter and - stride of 2 -5. **Dense Layer #1**: 1,024 neurons, with dropout regularization rate of 0.4 - (probability of 0.4 that any given element will be dropped during training) -6. **Dense Layer #2 (Logits Layer)**: 10 neurons, one for each digit target - class (0–9). - -The `tf.layers` module contains methods to create each of the three layer types -above: - -* `conv2d()`. Constructs a two-dimensional convolutional layer. Takes number - of filters, filter kernel size, padding, and activation function as - arguments. -* `max_pooling2d()`. Constructs a two-dimensional pooling layer using the - max-pooling algorithm. Takes pooling filter size and stride as arguments. -* `dense()`. Constructs a dense layer. Takes number of neurons and activation - function as arguments. - -Each of these methods accepts a tensor as input and returns a transformed tensor -as output. This makes it easy to connect one layer to another: just take the -output from one layer-creation method and supply it as input to another. - -Open `cnn_mnist.py` and add the following `cnn_model_fn` function, which -conforms to the interface expected by TensorFlow's Estimator API (more on this -later in [Create the Estimator](#create-the-estimator)). `cnn_mnist.py` takes -MNIST feature data, labels, and -@{tf.estimator.ModeKeys$model mode} (`TRAIN`, `EVAL`, `PREDICT`) as arguments; -configures the CNN; and returns predictions, loss, and a training operation: - -```python -def cnn_model_fn(features, labels, mode): - """Model function for CNN.""" - # Input Layer - input_layer = tf.reshape(features["x"], [-1, 28, 28, 1]) - - # Convolutional Layer #1 - conv1 = tf.layers.conv2d( - inputs=input_layer, - filters=32, - kernel_size=[5, 5], - padding="same", - activation=tf.nn.relu) - - # Pooling Layer #1 - pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2) - - # Convolutional Layer #2 and Pooling Layer #2 - conv2 = tf.layers.conv2d( - inputs=pool1, - filters=64, - kernel_size=[5, 5], - padding="same", - activation=tf.nn.relu) - pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2) - - # Dense Layer - pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64]) - dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu) - dropout = tf.layers.dropout( - inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN) - - # Logits Layer - logits = tf.layers.dense(inputs=dropout, units=10) - - predictions = { - # Generate predictions (for PREDICT and EVAL mode) - "classes": tf.argmax(input=logits, axis=1), - # Add `softmax_tensor` to the graph. It is used for PREDICT and by the - # `logging_hook`. - "probabilities": tf.nn.softmax(logits, name="softmax_tensor") - } - - if mode == tf.estimator.ModeKeys.PREDICT: - return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions) - - # Calculate Loss (for both TRAIN and EVAL modes) - loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits) - - # Configure the Training Op (for TRAIN mode) - if mode == tf.estimator.ModeKeys.TRAIN: - optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001) - train_op = optimizer.minimize( - loss=loss, - global_step=tf.train.get_global_step()) - return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op) - - # Add evaluation metrics (for EVAL mode) - eval_metric_ops = { - "accuracy": tf.metrics.accuracy( - labels=labels, predictions=predictions["classes"])} - return tf.estimator.EstimatorSpec( - mode=mode, loss=loss, eval_metric_ops=eval_metric_ops) -``` - -The following sections (with headings corresponding to each code block above) -dive deeper into the `tf.layers` code used to create each layer, as well as how -to calculate loss, configure the training op, and generate predictions. If -you're already experienced with CNNs and @{$custom_estimators$TensorFlow `Estimator`s}, -and find the above code intuitive, you may want to skim these sections or just -skip ahead to ["Training and Evaluating the CNN MNIST Classifier"](#train_eval_mnist). - -### Input Layer - -The methods in the `layers` module for creating convolutional and pooling layers -for two-dimensional image data expect input tensors to have a shape of -[batch_size, image_height, image_width, -channels] by default. This behavior can be changed using the data_format parameter; defined as follows: - - -* _`batch_size`_. Size of the subset of examples to use when performing - gradient descent during training. -* _`image_height`_. Height of the example images. -* _`image_width`_. Width of the example images. -* _`channels`_. Number of color channels in the example images. For color - images, the number of channels is 3 (red, green, blue). For monochrome - images, there is just 1 channel (black). -* _`data_format`_. A string, one of `channels_last` (default) or `channels_first`. - `channels_last` corresponds to inputs with shape - `(batch, ..., channels)` while `channels_first` corresponds to - inputs with shape `(batch, channels, ...)`. - -Here, our MNIST dataset is composed of monochrome 28x28 pixel images, so the -desired shape for our input layer is [batch_size, 28, 28, -1]. - -To convert our input feature map (`features`) to this shape, we can perform the -following `reshape` operation: - -```python -input_layer = tf.reshape(features["x"], [-1, 28, 28, 1]) -``` - -Note that we've indicated `-1` for batch size, which specifies that this -dimension should be dynamically computed based on the number of input values in -`features["x"]`, holding the size of all other dimensions constant. This allows -us to treat `batch_size` as a hyperparameter that we can tune. For example, if -we feed examples into our model in batches of 5, `features["x"]` will contain -3,920 values (one value for each pixel in each image), and `input_layer` will -have a shape of `[5, 28, 28, 1]`. Similarly, if we feed examples in batches of -100, `features["x"]` will contain 78,400 values, and `input_layer` will have a -shape of `[100, 28, 28, 1]`. - -### Convolutional Layer #1 - -In our first convolutional layer, we want to apply 32 5x5 filters to the input -layer, with a ReLU activation function. We can use the `conv2d()` method in the -`layers` module to create this layer as follows: - -```python -conv1 = tf.layers.conv2d( - inputs=input_layer, - filters=32, - kernel_size=[5, 5], - padding="same", - activation=tf.nn.relu) -``` - -The `inputs` argument specifies our input tensor, which must have the shape -[batch_size, image_height, image_width, -channels]. Here, we're connecting our first convolutional layer -to `input_layer`, which has the shape [batch_size, 28, 28, -1]. - -> Note: conv2d() will instead accept a shape of -> [batch_size, channels, image_height, image_width] when passed the argument -> data_format=channels_first. - -The `filters` argument specifies the number of filters to apply (here, 32), and -`kernel_size` specifies the dimensions of the filters as [height, -width] (here, [5, 5]). - -

TIP: If filter height and width have the same value, you can instead specify a -single integer for kernel_size—e.g., kernel_size=5.

- -The `padding` argument specifies one of two enumerated values -(case-insensitive): `valid` (default value) or `same`. To specify that the -output tensor should have the same height and width values as the input tensor, -we set `padding=same` here, which instructs TensorFlow to add 0 values to the -edges of the input tensor to preserve height and width of 28. (Without padding, -a 5x5 convolution over a 28x28 tensor will produce a 24x24 tensor, as there are -24x24 locations to extract a 5x5 tile from a 28x28 grid.) - -The `activation` argument specifies the activation function to apply to the -output of the convolution. Here, we specify ReLU activation with -@{tf.nn.relu}. - -Our output tensor produced by `conv2d()` has a shape of -[batch_size, 28, 28, 32]: the same height and width -dimensions as the input, but now with 32 channels holding the output from each -of the filters. - -### Pooling Layer #1 - -Next, we connect our first pooling layer to the convolutional layer we just -created. We can use the `max_pooling2d()` method in `layers` to construct a -layer that performs max pooling with a 2x2 filter and stride of 2: - -```python -pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2) -``` - -Again, `inputs` specifies the input tensor, with a shape of -[batch_size, image_height, image_width, -channels]. Here, our input tensor is `conv1`, the output from -the first convolutional layer, which has a shape of [batch_size, -28, 28, 32]. - -> Note: As with conv2d(), max_pooling2d() will instead -> accept a shape of [batch_size, channels, -> image_height, image_width] when passed the argument -> data_format=channels_first. - -The `pool_size` argument specifies the size of the max pooling filter as -[height, width] (here, `[2, 2]`). If both -dimensions have the same value, you can instead specify a single integer (e.g., -`pool_size=2`). - -The `strides` argument specifies the size of the stride. Here, we set a stride -of 2, which indicates that the subregions extracted by the filter should be -separated by 2 pixels in both the height and width dimensions (for a 2x2 filter, -this means that none of the regions extracted will overlap). If you want to set -different stride values for height and width, you can instead specify a tuple or -list (e.g., `stride=[3, 6]`). - -Our output tensor produced by `max_pooling2d()` (`pool1`) has a shape of -[batch_size, 14, 14, 32]: the 2x2 filter reduces height and width by 50% each. - -### Convolutional Layer #2 and Pooling Layer #2 - -We can connect a second convolutional and pooling layer to our CNN using -`conv2d()` and `max_pooling2d()` as before. For convolutional layer #2, we -configure 64 5x5 filters with ReLU activation, and for pooling layer #2, we use -the same specs as pooling layer #1 (a 2x2 max pooling filter with stride of 2): - -```python -conv2 = tf.layers.conv2d( - inputs=pool1, - filters=64, - kernel_size=[5, 5], - padding="same", - activation=tf.nn.relu) - -pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2) -``` - -Note that convolutional layer #2 takes the output tensor of our first pooling -layer (`pool1`) as input, and produces the tensor `conv2` as output. `conv2` -has a shape of [batch_size, 14, 14, 64], the same height and width as `pool1` (due to `padding="same"`), and 64 channels for the 64 -filters applied. - -Pooling layer #2 takes `conv2` as input, producing `pool2` as output. `pool2` -has shape [batch_size, 7, 7, 64] (50% reduction of height and width from `conv2`). - -### Dense Layer - -Next, we want to add a dense layer (with 1,024 neurons and ReLU activation) to -our CNN to perform classification on the features extracted by the -convolution/pooling layers. Before we connect the layer, however, we'll flatten -our feature map (`pool2`) to shape [batch_size, -features], so that our tensor has only two dimensions: - -```python -pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64]) -``` - -In the `reshape()` operation above, the `-1` signifies that the *`batch_size`* -dimension will be dynamically calculated based on the number of examples in our -input data. Each example has 7 (`pool2` height) * 7 (`pool2` width) * 64 -(`pool2` channels) features, so we want the `features` dimension to have a value -of 7 * 7 * 64 (3136 in total). The output tensor, `pool2_flat`, has shape -[batch_size, 3136]. - -Now, we can use the `dense()` method in `layers` to connect our dense layer as -follows: - -```python -dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu) -``` - -The `inputs` argument specifies the input tensor: our flattened feature map, -`pool2_flat`. The `units` argument specifies the number of neurons in the dense -layer (1,024). The `activation` argument takes the activation function; again, -we'll use `tf.nn.relu` to add ReLU activation. - -To help improve the results of our model, we also apply dropout regularization -to our dense layer, using the `dropout` method in `layers`: - -```python -dropout = tf.layers.dropout( - inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN) -``` - -Again, `inputs` specifies the input tensor, which is the output tensor from our -dense layer (`dense`). - -The `rate` argument specifies the dropout rate; here, we use `0.4`, which means -40% of the elements will be randomly dropped out during training. - -The `training` argument takes a boolean specifying whether or not the model is -currently being run in training mode; dropout will only be performed if -`training` is `True`. Here, we check if the `mode` passed to our model function -`cnn_model_fn` is `TRAIN` mode. - -Our output tensor `dropout` has shape [batch_size, 1024]. - -### Logits Layer - -The final layer in our neural network is the logits layer, which will return the -raw values for our predictions. We create a dense layer with 10 neurons (one for -each target class 0–9), with linear activation (the default): - -```python -logits = tf.layers.dense(inputs=dropout, units=10) -``` - -Our final output tensor of the CNN, `logits`, has shape -[batch_size, 10]. - -### Generate Predictions {#generate_predictions} - -The logits layer of our model returns our predictions as raw values in a -[batch_size, 10]-dimensional tensor. Let's convert these -raw values into two different formats that our model function can return: - -* The **predicted class** for each example: a digit from 0–9. -* The **probabilities** for each possible target class for each example: the - probability that the example is a 0, is a 1, is a 2, etc. - -For a given example, our predicted class is the element in the corresponding row -of the logits tensor with the highest raw value. We can find the index of this -element using the @{tf.argmax} -function: - -```python -tf.argmax(input=logits, axis=1) -``` - -The `input` argument specifies the tensor from which to extract maximum -values—here `logits`. The `axis` argument specifies the axis of the `input` -tensor along which to find the greatest value. Here, we want to find the largest -value along the dimension with index of 1, which corresponds to our predictions -(recall that our logits tensor has shape [batch_size, -10]). - -We can derive probabilities from our logits layer by applying softmax activation -using @{tf.nn.softmax}: - -```python -tf.nn.softmax(logits, name="softmax_tensor") -``` - -> Note: We use the `name` argument to explicitly name this operation -> `softmax_tensor`, so we can reference it later. (We'll set up logging for the -> softmax values in ["Set Up a Logging Hook"](#set-up-a-logging-hook)). - -We compile our predictions in a dict, and return an `EstimatorSpec` object: - -```python -predictions = { - "classes": tf.argmax(input=logits, axis=1), - "probabilities": tf.nn.softmax(logits, name="softmax_tensor") -} -if mode == tf.estimator.ModeKeys.PREDICT: - return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions) -``` - -### Calculate Loss {#calculating-loss} - -For both training and evaluation, we need to define a -[loss function](https://en.wikipedia.org/wiki/Loss_function) -that measures how closely the model's predictions match the target classes. For -multiclass classification problems like MNIST, -[cross entropy](https://en.wikipedia.org/wiki/Cross_entropy) is typically used -as the loss metric. The following code calculates cross entropy when the model -runs in either `TRAIN` or `EVAL` mode: - -```python -onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=10) -loss = tf.losses.softmax_cross_entropy( - onehot_labels=onehot_labels, logits=logits) -``` - -Let's take a closer look at what's happening above. - -Our `labels` tensor contains a list of predictions for our examples, e.g. `[1, -9, ...]`. In order to calculate cross-entropy, first we need to convert `labels` -to the corresponding -[one-hot encoding](https://www.quora.com/What-is-one-hot-encoding-and-when-is-it-used-in-data-science): - -```none -[[0, 1, 0, 0, 0, 0, 0, 0, 0, 0], - [0, 0, 0, 0, 0, 0, 0, 0, 0, 1], - ...] -``` - -We use the @{tf.one_hot} function -to perform this conversion. `tf.one_hot()` has two required arguments: - -* `indices`. The locations in the one-hot tensor that will have "on - values"—i.e., the locations of `1` values in the tensor shown above. -* `depth`. The depth of the one-hot tensor—i.e., the number of target classes. - Here, the depth is `10`. - -The following code creates the one-hot tensor for our labels, `onehot_labels`: - -```python -onehot_labels = tf.one_hot(indices=tf.cast(labels, tf.int32), depth=10) -``` - -Because `labels` contains a series of values from 0–9, `indices` is just our -`labels` tensor, with values cast to integers. The `depth` is `10` because we -have 10 possible target classes, one for each digit. - -Next, we compute cross-entropy of `onehot_labels` and the softmax of the -predictions from our logits layer. `tf.losses.softmax_cross_entropy()` takes -`onehot_labels` and `logits` as arguments, performs softmax activation on -`logits`, calculates cross-entropy, and returns our `loss` as a scalar `Tensor`: - -```python -loss = tf.losses.softmax_cross_entropy( - onehot_labels=onehot_labels, logits=logits) -``` - -### Configure the Training Op - -In the previous section, we defined loss for our CNN as the softmax -cross-entropy of the logits layer and our labels. Let's configure our model to -optimize this loss value during training. We'll use a learning rate of 0.001 and -[stochastic gradient descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) -as the optimization algorithm: - -```python -if mode == tf.estimator.ModeKeys.TRAIN: - optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001) - train_op = optimizer.minimize( - loss=loss, - global_step=tf.train.get_global_step()) - return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op) -``` - -> Note: For a more in-depth look at configuring training ops for Estimator model -> functions, see @{$custom_estimators#defining-the-training-op-for-the-model$"Defining the training op for the model"} -> in the @{$custom_estimators$"Creating Estimations in tf.estimator"} tutorial. - - -### Add evaluation metrics - -To add accuracy metric in our model, we define `eval_metric_ops` dict in EVAL -mode as follows: - -```python -eval_metric_ops = { - "accuracy": tf.metrics.accuracy( - labels=labels, predictions=predictions["classes"])} -return tf.estimator.EstimatorSpec( - mode=mode, loss=loss, eval_metric_ops=eval_metric_ops) -``` - - -## Training and Evaluating the CNN MNIST Classifier - -We've coded our MNIST CNN model function; now we're ready to train and evaluate -it. - -### Load Training and Test Data - -First, let's load our training and test data. Add a `main()` function to -`cnn_mnist.py` with the following code: - -```python -def main(unused_argv): - # Load training and eval data - mnist = tf.contrib.learn.datasets.load_dataset("mnist") - train_data = mnist.train.images # Returns np.array - train_labels = np.asarray(mnist.train.labels, dtype=np.int32) - eval_data = mnist.test.images # Returns np.array - eval_labels = np.asarray(mnist.test.labels, dtype=np.int32) -``` - -We store the training feature data (the raw pixel values for 55,000 images of -hand-drawn digits) and training labels (the corresponding value from 0–9 for -each image) as [numpy -arrays](https://docs.scipy.org/doc/numpy/reference/generated/numpy.array.html) -in `train_data` and `train_labels`, respectively. Similarly, we store the -evaluation feature data (10,000 images) and evaluation labels in `eval_data` -and `eval_labels`, respectively. - -### Create the Estimator {#create-the-estimator} - -Next, let's create an `Estimator` (a TensorFlow class for performing high-level -model training, evaluation, and inference) for our model. Add the following code -to `main()`: - -```python -# Create the Estimator -mnist_classifier = tf.estimator.Estimator( - model_fn=cnn_model_fn, model_dir="/tmp/mnist_convnet_model") -``` - -The `model_fn` argument specifies the model function to use for training, -evaluation, and prediction; we pass it the `cnn_model_fn` we created in -["Building the CNN MNIST Classifier."](#building-the-cnn-mnist-classifier) The -`model_dir` argument specifies the directory where model data (checkpoints) will -be saved (here, we specify the temp directory `/tmp/mnist_convnet_model`, but -feel free to change to another directory of your choice). - -> Note: For an in-depth walkthrough of the TensorFlow `Estimator` API, see the -> tutorial @{$custom_estimators$"Creating Estimators in tf.estimator."} - -### Set Up a Logging Hook {#set_up_a_logging_hook} - -Since CNNs can take a while to train, let's set up some logging so we can track -progress during training. We can use TensorFlow's @{tf.train.SessionRunHook} to create a -@{tf.train.LoggingTensorHook} -that will log the probability values from the softmax layer of our CNN. Add the -following to `main()`: - -```python -# Set up logging for predictions -tensors_to_log = {"probabilities": "softmax_tensor"} -logging_hook = tf.train.LoggingTensorHook( - tensors=tensors_to_log, every_n_iter=50) -``` - -We store a dict of the tensors we want to log in `tensors_to_log`. Each key is a -label of our choice that will be printed in the log output, and the -corresponding label is the name of a `Tensor` in the TensorFlow graph. Here, our -`probabilities` can be found in `softmax_tensor`, the name we gave our softmax -operation earlier when we generated the probabilities in `cnn_model_fn`. - -> Note: If you don't explicitly assign a name to an operation via the `name` -> argument, TensorFlow will assign a default name. A couple easy ways to -> discover the names applied to operations are to visualize your graph on -> @{$graph_viz$TensorBoard}) or to enable the -> @{$guide/debugger$TensorFlow Debugger (tfdbg)}. - -Next, we create the `LoggingTensorHook`, passing `tensors_to_log` to the -`tensors` argument. We set `every_n_iter=50`, which specifies that probabilities -should be logged after every 50 steps of training. - -### Train the Model - -Now we're ready to train our model, which we can do by creating `train_input_fn` -and calling `train()` on `mnist_classifier`. Add the following to `main()`: - -```python -# Train the model -train_input_fn = tf.estimator.inputs.numpy_input_fn( - x={"x": train_data}, - y=train_labels, - batch_size=100, - num_epochs=None, - shuffle=True) -mnist_classifier.train( - input_fn=train_input_fn, - steps=20000, - hooks=[logging_hook]) -``` - -In the `numpy_input_fn` call, we pass the training feature data and labels to -`x` (as a dict) and `y`, respectively. We set a `batch_size` of `100` (which -means that the model will train on minibatches of 100 examples at each step). -`num_epochs=None` means that the model will train until the specified number of -steps is reached. We also set `shuffle=True` to shuffle the training data. -In the `train` call, we set `steps=20000` -(which means the model will train for 20,000 steps total). We pass our -`logging_hook` to the `hooks` argument, so that it will be triggered during -training. - -### Evaluate the Model - -Once training is complete, we want to evaluate our model to determine its -accuracy on the MNIST test set. We call the `evaluate` method, which evaluates -the metrics we specified in `eval_metric_ops` argument in the `model_fn`. -Add the following to `main()`: - -```python -# Evaluate the model and print results -eval_input_fn = tf.estimator.inputs.numpy_input_fn( - x={"x": eval_data}, - y=eval_labels, - num_epochs=1, - shuffle=False) -eval_results = mnist_classifier.evaluate(input_fn=eval_input_fn) -print(eval_results) -``` - -To create `eval_input_fn`, we set `num_epochs=1`, so that the model evaluates -the metrics over one epoch of data and returns the result. We also set -`shuffle=False` to iterate through the data sequentially. - -### Run the Model - -We've coded the CNN model function, `Estimator`, and the training/evaluation -logic; now let's see the results. Run `cnn_mnist.py`. - -> Note: Training CNNs is quite computationally intensive. Estimated completion -> time of `cnn_mnist.py` will vary depending on your processor, but will likely -> be upwards of 1 hour on CPU. To train more quickly, you can decrease the -> number of `steps` passed to `train()`, but note that this will affect accuracy. - -As the model trains, you'll see log output like the following: - -```python -INFO:tensorflow:loss = 2.36026, step = 1 -INFO:tensorflow:probabilities = [[ 0.07722801 0.08618255 0.09256398, ...]] -... -INFO:tensorflow:loss = 2.13119, step = 101 -INFO:tensorflow:global_step/sec: 5.44132 -... -INFO:tensorflow:Loss for final step: 0.553216. - -INFO:tensorflow:Restored model from /tmp/mnist_convnet_model -INFO:tensorflow:Eval steps [0,inf) for training step 20000. -INFO:tensorflow:Input iterator is exhausted. -INFO:tensorflow:Saving evaluation summary for step 20000: accuracy = 0.9733, loss = 0.0902271 -{'loss': 0.090227105, 'global_step': 20000, 'accuracy': 0.97329998} -``` - -Here, we've achieved an accuracy of 97.3% on our test data set. - -## Additional Resources - -To learn more about TensorFlow Estimators and CNNs in TensorFlow, see the -following resources: - -* @{$custom_estimators$Creating Estimators in tf.estimator} - provides an introduction to the TensorFlow Estimator API. It walks through - configuring an Estimator, writing a model function, calculating loss, and - defining a training op. -* @{$deep_cnn} walks through how to build a MNIST CNN classification model - *without estimators* using lower-level TensorFlow operations. diff --git a/tensorflow/docs_src/tutorials/leftnav_files b/tensorflow/docs_src/tutorials/leftnav_files deleted file mode 100644 index 888052428f..0000000000 --- a/tensorflow/docs_src/tutorials/leftnav_files +++ /dev/null @@ -1,23 +0,0 @@ -index.md - -### Images -layers.md: MNIST -image_recognition.md: Image Recognition -image_retraining.md: Image Retraining -deep_cnn.md - -### Sequences -recurrent.md -seq2seq.md: Neural Machine Translation -recurrent_quickdraw.md: Drawing Classification -audio_recognition.md - -### Data Representation -wide.md: Linear Models -wide_and_deep.md: Wide & Deep Learning -word2vec.md -kernel_methods.md: Kernel Methods - -### Non-ML -mandelbrot.md -pdes.md diff --git a/tensorflow/docs_src/tutorials/linear.md b/tensorflow/docs_src/tutorials/linear.md deleted file mode 100644 index 3f247ade26..0000000000 --- a/tensorflow/docs_src/tutorials/linear.md +++ /dev/null @@ -1,237 +0,0 @@ -# Large-scale Linear Models with TensorFlow - -@{tf.estimator$Estimators} provides (among other things) a rich set of tools for -working with linear models in TensorFlow. This document provides an overview of -those tools. It explains: - - * What a linear model is. - * Why you might want to use a linear model. - * How Estimators make it easy to build linear models in TensorFlow. - * How you can use Estimators to combine linear models with. - deep learning to get the advantages of both. - -Read this overview to decide whether the Estimator's linear model tools might -be useful to you. Then do the @{$wide$Linear Models tutorial} to -give it a try. This overview uses code samples from the tutorial, but the -tutorial walks through the code in greater detail. - -To understand this overview it will help to have some familiarity -with basic machine learning concepts, and also with -@{$premade_estimators$Estimators}. - -[TOC] - -## What is a linear model? - -A **linear model** uses a single weighted sum of features to make a prediction. -For example, if you have [data](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names) -on age, years of education, and weekly hours of -work for a population, a model can learn weights for each of those numbers so that -their weighted sum estimates a person's salary. You can also use linear models -for classification. - -Some linear models transform the weighted sum into a more convenient form. For -example, [**logistic regression**](https://developers.google.com/machine-learning/glossary/#logistic_regression) plugs the weighted sum into the logistic -function to turn the output into a value between 0 and 1. But you still just -have one weight for each input feature. - -## Why would you want to use a linear model? - -Why would you want to use so simple a model when recent research has -demonstrated the power of more complex neural networks with many layers? - -Linear models: - - * train quickly, compared to deep neural nets. - * can work well on very large feature sets. - * can be trained with algorithms that don't require a lot of fiddling - with learning rates, etc. - * can be interpreted and debugged more easily than neural nets. - You can examine the weights assigned to each feature to figure out what's - having the biggest impact on a prediction. - * provide an excellent starting point for learning about machine learning. - * are widely used in industry. - -## How do Estimators help you build linear models? - -You can build a linear model from scratch in TensorFlow without the help of a -special API. But Estimators provides some tools that make it easier to build -effective large-scale linear models. - -### Feature columns and transformations - -Much of the work of designing a linear model consists of transforming raw data -into suitable input features. Tensorflow uses the `FeatureColumn` abstraction to -enable these transformations. - -A `FeatureColumn` represents a single feature in your data. A `FeatureColumn` -may represent a quantity like 'height', or it may represent a category like -'eye_color' where the value is drawn from a set of discrete possibilities like -{'blue', 'brown', 'green'}. - -In the case of both *continuous features* like 'height' and *categorical -features* like 'eye_color', a single value in the data might get transformed -into a sequence of numbers before it is input into the model. The -`FeatureColumn` abstraction lets you manipulate the feature as a single -semantic unit in spite of this fact. You can specify transformations and -select features to include without dealing with specific indices in the -tensors you feed into the model. - -#### Sparse columns - -Categorical features in linear models are typically translated into a sparse -vector in which each possible value has a corresponding index or id. For -example, if there are only three possible eye colors you can represent -'eye_color' as a length 3 vector: 'brown' would become [1, 0, 0], 'blue' would -become [0, 1, 0] and 'green' would become [0, 0, 1]. These vectors are called -"sparse" because they may be very long, with many zeros, when the set of -possible values is very large (such as all English words). - -While you don't need to use categorical columns to use the linear model tools -provided by Estimators, one of the strengths of linear models is their ability -to deal with large sparse vectors. Sparse features are a primary use case for -the linear model tools provided by Estimators. - -##### Encoding sparse columns - -`FeatureColumn` handles the conversion of categorical values into vectors -automatically, with code like this: - -```python -eye_color = tf.feature_column.categorical_column_with_vocabulary_list( - "eye_color", vocabulary_list=["blue", "brown", "green"]) -``` - -where `eye_color` is the name of a column in your source data. - -You can also generate `FeatureColumn`s for categorical features for which you -don't know all possible values. For this case you would use -`categorical_column_with_hash_bucket()`, which uses a hash function to assign -indices to feature values. - -```python -education = tf.feature_column.categorical_column_with_hash_bucket( - "education", hash_bucket_size=1000) -``` - -##### Feature Crosses - -Because linear models assign independent weights to separate features, they -can't learn the relative importance of specific combinations of feature -values. If you have a feature 'favorite_sport' and a feature 'home_city' and -you're trying to predict whether a person likes to wear red, your linear model -won't be able to learn that baseball fans from St. Louis especially like to -wear red. - -You can get around this limitation by creating a new feature -'favorite_sport_x_home_city'. The value of this feature for a given person is -just the concatenation of the values of the two source features: -'baseball_x_stlouis', for example. This sort of combination feature is called -a *feature cross*. - -The `crossed_column()` method makes it easy to set up feature crosses: - -```python -sport_x_city = tf.feature_column.crossed_column( - ["sport", "city"], hash_bucket_size=int(1e4)) -``` - -#### Continuous columns - -You can specify a continuous feature like so: - -```python -age = tf.feature_column.numeric_column("age") -``` - -Although, as a single real number, a continuous feature can often be input -directly into the model, Tensorflow offers useful transformations for this sort -of column as well. - -##### Bucketization - -*Bucketization* turns a continuous column into a categorical column. This -transformation lets you use continuous features in feature crosses, or learn -cases where specific value ranges have particular importance. - -Bucketization divides the range of possible values into subranges called -buckets: - -```python -age_buckets = tf.feature_column.bucketized_column( - age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) -``` - -The bucket into which a value falls becomes the categorical label for -that value. - -#### Input function - -`FeatureColumn`s provide a specification for the input data for your model, -indicating how to represent and transform the data. But they do not provide -the data itself. You provide the data through an input function. - -The input function must return a dictionary of tensors. Each key corresponds to -the name of a `FeatureColumn`. Each key's value is a tensor containing the -values of that feature for all data instances. See -@{$premade_estimators#input_fn} for a -more comprehensive look at input functions, and `input_fn` in the -[linear models tutorial code](https://github.com/tensorflow/models/tree/master/official/wide_deep/wide_deep.py) -for an example implementation of an input function. - -The input function is passed to the `train()` and `evaluate()` calls that -initiate training and testing, as described in the next section. - -### Linear estimators - -Tensorflow estimator classes provide a unified training and evaluation harness -for regression and classification models. They take care of the details of the -training and evaluation loops and allow the user to focus on model inputs and -architecture. - -To build a linear estimator, you can use either the -`tf.estimator.LinearClassifier` estimator or the -`tf.estimator.LinearRegressor` estimator, for classification and -regression respectively. - -As with all tensorflow estimators, to run the estimator you just: - - 1. Instantiate the estimator class. For the two linear estimator classes, - you pass a list of `FeatureColumn`s to the constructor. - 2. Call the estimator's `train()` method to train it. - 3. Call the estimator's `evaluate()` method to see how it does. - -For example: - -```python -e = tf.estimator.LinearClassifier( - feature_columns=[ - native_country, education, occupation, workclass, marital_status, - race, age_buckets, education_x_occupation, - age_buckets_x_race_x_occupation], - model_dir=YOUR_MODEL_DIRECTORY) -e.train(input_fn=input_fn_train, steps=200) -# Evaluate for one step (one pass through the test data). -results = e.evaluate(input_fn=input_fn_test) - -# Print the stats for the evaluation. -for key in sorted(results): - print("%s: %s" % (key, results[key])) -``` - -### Wide and deep learning - -The `tf.estimator` module also provides an estimator class that lets you jointly -train a linear model and a deep neural network. This novel approach combines the -ability of linear models to "memorize" key features with the generalization -ability of neural nets. Use `tf.estimator.DNNLinearCombinedClassifier` to -create this sort of "wide and deep" model: - -```python -e = tf.estimator.DNNLinearCombinedClassifier( - model_dir=YOUR_MODEL_DIR, - linear_feature_columns=wide_columns, - dnn_feature_columns=deep_columns, - dnn_hidden_units=[100, 50]) -``` -For more information, see the @{$wide_and_deep$Wide and Deep Learning tutorial}. diff --git a/tensorflow/docs_src/tutorials/mandelbrot.md b/tensorflow/docs_src/tutorials/mandelbrot.md deleted file mode 100755 index 1c0a548129..0000000000 --- a/tensorflow/docs_src/tutorials/mandelbrot.md +++ /dev/null @@ -1,116 +0,0 @@ -# Mandelbrot Set - -Visualizing the [Mandelbrot set](https://en.wikipedia.org/wiki/Mandelbrot_set) -doesn't have anything to do with machine learning, but it makes for a fun -example of how one can use TensorFlow for general mathematics. This is -actually a pretty naive implementation of the visualization, but it makes the -point. (We may end up providing a more elaborate implementation down the line -to produce more truly beautiful images.) - - -## Basic Setup - -We'll need a few imports to get started. - -```python -# Import libraries for simulation -import tensorflow as tf -import numpy as np - -# Imports for visualization -import PIL.Image -from io import BytesIO -from IPython.display import Image, display -``` - -Now we'll define a function to actually display the image once we have -iteration counts. - -```python -def DisplayFractal(a, fmt='jpeg'): - """Display an array of iteration counts as a - colorful picture of a fractal.""" - a_cyclic = (6.28*a/20.0).reshape(list(a.shape)+[1]) - img = np.concatenate([10+20*np.cos(a_cyclic), - 30+50*np.sin(a_cyclic), - 155-80*np.cos(a_cyclic)], 2) - img[a==a.max()] = 0 - a = img - a = np.uint8(np.clip(a, 0, 255)) - f = BytesIO() - PIL.Image.fromarray(a).save(f, fmt) - display(Image(data=f.getvalue())) -``` - -## Session and Variable Initialization - -For playing around like this, we often use an interactive session, but a regular -session would work as well. - -```python -sess = tf.InteractiveSession() -``` - -It's handy that we can freely mix NumPy and TensorFlow. - -```python -# Use NumPy to create a 2D array of complex numbers - -Y, X = np.mgrid[-1.3:1.3:0.005, -2:1:0.005] -Z = X+1j*Y -``` - -Now we define and initialize TensorFlow tensors. - -```python -xs = tf.constant(Z.astype(np.complex64)) -zs = tf.Variable(xs) -ns = tf.Variable(tf.zeros_like(xs, tf.float32)) -``` - -TensorFlow requires that you explicitly initialize variables before using them. - -```python -tf.global_variables_initializer().run() -``` - -## Defining and Running the Computation - -Now we specify more of the computation... - -```python -# Compute the new values of z: z^2 + x -zs_ = zs*zs + xs - -# Have we diverged with this new value? -not_diverged = tf.abs(zs_) < 4 - -# Operation to update the zs and the iteration count. -# -# Note: We keep computing zs after they diverge! This -# is very wasteful! There are better, if a little -# less simple, ways to do this. -# -step = tf.group( - zs.assign(zs_), - ns.assign_add(tf.cast(not_diverged, tf.float32)) - ) -``` - -... and run it for a couple hundred steps - -```python -for i in range(200): step.run() -``` - -Let's see what we've got. - -```python -DisplayFractal(ns.eval()) -``` - -![jpeg](https://www.tensorflow.org/images/mandelbrot_output.jpg) - -Not bad! - - diff --git a/tensorflow/docs_src/tutorials/next_steps.md b/tensorflow/docs_src/tutorials/next_steps.md new file mode 100644 index 0000000000..01c9f7204a --- /dev/null +++ b/tensorflow/docs_src/tutorials/next_steps.md @@ -0,0 +1,36 @@ +# Next steps + +## Learn more about TensorFlow + +* The [TensorFlow Guide](/guide) includes usage guides for the + high-level APIs, as well as advanced TensorFlow operations. +* [Premade Estimators](/guide/premade_estimators) are designed to + get results out of the box. Use TensorFlow without building your own models. +* [TensorFlow.js](https://js.tensorflow.org/) allows web developers to train and + deploy ML models in the browser and using Node.js. +* [TFLite](/mobile/tflite) allows mobile developers to do inference efficiently + on mobile devices. +* [TensorFlow Serving](/serving) is an open-source project that can put + TensorFlow models in production quickly. +* The [ecosystem](/ecosystem) contains more projects, including + [Magenta](https://magenta.tensorflow.org/), [TFX](/tfx), + [Swift for TensorFlow](https://github.com/tensorflow/swift), and more. + +## Learn more about machine learning + +Recommended resources include: + +* [Machine Learning Crash Course](https://developers.google.com/machine-learning/crash-course/), + a course from Google that introduces machine learning concepts. +* [CS 20: Tensorflow for Deep Learning Research](http://web.stanford.edu/class/cs20si/), + notes from an intro course from Stanford. +* [CS231n: Convolutional Neural Networks for Visual Recognition](http://cs231n.stanford.edu/), + a course that teaches how convolutional networks work. +* [Machine Learning Recipes](https://www.youtube.com/watch?v=cKxRvEZd3Mw&list=PLOU2XLYxmsIIuiBfYad6rFYQU_jL2ryal), + a video series that introduces basic machine learning concepts with few prerequisites. +* [Deep Learning with Python](https://www.manning.com/books/deep-learning-with-python), + a book by Francois Chollet about the Keras API, as well as an excellent hands on intro to Deep Learning. +* [Hands-on Machine Learning with Scikit-Learn and TensorFlow](https://github.com/ageron/handson-ml), + a book by Aurélien Geron's that is a clear getting-started guide to data science and deep learning. +* [Deep Learning](https://www.deeplearningbook.org/), a book by Ian Goodfellow et al. + that provides a technical dive into learning machine learning. diff --git a/tensorflow/docs_src/tutorials/non-ml/mandelbrot.md b/tensorflow/docs_src/tutorials/non-ml/mandelbrot.md new file mode 100644 index 0000000000..1c0a548129 --- /dev/null +++ b/tensorflow/docs_src/tutorials/non-ml/mandelbrot.md @@ -0,0 +1,116 @@ +# Mandelbrot Set + +Visualizing the [Mandelbrot set](https://en.wikipedia.org/wiki/Mandelbrot_set) +doesn't have anything to do with machine learning, but it makes for a fun +example of how one can use TensorFlow for general mathematics. This is +actually a pretty naive implementation of the visualization, but it makes the +point. (We may end up providing a more elaborate implementation down the line +to produce more truly beautiful images.) + + +## Basic Setup + +We'll need a few imports to get started. + +```python +# Import libraries for simulation +import tensorflow as tf +import numpy as np + +# Imports for visualization +import PIL.Image +from io import BytesIO +from IPython.display import Image, display +``` + +Now we'll define a function to actually display the image once we have +iteration counts. + +```python +def DisplayFractal(a, fmt='jpeg'): + """Display an array of iteration counts as a + colorful picture of a fractal.""" + a_cyclic = (6.28*a/20.0).reshape(list(a.shape)+[1]) + img = np.concatenate([10+20*np.cos(a_cyclic), + 30+50*np.sin(a_cyclic), + 155-80*np.cos(a_cyclic)], 2) + img[a==a.max()] = 0 + a = img + a = np.uint8(np.clip(a, 0, 255)) + f = BytesIO() + PIL.Image.fromarray(a).save(f, fmt) + display(Image(data=f.getvalue())) +``` + +## Session and Variable Initialization + +For playing around like this, we often use an interactive session, but a regular +session would work as well. + +```python +sess = tf.InteractiveSession() +``` + +It's handy that we can freely mix NumPy and TensorFlow. + +```python +# Use NumPy to create a 2D array of complex numbers + +Y, X = np.mgrid[-1.3:1.3:0.005, -2:1:0.005] +Z = X+1j*Y +``` + +Now we define and initialize TensorFlow tensors. + +```python +xs = tf.constant(Z.astype(np.complex64)) +zs = tf.Variable(xs) +ns = tf.Variable(tf.zeros_like(xs, tf.float32)) +``` + +TensorFlow requires that you explicitly initialize variables before using them. + +```python +tf.global_variables_initializer().run() +``` + +## Defining and Running the Computation + +Now we specify more of the computation... + +```python +# Compute the new values of z: z^2 + x +zs_ = zs*zs + xs + +# Have we diverged with this new value? +not_diverged = tf.abs(zs_) < 4 + +# Operation to update the zs and the iteration count. +# +# Note: We keep computing zs after they diverge! This +# is very wasteful! There are better, if a little +# less simple, ways to do this. +# +step = tf.group( + zs.assign(zs_), + ns.assign_add(tf.cast(not_diverged, tf.float32)) + ) +``` + +... and run it for a couple hundred steps + +```python +for i in range(200): step.run() +``` + +Let's see what we've got. + +```python +DisplayFractal(ns.eval()) +``` + +![jpeg](https://www.tensorflow.org/images/mandelbrot_output.jpg) + +Not bad! + + diff --git a/tensorflow/docs_src/tutorials/non-ml/pdes.md b/tensorflow/docs_src/tutorials/non-ml/pdes.md new file mode 100644 index 0000000000..b5a0fa834a --- /dev/null +++ b/tensorflow/docs_src/tutorials/non-ml/pdes.md @@ -0,0 +1,140 @@ +# Partial Differential Equations + +TensorFlow isn't just for machine learning. Here we give a (somewhat +pedestrian) example of using TensorFlow for simulating the behavior of a +[partial differential equation]( +https://en.wikipedia.org/wiki/Partial_differential_equation). +We'll simulate the surface of square pond as a few raindrops land on it. + + +## Basic Setup + +A few imports we'll need. + +```python +#Import libraries for simulation +import tensorflow as tf +import numpy as np + +#Imports for visualization +import PIL.Image +from io import BytesIO +from IPython.display import clear_output, Image, display +``` + +A function for displaying the state of the pond's surface as an image. + +```python +def DisplayArray(a, fmt='jpeg', rng=[0,1]): + """Display an array as a picture.""" + a = (a - rng[0])/float(rng[1] - rng[0])*255 + a = np.uint8(np.clip(a, 0, 255)) + f = BytesIO() + PIL.Image.fromarray(a).save(f, fmt) + clear_output(wait = True) + display(Image(data=f.getvalue())) +``` + +Here we start an interactive TensorFlow session for convenience in playing +around. A regular session would work as well if we were doing this in an +executable .py file. + +```python +sess = tf.InteractiveSession() +``` + +## Computational Convenience Functions + + +```python +def make_kernel(a): + """Transform a 2D array into a convolution kernel""" + a = np.asarray(a) + a = a.reshape(list(a.shape) + [1,1]) + return tf.constant(a, dtype=1) + +def simple_conv(x, k): + """A simplified 2D convolution operation""" + x = tf.expand_dims(tf.expand_dims(x, 0), -1) + y = tf.nn.depthwise_conv2d(x, k, [1, 1, 1, 1], padding='SAME') + return y[0, :, :, 0] + +def laplace(x): + """Compute the 2D laplacian of an array""" + laplace_k = make_kernel([[0.5, 1.0, 0.5], + [1.0, -6., 1.0], + [0.5, 1.0, 0.5]]) + return simple_conv(x, laplace_k) +``` + +## Define the PDE + +Our pond is a perfect 500 x 500 square, as is the case for most ponds found in +nature. + +```python +N = 500 +``` + +Here we create our pond and hit it with some rain drops. + +```python +# Initial Conditions -- some rain drops hit a pond + +# Set everything to zero +u_init = np.zeros([N, N], dtype=np.float32) +ut_init = np.zeros([N, N], dtype=np.float32) + +# Some rain drops hit a pond at random points +for n in range(40): + a,b = np.random.randint(0, N, 2) + u_init[a,b] = np.random.uniform() + +DisplayArray(u_init, rng=[-0.1, 0.1]) +``` + +![jpeg](https://www.tensorflow.org/images/pde_output_1.jpg) + + +Now let's specify the details of the differential equation. + + +```python +# Parameters: +# eps -- time resolution +# damping -- wave damping +eps = tf.placeholder(tf.float32, shape=()) +damping = tf.placeholder(tf.float32, shape=()) + +# Create variables for simulation state +U = tf.Variable(u_init) +Ut = tf.Variable(ut_init) + +# Discretized PDE update rules +U_ = U + eps * Ut +Ut_ = Ut + eps * (laplace(U) - damping * Ut) + +# Operation to update the state +step = tf.group( + U.assign(U_), + Ut.assign(Ut_)) +``` + +## Run The Simulation + +This is where it gets fun -- running time forward with a simple for loop. + +```python +# Initialize state to initial conditions +tf.global_variables_initializer().run() + +# Run 1000 steps of PDE +for i in range(1000): + # Step simulation + step.run({eps: 0.03, damping: 0.04}) + DisplayArray(U.eval(), rng=[-0.1, 0.1]) +``` + +![jpeg](../../images/pde_output_2.jpg) + +Look! Ripples! diff --git a/tensorflow/docs_src/tutorials/pdes.md b/tensorflow/docs_src/tutorials/pdes.md deleted file mode 100755 index 425e8d7084..0000000000 --- a/tensorflow/docs_src/tutorials/pdes.md +++ /dev/null @@ -1,141 +0,0 @@ -# Partial Differential Equations - -TensorFlow isn't just for machine learning. Here we give a (somewhat -pedestrian) example of using TensorFlow for simulating the behavior of a -[partial differential equation]( -https://en.wikipedia.org/wiki/Partial_differential_equation). -We'll simulate the surface of square pond as a few raindrops land on it. - - -## Basic Setup - -A few imports we'll need. - -```python -#Import libraries for simulation -import tensorflow as tf -import numpy as np - -#Imports for visualization -import PIL.Image -from io import BytesIO -from IPython.display import clear_output, Image, display -``` - -A function for displaying the state of the pond's surface as an image. - -```python -def DisplayArray(a, fmt='jpeg', rng=[0,1]): - """Display an array as a picture.""" - a = (a - rng[0])/float(rng[1] - rng[0])*255 - a = np.uint8(np.clip(a, 0, 255)) - f = BytesIO() - PIL.Image.fromarray(a).save(f, fmt) - clear_output(wait = True) - display(Image(data=f.getvalue())) -``` - -Here we start an interactive TensorFlow session for convenience in playing -around. A regular session would work as well if we were doing this in an -executable .py file. - -```python -sess = tf.InteractiveSession() -``` - -## Computational Convenience Functions - - -```python -def make_kernel(a): - """Transform a 2D array into a convolution kernel""" - a = np.asarray(a) - a = a.reshape(list(a.shape) + [1,1]) - return tf.constant(a, dtype=1) - -def simple_conv(x, k): - """A simplified 2D convolution operation""" - x = tf.expand_dims(tf.expand_dims(x, 0), -1) - y = tf.nn.depthwise_conv2d(x, k, [1, 1, 1, 1], padding='SAME') - return y[0, :, :, 0] - -def laplace(x): - """Compute the 2D laplacian of an array""" - laplace_k = make_kernel([[0.5, 1.0, 0.5], - [1.0, -6., 1.0], - [0.5, 1.0, 0.5]]) - return simple_conv(x, laplace_k) -``` - -## Define the PDE - -Our pond is a perfect 500 x 500 square, as is the case for most ponds found in -nature. - -```python -N = 500 -``` - -Here we create our pond and hit it with some rain drops. - -```python -# Initial Conditions -- some rain drops hit a pond - -# Set everything to zero -u_init = np.zeros([N, N], dtype=np.float32) -ut_init = np.zeros([N, N], dtype=np.float32) - -# Some rain drops hit a pond at random points -for n in range(40): - a,b = np.random.randint(0, N, 2) - u_init[a,b] = np.random.uniform() - -DisplayArray(u_init, rng=[-0.1, 0.1]) -``` - -![jpeg](https://www.tensorflow.org/images/pde_output_1.jpg) - - -Now let's specify the details of the differential equation. - - -```python -# Parameters: -# eps -- time resolution -# damping -- wave damping -eps = tf.placeholder(tf.float32, shape=()) -damping = tf.placeholder(tf.float32, shape=()) - -# Create variables for simulation state -U = tf.Variable(u_init) -Ut = tf.Variable(ut_init) - -# Discretized PDE update rules -U_ = U + eps * Ut -Ut_ = Ut + eps * (laplace(U) - damping * Ut) - -# Operation to update the state -step = tf.group( - U.assign(U_), - Ut.assign(Ut_)) -``` - -## Run The Simulation - -This is where it gets fun -- running time forward with a simple for loop. - -```python -# Initialize state to initial conditions -tf.global_variables_initializer().run() - -# Run 1000 steps of PDE -for i in range(1000): - # Step simulation - step.run({eps: 0.03, damping: 0.04}) - DisplayArray(U.eval(), rng=[-0.1, 0.1]) -``` - -![jpeg](../images/pde_output_2.jpg) - -Look! Ripples! - diff --git a/tensorflow/docs_src/tutorials/recurrent.md b/tensorflow/docs_src/tutorials/recurrent.md deleted file mode 100644 index 14da2c8785..0000000000 --- a/tensorflow/docs_src/tutorials/recurrent.md +++ /dev/null @@ -1,232 +0,0 @@ -# Recurrent Neural Networks - -## Introduction - -Take a look at [this great article](https://colah.github.io/posts/2015-08-Understanding-LSTMs/) -for an introduction to recurrent neural networks and LSTMs in particular. - -## Language Modeling - -In this tutorial we will show how to train a recurrent neural network on -a challenging task of language modeling. The goal of the problem is to fit a -probabilistic model which assigns probabilities to sentences. It does so by -predicting next words in a text given a history of previous words. For this -purpose we will use the [Penn Tree Bank](https://catalog.ldc.upenn.edu/ldc99t42) -(PTB) dataset, which is a popular benchmark for measuring the quality of these -models, whilst being small and relatively fast to train. - -Language modeling is key to many interesting problems such as speech -recognition, machine translation, or image captioning. It is also fun -- -take a look [here](https://karpathy.github.io/2015/05/21/rnn-effectiveness/). - -For the purpose of this tutorial, we will reproduce the results from -[Zaremba et al., 2014](https://arxiv.org/abs/1409.2329) -([pdf](https://arxiv.org/pdf/1409.2329.pdf)), which achieves very good quality -on the PTB dataset. - -## Tutorial Files - -This tutorial references the following files from `models/tutorials/rnn/ptb` in the [TensorFlow models repo](https://github.com/tensorflow/models): - -File | Purpose ---- | --- -`ptb_word_lm.py` | The code to train a language model on the PTB dataset. -`reader.py` | The code to read the dataset. - -## Download and Prepare the Data - -The data required for this tutorial is in the `data/` directory of the -[PTB dataset from Tomas Mikolov's webpage](http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz). - -The dataset is already preprocessed and contains overall 10000 different words, -including the end-of-sentence marker and a special symbol (\) for rare -words. In `reader.py`, we convert each word to a unique integer identifier, -in order to make it easy for the neural network to process the data. - -## The Model - -### LSTM - -The core of the model consists of an LSTM cell that processes one word at a -time and computes probabilities of the possible values for the next word in the -sentence. The memory state of the network is initialized with a vector of zeros -and gets updated after reading each word. For computational reasons, we will -process data in mini-batches of size `batch_size`. In this example, it is -important to note that `current_batch_of_words` does not correspond to a -"sentence" of words. Every word in a batch should correspond to a time t. -TensorFlow will automatically sum the gradients of each batch for you. - -For example: - -``` - t=0 t=1 t=2 t=3 t=4 -[The, brown, fox, is, quick] -[The, red, fox, jumped, high] - -words_in_dataset[0] = [The, The] -words_in_dataset[1] = [brown, red] -words_in_dataset[2] = [fox, fox] -words_in_dataset[3] = [is, jumped] -words_in_dataset[4] = [quick, high] -batch_size = 2, time_steps = 5 -``` - -The basic pseudocode is as follows: - -```python -words_in_dataset = tf.placeholder(tf.float32, [time_steps, batch_size, num_features]) -lstm = tf.contrib.rnn.BasicLSTMCell(lstm_size) -# Initial state of the LSTM memory. -hidden_state = tf.zeros([batch_size, lstm.state_size]) -current_state = tf.zeros([batch_size, lstm.state_size]) -state = hidden_state, current_state -probabilities = [] -loss = 0.0 -for current_batch_of_words in words_in_dataset: - # The value of state is updated after processing each batch of words. - output, state = lstm(current_batch_of_words, state) - - # The LSTM output can be used to make next word predictions - logits = tf.matmul(output, softmax_w) + softmax_b - probabilities.append(tf.nn.softmax(logits)) - loss += loss_function(probabilities, target_words) -``` - -### Truncated Backpropagation - -By design, the output of a recurrent neural network (RNN) depends on arbitrarily -distant inputs. Unfortunately, this makes backpropagation computation difficult. -In order to make the learning process tractable, it is common practice to create -an "unrolled" version of the network, which contains a fixed number -(`num_steps`) of LSTM inputs and outputs. The model is then trained on this -finite approximation of the RNN. This can be implemented by feeding inputs of -length `num_steps` at a time and performing a backward pass after each -such input block. - -Here is a simplified block of code for creating a graph which performs -truncated backpropagation: - -```python -# Placeholder for the inputs in a given iteration. -words = tf.placeholder(tf.int32, [batch_size, num_steps]) - -lstm = tf.contrib.rnn.BasicLSTMCell(lstm_size) -# Initial state of the LSTM memory. -initial_state = state = tf.zeros([batch_size, lstm.state_size]) - -for i in range(num_steps): - # The value of state is updated after processing each batch of words. - output, state = lstm(words[:, i], state) - - # The rest of the code. - # ... - -final_state = state -``` - -And this is how to implement an iteration over the whole dataset: - -```python -# A numpy array holding the state of LSTM after each batch of words. -numpy_state = initial_state.eval() -total_loss = 0.0 -for current_batch_of_words in words_in_dataset: - numpy_state, current_loss = session.run([final_state, loss], - # Initialize the LSTM state from the previous iteration. - feed_dict={initial_state: numpy_state, words: current_batch_of_words}) - total_loss += current_loss -``` - -### Inputs - -The word IDs will be embedded into a dense representation (see the -@{$word2vec$Vector Representations Tutorial}) before feeding to -the LSTM. This allows the model to efficiently represent the knowledge about -particular words. It is also easy to write: - -```python -# embedding_matrix is a tensor of shape [vocabulary_size, embedding size] -word_embeddings = tf.nn.embedding_lookup(embedding_matrix, word_ids) -``` - -The embedding matrix will be initialized randomly and the model will learn to -differentiate the meaning of words just by looking at the data. - -### Loss Function - -We want to minimize the average negative log probability of the target words: - -$$ \text{loss} = -\frac{1}{N}\sum_{i=1}^{N} \ln p_{\text{target}_i} $$ - -It is not very difficult to implement but the function -`sequence_loss_by_example` is already available, so we can just use it here. - -The typical measure reported in the papers is average per-word perplexity (often -just called perplexity), which is equal to - -$$e^{-\frac{1}{N}\sum_{i=1}^{N} \ln p_{\text{target}_i}} = e^{\text{loss}} $$ - -and we will monitor its value throughout the training process. - -### Stacking multiple LSTMs - -To give the model more expressive power, we can add multiple layers of LSTMs -to process the data. The output of the first layer will become the input of -the second and so on. - -We have a class called `MultiRNNCell` that makes the implementation seamless: - -```python -def lstm_cell(): - return tf.contrib.rnn.BasicLSTMCell(lstm_size) -stacked_lstm = tf.contrib.rnn.MultiRNNCell( - [lstm_cell() for _ in range(number_of_layers)]) - -initial_state = state = stacked_lstm.zero_state(batch_size, tf.float32) -for i in range(num_steps): - # The value of state is updated after processing each batch of words. - output, state = stacked_lstm(words[:, i], state) - - # The rest of the code. - # ... - -final_state = state -``` - -## Run the Code - -Before running the code, download the PTB dataset, as discussed at the beginning -of this tutorial. Then, extract the PTB dataset underneath your home directory -as follows: - -```bsh -tar xvfz simple-examples.tgz -C $HOME -``` -_(Note: On Windows, you may need to use -[other tools](https://wiki.haskell.org/How_to_unpack_a_tar_file_in_Windows).)_ - -Now, clone the [TensorFlow models repo](https://github.com/tensorflow/models) -from GitHub. Run the following commands: - -```bsh -cd models/tutorials/rnn/ptb -python ptb_word_lm.py --data_path=$HOME/simple-examples/data/ --model=small -``` - -There are 3 supported model configurations in the tutorial code: "small", -"medium" and "large". The difference between them is in size of the LSTMs and -the set of hyperparameters used for training. - -The larger the model, the better results it should get. The `small` model should -be able to reach perplexity below 120 on the test set and the `large` one below -80, though it might take several hours to train. - -## What Next? - -There are several tricks that we haven't mentioned that make the model better, -including: - -* decreasing learning rate schedule, -* dropout between the LSTM layers. - -Study the code and modify it to improve the model even further. diff --git a/tensorflow/docs_src/tutorials/recurrent_quickdraw.md b/tensorflow/docs_src/tutorials/recurrent_quickdraw.md deleted file mode 100644 index 1afd861738..0000000000 --- a/tensorflow/docs_src/tutorials/recurrent_quickdraw.md +++ /dev/null @@ -1,411 +0,0 @@ -# Recurrent Neural Networks for Drawing Classification - -[Quick, Draw!]: http://quickdraw.withgoogle.com - -[Quick, Draw!] is a game where a player is challenged to draw a number of -objects and see if a computer can recognize the drawing. - -The recognition in [Quick, Draw!] is performed by a classifier that takes the -user input, given as a sequence of strokes of points in x and y, and recognizes -the object category that the user tried to draw. - -In this tutorial we'll show how to build an RNN-based recognizer for this -problem. The model will use a combination of convolutional layers, LSTM layers, -and a softmax output layer to classify the drawings: - -
![RNN model structure](../images/quickdraw_model.png)
- -The figure above shows the structure of the model that we will build in this -tutorial. The input is a drawing that is encoded as a sequence of strokes of -points in x, y, and n, where n indicates whether a the point is the first point -in a new stroke. - -Then, a series of 1-dimensional convolutions is applied. Then LSTM layers are -applied and the sum of the outputs of all LSTM steps is fed into a softmax layer -to make a classification decision among the classes of drawings that we know. - -This tutorial uses the data from actual [Quick, Draw!] games [that is publicly -available](https://quickdraw.withgoogle.com/data). This dataset contains of 50M -drawings in 345 categories. - -## Run the tutorial code - -To try the code for this tutorial: - -1. @{$install$Install TensorFlow} if you haven't already. -1. Download the [tutorial code] -(https://github.com/tensorflow/models/tree/master/tutorials/rnn/quickdraw/train_model.py). -1. [Download the data](#download-the-data) in `TFRecord` format from - [here](http://download.tensorflow.org/data/quickdraw_tutorial_dataset_v1.tar.gz) and unzip it. More details about [how to - obtain the original Quick, Draw! - data](#optional_download_the_full_quick_draw_data) and [how to convert that - to `TFRecord` files](#optional_converting_the_data) is available below. - -1. Execute the tutorial code with the following command to train the RNN-based - model described in this tutorial. Make sure to adjust the paths to point to - the unzipped data from the download in step 3. - -```shell - python train_model.py \ - --training_data=rnn_tutorial_data/training.tfrecord-?????-of-????? \ - --eval_data=rnn_tutorial_data/eval.tfrecord-?????-of-????? \ - --classes_file=rnn_tutorial_data/training.tfrecord.classes -``` - -## Tutorial details - -### Download the data - -We make the data that we use in this tutorial available as `TFRecord` files -containing `TFExamples`. You can download the data from here: - -http://download.tensorflow.org/data/quickdraw_tutorial_dataset_v1.tar.gz - -Alternatively you can download the original data in `ndjson` format from the -Google cloud and convert it to the `TFRecord` files containing `TFExamples` -yourself as described in the next section. - -### Optional: Download the full Quick Draw Data - -The full [Quick, Draw!](https://quickdraw.withgoogle.com) -[dataset](https://quickdraw.withgoogle.com/data) is available on Google Cloud -Storage as [ndjson](http://ndjson.org/) files separated by category. You can -[browse the list of files in Cloud -Console](https://console.cloud.google.com/storage/quickdraw_dataset). - -To download the data we recommend using -[gsutil](https://cloud.google.com/storage/docs/gsutil_install#install) to -download the entire dataset. Note that the original .ndjson files require -downloading ~22GB. - -Then use the following command to check that your gsutil installation works and -that you can access the data bucket: - -```shell -gsutil ls -r "gs://quickdraw_dataset/full/simplified/*" -``` - -which will output a long list of files like the following: - -```shell -gs://quickdraw_dataset/full/simplified/The Eiffel Tower.ndjson -gs://quickdraw_dataset/full/simplified/The Great Wall of China.ndjson -gs://quickdraw_dataset/full/simplified/The Mona Lisa.ndjson -gs://quickdraw_dataset/full/simplified/aircraft carrier.ndjson -... -``` - -Then create a folder and download the dataset there. - -```shell -mkdir rnn_tutorial_data -cd rnn_tutorial_data -gsutil -m cp "gs://quickdraw_dataset/full/simplified/*" . -``` - -This download will take a while and download a bit more than 23GB of data. - -### Optional: Converting the data - -To convert the `ndjson` files to -@{$python/python_io#TFRecords_Format_Details$TFRecord} files containing -[`tf.train.Example`](https://www.tensorflow.org/code/tensorflow/core/example/example.proto) -protos run the following command. - -```shell - python create_dataset.py --ndjson_path rnn_tutorial_data \ - --output_path rnn_tutorial_data -``` - -This will store the data in 10 shards of -@{$python/python_io#TFRecords_Format_Details$TFRecord} files with 10000 items -per class for the training data and 1000 items per class as eval data. - -This conversion process is described in more detail in the following. - -The original QuickDraw data is formatted as `ndjson` files where each line -contains a JSON object like the following: - -```json -{"word":"cat", - "countrycode":"VE", - "timestamp":"2017-03-02 23:25:10.07453 UTC", - "recognized":true, - "key_id":"5201136883597312", - "drawing":[ - [ - [130,113,99,109,76,64,55,48,48,51,59,86,133,154,170,203,214,217,215,208,186,176,162,157,132], - [72,40,27,79,82,88,100,120,134,152,165,184,189,186,179,152,131,114,100,89,76,0,31,65,70] - ],[ - [76,28,7], - [136,128,128] - ],[ - [76,23,0], - [160,164,175] - ],[ - [87,52,37], - [175,191,204] - ],[ - [174,220,246,251], - [134,132,136,139] - ],[ - [175,255], - [147,168] - ],[ - [171,208,215], - [164,198,210] - ],[ - [130,110,108,111,130,139,139,119], - [129,134,137,144,148,144,136,130] - ],[ - [107,106], - [96,113] - ] - ] -} -``` - -For our purpose of building a classifier we only care about the fields "`word`" -and "`drawing`". While parsing the ndjson files, we process them line by line -using a function that converts the strokes from the `drawing` field into a -tensor of size `[number of points, 3]` containing the differences of consecutive -points. This function also returns the class name as a string. - -```python -def parse_line(ndjson_line): - """Parse an ndjson line and return ink (as np array) and classname.""" - sample = json.loads(ndjson_line) - class_name = sample["word"] - inkarray = sample["drawing"] - stroke_lengths = [len(stroke[0]) for stroke in inkarray] - total_points = sum(stroke_lengths) - np_ink = np.zeros((total_points, 3), dtype=np.float32) - current_t = 0 - for stroke in inkarray: - for i in [0, 1]: - np_ink[current_t:(current_t + len(stroke[0])), i] = stroke[i] - current_t += len(stroke[0]) - np_ink[current_t - 1, 2] = 1 # stroke_end - # Preprocessing. - # 1. Size normalization. - lower = np.min(np_ink[:, 0:2], axis=0) - upper = np.max(np_ink[:, 0:2], axis=0) - scale = upper - lower - scale[scale == 0] = 1 - np_ink[:, 0:2] = (np_ink[:, 0:2] - lower) / scale - # 2. Compute deltas. - np_ink = np_ink[1:, 0:2] - np_ink[0:-1, 0:2] - return np_ink, class_name -``` - -Since we want the data to be shuffled for writing we read from each of the -category files in random order and write to a random shard. - -For the training data we read the first 10000 items for each class and for the -eval data we read the next 1000 items for each class. - -This data is then reformatted into a tensor of shape `[num_training_samples, -max_length, 3]`. Then we determine the bounding box of the original drawing in -screen coordinates and normalize the size such that the drawing has unit height. - -
![Size normalization](../images/quickdraw_sizenormalization.png)
- -Finally, we compute the differences between consecutive points and store these -as a `VarLenFeature` in a -[tensorflow.Example](https://www.tensorflow.org/code/tensorflow/core/example/example.proto) -under the key `ink`. In addition we store the `class_index` as a single entry -`FixedLengthFeature` and the `shape` of the `ink` as a `FixedLengthFeature` of -length 2. - -### Defining the model - -To define the model we create a new `Estimator`. If you want to read more about -estimators, we recommend @{$custom_estimators$this tutorial}. - -To build the model, we: - -1. reshape the input back into the original shape - where the mini batch is - padded to the maximal length of its contents. In addition to the ink data we - also have the lengths for each example and the target class. This happens in - the function [`_get_input_tensors`](#-get-input-tensors). - -1. pass the input through to a series of convolution layers in - [`_add_conv_layers`](#-add-conv-layers). - -1. pass the output of the convolutions into a series of bidirectional LSTM - layers in [`_add_rnn_layers`](#-add-rnn-layers). At the end of that, the - outputs for each time step are summed up to have a compact, fixed length - embedding of the input. - -1. classify this embedding using a softmax layer in - [`_add_fc_layers`](#-add-fc-layers). - -In code this looks like: - -```python -inks, lengths, targets = _get_input_tensors(features, targets) -convolved = _add_conv_layers(inks) -final_state = _add_rnn_layers(convolved, lengths) -logits =_add_fc_layers(final_state) -``` - -### _get_input_tensors - -To obtain the input features we first obtain the shape from the features dict -and then create a 1D tensor of size `[batch_size]` containing the lengths of the -input sequences. The ink is stored as a SparseTensor in the features dict which -we convert into a dense tensor and then reshape to be `[batch_size, ?, 3]`. And -finally, if targets were passed in we make sure they are stored as a 1D tensor -of size `[batch_size]` - -In code this looks like this: - -```python -shapes = features["shape"] -lengths = tf.squeeze( - tf.slice(shapes, begin=[0, 0], size=[params["batch_size"], 1])) -inks = tf.reshape( - tf.sparse_tensor_to_dense(features["ink"]), - [params["batch_size"], -1, 3]) -if targets is not None: - targets = tf.squeeze(targets) -``` - -### _add_conv_layers - -The desired number of convolution layers and the lengths of the filters is -configured through the parameters `num_conv` and `conv_len` in the `params` -dict. - -The input is a sequence where each point has dimensionality 3. We are going to -use 1D convolutions where we treat the 3 input features as channels. That means -that the input is a `[batch_size, length, 3]` tensor and the output will be a -`[batch_size, length, number_of_filters]` tensor. - -```python -convolved = inks -for i in range(len(params.num_conv)): - convolved_input = convolved - if params.batch_norm: - convolved_input = tf.layers.batch_normalization( - convolved_input, - training=(mode == tf.estimator.ModeKeys.TRAIN)) - # Add dropout layer if enabled and not first convolution layer. - if i > 0 and params.dropout: - convolved_input = tf.layers.dropout( - convolved_input, - rate=params.dropout, - training=(mode == tf.estimator.ModeKeys.TRAIN)) - convolved = tf.layers.conv1d( - convolved_input, - filters=params.num_conv[i], - kernel_size=params.conv_len[i], - activation=None, - strides=1, - padding="same", - name="conv1d_%d" % i) -return convolved, lengths -``` - -### _add_rnn_layers - -We pass the output from the convolutions into bidirectional LSTM layers for -which we use a helper function from contrib. - -```python -outputs, _, _ = contrib_rnn.stack_bidirectional_dynamic_rnn( - cells_fw=[cell(params.num_nodes) for _ in range(params.num_layers)], - cells_bw=[cell(params.num_nodes) for _ in range(params.num_layers)], - inputs=convolved, - sequence_length=lengths, - dtype=tf.float32, - scope="rnn_classification") -``` - -see the code for more details and how to use `CUDA` accelerated implementations. - -To create a compact, fixed-length embedding, we sum up the output of the LSTMs. -We first zero out the regions of the batch where the sequences have no data. - -```python -mask = tf.tile( - tf.expand_dims(tf.sequence_mask(lengths, tf.shape(outputs)[1]), 2), - [1, 1, tf.shape(outputs)[2]]) -zero_outside = tf.where(mask, outputs, tf.zeros_like(outputs)) -outputs = tf.reduce_sum(zero_outside, axis=1) -``` - -### _add_fc_layers - -The embedding of the input is passed into a fully connected layer which we then -use as a softmax layer. - -```python -tf.layers.dense(final_state, params.num_classes) -``` - -### Loss, predictions, and optimizer - -Finally, we need to add a loss, a training op, and predictions to create the -`ModelFn`: - -```python -cross_entropy = tf.reduce_mean( - tf.nn.sparse_softmax_cross_entropy_with_logits( - labels=targets, logits=logits)) -# Add the optimizer. -train_op = tf.contrib.layers.optimize_loss( - loss=cross_entropy, - global_step=tf.train.get_global_step(), - learning_rate=params.learning_rate, - optimizer="Adam", - # some gradient clipping stabilizes training in the beginning. - clip_gradients=params.gradient_clipping_norm, - summaries=["learning_rate", "loss", "gradients", "gradient_norm"]) -predictions = tf.argmax(logits, axis=1) -return model_fn_lib.ModelFnOps( - mode=mode, - predictions={"logits": logits, - "predictions": predictions}, - loss=cross_entropy, - train_op=train_op, - eval_metric_ops={"accuracy": tf.metrics.accuracy(targets, predictions)}) -``` - -### Training and evaluating the model - -To train and evaluate the model we can rely on the functionalities of the -`Estimator` APIs and easily run training and evaluation with the `Experiment` -APIs: - -```python - estimator = tf.estimator.Estimator( - model_fn=model_fn, - model_dir=output_dir, - config=config, - params=model_params) - # Train the model. - tf.contrib.learn.Experiment( - estimator=estimator, - train_input_fn=get_input_fn( - mode=tf.contrib.learn.ModeKeys.TRAIN, - tfrecord_pattern=FLAGS.training_data, - batch_size=FLAGS.batch_size), - train_steps=FLAGS.steps, - eval_input_fn=get_input_fn( - mode=tf.contrib.learn.ModeKeys.EVAL, - tfrecord_pattern=FLAGS.eval_data, - batch_size=FLAGS.batch_size), - min_eval_frequency=1000) -``` - -Note that this tutorial is just a quick example on a relatively small dataset to -get you familiar with the APIs of recurrent neural networks and estimators. Such -models can be even more powerful if you try them on a large dataset. - -When training the model for 1M steps you can expect to get an accuracy of -approximately of approximately 70% on the top-1 candidate. Note that this -accuracy is sufficient to build the quickdraw game because of the game dynamics -the user will be able to adjust their drawing until it is ready. Also, the game -does not use the top-1 candidate only but accepts a drawing as correct if the -target category shows up with a score better than a fixed threshold. diff --git a/tensorflow/docs_src/tutorials/representation/kernel_methods.md b/tensorflow/docs_src/tutorials/representation/kernel_methods.md new file mode 100644 index 0000000000..f3c232c511 --- /dev/null +++ b/tensorflow/docs_src/tutorials/representation/kernel_methods.md @@ -0,0 +1,304 @@ +# Improving Linear Models Using Explicit Kernel Methods + +Note: This document uses a deprecated version of @{tf.estimator}, +which has a @{tf.contrib.learn.Estimator$different interface}. +It also uses other `contrib` methods whose +@{$version_compat#not_covered$API may not be stable}. + +In this tutorial, we demonstrate how combining (explicit) kernel methods with +linear models can drastically increase the latters' quality of predictions +without significantly increasing training and inference times. Unlike dual +kernel methods, explicit (primal) kernel methods scale well with the size of the +training dataset both in terms of training/inference times and in terms of +memory requirements. + +**Intended audience:** Even though we provide a high-level overview of concepts +related to explicit kernel methods, this tutorial primarily targets readers who +already have at least basic knowledge of kernel methods and Support Vector +Machines (SVMs). If you are new to kernel methods, refer to either of the +following sources for an introduction: + +* If you have a strong mathematical background: +[Kernel Methods in Machine Learning](https://arxiv.org/pdf/math/0701907.pdf) +* [Kernel method wikipedia page](https://en.wikipedia.org/wiki/Kernel_method) + +Currently, TensorFlow supports explicit kernel mappings for dense features only; +TensorFlow will provide support for sparse features at a later release. + +This tutorial uses [tf.contrib.learn](https://www.tensorflow.org/code/tensorflow/contrib/learn/python/learn) +(TensorFlow's high-level Machine Learning API) Estimators for our ML models. +If you are not familiar with this API, The [Estimator guide](../../guide/estimators.md) +is a good place to start. We will use the MNIST dataset. The tutorial consists +of the following steps: + +* Load and prepare MNIST data for classification. +* Construct a simple linear model, train it, and evaluate it on the eval data. +* Replace the linear model with a kernelized linear model, re-train, and +re-evaluate. + +## Load and prepare MNIST data for classification +Run the following utility command to load the MNIST dataset: + +```python +data = tf.contrib.learn.datasets.mnist.load_mnist() +``` +The preceding method loads the entire MNIST dataset (containing 70K samples) and +splits it into train, validation, and test data with 55K, 5K, and 10K samples +respectively. Each split contains one numpy array for images (with shape +[sample_size, 784]) and one for labels (with shape [sample_size, 1]). In this +tutorial, we only use the train and validation splits to train and evaluate our +models respectively. + +In order to feed data to a `tf.contrib.learn Estimator`, it is helpful to convert +it to Tensors. For this, we will use an `input function` which adds Ops to the +TensorFlow graph that, when executed, create mini-batches of Tensors to be used +downstream. For more background on input functions, check +@{$premade_estimators#create_input_functions$this section on input functions}. +In this example, we will use the `tf.train.shuffle_batch` Op which, besides +converting numpy arrays to Tensors, allows us to specify the batch_size and +whether to randomize the input every time the input_fn Ops are executed +(randomization typically expedites convergence during training). The full code +for loading and preparing the data is shown in the snippet below. In this +example, we use mini-batches of size 256 for training and the entire sample +(5K entries) for evaluation. Feel free to experiment with different batch sizes. + +```python +import numpy as np +import tensorflow as tf + +def get_input_fn(dataset_split, batch_size, capacity=10000, min_after_dequeue=3000): + + def _input_fn(): + images_batch, labels_batch = tf.train.shuffle_batch( + tensors=[dataset_split.images, dataset_split.labels.astype(np.int32)], + batch_size=batch_size, + capacity=capacity, + min_after_dequeue=min_after_dequeue, + enqueue_many=True, + num_threads=4) + features_map = {'images': images_batch} + return features_map, labels_batch + + return _input_fn + +data = tf.contrib.learn.datasets.mnist.load_mnist() + +train_input_fn = get_input_fn(data.train, batch_size=256) +eval_input_fn = get_input_fn(data.validation, batch_size=5000) + +``` + +## Training a simple linear model +We can now train a linear model over the MNIST dataset. We will use the +@{tf.contrib.learn.LinearClassifier} estimator with 10 classes representing the +10 digits. The input features form a 784-dimensional dense vector which can +be specified as follows: + +```python +image_column = tf.contrib.layers.real_valued_column('images', dimension=784) +``` + +The full code for constructing, training and evaluating a LinearClassifier +estimator is as follows: + +```python +import time + +# Specify the feature(s) to be used by the estimator. +image_column = tf.contrib.layers.real_valued_column('images', dimension=784) +estimator = tf.contrib.learn.LinearClassifier(feature_columns=[image_column], n_classes=10) + +# Train. +start = time.time() +estimator.fit(input_fn=train_input_fn, steps=2000) +end = time.time() +print('Elapsed time: {} seconds'.format(end - start)) + +# Evaluate and report metrics. +eval_metrics = estimator.evaluate(input_fn=eval_input_fn, steps=1) +print(eval_metrics) +``` +The following table summarizes the results on the eval data. + +metric | value +:------------ | :------------ +loss | 0.25 to 0.30 +accuracy | 92.5% +training time | ~25 seconds on my machine + +Note: Metrics will vary depending on various factors. + +In addition to experimenting with the (training) batch size and the number of +training steps, there are a couple other parameters that can be tuned as well. +For instance, you can change the optimization method used to minimize the loss +by explicitly selecting another optimizer from the collection of +[available optimizers](https://www.tensorflow.org/code/tensorflow/python/training). +As an example, the following code constructs a LinearClassifier estimator that +uses the Follow-The-Regularized-Leader (FTRL) optimization strategy with a +specific learning rate and L2-regularization. + + +```python +optimizer = tf.train.FtrlOptimizer(learning_rate=5.0, l2_regularization_strength=1.0) +estimator = tf.contrib.learn.LinearClassifier( + feature_columns=[image_column], n_classes=10, optimizer=optimizer) +``` + +Regardless of the values of the parameters, the maximum accuracy a linear model +can achieve on this dataset caps at around **93%**. + +## Using explicit kernel mappings with the linear model. +The relatively high error (~7%) of the linear model over MNIST indicates that +the input data is not linearly separable. We will use explicit kernel mappings +to reduce the classification error. + +**Intuition:** The high-level idea is to use a non-linear map to transform the +input space to another feature space (of possibly higher dimension) where the +(transformed) features are (almost) linearly separable and then apply a linear +model on the mapped features. This is shown in the following figure: + +
+ +
+ + +### Technical details +In this example we will use **Random Fourier Features**, introduced in the +["Random Features for Large-Scale Kernel Machines"](https://people.eecs.berkeley.edu/~brecht/papers/07.rah.rec.nips.pdf) +paper by Rahimi and Recht, to map the input data. Random Fourier Features map a +vector \\(\mathbf{x} \in \mathbb{R}^d\\) to \\(\mathbf{x'} \in \mathbb{R}^D\\) +via the following mapping: + +$$ +RFFM(\cdot): \mathbb{R}^d \to \mathbb{R}^D, \quad +RFFM(\mathbf{x}) = \cos(\mathbf{\Omega} \cdot \mathbf{x}+ \mathbf{b}) +$$ + +where \\(\mathbf{\Omega} \in \mathbb{R}^{D \times d}\\), +\\(\mathbf{x} \in \mathbb{R}^d,\\) \\(\mathbf{b} \in \mathbb{R}^D\\) and the +cosine is applied element-wise. + +In this example, the entries of \\(\mathbf{\Omega}\\) and \\(\mathbf{b}\\) are +sampled from distributions such that the mapping satisfies the following +property: + +$$ +RFFM(\mathbf{x})^T \cdot RFFM(\mathbf{y}) \approx +e^{-\frac{\|\mathbf{x} - \mathbf{y}\|^2}{2 \sigma^2}} +$$ + +The right-hand-side quantity of the expression above is known as the RBF (or +Gaussian) kernel function. This function is one of the most-widely used kernel +functions in Machine Learning and implicitly measures similarity in a different, +much higher dimensional space than the original one. See +[Radial basis function kernel](https://en.wikipedia.org/wiki/Radial_basis_function_kernel) +for more details. + +### Kernel classifier +@{tf.contrib.kernel_methods.KernelLinearClassifier} is a pre-packaged +`tf.contrib.learn` estimator that combines the power of explicit kernel mappings +with linear models. Its constructor is almost identical to that of the +LinearClassifier estimator with the additional option to specify a list of +explicit kernel mappings to be applied to each feature the classifier uses. The +following code snippet demonstrates how to replace LinearClassifier with +KernelLinearClassifier. + + +```python +# Specify the feature(s) to be used by the estimator. This is identical to the +# code used for the LinearClassifier. +image_column = tf.contrib.layers.real_valued_column('images', dimension=784) +optimizer = tf.train.FtrlOptimizer( + learning_rate=50.0, l2_regularization_strength=0.001) + + +kernel_mapper = tf.contrib.kernel_methods.RandomFourierFeatureMapper( + input_dim=784, output_dim=2000, stddev=5.0, name='rffm') +kernel_mappers = {image_column: [kernel_mapper]} +estimator = tf.contrib.kernel_methods.KernelLinearClassifier( + n_classes=10, optimizer=optimizer, kernel_mappers=kernel_mappers) + +# Train. +start = time.time() +estimator.fit(input_fn=train_input_fn, steps=2000) +end = time.time() +print('Elapsed time: {} seconds'.format(end - start)) + +# Evaluate and report metrics. +eval_metrics = estimator.evaluate(input_fn=eval_input_fn, steps=1) +print(eval_metrics) +``` +The only additional parameter passed to `KernelLinearClassifier` is a dictionary +from feature_columns to a list of kernel mappings to be applied to the +corresponding feature column. The following lines instruct the classifier to +first map the initial 784-dimensional images to 2000-dimensional vectors using +random Fourier features and then learn a linear model on the transformed +vectors: + +```python +kernel_mapper = tf.contrib.kernel_methods.RandomFourierFeatureMapper( + input_dim=784, output_dim=2000, stddev=5.0, name='rffm') +kernel_mappers = {image_column: [kernel_mapper]} +estimator = tf.contrib.kernel_methods.KernelLinearClassifier( + n_classes=10, optimizer=optimizer, kernel_mappers=kernel_mappers) +``` +Notice the `stddev` parameter. This is the standard deviation (\\(\sigma\\)) of +the approximated RBF kernel and controls the similarity measure used in +classification. `stddev` is typically determined via hyperparameter tuning. + +The results of running the preceding code are summarized in the following table. +We can further increase the accuracy by increasing the output dimension of the +mapping and tuning the standard deviation. + +metric | value +:------------ | :------------ +loss | 0.10 +accuracy | 97% +training time | ~35 seconds on my machine + + +### stddev +The classification quality is very sensitive to the value of stddev. The +following table shows the accuracy of the classifier on the eval data for +different values of stddev. The optimal value is stddev=5.0. Notice how too +small or too high stddev values can dramatically decrease the accuracy of the +classification. + +stddev | eval accuracy +:----- | :------------ +1.0 | 0.1362 +2.0 | 0.4764 +4.0 | 0.9654 +5.0 | 0.9766 +8.0 | 0.9714 +16.0 | 0.8878 + +### Output dimension +Intuitively, the larger the output dimension of the mapping, the closer the +inner product of two mapped vectors approximates the kernel, which typically +translates to better classification accuracy. Another way to think about this is +that the output dimension equals the number of weights of the linear model; the +larger this dimension, the larger the "degrees of freedom" of the model. +However, after a certain threshold, higher output dimensions increase the +accuracy by very little, while making training take more time. This is shown in +the following two Figures which depict the eval accuracy as a function of the +output dimension and the training time, respectively. + +![image](https://www.tensorflow.org/versions/master/images/acc_vs_outdim.png) +![image](https://www.tensorflow.org/versions/master/images/acc-vs-trn_time.png) + + +## Summary +Explicit kernel mappings combine the predictive power of nonlinear models with +the scalability of linear models. Unlike traditional dual kernel methods, +explicit kernel methods can scale to millions or hundreds of millions of +samples. When using explicit kernel mappings, consider the following tips: + +* Random Fourier Features can be particularly effective for datasets with dense +features. +* The parameters of the kernel mapping are often data-dependent. Model quality +can be very sensitive to these parameters. Use hyperparameter tuning to find the +optimal values. +* If you have multiple numerical features, concatenate them into a single +multi-dimensional feature and apply the kernel mapping to the concatenated +vector. diff --git a/tensorflow/docs_src/tutorials/representation/linear.md b/tensorflow/docs_src/tutorials/representation/linear.md new file mode 100644 index 0000000000..3f247ade26 --- /dev/null +++ b/tensorflow/docs_src/tutorials/representation/linear.md @@ -0,0 +1,237 @@ +# Large-scale Linear Models with TensorFlow + +@{tf.estimator$Estimators} provides (among other things) a rich set of tools for +working with linear models in TensorFlow. This document provides an overview of +those tools. It explains: + + * What a linear model is. + * Why you might want to use a linear model. + * How Estimators make it easy to build linear models in TensorFlow. + * How you can use Estimators to combine linear models with. + deep learning to get the advantages of both. + +Read this overview to decide whether the Estimator's linear model tools might +be useful to you. Then do the @{$wide$Linear Models tutorial} to +give it a try. This overview uses code samples from the tutorial, but the +tutorial walks through the code in greater detail. + +To understand this overview it will help to have some familiarity +with basic machine learning concepts, and also with +@{$premade_estimators$Estimators}. + +[TOC] + +## What is a linear model? + +A **linear model** uses a single weighted sum of features to make a prediction. +For example, if you have [data](https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names) +on age, years of education, and weekly hours of +work for a population, a model can learn weights for each of those numbers so that +their weighted sum estimates a person's salary. You can also use linear models +for classification. + +Some linear models transform the weighted sum into a more convenient form. For +example, [**logistic regression**](https://developers.google.com/machine-learning/glossary/#logistic_regression) plugs the weighted sum into the logistic +function to turn the output into a value between 0 and 1. But you still just +have one weight for each input feature. + +## Why would you want to use a linear model? + +Why would you want to use so simple a model when recent research has +demonstrated the power of more complex neural networks with many layers? + +Linear models: + + * train quickly, compared to deep neural nets. + * can work well on very large feature sets. + * can be trained with algorithms that don't require a lot of fiddling + with learning rates, etc. + * can be interpreted and debugged more easily than neural nets. + You can examine the weights assigned to each feature to figure out what's + having the biggest impact on a prediction. + * provide an excellent starting point for learning about machine learning. + * are widely used in industry. + +## How do Estimators help you build linear models? + +You can build a linear model from scratch in TensorFlow without the help of a +special API. But Estimators provides some tools that make it easier to build +effective large-scale linear models. + +### Feature columns and transformations + +Much of the work of designing a linear model consists of transforming raw data +into suitable input features. Tensorflow uses the `FeatureColumn` abstraction to +enable these transformations. + +A `FeatureColumn` represents a single feature in your data. A `FeatureColumn` +may represent a quantity like 'height', or it may represent a category like +'eye_color' where the value is drawn from a set of discrete possibilities like +{'blue', 'brown', 'green'}. + +In the case of both *continuous features* like 'height' and *categorical +features* like 'eye_color', a single value in the data might get transformed +into a sequence of numbers before it is input into the model. The +`FeatureColumn` abstraction lets you manipulate the feature as a single +semantic unit in spite of this fact. You can specify transformations and +select features to include without dealing with specific indices in the +tensors you feed into the model. + +#### Sparse columns + +Categorical features in linear models are typically translated into a sparse +vector in which each possible value has a corresponding index or id. For +example, if there are only three possible eye colors you can represent +'eye_color' as a length 3 vector: 'brown' would become [1, 0, 0], 'blue' would +become [0, 1, 0] and 'green' would become [0, 0, 1]. These vectors are called +"sparse" because they may be very long, with many zeros, when the set of +possible values is very large (such as all English words). + +While you don't need to use categorical columns to use the linear model tools +provided by Estimators, one of the strengths of linear models is their ability +to deal with large sparse vectors. Sparse features are a primary use case for +the linear model tools provided by Estimators. + +##### Encoding sparse columns + +`FeatureColumn` handles the conversion of categorical values into vectors +automatically, with code like this: + +```python +eye_color = tf.feature_column.categorical_column_with_vocabulary_list( + "eye_color", vocabulary_list=["blue", "brown", "green"]) +``` + +where `eye_color` is the name of a column in your source data. + +You can also generate `FeatureColumn`s for categorical features for which you +don't know all possible values. For this case you would use +`categorical_column_with_hash_bucket()`, which uses a hash function to assign +indices to feature values. + +```python +education = tf.feature_column.categorical_column_with_hash_bucket( + "education", hash_bucket_size=1000) +``` + +##### Feature Crosses + +Because linear models assign independent weights to separate features, they +can't learn the relative importance of specific combinations of feature +values. If you have a feature 'favorite_sport' and a feature 'home_city' and +you're trying to predict whether a person likes to wear red, your linear model +won't be able to learn that baseball fans from St. Louis especially like to +wear red. + +You can get around this limitation by creating a new feature +'favorite_sport_x_home_city'. The value of this feature for a given person is +just the concatenation of the values of the two source features: +'baseball_x_stlouis', for example. This sort of combination feature is called +a *feature cross*. + +The `crossed_column()` method makes it easy to set up feature crosses: + +```python +sport_x_city = tf.feature_column.crossed_column( + ["sport", "city"], hash_bucket_size=int(1e4)) +``` + +#### Continuous columns + +You can specify a continuous feature like so: + +```python +age = tf.feature_column.numeric_column("age") +``` + +Although, as a single real number, a continuous feature can often be input +directly into the model, Tensorflow offers useful transformations for this sort +of column as well. + +##### Bucketization + +*Bucketization* turns a continuous column into a categorical column. This +transformation lets you use continuous features in feature crosses, or learn +cases where specific value ranges have particular importance. + +Bucketization divides the range of possible values into subranges called +buckets: + +```python +age_buckets = tf.feature_column.bucketized_column( + age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) +``` + +The bucket into which a value falls becomes the categorical label for +that value. + +#### Input function + +`FeatureColumn`s provide a specification for the input data for your model, +indicating how to represent and transform the data. But they do not provide +the data itself. You provide the data through an input function. + +The input function must return a dictionary of tensors. Each key corresponds to +the name of a `FeatureColumn`. Each key's value is a tensor containing the +values of that feature for all data instances. See +@{$premade_estimators#input_fn} for a +more comprehensive look at input functions, and `input_fn` in the +[linear models tutorial code](https://github.com/tensorflow/models/tree/master/official/wide_deep/wide_deep.py) +for an example implementation of an input function. + +The input function is passed to the `train()` and `evaluate()` calls that +initiate training and testing, as described in the next section. + +### Linear estimators + +Tensorflow estimator classes provide a unified training and evaluation harness +for regression and classification models. They take care of the details of the +training and evaluation loops and allow the user to focus on model inputs and +architecture. + +To build a linear estimator, you can use either the +`tf.estimator.LinearClassifier` estimator or the +`tf.estimator.LinearRegressor` estimator, for classification and +regression respectively. + +As with all tensorflow estimators, to run the estimator you just: + + 1. Instantiate the estimator class. For the two linear estimator classes, + you pass a list of `FeatureColumn`s to the constructor. + 2. Call the estimator's `train()` method to train it. + 3. Call the estimator's `evaluate()` method to see how it does. + +For example: + +```python +e = tf.estimator.LinearClassifier( + feature_columns=[ + native_country, education, occupation, workclass, marital_status, + race, age_buckets, education_x_occupation, + age_buckets_x_race_x_occupation], + model_dir=YOUR_MODEL_DIRECTORY) +e.train(input_fn=input_fn_train, steps=200) +# Evaluate for one step (one pass through the test data). +results = e.evaluate(input_fn=input_fn_test) + +# Print the stats for the evaluation. +for key in sorted(results): + print("%s: %s" % (key, results[key])) +``` + +### Wide and deep learning + +The `tf.estimator` module also provides an estimator class that lets you jointly +train a linear model and a deep neural network. This novel approach combines the +ability of linear models to "memorize" key features with the generalization +ability of neural nets. Use `tf.estimator.DNNLinearCombinedClassifier` to +create this sort of "wide and deep" model: + +```python +e = tf.estimator.DNNLinearCombinedClassifier( + model_dir=YOUR_MODEL_DIR, + linear_feature_columns=wide_columns, + dnn_feature_columns=deep_columns, + dnn_hidden_units=[100, 50]) +``` +For more information, see the @{$wide_and_deep$Wide and Deep Learning tutorial}. diff --git a/tensorflow/docs_src/tutorials/representation/wide.md b/tensorflow/docs_src/tutorials/representation/wide.md new file mode 100644 index 0000000000..27ce75a30d --- /dev/null +++ b/tensorflow/docs_src/tutorials/representation/wide.md @@ -0,0 +1,461 @@ +# TensorFlow Linear Model Tutorial + +In this tutorial, we will use the tf.estimator API in TensorFlow to solve a +binary classification problem: Given census data about a person such as age, +education, marital status, and occupation (the features), we will try to predict +whether or not the person earns more than 50,000 dollars a year (the target +label). We will train a **logistic regression** model, and given an individual's +information our model will output a number between 0 and 1, which can be +interpreted as the probability that the individual has an annual income of over +50,000 dollars. + +## Setup + +To try the code for this tutorial: + +1. @{$install$Install TensorFlow} if you haven't already. + +2. Download [the tutorial code](https://github.com/tensorflow/models/tree/master/official/wide_deep/). + +3. Execute the data download script we provide to you: + + $ python data_download.py + +4. Execute the tutorial code with the following command to train the linear +model described in this tutorial: + + $ python wide_deep.py --model_type=wide + +Read on to find out how this code builds its linear model. + +## Reading The Census Data + +The dataset we'll be using is the +[Census Income Dataset](https://archive.ics.uci.edu/ml/datasets/Census+Income). +We have provided +[data_download.py](https://github.com/tensorflow/models/tree/master/official/wide_deep/data_download.py) +which downloads the code and performs some additional cleanup. + +Since the task is a binary classification problem, we'll construct a label +column named "label" whose value is 1 if the income is over 50K, and 0 +otherwise. For reference, see `input_fn` in +[wide_deep.py](https://github.com/tensorflow/models/tree/master/official/wide_deep/wide_deep.py). + +Next, let's take a look at the dataframe and see which columns we can use to +predict the target label. The columns can be grouped into two types—categorical +and continuous columns: + +* A column is called **categorical** if its value can only be one of the + categories in a finite set. For example, the relationship status of a person + (wife, husband, unmarried, etc.) or the education level (high school, + college, etc.) are categorical columns. +* A column is called **continuous** if its value can be any numerical value in + a continuous range. For example, the capital gain of a person (e.g. $14,084) + is a continuous column. + +Here's a list of columns available in the Census Income dataset: + +| Column Name | Type | Description | +| -------------- | ----------- | --------------------------------- | +| age | Continuous | The age of the individual | +| workclass | Categorical | The type of employer the | +: : : individual has (government, : +: : : military, private, etc.). : +| fnlwgt | Continuous | The number of people the census | +: : : takers believe that observation : +: : : represents (sample weight). Final : +: : : weight will not be used. : +| education | Categorical | The highest level of education | +: : : achieved for that individual. : +| education_num | Continuous | The highest level of education in | +: : : numerical form. : +| marital_status | Categorical | Marital status of the individual. | +| occupation | Categorical | The occupation of the individual. | +| relationship | Categorical | Wife, Own-child, Husband, | +: : : Not-in-family, Other-relative, : +: : : Unmarried. : +| race | Categorical | Amer-Indian-Eskimo, Asian-Pac- | +: : : Islander, Black, White, Other. : +| gender | Categorical | Female, Male. | +| capital_gain | Continuous | Capital gains recorded. | +| capital_loss | Continuous | Capital Losses recorded. | +| hours_per_week | Continuous | Hours worked per week. | +| native_country | Categorical | Country of origin of the | +: : : individual. : +| income_bracket | Categorical | ">50K" or "<=50K", meaning | +: : : whether the person makes more : +: : : than $50,000 annually. : + +## Converting Data into Tensors + +When building a tf.estimator model, the input data is specified by means of an +Input Builder function. This builder function will not be called until it is +later passed to tf.estimator.Estimator methods such as `train` and `evaluate`. +The purpose of this function is to construct the input data, which is +represented in the form of @{tf.Tensor}s or @{tf.SparseTensor}s. +In more detail, the input builder function returns the following as a pair: + +1. `features`: A dict from feature column names to `Tensors` or + `SparseTensors`. +2. `labels`: A `Tensor` containing the label column. + +The keys of the `features` will be used to construct columns in the next +section. Because we want to call the `train` and `evaluate` methods with +different data, we define a method that returns an input function based on the +given data. Note that the returned input function will be called while +constructing the TensorFlow graph, not while running the graph. What it is +returning is a representation of the input data as the fundamental unit of +TensorFlow computations, a `Tensor` (or `SparseTensor`). + +Each continuous column in the train or test data will be converted into a +`Tensor`, which in general is a good format to represent dense data. For +categorical data, we must represent the data as a `SparseTensor`. This data +format is good for representing sparse data. Our `input_fn` uses the `tf.data` +API, which makes it easy to apply transformations to our dataset: + +```python +def input_fn(data_file, num_epochs, shuffle, batch_size): + """Generate an input function for the Estimator.""" + assert tf.gfile.Exists(data_file), ( + '%s not found. Please make sure you have either run data_download.py or ' + 'set both arguments --train_data and --test_data.' % data_file) + + def parse_csv(value): + print('Parsing', data_file) + columns = tf.decode_csv(value, record_defaults=_CSV_COLUMN_DEFAULTS) + features = dict(zip(_CSV_COLUMNS, columns)) + labels = features.pop('income_bracket') + return features, tf.equal(labels, '>50K') + + # Extract lines from input files using the Dataset API. + dataset = tf.data.TextLineDataset(data_file) + + if shuffle: + dataset = dataset.shuffle(buffer_size=_SHUFFLE_BUFFER) + + dataset = dataset.map(parse_csv, num_parallel_calls=5) + + # We call repeat after shuffling, rather than before, to prevent separate + # epochs from blending together. + dataset = dataset.repeat(num_epochs) + dataset = dataset.batch(batch_size) + + iterator = dataset.make_one_shot_iterator() + features, labels = iterator.get_next() + return features, labels +``` + +## Selecting and Engineering Features for the Model + +Selecting and crafting the right set of feature columns is key to learning an +effective model. A **feature column** can be either one of the raw columns in +the original dataframe (let's call them **base feature columns**), or any new +columns created based on some transformations defined over one or multiple base +columns (let's call them **derived feature columns**). Basically, "feature +column" is an abstract concept of any raw or derived variable that can be used +to predict the target label. + +### Base Categorical Feature Columns + +To define a feature column for a categorical feature, we can create a +`CategoricalColumn` using the tf.feature_column API. If you know the set of all +possible feature values of a column and there are only a few of them, you can +use `categorical_column_with_vocabulary_list`. Each key in the list will get +assigned an auto-incremental ID starting from 0. For example, for the +`relationship` column we can assign the feature string "Husband" to an integer +ID of 0 and "Not-in-family" to 1, etc., by doing: + +```python +relationship = tf.feature_column.categorical_column_with_vocabulary_list( + 'relationship', [ + 'Husband', 'Not-in-family', 'Wife', 'Own-child', 'Unmarried', + 'Other-relative']) +``` + +What if we don't know the set of possible values in advance? Not a problem. We +can use `categorical_column_with_hash_bucket` instead: + +```python +occupation = tf.feature_column.categorical_column_with_hash_bucket( + 'occupation', hash_bucket_size=1000) +``` + +What will happen is that each possible value in the feature column `occupation` +will be hashed to an integer ID as we encounter them in training. See an example +illustration below: + +ID | Feature +--- | ------------- +... | +9 | `"Machine-op-inspct"` +... | +103 | `"Farming-fishing"` +... | +375 | `"Protective-serv"` +... | + +No matter which way we choose to define a `SparseColumn`, each feature string +will be mapped into an integer ID by looking up a fixed mapping or by hashing. +Note that hashing collisions are possible, but may not significantly impact the +model quality. Under the hood, the `LinearModel` class is responsible for +managing the mapping and creating `tf.Variable` to store the model parameters +(also known as model weights) for each feature ID. The model parameters will be +learned through the model training process we'll go through later. + +We'll do the similar trick to define the other categorical features: + +```python +education = tf.feature_column.categorical_column_with_vocabulary_list( + 'education', [ + 'Bachelors', 'HS-grad', '11th', 'Masters', '9th', 'Some-college', + 'Assoc-acdm', 'Assoc-voc', '7th-8th', 'Doctorate', 'Prof-school', + '5th-6th', '10th', '1st-4th', 'Preschool', '12th']) + +marital_status = tf.feature_column.categorical_column_with_vocabulary_list( + 'marital_status', [ + 'Married-civ-spouse', 'Divorced', 'Married-spouse-absent', + 'Never-married', 'Separated', 'Married-AF-spouse', 'Widowed']) + +relationship = tf.feature_column.categorical_column_with_vocabulary_list( + 'relationship', [ + 'Husband', 'Not-in-family', 'Wife', 'Own-child', 'Unmarried', + 'Other-relative']) + +workclass = tf.feature_column.categorical_column_with_vocabulary_list( + 'workclass', [ + 'Self-emp-not-inc', 'Private', 'State-gov', 'Federal-gov', + 'Local-gov', '?', 'Self-emp-inc', 'Without-pay', 'Never-worked']) + +# To show an example of hashing: +occupation = tf.feature_column.categorical_column_with_hash_bucket( + 'occupation', hash_bucket_size=1000) +``` + +### Base Continuous Feature Columns + +Similarly, we can define a `NumericColumn` for each continuous feature column +that we want to use in the model: + +```python +age = tf.feature_column.numeric_column('age') +education_num = tf.feature_column.numeric_column('education_num') +capital_gain = tf.feature_column.numeric_column('capital_gain') +capital_loss = tf.feature_column.numeric_column('capital_loss') +hours_per_week = tf.feature_column.numeric_column('hours_per_week') +``` + +### Making Continuous Features Categorical through Bucketization + +Sometimes the relationship between a continuous feature and the label is not +linear. As a hypothetical example, a person's income may grow with age in the +early stage of one's career, then the growth may slow at some point, and finally +the income decreases after retirement. In this scenario, using the raw `age` as +a real-valued feature column might not be a good choice because the model can +only learn one of the three cases: + +1. Income always increases at some rate as age grows (positive correlation), +1. Income always decreases at some rate as age grows (negative correlation), or +1. Income stays the same no matter at what age (no correlation) + +If we want to learn the fine-grained correlation between income and each age +group separately, we can leverage **bucketization**. Bucketization is a process +of dividing the entire range of a continuous feature into a set of consecutive +bins/buckets, and then converting the original numerical feature into a bucket +ID (as a categorical feature) depending on which bucket that value falls into. +So, we can define a `bucketized_column` over `age` as: + +```python +age_buckets = tf.feature_column.bucketized_column( + age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) +``` + +where the `boundaries` is a list of bucket boundaries. In this case, there are +10 boundaries, resulting in 11 age group buckets (from age 17 and below, 18-24, +25-29, ..., to 65 and over). + +### Intersecting Multiple Columns with CrossedColumn + +Using each base feature column separately may not be enough to explain the data. +For example, the correlation between education and the label (earning > 50,000 +dollars) may be different for different occupations. Therefore, if we only learn +a single model weight for `education="Bachelors"` and `education="Masters"`, we +won't be able to capture every single education-occupation combination (e.g. +distinguishing between `education="Bachelors" AND occupation="Exec-managerial"` +and `education="Bachelors" AND occupation="Craft-repair"`). To learn the +differences between different feature combinations, we can add **crossed feature +columns** to the model. + +```python +education_x_occupation = tf.feature_column.crossed_column( + ['education', 'occupation'], hash_bucket_size=1000) +``` + +We can also create a `CrossedColumn` over more than two columns. Each +constituent column can be either a base feature column that is categorical +(`SparseColumn`), a bucketized real-valued feature column (`BucketizedColumn`), +or even another `CrossColumn`. Here's an example: + +```python +age_buckets_x_education_x_occupation = tf.feature_column.crossed_column( + [age_buckets, 'education', 'occupation'], hash_bucket_size=1000) +``` + +## Defining The Logistic Regression Model + +After processing the input data and defining all the feature columns, we're now +ready to put them all together and build a Logistic Regression model. In the +previous section we've seen several types of base and derived feature columns, +including: + +* `CategoricalColumn` +* `NumericColumn` +* `BucketizedColumn` +* `CrossedColumn` + +All of these are subclasses of the abstract `FeatureColumn` class, and can be +added to the `feature_columns` field of a model: + +```python +base_columns = [ + education, marital_status, relationship, workclass, occupation, + age_buckets, +] +crossed_columns = [ + tf.feature_column.crossed_column( + ['education', 'occupation'], hash_bucket_size=1000), + tf.feature_column.crossed_column( + [age_buckets, 'education', 'occupation'], hash_bucket_size=1000), +] + +model_dir = tempfile.mkdtemp() +model = tf.estimator.LinearClassifier( + model_dir=model_dir, feature_columns=base_columns + crossed_columns) +``` + +The model also automatically learns a bias term, which controls the prediction +one would make without observing any features (see the section "How Logistic +Regression Works" for more explanations). The learned model files will be stored +in `model_dir`. + +## Training and Evaluating Our Model + +After adding all the features to the model, now let's look at how to actually +train the model. Training a model is just a single command using the +tf.estimator API: + +```python +model.train(input_fn=lambda: input_fn(train_data, num_epochs, True, batch_size)) +``` + +After the model is trained, we can evaluate how good our model is at predicting +the labels of the holdout data: + +```python +results = model.evaluate(input_fn=lambda: input_fn( + test_data, 1, False, batch_size)) +for key in sorted(results): + print('%s: %s' % (key, results[key])) +``` + +The first line of the final output should be something like +`accuracy: 0.83557522`, which means the accuracy is 83.6%. Feel free to try more +features and transformations and see if you can do even better! + +After the model is evaluated, we can use the model to predict whether an individual has an annual income of over +50,000 dollars given an individual's information input. +```python + pred_iter = model.predict(input_fn=lambda: input_fn(FLAGS.test_data, 1, False, 1)) + for pred in pred_iter: + print(pred['classes']) +``` + +The model prediction output would be like `[b'1']` or `[b'0']` which means whether corresponding individual has an annual income of over 50,000 dollars or not. + +If you'd like to see a working end-to-end example, you can download our +[example code](https://github.com/tensorflow/models/tree/master/official/wide_deep/wide_deep.py) +and set the `model_type` flag to `wide`. + +## Adding Regularization to Prevent Overfitting + +Regularization is a technique used to avoid **overfitting**. Overfitting happens +when your model does well on the data it is trained on, but worse on test data +that the model has not seen before, such as live traffic. Overfitting generally +occurs when a model is excessively complex, such as having too many parameters +relative to the number of observed training data. Regularization allows for you +to control your model's complexity and makes the model more generalizable to +unseen data. + +In the Linear Model library, you can add L1 and L2 regularizations to the model +as: + +``` +model = tf.estimator.LinearClassifier( + model_dir=model_dir, feature_columns=base_columns + crossed_columns, + optimizer=tf.train.FtrlOptimizer( + learning_rate=0.1, + l1_regularization_strength=1.0, + l2_regularization_strength=1.0)) +``` + +One important difference between L1 and L2 regularization is that L1 +regularization tends to make model weights stay at zero, creating sparser +models, whereas L2 regularization also tries to make the model weights closer to +zero but not necessarily zero. Therefore, if you increase the strength of L1 +regularization, you will have a smaller model size because many of the model +weights will be zero. This is often desirable when the feature space is very +large but sparse, and when there are resource constraints that prevent you from +serving a model that is too large. + +In practice, you should try various combinations of L1, L2 regularization +strengths and find the best parameters that best control overfitting and give +you a desirable model size. + +## How Logistic Regression Works + +Finally, let's take a minute to talk about what the Logistic Regression model +actually looks like in case you're not already familiar with it. We'll denote +the label as \\(Y\\), and the set of observed features as a feature vector +\\(\mathbf{x}=[x_1, x_2, ..., x_d]\\). We define \\(Y=1\\) if an individual +earned > 50,000 dollars and \\(Y=0\\) otherwise. In Logistic Regression, the +probability of the label being positive (\\(Y=1\\)) given the features +\\(\mathbf{x}\\) is given as: + +$$ P(Y=1|\mathbf{x}) = \frac{1}{1+\exp(-(\mathbf{w}^T\mathbf{x}+b))}$$ + +where \\(\mathbf{w}=[w_1, w_2, ..., w_d]\\) are the model weights for the +features \\(\mathbf{x}=[x_1, x_2, ..., x_d]\\). \\(b\\) is a constant that is +often called the **bias** of the model. The equation consists of two parts—A +linear model and a logistic function: + +* **Linear Model**: First, we can see that \\(\mathbf{w}^T\mathbf{x}+b = b + + w_1x_1 + ... +w_dx_d\\) is a linear model where the output is a linear + function of the input features \\(\mathbf{x}\\). The bias \\(b\\) is the + prediction one would make without observing any features. The model weight + \\(w_i\\) reflects how the feature \\(x_i\\) is correlated with the positive + label. If \\(x_i\\) is positively correlated with the positive label, the + weight \\(w_i\\) increases, and the probability \\(P(Y=1|\mathbf{x})\\) will + be closer to 1. On the other hand, if \\(x_i\\) is negatively correlated + with the positive label, then the weight \\(w_i\\) decreases and the + probability \\(P(Y=1|\mathbf{x})\\) will be closer to 0. + +* **Logistic Function**: Second, we can see that there's a logistic function + (also known as the sigmoid function) \\(S(t) = 1/(1+\exp(-t))\\) being + applied to the linear model. The logistic function is used to convert the + output of the linear model \\(\mathbf{w}^T\mathbf{x}+b\\) from any real + number into the range of \\([0, 1]\\), which can be interpreted as a + probability. + +Model training is an optimization problem: The goal is to find a set of model +weights (i.e. model parameters) to minimize a **loss function** defined over the +training data, such as logistic loss for Logistic Regression models. The loss +function measures the discrepancy between the ground-truth label and the model's +prediction. If the prediction is very close to the ground-truth label, the loss +value will be low; if the prediction is very far from the label, then the loss +value would be high. + +## Learn Deeper + +If you're interested in learning more, check out our +@{$wide_and_deep$Wide & Deep Learning Tutorial} where we'll show you how to +combine the strengths of linear models and deep neural networks by jointly +training them using the tf.estimator API. diff --git a/tensorflow/docs_src/tutorials/representation/wide_and_deep.md b/tensorflow/docs_src/tutorials/representation/wide_and_deep.md new file mode 100644 index 0000000000..44677a810b --- /dev/null +++ b/tensorflow/docs_src/tutorials/representation/wide_and_deep.md @@ -0,0 +1,243 @@ +# TensorFlow Wide & Deep Learning Tutorial + +In the previous @{$wide$TensorFlow Linear Model Tutorial}, we trained a logistic +regression model to predict the probability that the individual has an annual +income of over 50,000 dollars using the +[Census Income Dataset](https://archive.ics.uci.edu/ml/datasets/Census+Income). +TensorFlow is great for training deep neural networks too, and you might be +thinking which one you should choose—well, why not both? Would it be possible to +combine the strengths of both in one model? + +In this tutorial, we'll introduce how to use the tf.estimator API to jointly +train a wide linear model and a deep feed-forward neural network. This approach +combines the strengths of memorization and generalization. It's useful for +generic large-scale regression and classification problems with sparse input +features (e.g., categorical features with a large number of possible feature +values). If you're interested in learning more about how Wide & Deep Learning +works, please check out our [research paper](https://arxiv.org/abs/1606.07792). + +![Wide & Deep Spectrum of Models](https://www.tensorflow.org/images/wide_n_deep.svg "Wide & Deep") + +The figure above shows a comparison of a wide model (logistic regression with +sparse features and transformations), a deep model (feed-forward neural network +with an embedding layer and several hidden layers), and a Wide & Deep model +(joint training of both). At a high level, there are only 3 steps to configure a +wide, deep, or Wide & Deep model using the tf.estimator API: + +1. Select features for the wide part: Choose the sparse base columns and + crossed columns you want to use. +1. Select features for the deep part: Choose the continuous columns, the + embedding dimension for each categorical column, and the hidden layer sizes. +1. Put them all together in a Wide & Deep model + (`DNNLinearCombinedClassifier`). + +And that's it! Let's go through a simple example. + +## Setup + +To try the code for this tutorial: + +1. @{$install$Install TensorFlow} if you haven't already. + +2. Download [the tutorial code](https://github.com/tensorflow/models/tree/master/official/wide_deep/). + +3. Execute the data download script we provide to you: + + $ python data_download.py + +4. Execute the tutorial code with the following command to train the wide and +deep model described in this tutorial: + + $ python wide_deep.py + +Read on to find out how this code builds its model. + + +## Define Base Feature Columns + +First, let's define the base categorical and continuous feature columns that +we'll use. These base columns will be the building blocks used by both the wide +part and the deep part of the model. + +```python +import tensorflow as tf + +# Continuous columns +age = tf.feature_column.numeric_column('age') +education_num = tf.feature_column.numeric_column('education_num') +capital_gain = tf.feature_column.numeric_column('capital_gain') +capital_loss = tf.feature_column.numeric_column('capital_loss') +hours_per_week = tf.feature_column.numeric_column('hours_per_week') + +education = tf.feature_column.categorical_column_with_vocabulary_list( + 'education', [ + 'Bachelors', 'HS-grad', '11th', 'Masters', '9th', 'Some-college', + 'Assoc-acdm', 'Assoc-voc', '7th-8th', 'Doctorate', 'Prof-school', + '5th-6th', '10th', '1st-4th', 'Preschool', '12th']) + +marital_status = tf.feature_column.categorical_column_with_vocabulary_list( + 'marital_status', [ + 'Married-civ-spouse', 'Divorced', 'Married-spouse-absent', + 'Never-married', 'Separated', 'Married-AF-spouse', 'Widowed']) + +relationship = tf.feature_column.categorical_column_with_vocabulary_list( + 'relationship', [ + 'Husband', 'Not-in-family', 'Wife', 'Own-child', 'Unmarried', + 'Other-relative']) + +workclass = tf.feature_column.categorical_column_with_vocabulary_list( + 'workclass', [ + 'Self-emp-not-inc', 'Private', 'State-gov', 'Federal-gov', + 'Local-gov', '?', 'Self-emp-inc', 'Without-pay', 'Never-worked']) + +# To show an example of hashing: +occupation = tf.feature_column.categorical_column_with_hash_bucket( + 'occupation', hash_bucket_size=1000) + +# Transformations. +age_buckets = tf.feature_column.bucketized_column( + age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) +``` + +## The Wide Model: Linear Model with Crossed Feature Columns + +The wide model is a linear model with a wide set of sparse and crossed feature +columns: + +```python +base_columns = [ + education, marital_status, relationship, workclass, occupation, + age_buckets, +] + +crossed_columns = [ + tf.feature_column.crossed_column( + ['education', 'occupation'], hash_bucket_size=1000), + tf.feature_column.crossed_column( + [age_buckets, 'education', 'occupation'], hash_bucket_size=1000), +] +``` + +You can also see the @{$wide$TensorFlow Linear Model Tutorial} for more details. + +Wide models with crossed feature columns can memorize sparse interactions +between features effectively. That being said, one limitation of crossed feature +columns is that they do not generalize to feature combinations that have not +appeared in the training data. Let's add a deep model with embeddings to fix +that. + +## The Deep Model: Neural Network with Embeddings + +The deep model is a feed-forward neural network, as shown in the previous +figure. Each of the sparse, high-dimensional categorical features are first +converted into a low-dimensional and dense real-valued vector, often referred to +as an embedding vector. These low-dimensional dense embedding vectors are +concatenated with the continuous features, and then fed into the hidden layers +of a neural network in the forward pass. The embedding values are initialized +randomly, and are trained along with all other model parameters to minimize the +training loss. If you're interested in learning more about embeddings, check out +the TensorFlow tutorial on @{$word2vec$Vector Representations of Words} or +[Word embedding](https://en.wikipedia.org/wiki/Word_embedding) on Wikipedia. + +Another way to represent categorical columns to feed into a neural network is +via a one-hot or multi-hot representation. This is often appropriate for +categorical columns with only a few possible values. As an example of a one-hot +representation, for the relationship column, `"Husband"` can be represented as +[1, 0, 0, 0, 0, 0], and `"Not-in-family"` as [0, 1, 0, 0, 0, 0], etc. This is a +fixed representation, whereas embeddings are more flexible and calculated at +training time. + +We'll configure the embeddings for the categorical columns using +`embedding_column`, and concatenate them with the continuous columns. +We also use `indicator_column` to create multi-hot representations of some +categorical columns. + +```python +deep_columns = [ + age, + education_num, + capital_gain, + capital_loss, + hours_per_week, + tf.feature_column.indicator_column(workclass), + tf.feature_column.indicator_column(education), + tf.feature_column.indicator_column(marital_status), + tf.feature_column.indicator_column(relationship), + # To show an example of embedding + tf.feature_column.embedding_column(occupation, dimension=8), +] +``` + +The higher the `dimension` of the embedding is, the more degrees of freedom the +model will have to learn the representations of the features. For simplicity, we +set the dimension to 8 for all feature columns here. Empirically, a more +informed decision for the number of dimensions is to start with a value on the +order of \\(\log_2(n)\\) or \\(k\sqrt[4]n\\), where \\(n\\) is the number of +unique features in a feature column and \\(k\\) is a small constant (usually +smaller than 10). + +Through dense embeddings, deep models can generalize better and make predictions +on feature pairs that were previously unseen in the training data. However, it +is difficult to learn effective low-dimensional representations for feature +columns when the underlying interaction matrix between two feature columns is +sparse and high-rank. In such cases, the interaction between most feature pairs +should be zero except a few, but dense embeddings will lead to nonzero +predictions for all feature pairs, and thus can over-generalize. On the other +hand, linear models with crossed features can memorize these “exception rules” +effectively with fewer model parameters. + +Now, let's see how to jointly train wide and deep models and allow them to +complement each other’s strengths and weaknesses. + +## Combining Wide and Deep Models into One + +The wide models and deep models are combined by summing up their final output +log odds as the prediction, then feeding the prediction to a logistic loss +function. All the graph definition and variable allocations have already been +handled for you under the hood, so you simply need to create a +`DNNLinearCombinedClassifier`: + +```python +model = tf.estimator.DNNLinearCombinedClassifier( + model_dir='/tmp/census_model', + linear_feature_columns=base_columns + crossed_columns, + dnn_feature_columns=deep_columns, + dnn_hidden_units=[100, 50]) +``` + +## Training and Evaluating The Model + +Before we train the model, let's read in the Census dataset as we did in the +@{$wide$TensorFlow Linear Model tutorial}. See `data_download.py` as well as +`input_fn` within +[`wide_deep.py`](https://github.com/tensorflow/models/tree/master/official/wide_deep/wide_deep.py). + +After reading in the data, you can train and evaluate the model: + +```python +# Train and evaluate the model every `FLAGS.epochs_per_eval` epochs. +for n in range(FLAGS.train_epochs // FLAGS.epochs_per_eval): + model.train(input_fn=lambda: input_fn( + FLAGS.train_data, FLAGS.epochs_per_eval, True, FLAGS.batch_size)) + + results = model.evaluate(input_fn=lambda: input_fn( + FLAGS.test_data, 1, False, FLAGS.batch_size)) + + # Display evaluation metrics + print('Results at epoch', (n + 1) * FLAGS.epochs_per_eval) + print('-' * 30) + + for key in sorted(results): + print('%s: %s' % (key, results[key])) +``` + +The final output accuracy should be somewhere around 85.5%. If you'd like to +see a working end-to-end example, you can download our +[example code](https://github.com/tensorflow/models/tree/master/official/wide_deep/wide_deep.py). + +Note that this tutorial is just a quick example on a small dataset to get you +familiar with the API. Wide & Deep Learning will be even more powerful if you +try it on a large dataset with many sparse feature columns that have a large +number of possible feature values. Again, feel free to take a look at our +[research paper](https://arxiv.org/abs/1606.07792) for more ideas about how to +apply Wide & Deep Learning in real-world large-scale machine learning problems. diff --git a/tensorflow/docs_src/tutorials/representation/word2vec.md b/tensorflow/docs_src/tutorials/representation/word2vec.md new file mode 100644 index 0000000000..3fe7352bd2 --- /dev/null +++ b/tensorflow/docs_src/tutorials/representation/word2vec.md @@ -0,0 +1,405 @@ +# Vector Representations of Words + +In this tutorial we look at the word2vec model by +[Mikolov et al.](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) +This model is used for learning vector representations of words, called "word +embeddings". + +## Highlights + +This tutorial is meant to highlight the interesting, substantive parts of +building a word2vec model in TensorFlow. + +* We start by giving the motivation for why we would want to +represent words as vectors. +* We look at the intuition behind the model and how it is trained +(with a splash of math for good measure). +* We also show a simple implementation of the model in TensorFlow. +* Finally, we look at ways to make the naive version scale better. + +We walk through the code later during the tutorial, but if you'd prefer to dive +straight in, feel free to look at the minimalistic implementation in +[tensorflow/examples/tutorials/word2vec/word2vec_basic.py](https://www.tensorflow.org/code/tensorflow/examples/tutorials/word2vec/word2vec_basic.py) +This basic example contains the code needed to download some data, train on it a +bit and visualize the result. Once you get comfortable with reading and running +the basic version, you can graduate to +[models/tutorials/embedding/word2vec.py](https://www.tensorflow.org/code/tensorflow_models/tutorials/embedding/word2vec.py) +which is a more serious implementation that showcases some more advanced +TensorFlow principles about how to efficiently use threads to move data into a +text model, how to checkpoint during training, etc. + +But first, let's look at why we would want to learn word embeddings in the first +place. Feel free to skip this section if you're an Embedding Pro and you'd just +like to get your hands dirty with the details. + +## Motivation: Why Learn Word Embeddings? + +Image and audio processing systems work with rich, high-dimensional datasets +encoded as vectors of the individual raw pixel-intensities for image data, or +e.g. power spectral density coefficients for audio data. For tasks like object +or speech recognition we know that all the information required to successfully +perform the task is encoded in the data (because humans can perform these tasks +from the raw data). However, natural language processing systems traditionally +treat words as discrete atomic symbols, and therefore 'cat' may be represented +as `Id537` and 'dog' as `Id143`. These encodings are arbitrary, and provide +no useful information to the system regarding the relationships that may exist +between the individual symbols. This means that the model can leverage +very little of what it has learned about 'cats' when it is processing data about +'dogs' (such that they are both animals, four-legged, pets, etc.). Representing +words as unique, discrete ids furthermore leads to data sparsity, and usually +means that we may need more data in order to successfully train statistical +models. Using vector representations can overcome some of these obstacles. + +
+ +
+ +[Vector space models](https://en.wikipedia.org/wiki/Vector_space_model) (VSMs) +represent (embed) words in a continuous vector space where semantically +similar words are mapped to nearby points ('are embedded nearby each other'). +VSMs have a long, rich history in NLP, but all methods depend in some way or +another on the +[Distributional Hypothesis](https://en.wikipedia.org/wiki/Distributional_semantics#Distributional_Hypothesis), +which states that words that appear in the same contexts share +semantic meaning. The different approaches that leverage this principle can be +divided into two categories: *count-based methods* (e.g. +[Latent Semantic Analysis](https://en.wikipedia.org/wiki/Latent_semantic_analysis)), +and *predictive methods* (e.g. +[neural probabilistic language models](http://www.scholarpedia.org/article/Neural_net_language_models)). + +This distinction is elaborated in much more detail by +[Baroni et al.](http://clic.cimec.unitn.it/marco/publications/acl2014/baroni-etal-countpredict-acl2014.pdf), +but in a nutshell: Count-based methods compute the statistics of +how often some word co-occurs with its neighbor words in a large text corpus, +and then map these count-statistics down to a small, dense vector for each word. +Predictive models directly try to predict a word from its neighbors in terms of +learned small, dense *embedding vectors* (considered parameters of the +model). + +Word2vec is a particularly computationally-efficient predictive model for +learning word embeddings from raw text. It comes in two flavors, the Continuous +Bag-of-Words model (CBOW) and the Skip-Gram model (Section 3.1 and 3.2 in [Mikolov et al.](https://arxiv.org/pdf/1301.3781.pdf)). Algorithmically, these +models are similar, except that CBOW predicts target words (e.g. 'mat') from +source context words ('the cat sits on the'), while the skip-gram does the +inverse and predicts source context-words from the target words. This inversion +might seem like an arbitrary choice, but statistically it has the effect that +CBOW smoothes over a lot of the distributional information (by treating an +entire context as one observation). For the most part, this turns out to be a +useful thing for smaller datasets. However, skip-gram treats each context-target +pair as a new observation, and this tends to do better when we have larger +datasets. We will focus on the skip-gram model in the rest of this tutorial. + + +## Scaling up with Noise-Contrastive Training + +Neural probabilistic language models are traditionally trained using the +[maximum likelihood](https://en.wikipedia.org/wiki/Maximum_likelihood) (ML) +principle to maximize the probability of the next word \\(w_t\\) (for "target") +given the previous words \\(h\\) (for "history") in terms of a +[*softmax* function](https://en.wikipedia.org/wiki/Softmax_function), + +$$ +\begin{align} +P(w_t | h) &= \text{softmax}(\text{score}(w_t, h)) \\ + &= \frac{\exp \{ \text{score}(w_t, h) \} } + {\sum_\text{Word w' in Vocab} \exp \{ \text{score}(w', h) \} } +\end{align} +$$ + +where \\(\text{score}(w_t, h)\\) computes the compatibility of word \\(w_t\\) +with the context \\(h\\) (a dot product is commonly used). We train this model +by maximizing its [log-likelihood](https://en.wikipedia.org/wiki/Likelihood_function) +on the training set, i.e. by maximizing + +$$ +\begin{align} + J_\text{ML} &= \log P(w_t | h) \\ + &= \text{score}(w_t, h) - + \log \left( \sum_\text{Word w' in Vocab} \exp \{ \text{score}(w', h) \} \right). +\end{align} +$$ + +This yields a properly normalized probabilistic model for language modeling. +However this is very expensive, because we need to compute and normalize each +probability using the score for all other \\(V\\) words \\(w'\\) in the current +context \\(h\\), *at every training step*. + +
+ +
+ +On the other hand, for feature learning in word2vec we do not need a full +probabilistic model. The CBOW and skip-gram models are instead trained using a +binary classification objective ([logistic regression](https://en.wikipedia.org/wiki/Logistic_regression)) +to discriminate the real target words \\(w_t\\) from \\(k\\) imaginary (noise) words \\(\tilde w\\), in the +same context. We illustrate this below for a CBOW model. For skip-gram the +direction is simply inverted. + +
+ +
+ +Mathematically, the objective (for each example) is to maximize + +$$J_\text{NEG} = \log Q_\theta(D=1 |w_t, h) + + k \mathop{\mathbb{E}}_{\tilde w \sim P_\text{noise}} + \left[ \log Q_\theta(D = 0 |\tilde w, h) \right]$$ + +where \\(Q_\theta(D=1 | w, h)\\) is the binary logistic regression probability +under the model of seeing the word \\(w\\) in the context \\(h\\) in the dataset +\\(D\\), calculated in terms of the learned embedding vectors \\(\theta\\). In +practice we approximate the expectation by drawing \\(k\\) contrastive words +from the noise distribution (i.e. we compute a +[Monte Carlo average](https://en.wikipedia.org/wiki/Monte_Carlo_integration)). + +This objective is maximized when the model assigns high probabilities +to the real words, and low probabilities to noise words. Technically, this is +called +[Negative Sampling](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf), +and there is good mathematical motivation for using this loss function: +The updates it proposes approximate the updates of the softmax function in the +limit. But computationally it is especially appealing because computing the +loss function now scales only with the number of *noise words* that we +select (\\(k\\)), and not *all words* in the vocabulary (\\(V\\)). This makes it +much faster to train. We will actually make use of the very similar +[noise-contrastive estimation (NCE)](https://papers.nips.cc/paper/5165-learning-word-embeddings-efficiently-with-noise-contrastive-estimation.pdf) +loss, for which TensorFlow has a handy helper function `tf.nn.nce_loss()`. + +Let's get an intuitive feel for how this would work in practice! + +## The Skip-gram Model + +As an example, let's consider the dataset + +`the quick brown fox jumped over the lazy dog` + +We first form a dataset of words and the contexts in which they appear. We +could define 'context' in any way that makes sense, and in fact people have +looked at syntactic contexts (i.e. the syntactic dependents of the current +target word, see e.g. +[Levy et al.](https://levyomer.files.wordpress.com/2014/04/dependency-based-word-embeddings-acl-2014.pdf)), +words-to-the-left of the target, words-to-the-right of the target, etc. For now, +let's stick to the vanilla definition and define 'context' as the window +of words to the left and to the right of a target word. Using a window +size of 1, we then have the dataset + +`([the, brown], quick), ([quick, fox], brown), ([brown, jumped], fox), ...` + +of `(context, target)` pairs. Recall that skip-gram inverts contexts and +targets, and tries to predict each context word from its target word, so the +task becomes to predict 'the' and 'brown' from 'quick', 'quick' and 'fox' from +'brown', etc. Therefore our dataset becomes + +`(quick, the), (quick, brown), (brown, quick), (brown, fox), ...` + +of `(input, output)` pairs. The objective function is defined over the entire +dataset, but we typically optimize this with +[stochastic gradient descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) +(SGD) using one example at a time (or a 'minibatch' of `batch_size` examples, +where typically `16 <= batch_size <= 512`). So let's look at one step of +this process. + +Let's imagine at training step \\(t\\) we observe the first training case above, +where the goal is to predict `the` from `quick`. We select `num_noise` number +of noisy (contrastive) examples by drawing from some noise distribution, +typically the unigram distribution, \\(P(w)\\). For simplicity let's say +`num_noise=1` and we select `sheep` as a noisy example. Next we compute the +loss for this pair of observed and noisy examples, i.e. the objective at time +step \\(t\\) becomes + +$$J^{(t)}_\text{NEG} = \log Q_\theta(D=1 | \text{the, quick}) + + \log(Q_\theta(D=0 | \text{sheep, quick}))$$ + +The goal is to make an update to the embedding parameters \\(\theta\\) to improve +(in this case, maximize) this objective function. We do this by deriving the +gradient of the loss with respect to the embedding parameters \\(\theta\\), i.e. +\\(\frac{\partial}{\partial \theta} J_\text{NEG}\\) (luckily TensorFlow provides +easy helper functions for doing this!). We then perform an update to the +embeddings by taking a small step in the direction of the gradient. When this +process is repeated over the entire training set, this has the effect of +'moving' the embedding vectors around for each word until the model is +successful at discriminating real words from noise words. + +We can visualize the learned vectors by projecting them down to 2 dimensions +using for instance something like the +[t-SNE dimensionality reduction technique](https://lvdmaaten.github.io/tsne/). +When we inspect these visualizations it becomes apparent that the vectors +capture some general, and in fact quite useful, semantic information about +words and their relationships to one another. It was very interesting when we +first discovered that certain directions in the induced vector space specialize +towards certain semantic relationships, e.g. *male-female*, *verb tense* and +even *country-capital* relationships between words, as illustrated in the figure +below (see also for example +[Mikolov et al., 2013](https://www.aclweb.org/anthology/N13-1090)). + +
+ +
+ +This explains why these vectors are also useful as features for many canonical +NLP prediction tasks, such as part-of-speech tagging or named entity recognition +(see for example the original work by +[Collobert et al., 2011](https://arxiv.org/abs/1103.0398) +([pdf](https://arxiv.org/pdf/1103.0398.pdf)), or follow-up work by +[Turian et al., 2010](https://www.aclweb.org/anthology/P10-1040)). + +But for now, let's just use them to draw pretty pictures! + +## Building the Graph + +This is all about embeddings, so let's define our embedding matrix. +This is just a big random matrix to start. We'll initialize the values to be +uniform in the unit cube. + +```python +embeddings = tf.Variable( + tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0)) +``` + +The noise-contrastive estimation loss is defined in terms of a logistic regression +model. For this, we need to define the weights and biases for each word in the +vocabulary (also called the `output weights` as opposed to the `input +embeddings`). So let's define that. + +```python +nce_weights = tf.Variable( + tf.truncated_normal([vocabulary_size, embedding_size], + stddev=1.0 / math.sqrt(embedding_size))) +nce_biases = tf.Variable(tf.zeros([vocabulary_size])) +``` + +Now that we have the parameters in place, we can define our skip-gram model +graph. For simplicity, let's suppose we've already integerized our text corpus +with a vocabulary so that each word is represented as an integer (see +[tensorflow/examples/tutorials/word2vec/word2vec_basic.py](https://www.tensorflow.org/code/tensorflow/examples/tutorials/word2vec/word2vec_basic.py) +for the details). The skip-gram model takes two inputs. One is a batch full of +integers representing the source context words, the other is for the target +words. Let's create placeholder nodes for these inputs, so that we can feed in +data later. + +```python +# Placeholders for inputs +train_inputs = tf.placeholder(tf.int32, shape=[batch_size]) +train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1]) +``` + +Now what we need to do is look up the vector for each of the source words in +the batch. TensorFlow has handy helpers that make this easy. + +```python +embed = tf.nn.embedding_lookup(embeddings, train_inputs) +``` + +Ok, now that we have the embeddings for each word, we'd like to try to predict +the target word using the noise-contrastive training objective. + +```python +# Compute the NCE loss, using a sample of the negative labels each time. +loss = tf.reduce_mean( + tf.nn.nce_loss(weights=nce_weights, + biases=nce_biases, + labels=train_labels, + inputs=embed, + num_sampled=num_sampled, + num_classes=vocabulary_size)) +``` + +Now that we have a loss node, we need to add the nodes required to compute +gradients and update the parameters, etc. For this we will use stochastic +gradient descent, and TensorFlow has handy helpers to make this easy as well. + +```python +# We use the SGD optimizer. +optimizer = tf.train.GradientDescentOptimizer(learning_rate=1.0).minimize(loss) +``` + +## Training the Model + +Training the model is then as simple as using a `feed_dict` to push data into +the placeholders and calling +@{tf.Session.run} with this new data +in a loop. + +```python +for inputs, labels in generate_batch(...): + feed_dict = {train_inputs: inputs, train_labels: labels} + _, cur_loss = session.run([optimizer, loss], feed_dict=feed_dict) +``` + +See the full example code in +[tensorflow/examples/tutorials/word2vec/word2vec_basic.py](https://www.tensorflow.org/code/tensorflow/examples/tutorials/word2vec/word2vec_basic.py). + +## Visualizing the Learned Embeddings + +After training has finished we can visualize the learned embeddings using +t-SNE. + +
+ +
+ +Et voila! As expected, words that are similar end up clustering nearby each +other. For a more heavyweight implementation of word2vec that showcases more of +the advanced features of TensorFlow, see the implementation in +[models/tutorials/embedding/word2vec.py](https://www.tensorflow.org/code/tensorflow_models/tutorials/embedding/word2vec.py). + +## Evaluating Embeddings: Analogical Reasoning + +Embeddings are useful for a wide variety of prediction tasks in NLP. Short of +training a full-blown part-of-speech model or named-entity model, one simple way +to evaluate embeddings is to directly use them to predict syntactic and semantic +relationships like `king is to queen as father is to ?`. This is called +*analogical reasoning* and the task was introduced by +[Mikolov and colleagues +](https://www.aclweb.org/anthology/N13-1090). +Download the dataset for this task from +[download.tensorflow.org](http://download.tensorflow.org/data/questions-words.txt). + +To see how we do this evaluation, have a look at the `build_eval_graph()` and +`eval()` functions in +[models/tutorials/embedding/word2vec.py](https://www.tensorflow.org/code/tensorflow_models/tutorials/embedding/word2vec.py). + +The choice of hyperparameters can strongly influence the accuracy on this task. +To achieve state-of-the-art performance on this task requires training over a +very large dataset, carefully tuning the hyperparameters and making use of +tricks like subsampling the data, which is out of the scope of this tutorial. + + +## Optimizing the Implementation + +Our vanilla implementation showcases the flexibility of TensorFlow. For +example, changing the training objective is as simple as swapping out the call +to `tf.nn.nce_loss()` for an off-the-shelf alternative such as +`tf.nn.sampled_softmax_loss()`. If you have a new idea for a loss function, you +can manually write an expression for the new objective in TensorFlow and let +the optimizer compute its derivatives. This flexibility is invaluable in the +exploratory phase of machine learning model development, where we are trying +out several different ideas and iterating quickly. + +Once you have a model structure you're satisfied with, it may be worth +optimizing your implementation to run more efficiently (and cover more data in +less time). For example, the naive code we used in this tutorial would suffer +compromised speed because we use Python for reading and feeding data items -- +each of which require very little work on the TensorFlow back-end. If you find +your model is seriously bottlenecked on input data, you may want to implement a +custom data reader for your problem, as described in +@{$new_data_formats$New Data Formats}. For the case of Skip-Gram +modeling, we've actually already done this for you as an example in +[models/tutorials/embedding/word2vec.py](https://www.tensorflow.org/code/tensorflow_models/tutorials/embedding/word2vec.py). + +If your model is no longer I/O bound but you want still more performance, you +can take things further by writing your own TensorFlow Ops, as described in +@{$adding_an_op$Adding a New Op}. Again we've provided an +example of this for the Skip-Gram case +[models/tutorials/embedding/word2vec_optimized.py](https://www.tensorflow.org/code/tensorflow_models/tutorials/embedding/word2vec_optimized.py). +Feel free to benchmark these against each other to measure performance +improvements at each stage. + +## Conclusion + +In this tutorial we covered the word2vec model, a computationally efficient +model for learning word embeddings. We motivated why embeddings are useful, +discussed efficient training techniques and showed how to implement all of this +in TensorFlow. Overall, we hope that this has show-cased how TensorFlow affords +you the flexibility you need for early experimentation, and the control you +later need for bespoke optimized implementation. diff --git a/tensorflow/docs_src/tutorials/seq2seq.md b/tensorflow/docs_src/tutorials/seq2seq.md deleted file mode 100644 index 8928ba4f7d..0000000000 --- a/tensorflow/docs_src/tutorials/seq2seq.md +++ /dev/null @@ -1,5 +0,0 @@ -# Sequence-to-Sequence Models - -Please check out the -[tensorflow neural machine translation tutorial](https://github.com/tensorflow/nmt) -for building sequence-to-sequence models with the latest Tensorflow API. diff --git a/tensorflow/docs_src/tutorials/sequences/audio_recognition.md b/tensorflow/docs_src/tutorials/sequences/audio_recognition.md new file mode 100644 index 0000000000..d7a8da6f96 --- /dev/null +++ b/tensorflow/docs_src/tutorials/sequences/audio_recognition.md @@ -0,0 +1,631 @@ +# Simple Audio Recognition + +This tutorial will show you how to build a basic speech recognition network that +recognizes ten different words. It's important to know that real speech and +audio recognition systems are much more complex, but like MNIST for images, it +should give you a basic understanding of the techniques involved. Once you've +completed this tutorial, you'll have a model that tries to classify a one second +audio clip as either silence, an unknown word, "yes", "no", "up", "down", +"left", "right", "on", "off", "stop", or "go". You'll also be able to take this +model and run it in an Android application. + +## Preparation + +You should make sure you have TensorFlow installed, and since the script +downloads over 1GB of training data, you'll need a good internet connection and +enough free space on your machine. The training process itself can take several +hours, so make sure you have a machine available for that long. + +## Training + +To begin the training process, go to the TensorFlow source tree and run: + +```bash +python tensorflow/examples/speech_commands/train.py +``` + +The script will start off by downloading the [Speech Commands +dataset](https://storage.cloud.google.com/download.tensorflow.org/data/speech_commands_v0.02.tar.gz), +which consists of over 105,000 WAVE audio files of people saying thirty +different words. This data was collected by Google and released under a CC BY +license, and you can help improve it by [contributing five minutes of your own +voice](https://aiyprojects.withgoogle.com/open_speech_recording). The archive is +over 2GB, so this part may take a while, but you should see progress logs, and +once it's been downloaded once you won't need to do this step again. You can +find more information about this dataset in this +[Speech Commands paper](https://arxiv.org/abs/1804.03209). + +Once the downloading has completed, you'll see logging information that looks +like this: + +``` +I0730 16:53:44.766740 55030 train.py:176] Training from step: 1 +I0730 16:53:47.289078 55030 train.py:217] Step #1: rate 0.001000, accuracy 7.0%, cross entropy 2.611571 +``` + +This shows that the initialization process is done and the training loop has +begun. You'll see that it outputs information for every training step. Here's a +break down of what it means: + +`Step #1` shows that we're on the first step of the training loop. In this case +there are going to be 18,000 steps in total, so you can look at the step number +to get an idea of how close it is to finishing. + +`rate 0.001000` is the learning rate that's controlling the speed of the +network's weight updates. Early on this is a comparatively high number (0.001), +but for later training cycles it will be reduced 10x, to 0.0001. + +`accuracy 7.0%` is the how many classes were correctly predicted on this +training step. This value will often fluctuate a lot, but should increase on +average as training progresses. The model outputs an array of numbers, one for +each label, and each number is the predicted likelihood of the input being that +class. The predicted label is picked by choosing the entry with the highest +score. The scores are always between zero and one, with higher values +representing more confidence in the result. + +`cross entropy 2.611571` is the result of the loss function that we're using to +guide the training process. This is a score that's obtained by comparing the +vector of scores from the current training run to the correct labels, and this +should trend downwards during training. + +After a hundred steps, you should see a line like this: + +`I0730 16:54:41.813438 55030 train.py:252] Saving to +"/tmp/speech_commands_train/conv.ckpt-100"` + +This is saving out the current trained weights to a checkpoint file. If your +training script gets interrupted, you can look for the last saved checkpoint and +then restart the script with +`--start_checkpoint=/tmp/speech_commands_train/conv.ckpt-100` as a command line +argument to start from that point. + +## Confusion Matrix + +After four hundred steps, this information will be logged: + +``` +I0730 16:57:38.073667 55030 train.py:243] Confusion Matrix: + [[258 0 0 0 0 0 0 0 0 0 0 0] + [ 7 6 26 94 7 49 1 15 40 2 0 11] + [ 10 1 107 80 13 22 0 13 10 1 0 4] + [ 1 3 16 163 6 48 0 5 10 1 0 17] + [ 15 1 17 114 55 13 0 9 22 5 0 9] + [ 1 1 6 97 3 87 1 12 46 0 0 10] + [ 8 6 86 84 13 24 1 9 9 1 0 6] + [ 9 3 32 112 9 26 1 36 19 0 0 9] + [ 8 2 12 94 9 52 0 6 72 0 0 2] + [ 16 1 39 74 29 42 0 6 37 9 0 3] + [ 15 6 17 71 50 37 0 6 32 2 1 9] + [ 11 1 6 151 5 42 0 8 16 0 0 20]] +``` + +The first section is a [confusion +matrix](https://www.tensorflow.org/api_docs/python/tf/confusion_matrix). To +understand what it means, you first need to know the labels being used, which in +this case are "_silence_", "_unknown_", "yes", "no", "up", "down", "left", +"right", "on", "off", "stop", and "go". Each column represents a set of samples +that were predicted to be each label, so the first column represents all the +clips that were predicted to be silence, the second all those that were +predicted to be unknown words, the third "yes", and so on. + +Each row represents clips by their correct, ground truth labels. The first row +is all the clips that were silence, the second clips that were unknown words, +the third "yes", etc. + +This matrix can be more useful than just a single accuracy score because it +gives a good summary of what mistakes the network is making. In this example you +can see that all of the entries in the first row are zero, apart from the +initial one. Because the first row is all the clips that are actually silence, +this means that none of them were mistakenly labeled as words, so we have no +false negatives for silence. This shows the network is already getting pretty +good at distinguishing silence from words. + +If we look down the first column though, we see a lot of non-zero values. The +column represents all the clips that were predicted to be silence, so positive +numbers outside of the first cell are errors. This means that some clips of real +spoken words are actually being predicted to be silence, so we do have quite a +few false positives. + +A perfect model would produce a confusion matrix where all of the entries were +zero apart from a diagonal line through the center. Spotting deviations from +that pattern can help you figure out how the model is most easily confused, and +once you've identified the problems you can address them by adding more data or +cleaning up categories. + +## Validation + +After the confusion matrix, you should see a line like this: + +`I0730 16:57:38.073777 55030 train.py:245] Step 400: Validation accuracy = 26.3% +(N=3093)` + +It's good practice to separate your data set into three categories. The largest +(in this case roughly 80% of the data) is used for training the network, a +smaller set (10% here, known as "validation") is reserved for evaluation of the +accuracy during training, and another set (the last 10%, "testing") is used to +evaluate the accuracy once after the training is complete. + +The reason for this split is that there's always a danger that networks will +start memorizing their inputs during training. By keeping the validation set +separate, you can ensure that the model works with data it's never seen before. +The testing set is an additional safeguard to make sure that you haven't just +been tweaking your model in a way that happens to work for both the training and +validation sets, but not a broader range of inputs. + +The training script automatically separates the data set into these three +categories, and the logging line above shows the accuracy of model when run on +the validation set. Ideally, this should stick fairly close to the training +accuracy. If the training accuracy increases but the validation doesn't, that's +a sign that overfitting is occurring, and your model is only learning things +about the training clips, not broader patterns that generalize. + +## Tensorboard + +A good way to visualize how the training is progressing is using Tensorboard. By +default, the script saves out events to /tmp/retrain_logs, and you can load +these by running: + +`tensorboard --logdir /tmp/retrain_logs` + +Then navigate to [http://localhost:6006](http://localhost:6006) in your browser, +and you'll see charts and graphs showing your models progress. + +
+ +
+ +## Training Finished + +After a few hours of training (depending on your machine's speed), the script +should have completed all 18,000 steps. It will print out a final confusion +matrix, along with an accuracy score, all run on the testing set. With the +default settings, you should see an accuracy of between 85% and 90%. + +Because audio recognition is particularly useful on mobile devices, next we'll +export it to a compact format that's easy to work with on those platforms. To do +that, run this command line: + +``` +python tensorflow/examples/speech_commands/freeze.py \ +--start_checkpoint=/tmp/speech_commands_train/conv.ckpt-18000 \ +--output_file=/tmp/my_frozen_graph.pb +``` + +Once the frozen model has been created, you can test it with the `label_wav.py` +script, like this: + +``` +python tensorflow/examples/speech_commands/label_wav.py \ +--graph=/tmp/my_frozen_graph.pb \ +--labels=/tmp/speech_commands_train/conv_labels.txt \ +--wav=/tmp/speech_dataset/left/a5d485dc_nohash_0.wav +``` + +This should print out three labels: + +``` +left (score = 0.81477) +right (score = 0.14139) +_unknown_ (score = 0.03808) +``` + +Hopefully "left" is the top score since that's the correct label, but since the +training is random it may not for the first file you try. Experiment with some +of the other .wav files in that same folder to see how well it does. + +The scores are between zero and one, and higher values mean the model is more +confident in its prediction. + +## Running the Model in an Android App + +The easiest way to see how this model works in a real application is to download +[the prebuilt Android demo +applications](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android#prebuilt-components) +and install them on your phone. You'll see 'TF Speech' appear in your app list, +and opening it will show you the same list of action words we've just trained +our model on, starting with "Yes" and "No". Once you've given the app permission +to use the microphone, you should be able to try saying those words and see them +highlighted in the UI when the model recognizes one of them. + +You can also build this application yourself, since it's open source and +[available as part of the TensorFlow repository on +github](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android#building-in-android-studio-using-the-tensorflow-aar-from-jcenter). +By default it downloads [a pretrained model from +tensorflow.org](http://download.tensorflow.org/models/speech_commands_v0.02.zip), +but you can easily [replace it with a model you've trained +yourself](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android#install-model-files-optional). +If you do this, you'll need to make sure that the constants in [the main +SpeechActivity Java source +file](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android/src/org/tensorflow/demo/SpeechActivity.java) +like `SAMPLE_RATE` and `SAMPLE_DURATION` match any changes you've made to the +defaults while training. You'll also see that there's a [Java version of the +RecognizeCommands +module](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/android/src/org/tensorflow/demo/RecognizeCommands.java) +that's very similar to the C++ version in this tutorial. If you've tweaked +parameters for that, you can also update them in SpeechActivity to get the same +results as in your server testing. + +The demo app updates its UI list of results automatically based on the labels +text file you copy into assets alongside your frozen graph, which means you can +easily try out different models without needing to make any code changes. You +will need to update `LABEL_FILENAME` and `MODEL_FILENAME` to point to the files +you've added if you change the paths though. + +## How does this Model Work? + +The architecture used in this tutorial is based on some described in the paper +[Convolutional Neural Networks for Small-footprint Keyword +Spotting](http://www.isca-speech.org/archive/interspeech_2015/papers/i15_1478.pdf). +It was chosen because it's comparatively simple, quick to train, and easy to +understand, rather than being state of the art. There are lots of different +approaches to building neural network models to work with audio, including +[recurrent networks](https://svds.com/tensorflow-rnn-tutorial/) or [dilated +(atrous) +convolutions](https://deepmind.com/blog/wavenet-generative-model-raw-audio/). +This tutorial is based on the kind of convolutional network that will feel very +familiar to anyone who's worked with image recognition. That may seem surprising +at first though, since audio is inherently a one-dimensional continuous signal +across time, not a 2D spatial problem. + +We solve that issue by defining a window of time we believe our spoken words +should fit into, and converting the audio signal in that window into an image. +This is done by grouping the incoming audio samples into short segments, just a +few milliseconds long, and calculating the strength of the frequencies across a +set of bands. Each set of frequency strengths from a segment is treated as a +vector of numbers, and those vectors are arranged in time order to form a +two-dimensional array. This array of values can then be treated like a +single-channel image, and is known as a +[spectrogram](https://en.wikipedia.org/wiki/Spectrogram). If you want to view +what kind of image an audio sample produces, you can run the `wav_to_spectrogram +tool: + +``` +bazel run tensorflow/examples/wav_to_spectrogram:wav_to_spectrogram -- \ +--input_wav=/tmp/speech_dataset/happy/ab00c4b2_nohash_0.wav \ +--output_image=/tmp/spectrogram.png +``` + +If you open up `/tmp/spectrogram.png` you should see something like this: + +
+ +
+ +Because of TensorFlow's memory order, time in this image is increasing from top +to bottom, with frequencies going from left to right, unlike the usual +convention for spectrograms where time is left to right. You should be able to +see a couple of distinct parts, with the first syllable "Ha" distinct from +"ppy". + +Because the human ear is more sensitive to some frequencies than others, it's +been traditional in speech recognition to do further processing to this +representation to turn it into a set of [Mel-Frequency Cepstral +Coefficients](https://en.wikipedia.org/wiki/Mel-frequency_cepstrum), or MFCCs +for short. This is also a two-dimensional, one-channel representation so it can +be treated like an image too. If you're targeting general sounds rather than +speech you may find you can skip this step and operate directly on the +spectrograms. + +The image that's produced by these processing steps is then fed into a +multi-layer convolutional neural network, with a fully-connected layer followed +by a softmax at the end. You can see the definition of this portion in +[tensorflow/examples/speech_commands/models.py](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/models.py). + +## Streaming Accuracy + +Most audio recognition applications need to run on a continuous stream of audio, +rather than on individual clips. A typical way to use a model in this +environment is to apply it repeatedly at different offsets in time and average +the results over a short window to produce a smoothed prediction. If you think +of the input as an image, it's continuously scrolling along the time axis. The +words we want to recognize can start at any time, so we need to take a series of +snapshots to have a chance of having an alignment that captures most of the +utterance in the time window we feed into the model. If we sample at a high +enough rate, then we have a good chance of capturing the word in multiple +windows, so averaging the results improves the overall confidence of the +prediction. + +For an example of how you can use your model on streaming data, you can look at +[test_streaming_accuracy.cc](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/). +This uses the +[RecognizeCommands](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/recognize_commands.h) +class to run through a long-form input audio, try to spot words, and compare +those predictions against a ground truth list of labels and times. This makes it +a good example of applying a model to a stream of audio signals over time. + +You'll need a long audio file to test it against, along with labels showing +where each word was spoken. If you don't want to record one yourself, you can +generate some synthetic test data using the `generate_streaming_test_wav` +utility. By default this will create a ten minute .wav file with words roughly +every three seconds, and a text file containing the ground truth of when each +word was spoken. These words are pulled from the test portion of your current +dataset, mixed in with background noise. To run it, use: + +``` +bazel run tensorflow/examples/speech_commands:generate_streaming_test_wav +``` + +This will save a .wav file to `/tmp/speech_commands_train/streaming_test.wav`, +and a text file listing the labels to +`/tmp/speech_commands_train/streaming_test_labels.txt`. You can then run +accuracy testing with: + +``` +bazel run tensorflow/examples/speech_commands:test_streaming_accuracy -- \ +--graph=/tmp/my_frozen_graph.pb \ +--labels=/tmp/speech_commands_train/conv_labels.txt \ +--wav=/tmp/speech_commands_train/streaming_test.wav \ +--ground_truth=/tmp/speech_commands_train/streaming_test_labels.txt \ +--verbose +``` + +This will output information about the number of words correctly matched, how +many were given the wrong labels, and how many times the model triggered when +there was no real word spoken. There are various parameters that control how the +signal averaging works, including `--average_window_ms` which sets the length of +time to average results over, `--clip_stride_ms` which is the time between +applications of the model, `--suppression_ms` which stops subsequent word +detections from triggering for a certain time after an initial one is found, and +`--detection_threshold`, which controls how high the average score must be +before it's considered a solid result. + +You'll see that the streaming accuracy outputs three numbers, rather than just +the one metric used in training. This is because different applications have +varying requirements, with some being able to tolerate frequent incorrect +results as long as real words are found (high recall), while others very focused +on ensuring the predicted labels are highly likely to be correct even if some +aren't detected (high precision). The numbers from the tool give you an idea of +how your model will perform in an application, and you can try tweaking the +signal averaging parameters to tune it to give the kind of performance you want. +To understand what the right parameters are for your application, you can look +at generating an [ROC +curve](https://en.wikipedia.org/wiki/Receiver_operating_characteristic) to help +you understand the tradeoffs. + +## RecognizeCommands + +The streaming accuracy tool uses a simple decoder contained in a small C++ class +called +[RecognizeCommands](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/recognize_commands.h). +This class is fed the output of running the TensorFlow model over time, it +averages the signals, and returns information about a label when it has enough +evidence to think that a recognized word has been found. The implementation is +fairly small, just keeping track of the last few predictions and averaging them, +so it's easy to port to other platforms and languages as needed. For example, +it's convenient to do something similar at the Java level on Android, or Python +on the Raspberry Pi. As long as these implementations share the same logic, you +can tune the parameters that control the averaging using the streaming test +tool, and then transfer them over to your application to get similar results. + +## Advanced Training + +The defaults for the training script are designed to produce good end to end +results in a comparatively small file, but there are a lot of options you can +change to customize the results for your own requirements. + +### Custom Training Data + +By default the script will download the [Speech Commands +dataset](https://download.tensorflow.org/data/speech_commands_v0.01.tgz), but +you can also supply your own training data. To train on your own data, you +should make sure that you have at least several hundred recordings of each sound +you would like to recognize, and arrange them into folders by class. For +example, if you were trying to recognize dog barks from cat miaows, you would +create a root folder called `animal_sounds`, and then within that two +sub-folders called `bark` and `miaow`. You would then organize your audio files +into the appropriate folders. + +To point the script to your new audio files, you'll need to set `--data_url=` to +disable downloading of the Speech Commands dataset, and +`--data_dir=/your/data/folder/` to find the files you've just created. + +The files themselves should be 16-bit little-endian PCM-encoded WAVE format. The +sample rate defaults to 16,000, but as long as all your audio is consistently +the same rate (the script doesn't support resampling) you can change this with +the `--sample_rate` argument. The clips should also all be roughly the same +duration. The default expected duration is one second, but you can set this with +the `--clip_duration_ms` flag. If you have clips with variable amounts of +silence at the start, you can look at word alignment tools to standardize them +([here's a quick and dirty approach you can use +too](https://petewarden.com/2017/07/17/a-quick-hack-to-align-single-word-audio-recordings/)). + +One issue to watch out for is that you may have very similar repetitions of the +same sounds in your dataset, and these can give misleading metrics if they're +spread across your training, validation, and test sets. For example, the Speech +Commands set has people repeating the same word multiple times. Each one of +those repetitions is likely to be pretty close to the others, so if training was +overfitting and memorizing one, it could perform unrealistically well when it +saw a very similar copy in the test set. To avoid this danger, Speech Commands +trys to ensure that all clips featuring the same word spoken by a single person +are put into the same partition. Clips are assigned to training, test, or +validation sets based on a hash of their filename, to ensure that the +assignments remain steady even as new clips are added and avoid any training +samples migrating into the other sets. To make sure that all a given speaker's +words are in the same bucket, [the hashing +function](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/input_data.py) +ignores anything in a filename after '_nohash_' when calculating the +assignments. This means that if you have file names like `pete_nohash_0.wav` and +`pete_nohash_1.wav`, they're guaranteed to be in the same set. + +### Unknown Class + +It's likely that your application will hear sounds that aren't in your training +set, and you'll want the model to indicate that it doesn't recognize the noise +in those cases. To help the network learn what sounds to ignore, you need to +provide some clips of audio that are neither of your classes. To do this, you'd +create `quack`, `oink`, and `moo` subfolders and populate them with noises from +other animals your users might encounter. The `--wanted_words` argument to the +script defines which classes you care about, all the others mentioned in +subfolder names will be used to populate an `_unknown_` class during training. +The Speech Commands dataset has twenty words in its unknown classes, including +the digits zero through nine and random names like "Sheila". + +By default 10% of the training examples are picked from the unknown classes, but +you can control this with the `--unknown_percentage` flag. Increasing this will +make the model less likely to mistake unknown words for wanted ones, but making +it too large can backfire as the model might decide it's safest to categorize +all words as unknown! + +### Background Noise + +Real applications have to recognize audio even when there are other irrelevant +sounds happening in the environment. To build a model that's robust to this kind +of interference, we need to train against recorded audio with similar +properties. The files in the Speech Commands dataset were captured on a variety +of devices by users in many different environments, not in a studio, so that +helps add some realism to the training. To add even more, you can mix in random +segments of environmental audio to the training inputs. In the Speech Commands +set there's a special folder called `_background_noise_` which contains +minute-long WAVE files with white noise and recordings of machinery and everyday +household activity. + +Small snippets of these files are chosen at random and mixed at a low volume +into clips during training. The loudness is also chosen randomly, and controlled +by the `--background_volume` argument as a proportion where 0 is silence, and 1 +is full volume. Not all clips have background added, so the +`--background_frequency` flag controls what proportion have them mixed in. + +Your own application might operate in its own environment with different +background noise patterns than these defaults, so you can supply your own audio +clips in the `_background_noise_` folder. These should be the same sample rate +as your main dataset, but much longer in duration so that a good set of random +segments can be selected from them. + +### Silence + +In most cases the sounds you care about will be intermittent and so it's +important to know when there's no matching audio. To support this, there's a +special `_silence_` label that indicates when the model detects nothing +interesting. Because there's never complete silence in real environments, we +actually have to supply examples with quiet and irrelevant audio. For this, we +reuse the `_background_noise_` folder that's also mixed in to real clips, +pulling short sections of the audio data and feeding those in with the ground +truth class of `_silence_`. By default 10% of the training data is supplied like +this, but the `--silence_percentage` can be used to control the proportion. As +with unknown words, setting this higher can weight the model results in favor of +true positives for silence, at the expense of false negatives for words, but too +large a proportion can cause it to fall into the trap of always guessing +silence. + +### Time Shifting + +Adding in background noise is one way of distorting the training data in a +realistic way to effectively increase the size of the dataset, and so increase +overall accuracy, and time shifting is another. This involves a random offset in +time of the training sample data, so that a small part of the start or end is +cut off and the opposite section is padded with zeroes. This mimics the natural +variations in starting time in the training data, and is controlled with the +`--time_shift_ms` flag, which defaults to 100ms. Increasing this value will +provide more variation, but at the risk of cutting off important parts of the +audio. A related way of augmenting the data with realistic distortions is by +using [time stretching and pitch +scaling](https://en.wikipedia.org/wiki/Audio_time_stretching_and_pitch_scaling), +but that's outside the scope of this tutorial. + +## Customizing the Model + +The default model used for this script is pretty large, taking over 800 million +FLOPs for each inference and using 940,000 weight parameters. This runs at +usable speeds on desktop machines or modern phones, but it involves too many +calculations to run at interactive speeds on devices with more limited +resources. To support these use cases, there's a couple of alternatives +available: + + +**low_latency_conv** +Based on the 'cnn-one-fstride4' topology described in the [Convolutional +Neural Networks for Small-footprint Keyword Spotting +paper](http://www.isca-speech.org/archive/interspeech_2015/papers/i15_1478.pdf). +The accuracy is slightly lower than 'conv' but the number of weight parameters +is about the same, and it only needs 11 million FLOPs to run one prediction, +making it much faster. + +To use this model, you specify `--model_architecture=low_latency_conv` on +the command line. You'll also need to update the training rates and the number +of steps, so the full command will look like: + +``` +python tensorflow/examples/speech_commands/train \ +--model_architecture=low_latency_conv \ +--how_many_training_steps=20000,6000 \ +--learning_rate=0.01,0.001 +``` + +This asks the script to train with a learning rate of 0.01 for 20,000 steps, and +then do a fine-tuning pass of 6,000 steps with a 10x smaller rate. + +**low_latency_svdf** +Based on the topology presented in the [Compressing Deep Neural Networks using a +Rank-Constrained Topology paper](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43813.pdf). +The accuracy is also lower than 'conv' but it only uses about 750 thousand +parameters, and most significantly, it allows for an optimized execution at +test time (i.e. when you will actually use it in your application), resulting +in 750 thousand FLOPs. + +To use this model, you specify `--model_architecture=low_latency_svdf` on +the command line, and update the training rates and the number +of steps, so the full command will look like: + +``` +python tensorflow/examples/speech_commands/train \ +--model_architecture=low_latency_svdf \ +--how_many_training_steps=100000,35000 \ +--learning_rate=0.01,0.005 +``` + +Note that despite requiring a larger number of steps than the previous two +topologies, the reduced number of computations means that training should take +about the same time, and at the end reach an accuracy of around 85%. +You can also further tune the topology fairly easily for computation and +accuracy by changing these parameters in the SVDF layer: + +* rank - The rank of the approximation (higher typically better, but results in + more computation). +* num_units - Similar to other layer types, specifies the number of nodes in + the layer (more nodes better quality, and more computation). + +Regarding runtime, since the layer allows optimizations by caching some of the +internal neural network activations, you need to make sure to use a consistent +stride (e.g. 'clip_stride_ms' flag) both when you freeze the graph, and when +executing the model in streaming mode (e.g. test_streaming_accuracy.cc). + +**Other parameters to customize** +If you want to experiment with customizing models, a good place to start is by +tweaking the spectrogram creation parameters. This has the effect of altering +the size of the input image to the model, and the creation code in +[models.py](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/models.py) +will adjust the number of computations and weights automatically to fit with +different dimensions. If you make the input smaller, the model will need fewer +computations to process it, so it can be a great way to trade off some accuracy +for improved latency. The `--window_stride_ms` controls how far apart each +frequency analysis sample is from the previous. If you increase this value, then +fewer samples will be taken for a given duration, and the time axis of the input +will shrink. The `--dct_coefficient_count` flag controls how many buckets are +used for the frequency counting, so reducing this will shrink the input in the +other dimension. The `--window_size_ms` argument doesn't affect the size, but +does control how wide the area used to calculate the frequencies is for each +sample. Reducing the duration of the training samples, controlled by +`--clip_duration_ms`, can also help if the sounds you're looking for are short, +since that also reduces the time dimension of the input. You'll need to make +sure that all your training data contains the right audio in the initial portion +of the clip though. + +If you have an entirely different model in mind for your problem, you may find +that you can plug it into +[models.py](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/speech_commands/models.py) +and have the rest of the script handle all of the preprocessing and training +mechanics. You would add a new clause to `create_model`, looking for the name of +your architecture and then calling a model creation function. This function is +given the size of the spectrogram input, along with other model information, and +is expected to create TensorFlow ops to read that in and produce an output +prediction vector, and a placeholder to control the dropout rate. The rest of +the script will handle integrating this model into a larger graph doing the +input calculations and applying softmax and a loss function to train it. + +One common problem when you're adjusting models and training hyper-parameters is +that not-a-number values can creep in, thanks to numerical precision issues. In +general you can solve these by reducing the magnitude of things like learning +rates and weight initialization functions, but if they're persistent you can +enable the `--check_nans` flag to track down the source of the errors. This will +insert check ops between most regular operations in TensorFlow, and abort the +training process with a useful error message when they're encountered. diff --git a/tensorflow/docs_src/tutorials/sequences/recurrent.md b/tensorflow/docs_src/tutorials/sequences/recurrent.md new file mode 100644 index 0000000000..715cc7856a --- /dev/null +++ b/tensorflow/docs_src/tutorials/sequences/recurrent.md @@ -0,0 +1,232 @@ +# Recurrent Neural Networks + +## Introduction + +See [Understanding LSTM Networks](https://colah.github.io/posts/2015-08-Understanding-LSTMs/){:.external} +for an introduction to recurrent neural networks and LSTMs. + +## Language Modeling + +In this tutorial we will show how to train a recurrent neural network on +a challenging task of language modeling. The goal of the problem is to fit a +probabilistic model which assigns probabilities to sentences. It does so by +predicting next words in a text given a history of previous words. For this +purpose we will use the [Penn Tree Bank](https://catalog.ldc.upenn.edu/ldc99t42) +(PTB) dataset, which is a popular benchmark for measuring the quality of these +models, whilst being small and relatively fast to train. + +Language modeling is key to many interesting problems such as speech +recognition, machine translation, or image captioning. It is also fun -- +take a look [here](https://karpathy.github.io/2015/05/21/rnn-effectiveness/). + +For the purpose of this tutorial, we will reproduce the results from +[Zaremba et al., 2014](https://arxiv.org/abs/1409.2329) +([pdf](https://arxiv.org/pdf/1409.2329.pdf)), which achieves very good quality +on the PTB dataset. + +## Tutorial Files + +This tutorial references the following files from `models/tutorials/rnn/ptb` in the [TensorFlow models repo](https://github.com/tensorflow/models): + +File | Purpose +--- | --- +`ptb_word_lm.py` | The code to train a language model on the PTB dataset. +`reader.py` | The code to read the dataset. + +## Download and Prepare the Data + +The data required for this tutorial is in the `data/` directory of the +[PTB dataset from Tomas Mikolov's webpage](http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz). + +The dataset is already preprocessed and contains overall 10000 different words, +including the end-of-sentence marker and a special symbol (\) for rare +words. In `reader.py`, we convert each word to a unique integer identifier, +in order to make it easy for the neural network to process the data. + +## The Model + +### LSTM + +The core of the model consists of an LSTM cell that processes one word at a +time and computes probabilities of the possible values for the next word in the +sentence. The memory state of the network is initialized with a vector of zeros +and gets updated after reading each word. For computational reasons, we will +process data in mini-batches of size `batch_size`. In this example, it is +important to note that `current_batch_of_words` does not correspond to a +"sentence" of words. Every word in a batch should correspond to a time t. +TensorFlow will automatically sum the gradients of each batch for you. + +For example: + +``` + t=0 t=1 t=2 t=3 t=4 +[The, brown, fox, is, quick] +[The, red, fox, jumped, high] + +words_in_dataset[0] = [The, The] +words_in_dataset[1] = [brown, red] +words_in_dataset[2] = [fox, fox] +words_in_dataset[3] = [is, jumped] +words_in_dataset[4] = [quick, high] +batch_size = 2, time_steps = 5 +``` + +The basic pseudocode is as follows: + +```python +words_in_dataset = tf.placeholder(tf.float32, [time_steps, batch_size, num_features]) +lstm = tf.contrib.rnn.BasicLSTMCell(lstm_size) +# Initial state of the LSTM memory. +hidden_state = tf.zeros([batch_size, lstm.state_size]) +current_state = tf.zeros([batch_size, lstm.state_size]) +state = hidden_state, current_state +probabilities = [] +loss = 0.0 +for current_batch_of_words in words_in_dataset: + # The value of state is updated after processing each batch of words. + output, state = lstm(current_batch_of_words, state) + + # The LSTM output can be used to make next word predictions + logits = tf.matmul(output, softmax_w) + softmax_b + probabilities.append(tf.nn.softmax(logits)) + loss += loss_function(probabilities, target_words) +``` + +### Truncated Backpropagation + +By design, the output of a recurrent neural network (RNN) depends on arbitrarily +distant inputs. Unfortunately, this makes backpropagation computation difficult. +In order to make the learning process tractable, it is common practice to create +an "unrolled" version of the network, which contains a fixed number +(`num_steps`) of LSTM inputs and outputs. The model is then trained on this +finite approximation of the RNN. This can be implemented by feeding inputs of +length `num_steps` at a time and performing a backward pass after each +such input block. + +Here is a simplified block of code for creating a graph which performs +truncated backpropagation: + +```python +# Placeholder for the inputs in a given iteration. +words = tf.placeholder(tf.int32, [batch_size, num_steps]) + +lstm = tf.contrib.rnn.BasicLSTMCell(lstm_size) +# Initial state of the LSTM memory. +initial_state = state = tf.zeros([batch_size, lstm.state_size]) + +for i in range(num_steps): + # The value of state is updated after processing each batch of words. + output, state = lstm(words[:, i], state) + + # The rest of the code. + # ... + +final_state = state +``` + +And this is how to implement an iteration over the whole dataset: + +```python +# A numpy array holding the state of LSTM after each batch of words. +numpy_state = initial_state.eval() +total_loss = 0.0 +for current_batch_of_words in words_in_dataset: + numpy_state, current_loss = session.run([final_state, loss], + # Initialize the LSTM state from the previous iteration. + feed_dict={initial_state: numpy_state, words: current_batch_of_words}) + total_loss += current_loss +``` + +### Inputs + +The word IDs will be embedded into a dense representation (see the +@{$word2vec$Vector Representations Tutorial}) before feeding to +the LSTM. This allows the model to efficiently represent the knowledge about +particular words. It is also easy to write: + +```python +# embedding_matrix is a tensor of shape [vocabulary_size, embedding size] +word_embeddings = tf.nn.embedding_lookup(embedding_matrix, word_ids) +``` + +The embedding matrix will be initialized randomly and the model will learn to +differentiate the meaning of words just by looking at the data. + +### Loss Function + +We want to minimize the average negative log probability of the target words: + +$$ \text{loss} = -\frac{1}{N}\sum_{i=1}^{N} \ln p_{\text{target}_i} $$ + +It is not very difficult to implement but the function +`sequence_loss_by_example` is already available, so we can just use it here. + +The typical measure reported in the papers is average per-word perplexity (often +just called perplexity), which is equal to + +$$e^{-\frac{1}{N}\sum_{i=1}^{N} \ln p_{\text{target}_i}} = e^{\text{loss}} $$ + +and we will monitor its value throughout the training process. + +### Stacking multiple LSTMs + +To give the model more expressive power, we can add multiple layers of LSTMs +to process the data. The output of the first layer will become the input of +the second and so on. + +We have a class called `MultiRNNCell` that makes the implementation seamless: + +```python +def lstm_cell(): + return tf.contrib.rnn.BasicLSTMCell(lstm_size) +stacked_lstm = tf.contrib.rnn.MultiRNNCell( + [lstm_cell() for _ in range(number_of_layers)]) + +initial_state = state = stacked_lstm.zero_state(batch_size, tf.float32) +for i in range(num_steps): + # The value of state is updated after processing each batch of words. + output, state = stacked_lstm(words[:, i], state) + + # The rest of the code. + # ... + +final_state = state +``` + +## Run the Code + +Before running the code, download the PTB dataset, as discussed at the beginning +of this tutorial. Then, extract the PTB dataset underneath your home directory +as follows: + +```bsh +tar xvfz simple-examples.tgz -C $HOME +``` +_(Note: On Windows, you may need to use +[other tools](https://wiki.haskell.org/How_to_unpack_a_tar_file_in_Windows).)_ + +Now, clone the [TensorFlow models repo](https://github.com/tensorflow/models) +from GitHub. Run the following commands: + +```bsh +cd models/tutorials/rnn/ptb +python ptb_word_lm.py --data_path=$HOME/simple-examples/data/ --model=small +``` + +There are 3 supported model configurations in the tutorial code: "small", +"medium" and "large". The difference between them is in size of the LSTMs and +the set of hyperparameters used for training. + +The larger the model, the better results it should get. The `small` model should +be able to reach perplexity below 120 on the test set and the `large` one below +80, though it might take several hours to train. + +## What Next? + +There are several tricks that we haven't mentioned that make the model better, +including: + +* decreasing learning rate schedule, +* dropout between the LSTM layers. + +Study the code and modify it to improve the model even further. diff --git a/tensorflow/docs_src/tutorials/sequences/recurrent_quickdraw.md b/tensorflow/docs_src/tutorials/sequences/recurrent_quickdraw.md new file mode 100644 index 0000000000..37bce5b76d --- /dev/null +++ b/tensorflow/docs_src/tutorials/sequences/recurrent_quickdraw.md @@ -0,0 +1,411 @@ +# Recurrent Neural Networks for Drawing Classification + +[Quick, Draw!]: http://quickdraw.withgoogle.com + +[Quick, Draw!] is a game where a player is challenged to draw a number of +objects and see if a computer can recognize the drawing. + +The recognition in [Quick, Draw!] is performed by a classifier that takes the +user input, given as a sequence of strokes of points in x and y, and recognizes +the object category that the user tried to draw. + +In this tutorial we'll show how to build an RNN-based recognizer for this +problem. The model will use a combination of convolutional layers, LSTM layers, +and a softmax output layer to classify the drawings: + +
![RNN model structure](../../images/quickdraw_model.png)
+ +The figure above shows the structure of the model that we will build in this +tutorial. The input is a drawing that is encoded as a sequence of strokes of +points in x, y, and n, where n indicates whether a the point is the first point +in a new stroke. + +Then, a series of 1-dimensional convolutions is applied. Then LSTM layers are +applied and the sum of the outputs of all LSTM steps is fed into a softmax layer +to make a classification decision among the classes of drawings that we know. + +This tutorial uses the data from actual [Quick, Draw!] games [that is publicly +available](https://quickdraw.withgoogle.com/data). This dataset contains of 50M +drawings in 345 categories. + +## Run the tutorial code + +To try the code for this tutorial: + +1. @{$install$Install TensorFlow} if you haven't already. +1. Download the [tutorial code] +(https://github.com/tensorflow/models/tree/master/tutorials/rnn/quickdraw/train_model.py). +1. [Download the data](#download-the-data) in `TFRecord` format from + [here](http://download.tensorflow.org/data/quickdraw_tutorial_dataset_v1.tar.gz) and unzip it. More details about [how to + obtain the original Quick, Draw! + data](#optional_download_the_full_quick_draw_data) and [how to convert that + to `TFRecord` files](#optional_converting_the_data) is available below. + +1. Execute the tutorial code with the following command to train the RNN-based + model described in this tutorial. Make sure to adjust the paths to point to + the unzipped data from the download in step 3. + +```shell + python train_model.py \ + --training_data=rnn_tutorial_data/training.tfrecord-?????-of-????? \ + --eval_data=rnn_tutorial_data/eval.tfrecord-?????-of-????? \ + --classes_file=rnn_tutorial_data/training.tfrecord.classes +``` + +## Tutorial details + +### Download the data + +We make the data that we use in this tutorial available as `TFRecord` files +containing `TFExamples`. You can download the data from here: + +http://download.tensorflow.org/data/quickdraw_tutorial_dataset_v1.tar.gz + +Alternatively you can download the original data in `ndjson` format from the +Google cloud and convert it to the `TFRecord` files containing `TFExamples` +yourself as described in the next section. + +### Optional: Download the full Quick Draw Data + +The full [Quick, Draw!](https://quickdraw.withgoogle.com) +[dataset](https://quickdraw.withgoogle.com/data) is available on Google Cloud +Storage as [ndjson](http://ndjson.org/) files separated by category. You can +[browse the list of files in Cloud +Console](https://console.cloud.google.com/storage/quickdraw_dataset). + +To download the data we recommend using +[gsutil](https://cloud.google.com/storage/docs/gsutil_install#install) to +download the entire dataset. Note that the original .ndjson files require +downloading ~22GB. + +Then use the following command to check that your gsutil installation works and +that you can access the data bucket: + +```shell +gsutil ls -r "gs://quickdraw_dataset/full/simplified/*" +``` + +which will output a long list of files like the following: + +```shell +gs://quickdraw_dataset/full/simplified/The Eiffel Tower.ndjson +gs://quickdraw_dataset/full/simplified/The Great Wall of China.ndjson +gs://quickdraw_dataset/full/simplified/The Mona Lisa.ndjson +gs://quickdraw_dataset/full/simplified/aircraft carrier.ndjson +... +``` + +Then create a folder and download the dataset there. + +```shell +mkdir rnn_tutorial_data +cd rnn_tutorial_data +gsutil -m cp "gs://quickdraw_dataset/full/simplified/*" . +``` + +This download will take a while and download a bit more than 23GB of data. + +### Optional: Converting the data + +To convert the `ndjson` files to +@{$python/python_io#TFRecords_Format_Details$TFRecord} files containing +[`tf.train.Example`](https://www.tensorflow.org/code/tensorflow/core/example/example.proto) +protos run the following command. + +```shell + python create_dataset.py --ndjson_path rnn_tutorial_data \ + --output_path rnn_tutorial_data +``` + +This will store the data in 10 shards of +@{$python/python_io#TFRecords_Format_Details$TFRecord} files with 10000 items +per class for the training data and 1000 items per class as eval data. + +This conversion process is described in more detail in the following. + +The original QuickDraw data is formatted as `ndjson` files where each line +contains a JSON object like the following: + +```json +{"word":"cat", + "countrycode":"VE", + "timestamp":"2017-03-02 23:25:10.07453 UTC", + "recognized":true, + "key_id":"5201136883597312", + "drawing":[ + [ + [130,113,99,109,76,64,55,48,48,51,59,86,133,154,170,203,214,217,215,208,186,176,162,157,132], + [72,40,27,79,82,88,100,120,134,152,165,184,189,186,179,152,131,114,100,89,76,0,31,65,70] + ],[ + [76,28,7], + [136,128,128] + ],[ + [76,23,0], + [160,164,175] + ],[ + [87,52,37], + [175,191,204] + ],[ + [174,220,246,251], + [134,132,136,139] + ],[ + [175,255], + [147,168] + ],[ + [171,208,215], + [164,198,210] + ],[ + [130,110,108,111,130,139,139,119], + [129,134,137,144,148,144,136,130] + ],[ + [107,106], + [96,113] + ] + ] +} +``` + +For our purpose of building a classifier we only care about the fields "`word`" +and "`drawing`". While parsing the ndjson files, we process them line by line +using a function that converts the strokes from the `drawing` field into a +tensor of size `[number of points, 3]` containing the differences of consecutive +points. This function also returns the class name as a string. + +```python +def parse_line(ndjson_line): + """Parse an ndjson line and return ink (as np array) and classname.""" + sample = json.loads(ndjson_line) + class_name = sample["word"] + inkarray = sample["drawing"] + stroke_lengths = [len(stroke[0]) for stroke in inkarray] + total_points = sum(stroke_lengths) + np_ink = np.zeros((total_points, 3), dtype=np.float32) + current_t = 0 + for stroke in inkarray: + for i in [0, 1]: + np_ink[current_t:(current_t + len(stroke[0])), i] = stroke[i] + current_t += len(stroke[0]) + np_ink[current_t - 1, 2] = 1 # stroke_end + # Preprocessing. + # 1. Size normalization. + lower = np.min(np_ink[:, 0:2], axis=0) + upper = np.max(np_ink[:, 0:2], axis=0) + scale = upper - lower + scale[scale == 0] = 1 + np_ink[:, 0:2] = (np_ink[:, 0:2] - lower) / scale + # 2. Compute deltas. + np_ink = np_ink[1:, 0:2] - np_ink[0:-1, 0:2] + return np_ink, class_name +``` + +Since we want the data to be shuffled for writing we read from each of the +category files in random order and write to a random shard. + +For the training data we read the first 10000 items for each class and for the +eval data we read the next 1000 items for each class. + +This data is then reformatted into a tensor of shape `[num_training_samples, +max_length, 3]`. Then we determine the bounding box of the original drawing in +screen coordinates and normalize the size such that the drawing has unit height. + +
![Size normalization](../../images/quickdraw_sizenormalization.png)
+ +Finally, we compute the differences between consecutive points and store these +as a `VarLenFeature` in a +[tensorflow.Example](https://www.tensorflow.org/code/tensorflow/core/example/example.proto) +under the key `ink`. In addition we store the `class_index` as a single entry +`FixedLengthFeature` and the `shape` of the `ink` as a `FixedLengthFeature` of +length 2. + +### Defining the model + +To define the model we create a new `Estimator`. If you want to read more about +estimators, we recommend @{$custom_estimators$this tutorial}. + +To build the model, we: + +1. reshape the input back into the original shape - where the mini batch is + padded to the maximal length of its contents. In addition to the ink data we + also have the lengths for each example and the target class. This happens in + the function [`_get_input_tensors`](#-get-input-tensors). + +1. pass the input through to a series of convolution layers in + [`_add_conv_layers`](#-add-conv-layers). + +1. pass the output of the convolutions into a series of bidirectional LSTM + layers in [`_add_rnn_layers`](#-add-rnn-layers). At the end of that, the + outputs for each time step are summed up to have a compact, fixed length + embedding of the input. + +1. classify this embedding using a softmax layer in + [`_add_fc_layers`](#-add-fc-layers). + +In code this looks like: + +```python +inks, lengths, targets = _get_input_tensors(features, targets) +convolved = _add_conv_layers(inks) +final_state = _add_rnn_layers(convolved, lengths) +logits =_add_fc_layers(final_state) +``` + +### _get_input_tensors + +To obtain the input features we first obtain the shape from the features dict +and then create a 1D tensor of size `[batch_size]` containing the lengths of the +input sequences. The ink is stored as a SparseTensor in the features dict which +we convert into a dense tensor and then reshape to be `[batch_size, ?, 3]`. And +finally, if targets were passed in we make sure they are stored as a 1D tensor +of size `[batch_size]` + +In code this looks like this: + +```python +shapes = features["shape"] +lengths = tf.squeeze( + tf.slice(shapes, begin=[0, 0], size=[params["batch_size"], 1])) +inks = tf.reshape( + tf.sparse_tensor_to_dense(features["ink"]), + [params["batch_size"], -1, 3]) +if targets is not None: + targets = tf.squeeze(targets) +``` + +### _add_conv_layers + +The desired number of convolution layers and the lengths of the filters is +configured through the parameters `num_conv` and `conv_len` in the `params` +dict. + +The input is a sequence where each point has dimensionality 3. We are going to +use 1D convolutions where we treat the 3 input features as channels. That means +that the input is a `[batch_size, length, 3]` tensor and the output will be a +`[batch_size, length, number_of_filters]` tensor. + +```python +convolved = inks +for i in range(len(params.num_conv)): + convolved_input = convolved + if params.batch_norm: + convolved_input = tf.layers.batch_normalization( + convolved_input, + training=(mode == tf.estimator.ModeKeys.TRAIN)) + # Add dropout layer if enabled and not first convolution layer. + if i > 0 and params.dropout: + convolved_input = tf.layers.dropout( + convolved_input, + rate=params.dropout, + training=(mode == tf.estimator.ModeKeys.TRAIN)) + convolved = tf.layers.conv1d( + convolved_input, + filters=params.num_conv[i], + kernel_size=params.conv_len[i], + activation=None, + strides=1, + padding="same", + name="conv1d_%d" % i) +return convolved, lengths +``` + +### _add_rnn_layers + +We pass the output from the convolutions into bidirectional LSTM layers for +which we use a helper function from contrib. + +```python +outputs, _, _ = contrib_rnn.stack_bidirectional_dynamic_rnn( + cells_fw=[cell(params.num_nodes) for _ in range(params.num_layers)], + cells_bw=[cell(params.num_nodes) for _ in range(params.num_layers)], + inputs=convolved, + sequence_length=lengths, + dtype=tf.float32, + scope="rnn_classification") +``` + +see the code for more details and how to use `CUDA` accelerated implementations. + +To create a compact, fixed-length embedding, we sum up the output of the LSTMs. +We first zero out the regions of the batch where the sequences have no data. + +```python +mask = tf.tile( + tf.expand_dims(tf.sequence_mask(lengths, tf.shape(outputs)[1]), 2), + [1, 1, tf.shape(outputs)[2]]) +zero_outside = tf.where(mask, outputs, tf.zeros_like(outputs)) +outputs = tf.reduce_sum(zero_outside, axis=1) +``` + +### _add_fc_layers + +The embedding of the input is passed into a fully connected layer which we then +use as a softmax layer. + +```python +tf.layers.dense(final_state, params.num_classes) +``` + +### Loss, predictions, and optimizer + +Finally, we need to add a loss, a training op, and predictions to create the +`ModelFn`: + +```python +cross_entropy = tf.reduce_mean( + tf.nn.sparse_softmax_cross_entropy_with_logits( + labels=targets, logits=logits)) +# Add the optimizer. +train_op = tf.contrib.layers.optimize_loss( + loss=cross_entropy, + global_step=tf.train.get_global_step(), + learning_rate=params.learning_rate, + optimizer="Adam", + # some gradient clipping stabilizes training in the beginning. + clip_gradients=params.gradient_clipping_norm, + summaries=["learning_rate", "loss", "gradients", "gradient_norm"]) +predictions = tf.argmax(logits, axis=1) +return model_fn_lib.ModelFnOps( + mode=mode, + predictions={"logits": logits, + "predictions": predictions}, + loss=cross_entropy, + train_op=train_op, + eval_metric_ops={"accuracy": tf.metrics.accuracy(targets, predictions)}) +``` + +### Training and evaluating the model + +To train and evaluate the model we can rely on the functionalities of the +`Estimator` APIs and easily run training and evaluation with the `Experiment` +APIs: + +```python + estimator = tf.estimator.Estimator( + model_fn=model_fn, + model_dir=output_dir, + config=config, + params=model_params) + # Train the model. + tf.contrib.learn.Experiment( + estimator=estimator, + train_input_fn=get_input_fn( + mode=tf.contrib.learn.ModeKeys.TRAIN, + tfrecord_pattern=FLAGS.training_data, + batch_size=FLAGS.batch_size), + train_steps=FLAGS.steps, + eval_input_fn=get_input_fn( + mode=tf.contrib.learn.ModeKeys.EVAL, + tfrecord_pattern=FLAGS.eval_data, + batch_size=FLAGS.batch_size), + min_eval_frequency=1000) +``` + +Note that this tutorial is just a quick example on a relatively small dataset to +get you familiar with the APIs of recurrent neural networks and estimators. Such +models can be even more powerful if you try them on a large dataset. + +When training the model for 1M steps you can expect to get an accuracy of +approximately of approximately 70% on the top-1 candidate. Note that this +accuracy is sufficient to build the quickdraw game because of the game dynamics +the user will be able to adjust their drawing until it is ready. Also, the game +does not use the top-1 candidate only but accepts a drawing as correct if the +target category shows up with a score better than a fixed threshold. diff --git a/tensorflow/docs_src/tutorials/wide.md b/tensorflow/docs_src/tutorials/wide.md deleted file mode 100644 index 27ce75a30d..0000000000 --- a/tensorflow/docs_src/tutorials/wide.md +++ /dev/null @@ -1,461 +0,0 @@ -# TensorFlow Linear Model Tutorial - -In this tutorial, we will use the tf.estimator API in TensorFlow to solve a -binary classification problem: Given census data about a person such as age, -education, marital status, and occupation (the features), we will try to predict -whether or not the person earns more than 50,000 dollars a year (the target -label). We will train a **logistic regression** model, and given an individual's -information our model will output a number between 0 and 1, which can be -interpreted as the probability that the individual has an annual income of over -50,000 dollars. - -## Setup - -To try the code for this tutorial: - -1. @{$install$Install TensorFlow} if you haven't already. - -2. Download [the tutorial code](https://github.com/tensorflow/models/tree/master/official/wide_deep/). - -3. Execute the data download script we provide to you: - - $ python data_download.py - -4. Execute the tutorial code with the following command to train the linear -model described in this tutorial: - - $ python wide_deep.py --model_type=wide - -Read on to find out how this code builds its linear model. - -## Reading The Census Data - -The dataset we'll be using is the -[Census Income Dataset](https://archive.ics.uci.edu/ml/datasets/Census+Income). -We have provided -[data_download.py](https://github.com/tensorflow/models/tree/master/official/wide_deep/data_download.py) -which downloads the code and performs some additional cleanup. - -Since the task is a binary classification problem, we'll construct a label -column named "label" whose value is 1 if the income is over 50K, and 0 -otherwise. For reference, see `input_fn` in -[wide_deep.py](https://github.com/tensorflow/models/tree/master/official/wide_deep/wide_deep.py). - -Next, let's take a look at the dataframe and see which columns we can use to -predict the target label. The columns can be grouped into two types—categorical -and continuous columns: - -* A column is called **categorical** if its value can only be one of the - categories in a finite set. For example, the relationship status of a person - (wife, husband, unmarried, etc.) or the education level (high school, - college, etc.) are categorical columns. -* A column is called **continuous** if its value can be any numerical value in - a continuous range. For example, the capital gain of a person (e.g. $14,084) - is a continuous column. - -Here's a list of columns available in the Census Income dataset: - -| Column Name | Type | Description | -| -------------- | ----------- | --------------------------------- | -| age | Continuous | The age of the individual | -| workclass | Categorical | The type of employer the | -: : : individual has (government, : -: : : military, private, etc.). : -| fnlwgt | Continuous | The number of people the census | -: : : takers believe that observation : -: : : represents (sample weight). Final : -: : : weight will not be used. : -| education | Categorical | The highest level of education | -: : : achieved for that individual. : -| education_num | Continuous | The highest level of education in | -: : : numerical form. : -| marital_status | Categorical | Marital status of the individual. | -| occupation | Categorical | The occupation of the individual. | -| relationship | Categorical | Wife, Own-child, Husband, | -: : : Not-in-family, Other-relative, : -: : : Unmarried. : -| race | Categorical | Amer-Indian-Eskimo, Asian-Pac- | -: : : Islander, Black, White, Other. : -| gender | Categorical | Female, Male. | -| capital_gain | Continuous | Capital gains recorded. | -| capital_loss | Continuous | Capital Losses recorded. | -| hours_per_week | Continuous | Hours worked per week. | -| native_country | Categorical | Country of origin of the | -: : : individual. : -| income_bracket | Categorical | ">50K" or "<=50K", meaning | -: : : whether the person makes more : -: : : than $50,000 annually. : - -## Converting Data into Tensors - -When building a tf.estimator model, the input data is specified by means of an -Input Builder function. This builder function will not be called until it is -later passed to tf.estimator.Estimator methods such as `train` and `evaluate`. -The purpose of this function is to construct the input data, which is -represented in the form of @{tf.Tensor}s or @{tf.SparseTensor}s. -In more detail, the input builder function returns the following as a pair: - -1. `features`: A dict from feature column names to `Tensors` or - `SparseTensors`. -2. `labels`: A `Tensor` containing the label column. - -The keys of the `features` will be used to construct columns in the next -section. Because we want to call the `train` and `evaluate` methods with -different data, we define a method that returns an input function based on the -given data. Note that the returned input function will be called while -constructing the TensorFlow graph, not while running the graph. What it is -returning is a representation of the input data as the fundamental unit of -TensorFlow computations, a `Tensor` (or `SparseTensor`). - -Each continuous column in the train or test data will be converted into a -`Tensor`, which in general is a good format to represent dense data. For -categorical data, we must represent the data as a `SparseTensor`. This data -format is good for representing sparse data. Our `input_fn` uses the `tf.data` -API, which makes it easy to apply transformations to our dataset: - -```python -def input_fn(data_file, num_epochs, shuffle, batch_size): - """Generate an input function for the Estimator.""" - assert tf.gfile.Exists(data_file), ( - '%s not found. Please make sure you have either run data_download.py or ' - 'set both arguments --train_data and --test_data.' % data_file) - - def parse_csv(value): - print('Parsing', data_file) - columns = tf.decode_csv(value, record_defaults=_CSV_COLUMN_DEFAULTS) - features = dict(zip(_CSV_COLUMNS, columns)) - labels = features.pop('income_bracket') - return features, tf.equal(labels, '>50K') - - # Extract lines from input files using the Dataset API. - dataset = tf.data.TextLineDataset(data_file) - - if shuffle: - dataset = dataset.shuffle(buffer_size=_SHUFFLE_BUFFER) - - dataset = dataset.map(parse_csv, num_parallel_calls=5) - - # We call repeat after shuffling, rather than before, to prevent separate - # epochs from blending together. - dataset = dataset.repeat(num_epochs) - dataset = dataset.batch(batch_size) - - iterator = dataset.make_one_shot_iterator() - features, labels = iterator.get_next() - return features, labels -``` - -## Selecting and Engineering Features for the Model - -Selecting and crafting the right set of feature columns is key to learning an -effective model. A **feature column** can be either one of the raw columns in -the original dataframe (let's call them **base feature columns**), or any new -columns created based on some transformations defined over one or multiple base -columns (let's call them **derived feature columns**). Basically, "feature -column" is an abstract concept of any raw or derived variable that can be used -to predict the target label. - -### Base Categorical Feature Columns - -To define a feature column for a categorical feature, we can create a -`CategoricalColumn` using the tf.feature_column API. If you know the set of all -possible feature values of a column and there are only a few of them, you can -use `categorical_column_with_vocabulary_list`. Each key in the list will get -assigned an auto-incremental ID starting from 0. For example, for the -`relationship` column we can assign the feature string "Husband" to an integer -ID of 0 and "Not-in-family" to 1, etc., by doing: - -```python -relationship = tf.feature_column.categorical_column_with_vocabulary_list( - 'relationship', [ - 'Husband', 'Not-in-family', 'Wife', 'Own-child', 'Unmarried', - 'Other-relative']) -``` - -What if we don't know the set of possible values in advance? Not a problem. We -can use `categorical_column_with_hash_bucket` instead: - -```python -occupation = tf.feature_column.categorical_column_with_hash_bucket( - 'occupation', hash_bucket_size=1000) -``` - -What will happen is that each possible value in the feature column `occupation` -will be hashed to an integer ID as we encounter them in training. See an example -illustration below: - -ID | Feature ---- | ------------- -... | -9 | `"Machine-op-inspct"` -... | -103 | `"Farming-fishing"` -... | -375 | `"Protective-serv"` -... | - -No matter which way we choose to define a `SparseColumn`, each feature string -will be mapped into an integer ID by looking up a fixed mapping or by hashing. -Note that hashing collisions are possible, but may not significantly impact the -model quality. Under the hood, the `LinearModel` class is responsible for -managing the mapping and creating `tf.Variable` to store the model parameters -(also known as model weights) for each feature ID. The model parameters will be -learned through the model training process we'll go through later. - -We'll do the similar trick to define the other categorical features: - -```python -education = tf.feature_column.categorical_column_with_vocabulary_list( - 'education', [ - 'Bachelors', 'HS-grad', '11th', 'Masters', '9th', 'Some-college', - 'Assoc-acdm', 'Assoc-voc', '7th-8th', 'Doctorate', 'Prof-school', - '5th-6th', '10th', '1st-4th', 'Preschool', '12th']) - -marital_status = tf.feature_column.categorical_column_with_vocabulary_list( - 'marital_status', [ - 'Married-civ-spouse', 'Divorced', 'Married-spouse-absent', - 'Never-married', 'Separated', 'Married-AF-spouse', 'Widowed']) - -relationship = tf.feature_column.categorical_column_with_vocabulary_list( - 'relationship', [ - 'Husband', 'Not-in-family', 'Wife', 'Own-child', 'Unmarried', - 'Other-relative']) - -workclass = tf.feature_column.categorical_column_with_vocabulary_list( - 'workclass', [ - 'Self-emp-not-inc', 'Private', 'State-gov', 'Federal-gov', - 'Local-gov', '?', 'Self-emp-inc', 'Without-pay', 'Never-worked']) - -# To show an example of hashing: -occupation = tf.feature_column.categorical_column_with_hash_bucket( - 'occupation', hash_bucket_size=1000) -``` - -### Base Continuous Feature Columns - -Similarly, we can define a `NumericColumn` for each continuous feature column -that we want to use in the model: - -```python -age = tf.feature_column.numeric_column('age') -education_num = tf.feature_column.numeric_column('education_num') -capital_gain = tf.feature_column.numeric_column('capital_gain') -capital_loss = tf.feature_column.numeric_column('capital_loss') -hours_per_week = tf.feature_column.numeric_column('hours_per_week') -``` - -### Making Continuous Features Categorical through Bucketization - -Sometimes the relationship between a continuous feature and the label is not -linear. As a hypothetical example, a person's income may grow with age in the -early stage of one's career, then the growth may slow at some point, and finally -the income decreases after retirement. In this scenario, using the raw `age` as -a real-valued feature column might not be a good choice because the model can -only learn one of the three cases: - -1. Income always increases at some rate as age grows (positive correlation), -1. Income always decreases at some rate as age grows (negative correlation), or -1. Income stays the same no matter at what age (no correlation) - -If we want to learn the fine-grained correlation between income and each age -group separately, we can leverage **bucketization**. Bucketization is a process -of dividing the entire range of a continuous feature into a set of consecutive -bins/buckets, and then converting the original numerical feature into a bucket -ID (as a categorical feature) depending on which bucket that value falls into. -So, we can define a `bucketized_column` over `age` as: - -```python -age_buckets = tf.feature_column.bucketized_column( - age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) -``` - -where the `boundaries` is a list of bucket boundaries. In this case, there are -10 boundaries, resulting in 11 age group buckets (from age 17 and below, 18-24, -25-29, ..., to 65 and over). - -### Intersecting Multiple Columns with CrossedColumn - -Using each base feature column separately may not be enough to explain the data. -For example, the correlation between education and the label (earning > 50,000 -dollars) may be different for different occupations. Therefore, if we only learn -a single model weight for `education="Bachelors"` and `education="Masters"`, we -won't be able to capture every single education-occupation combination (e.g. -distinguishing between `education="Bachelors" AND occupation="Exec-managerial"` -and `education="Bachelors" AND occupation="Craft-repair"`). To learn the -differences between different feature combinations, we can add **crossed feature -columns** to the model. - -```python -education_x_occupation = tf.feature_column.crossed_column( - ['education', 'occupation'], hash_bucket_size=1000) -``` - -We can also create a `CrossedColumn` over more than two columns. Each -constituent column can be either a base feature column that is categorical -(`SparseColumn`), a bucketized real-valued feature column (`BucketizedColumn`), -or even another `CrossColumn`. Here's an example: - -```python -age_buckets_x_education_x_occupation = tf.feature_column.crossed_column( - [age_buckets, 'education', 'occupation'], hash_bucket_size=1000) -``` - -## Defining The Logistic Regression Model - -After processing the input data and defining all the feature columns, we're now -ready to put them all together and build a Logistic Regression model. In the -previous section we've seen several types of base and derived feature columns, -including: - -* `CategoricalColumn` -* `NumericColumn` -* `BucketizedColumn` -* `CrossedColumn` - -All of these are subclasses of the abstract `FeatureColumn` class, and can be -added to the `feature_columns` field of a model: - -```python -base_columns = [ - education, marital_status, relationship, workclass, occupation, - age_buckets, -] -crossed_columns = [ - tf.feature_column.crossed_column( - ['education', 'occupation'], hash_bucket_size=1000), - tf.feature_column.crossed_column( - [age_buckets, 'education', 'occupation'], hash_bucket_size=1000), -] - -model_dir = tempfile.mkdtemp() -model = tf.estimator.LinearClassifier( - model_dir=model_dir, feature_columns=base_columns + crossed_columns) -``` - -The model also automatically learns a bias term, which controls the prediction -one would make without observing any features (see the section "How Logistic -Regression Works" for more explanations). The learned model files will be stored -in `model_dir`. - -## Training and Evaluating Our Model - -After adding all the features to the model, now let's look at how to actually -train the model. Training a model is just a single command using the -tf.estimator API: - -```python -model.train(input_fn=lambda: input_fn(train_data, num_epochs, True, batch_size)) -``` - -After the model is trained, we can evaluate how good our model is at predicting -the labels of the holdout data: - -```python -results = model.evaluate(input_fn=lambda: input_fn( - test_data, 1, False, batch_size)) -for key in sorted(results): - print('%s: %s' % (key, results[key])) -``` - -The first line of the final output should be something like -`accuracy: 0.83557522`, which means the accuracy is 83.6%. Feel free to try more -features and transformations and see if you can do even better! - -After the model is evaluated, we can use the model to predict whether an individual has an annual income of over -50,000 dollars given an individual's information input. -```python - pred_iter = model.predict(input_fn=lambda: input_fn(FLAGS.test_data, 1, False, 1)) - for pred in pred_iter: - print(pred['classes']) -``` - -The model prediction output would be like `[b'1']` or `[b'0']` which means whether corresponding individual has an annual income of over 50,000 dollars or not. - -If you'd like to see a working end-to-end example, you can download our -[example code](https://github.com/tensorflow/models/tree/master/official/wide_deep/wide_deep.py) -and set the `model_type` flag to `wide`. - -## Adding Regularization to Prevent Overfitting - -Regularization is a technique used to avoid **overfitting**. Overfitting happens -when your model does well on the data it is trained on, but worse on test data -that the model has not seen before, such as live traffic. Overfitting generally -occurs when a model is excessively complex, such as having too many parameters -relative to the number of observed training data. Regularization allows for you -to control your model's complexity and makes the model more generalizable to -unseen data. - -In the Linear Model library, you can add L1 and L2 regularizations to the model -as: - -``` -model = tf.estimator.LinearClassifier( - model_dir=model_dir, feature_columns=base_columns + crossed_columns, - optimizer=tf.train.FtrlOptimizer( - learning_rate=0.1, - l1_regularization_strength=1.0, - l2_regularization_strength=1.0)) -``` - -One important difference between L1 and L2 regularization is that L1 -regularization tends to make model weights stay at zero, creating sparser -models, whereas L2 regularization also tries to make the model weights closer to -zero but not necessarily zero. Therefore, if you increase the strength of L1 -regularization, you will have a smaller model size because many of the model -weights will be zero. This is often desirable when the feature space is very -large but sparse, and when there are resource constraints that prevent you from -serving a model that is too large. - -In practice, you should try various combinations of L1, L2 regularization -strengths and find the best parameters that best control overfitting and give -you a desirable model size. - -## How Logistic Regression Works - -Finally, let's take a minute to talk about what the Logistic Regression model -actually looks like in case you're not already familiar with it. We'll denote -the label as \\(Y\\), and the set of observed features as a feature vector -\\(\mathbf{x}=[x_1, x_2, ..., x_d]\\). We define \\(Y=1\\) if an individual -earned > 50,000 dollars and \\(Y=0\\) otherwise. In Logistic Regression, the -probability of the label being positive (\\(Y=1\\)) given the features -\\(\mathbf{x}\\) is given as: - -$$ P(Y=1|\mathbf{x}) = \frac{1}{1+\exp(-(\mathbf{w}^T\mathbf{x}+b))}$$ - -where \\(\mathbf{w}=[w_1, w_2, ..., w_d]\\) are the model weights for the -features \\(\mathbf{x}=[x_1, x_2, ..., x_d]\\). \\(b\\) is a constant that is -often called the **bias** of the model. The equation consists of two parts—A -linear model and a logistic function: - -* **Linear Model**: First, we can see that \\(\mathbf{w}^T\mathbf{x}+b = b + - w_1x_1 + ... +w_dx_d\\) is a linear model where the output is a linear - function of the input features \\(\mathbf{x}\\). The bias \\(b\\) is the - prediction one would make without observing any features. The model weight - \\(w_i\\) reflects how the feature \\(x_i\\) is correlated with the positive - label. If \\(x_i\\) is positively correlated with the positive label, the - weight \\(w_i\\) increases, and the probability \\(P(Y=1|\mathbf{x})\\) will - be closer to 1. On the other hand, if \\(x_i\\) is negatively correlated - with the positive label, then the weight \\(w_i\\) decreases and the - probability \\(P(Y=1|\mathbf{x})\\) will be closer to 0. - -* **Logistic Function**: Second, we can see that there's a logistic function - (also known as the sigmoid function) \\(S(t) = 1/(1+\exp(-t))\\) being - applied to the linear model. The logistic function is used to convert the - output of the linear model \\(\mathbf{w}^T\mathbf{x}+b\\) from any real - number into the range of \\([0, 1]\\), which can be interpreted as a - probability. - -Model training is an optimization problem: The goal is to find a set of model -weights (i.e. model parameters) to minimize a **loss function** defined over the -training data, such as logistic loss for Logistic Regression models. The loss -function measures the discrepancy between the ground-truth label and the model's -prediction. If the prediction is very close to the ground-truth label, the loss -value will be low; if the prediction is very far from the label, then the loss -value would be high. - -## Learn Deeper - -If you're interested in learning more, check out our -@{$wide_and_deep$Wide & Deep Learning Tutorial} where we'll show you how to -combine the strengths of linear models and deep neural networks by jointly -training them using the tf.estimator API. diff --git a/tensorflow/docs_src/tutorials/wide_and_deep.md b/tensorflow/docs_src/tutorials/wide_and_deep.md deleted file mode 100644 index 44677a810b..0000000000 --- a/tensorflow/docs_src/tutorials/wide_and_deep.md +++ /dev/null @@ -1,243 +0,0 @@ -# TensorFlow Wide & Deep Learning Tutorial - -In the previous @{$wide$TensorFlow Linear Model Tutorial}, we trained a logistic -regression model to predict the probability that the individual has an annual -income of over 50,000 dollars using the -[Census Income Dataset](https://archive.ics.uci.edu/ml/datasets/Census+Income). -TensorFlow is great for training deep neural networks too, and you might be -thinking which one you should choose—well, why not both? Would it be possible to -combine the strengths of both in one model? - -In this tutorial, we'll introduce how to use the tf.estimator API to jointly -train a wide linear model and a deep feed-forward neural network. This approach -combines the strengths of memorization and generalization. It's useful for -generic large-scale regression and classification problems with sparse input -features (e.g., categorical features with a large number of possible feature -values). If you're interested in learning more about how Wide & Deep Learning -works, please check out our [research paper](https://arxiv.org/abs/1606.07792). - -![Wide & Deep Spectrum of Models](https://www.tensorflow.org/images/wide_n_deep.svg "Wide & Deep") - -The figure above shows a comparison of a wide model (logistic regression with -sparse features and transformations), a deep model (feed-forward neural network -with an embedding layer and several hidden layers), and a Wide & Deep model -(joint training of both). At a high level, there are only 3 steps to configure a -wide, deep, or Wide & Deep model using the tf.estimator API: - -1. Select features for the wide part: Choose the sparse base columns and - crossed columns you want to use. -1. Select features for the deep part: Choose the continuous columns, the - embedding dimension for each categorical column, and the hidden layer sizes. -1. Put them all together in a Wide & Deep model - (`DNNLinearCombinedClassifier`). - -And that's it! Let's go through a simple example. - -## Setup - -To try the code for this tutorial: - -1. @{$install$Install TensorFlow} if you haven't already. - -2. Download [the tutorial code](https://github.com/tensorflow/models/tree/master/official/wide_deep/). - -3. Execute the data download script we provide to you: - - $ python data_download.py - -4. Execute the tutorial code with the following command to train the wide and -deep model described in this tutorial: - - $ python wide_deep.py - -Read on to find out how this code builds its model. - - -## Define Base Feature Columns - -First, let's define the base categorical and continuous feature columns that -we'll use. These base columns will be the building blocks used by both the wide -part and the deep part of the model. - -```python -import tensorflow as tf - -# Continuous columns -age = tf.feature_column.numeric_column('age') -education_num = tf.feature_column.numeric_column('education_num') -capital_gain = tf.feature_column.numeric_column('capital_gain') -capital_loss = tf.feature_column.numeric_column('capital_loss') -hours_per_week = tf.feature_column.numeric_column('hours_per_week') - -education = tf.feature_column.categorical_column_with_vocabulary_list( - 'education', [ - 'Bachelors', 'HS-grad', '11th', 'Masters', '9th', 'Some-college', - 'Assoc-acdm', 'Assoc-voc', '7th-8th', 'Doctorate', 'Prof-school', - '5th-6th', '10th', '1st-4th', 'Preschool', '12th']) - -marital_status = tf.feature_column.categorical_column_with_vocabulary_list( - 'marital_status', [ - 'Married-civ-spouse', 'Divorced', 'Married-spouse-absent', - 'Never-married', 'Separated', 'Married-AF-spouse', 'Widowed']) - -relationship = tf.feature_column.categorical_column_with_vocabulary_list( - 'relationship', [ - 'Husband', 'Not-in-family', 'Wife', 'Own-child', 'Unmarried', - 'Other-relative']) - -workclass = tf.feature_column.categorical_column_with_vocabulary_list( - 'workclass', [ - 'Self-emp-not-inc', 'Private', 'State-gov', 'Federal-gov', - 'Local-gov', '?', 'Self-emp-inc', 'Without-pay', 'Never-worked']) - -# To show an example of hashing: -occupation = tf.feature_column.categorical_column_with_hash_bucket( - 'occupation', hash_bucket_size=1000) - -# Transformations. -age_buckets = tf.feature_column.bucketized_column( - age, boundaries=[18, 25, 30, 35, 40, 45, 50, 55, 60, 65]) -``` - -## The Wide Model: Linear Model with Crossed Feature Columns - -The wide model is a linear model with a wide set of sparse and crossed feature -columns: - -```python -base_columns = [ - education, marital_status, relationship, workclass, occupation, - age_buckets, -] - -crossed_columns = [ - tf.feature_column.crossed_column( - ['education', 'occupation'], hash_bucket_size=1000), - tf.feature_column.crossed_column( - [age_buckets, 'education', 'occupation'], hash_bucket_size=1000), -] -``` - -You can also see the @{$wide$TensorFlow Linear Model Tutorial} for more details. - -Wide models with crossed feature columns can memorize sparse interactions -between features effectively. That being said, one limitation of crossed feature -columns is that they do not generalize to feature combinations that have not -appeared in the training data. Let's add a deep model with embeddings to fix -that. - -## The Deep Model: Neural Network with Embeddings - -The deep model is a feed-forward neural network, as shown in the previous -figure. Each of the sparse, high-dimensional categorical features are first -converted into a low-dimensional and dense real-valued vector, often referred to -as an embedding vector. These low-dimensional dense embedding vectors are -concatenated with the continuous features, and then fed into the hidden layers -of a neural network in the forward pass. The embedding values are initialized -randomly, and are trained along with all other model parameters to minimize the -training loss. If you're interested in learning more about embeddings, check out -the TensorFlow tutorial on @{$word2vec$Vector Representations of Words} or -[Word embedding](https://en.wikipedia.org/wiki/Word_embedding) on Wikipedia. - -Another way to represent categorical columns to feed into a neural network is -via a one-hot or multi-hot representation. This is often appropriate for -categorical columns with only a few possible values. As an example of a one-hot -representation, for the relationship column, `"Husband"` can be represented as -[1, 0, 0, 0, 0, 0], and `"Not-in-family"` as [0, 1, 0, 0, 0, 0], etc. This is a -fixed representation, whereas embeddings are more flexible and calculated at -training time. - -We'll configure the embeddings for the categorical columns using -`embedding_column`, and concatenate them with the continuous columns. -We also use `indicator_column` to create multi-hot representations of some -categorical columns. - -```python -deep_columns = [ - age, - education_num, - capital_gain, - capital_loss, - hours_per_week, - tf.feature_column.indicator_column(workclass), - tf.feature_column.indicator_column(education), - tf.feature_column.indicator_column(marital_status), - tf.feature_column.indicator_column(relationship), - # To show an example of embedding - tf.feature_column.embedding_column(occupation, dimension=8), -] -``` - -The higher the `dimension` of the embedding is, the more degrees of freedom the -model will have to learn the representations of the features. For simplicity, we -set the dimension to 8 for all feature columns here. Empirically, a more -informed decision for the number of dimensions is to start with a value on the -order of \\(\log_2(n)\\) or \\(k\sqrt[4]n\\), where \\(n\\) is the number of -unique features in a feature column and \\(k\\) is a small constant (usually -smaller than 10). - -Through dense embeddings, deep models can generalize better and make predictions -on feature pairs that were previously unseen in the training data. However, it -is difficult to learn effective low-dimensional representations for feature -columns when the underlying interaction matrix between two feature columns is -sparse and high-rank. In such cases, the interaction between most feature pairs -should be zero except a few, but dense embeddings will lead to nonzero -predictions for all feature pairs, and thus can over-generalize. On the other -hand, linear models with crossed features can memorize these “exception rules” -effectively with fewer model parameters. - -Now, let's see how to jointly train wide and deep models and allow them to -complement each other’s strengths and weaknesses. - -## Combining Wide and Deep Models into One - -The wide models and deep models are combined by summing up their final output -log odds as the prediction, then feeding the prediction to a logistic loss -function. All the graph definition and variable allocations have already been -handled for you under the hood, so you simply need to create a -`DNNLinearCombinedClassifier`: - -```python -model = tf.estimator.DNNLinearCombinedClassifier( - model_dir='/tmp/census_model', - linear_feature_columns=base_columns + crossed_columns, - dnn_feature_columns=deep_columns, - dnn_hidden_units=[100, 50]) -``` - -## Training and Evaluating The Model - -Before we train the model, let's read in the Census dataset as we did in the -@{$wide$TensorFlow Linear Model tutorial}. See `data_download.py` as well as -`input_fn` within -[`wide_deep.py`](https://github.com/tensorflow/models/tree/master/official/wide_deep/wide_deep.py). - -After reading in the data, you can train and evaluate the model: - -```python -# Train and evaluate the model every `FLAGS.epochs_per_eval` epochs. -for n in range(FLAGS.train_epochs // FLAGS.epochs_per_eval): - model.train(input_fn=lambda: input_fn( - FLAGS.train_data, FLAGS.epochs_per_eval, True, FLAGS.batch_size)) - - results = model.evaluate(input_fn=lambda: input_fn( - FLAGS.test_data, 1, False, FLAGS.batch_size)) - - # Display evaluation metrics - print('Results at epoch', (n + 1) * FLAGS.epochs_per_eval) - print('-' * 30) - - for key in sorted(results): - print('%s: %s' % (key, results[key])) -``` - -The final output accuracy should be somewhere around 85.5%. If you'd like to -see a working end-to-end example, you can download our -[example code](https://github.com/tensorflow/models/tree/master/official/wide_deep/wide_deep.py). - -Note that this tutorial is just a quick example on a small dataset to get you -familiar with the API. Wide & Deep Learning will be even more powerful if you -try it on a large dataset with many sparse feature columns that have a large -number of possible feature values. Again, feel free to take a look at our -[research paper](https://arxiv.org/abs/1606.07792) for more ideas about how to -apply Wide & Deep Learning in real-world large-scale machine learning problems. diff --git a/tensorflow/docs_src/tutorials/word2vec.md b/tensorflow/docs_src/tutorials/word2vec.md deleted file mode 100644 index 3fe7352bd2..0000000000 --- a/tensorflow/docs_src/tutorials/word2vec.md +++ /dev/null @@ -1,405 +0,0 @@ -# Vector Representations of Words - -In this tutorial we look at the word2vec model by -[Mikolov et al.](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf) -This model is used for learning vector representations of words, called "word -embeddings". - -## Highlights - -This tutorial is meant to highlight the interesting, substantive parts of -building a word2vec model in TensorFlow. - -* We start by giving the motivation for why we would want to -represent words as vectors. -* We look at the intuition behind the model and how it is trained -(with a splash of math for good measure). -* We also show a simple implementation of the model in TensorFlow. -* Finally, we look at ways to make the naive version scale better. - -We walk through the code later during the tutorial, but if you'd prefer to dive -straight in, feel free to look at the minimalistic implementation in -[tensorflow/examples/tutorials/word2vec/word2vec_basic.py](https://www.tensorflow.org/code/tensorflow/examples/tutorials/word2vec/word2vec_basic.py) -This basic example contains the code needed to download some data, train on it a -bit and visualize the result. Once you get comfortable with reading and running -the basic version, you can graduate to -[models/tutorials/embedding/word2vec.py](https://www.tensorflow.org/code/tensorflow_models/tutorials/embedding/word2vec.py) -which is a more serious implementation that showcases some more advanced -TensorFlow principles about how to efficiently use threads to move data into a -text model, how to checkpoint during training, etc. - -But first, let's look at why we would want to learn word embeddings in the first -place. Feel free to skip this section if you're an Embedding Pro and you'd just -like to get your hands dirty with the details. - -## Motivation: Why Learn Word Embeddings? - -Image and audio processing systems work with rich, high-dimensional datasets -encoded as vectors of the individual raw pixel-intensities for image data, or -e.g. power spectral density coefficients for audio data. For tasks like object -or speech recognition we know that all the information required to successfully -perform the task is encoded in the data (because humans can perform these tasks -from the raw data). However, natural language processing systems traditionally -treat words as discrete atomic symbols, and therefore 'cat' may be represented -as `Id537` and 'dog' as `Id143`. These encodings are arbitrary, and provide -no useful information to the system regarding the relationships that may exist -between the individual symbols. This means that the model can leverage -very little of what it has learned about 'cats' when it is processing data about -'dogs' (such that they are both animals, four-legged, pets, etc.). Representing -words as unique, discrete ids furthermore leads to data sparsity, and usually -means that we may need more data in order to successfully train statistical -models. Using vector representations can overcome some of these obstacles. - -
- -
- -[Vector space models](https://en.wikipedia.org/wiki/Vector_space_model) (VSMs) -represent (embed) words in a continuous vector space where semantically -similar words are mapped to nearby points ('are embedded nearby each other'). -VSMs have a long, rich history in NLP, but all methods depend in some way or -another on the -[Distributional Hypothesis](https://en.wikipedia.org/wiki/Distributional_semantics#Distributional_Hypothesis), -which states that words that appear in the same contexts share -semantic meaning. The different approaches that leverage this principle can be -divided into two categories: *count-based methods* (e.g. -[Latent Semantic Analysis](https://en.wikipedia.org/wiki/Latent_semantic_analysis)), -and *predictive methods* (e.g. -[neural probabilistic language models](http://www.scholarpedia.org/article/Neural_net_language_models)). - -This distinction is elaborated in much more detail by -[Baroni et al.](http://clic.cimec.unitn.it/marco/publications/acl2014/baroni-etal-countpredict-acl2014.pdf), -but in a nutshell: Count-based methods compute the statistics of -how often some word co-occurs with its neighbor words in a large text corpus, -and then map these count-statistics down to a small, dense vector for each word. -Predictive models directly try to predict a word from its neighbors in terms of -learned small, dense *embedding vectors* (considered parameters of the -model). - -Word2vec is a particularly computationally-efficient predictive model for -learning word embeddings from raw text. It comes in two flavors, the Continuous -Bag-of-Words model (CBOW) and the Skip-Gram model (Section 3.1 and 3.2 in [Mikolov et al.](https://arxiv.org/pdf/1301.3781.pdf)). Algorithmically, these -models are similar, except that CBOW predicts target words (e.g. 'mat') from -source context words ('the cat sits on the'), while the skip-gram does the -inverse and predicts source context-words from the target words. This inversion -might seem like an arbitrary choice, but statistically it has the effect that -CBOW smoothes over a lot of the distributional information (by treating an -entire context as one observation). For the most part, this turns out to be a -useful thing for smaller datasets. However, skip-gram treats each context-target -pair as a new observation, and this tends to do better when we have larger -datasets. We will focus on the skip-gram model in the rest of this tutorial. - - -## Scaling up with Noise-Contrastive Training - -Neural probabilistic language models are traditionally trained using the -[maximum likelihood](https://en.wikipedia.org/wiki/Maximum_likelihood) (ML) -principle to maximize the probability of the next word \\(w_t\\) (for "target") -given the previous words \\(h\\) (for "history") in terms of a -[*softmax* function](https://en.wikipedia.org/wiki/Softmax_function), - -$$ -\begin{align} -P(w_t | h) &= \text{softmax}(\text{score}(w_t, h)) \\ - &= \frac{\exp \{ \text{score}(w_t, h) \} } - {\sum_\text{Word w' in Vocab} \exp \{ \text{score}(w', h) \} } -\end{align} -$$ - -where \\(\text{score}(w_t, h)\\) computes the compatibility of word \\(w_t\\) -with the context \\(h\\) (a dot product is commonly used). We train this model -by maximizing its [log-likelihood](https://en.wikipedia.org/wiki/Likelihood_function) -on the training set, i.e. by maximizing - -$$ -\begin{align} - J_\text{ML} &= \log P(w_t | h) \\ - &= \text{score}(w_t, h) - - \log \left( \sum_\text{Word w' in Vocab} \exp \{ \text{score}(w', h) \} \right). -\end{align} -$$ - -This yields a properly normalized probabilistic model for language modeling. -However this is very expensive, because we need to compute and normalize each -probability using the score for all other \\(V\\) words \\(w'\\) in the current -context \\(h\\), *at every training step*. - -
- -
- -On the other hand, for feature learning in word2vec we do not need a full -probabilistic model. The CBOW and skip-gram models are instead trained using a -binary classification objective ([logistic regression](https://en.wikipedia.org/wiki/Logistic_regression)) -to discriminate the real target words \\(w_t\\) from \\(k\\) imaginary (noise) words \\(\tilde w\\), in the -same context. We illustrate this below for a CBOW model. For skip-gram the -direction is simply inverted. - -
- -
- -Mathematically, the objective (for each example) is to maximize - -$$J_\text{NEG} = \log Q_\theta(D=1 |w_t, h) + - k \mathop{\mathbb{E}}_{\tilde w \sim P_\text{noise}} - \left[ \log Q_\theta(D = 0 |\tilde w, h) \right]$$ - -where \\(Q_\theta(D=1 | w, h)\\) is the binary logistic regression probability -under the model of seeing the word \\(w\\) in the context \\(h\\) in the dataset -\\(D\\), calculated in terms of the learned embedding vectors \\(\theta\\). In -practice we approximate the expectation by drawing \\(k\\) contrastive words -from the noise distribution (i.e. we compute a -[Monte Carlo average](https://en.wikipedia.org/wiki/Monte_Carlo_integration)). - -This objective is maximized when the model assigns high probabilities -to the real words, and low probabilities to noise words. Technically, this is -called -[Negative Sampling](https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf), -and there is good mathematical motivation for using this loss function: -The updates it proposes approximate the updates of the softmax function in the -limit. But computationally it is especially appealing because computing the -loss function now scales only with the number of *noise words* that we -select (\\(k\\)), and not *all words* in the vocabulary (\\(V\\)). This makes it -much faster to train. We will actually make use of the very similar -[noise-contrastive estimation (NCE)](https://papers.nips.cc/paper/5165-learning-word-embeddings-efficiently-with-noise-contrastive-estimation.pdf) -loss, for which TensorFlow has a handy helper function `tf.nn.nce_loss()`. - -Let's get an intuitive feel for how this would work in practice! - -## The Skip-gram Model - -As an example, let's consider the dataset - -`the quick brown fox jumped over the lazy dog` - -We first form a dataset of words and the contexts in which they appear. We -could define 'context' in any way that makes sense, and in fact people have -looked at syntactic contexts (i.e. the syntactic dependents of the current -target word, see e.g. -[Levy et al.](https://levyomer.files.wordpress.com/2014/04/dependency-based-word-embeddings-acl-2014.pdf)), -words-to-the-left of the target, words-to-the-right of the target, etc. For now, -let's stick to the vanilla definition and define 'context' as the window -of words to the left and to the right of a target word. Using a window -size of 1, we then have the dataset - -`([the, brown], quick), ([quick, fox], brown), ([brown, jumped], fox), ...` - -of `(context, target)` pairs. Recall that skip-gram inverts contexts and -targets, and tries to predict each context word from its target word, so the -task becomes to predict 'the' and 'brown' from 'quick', 'quick' and 'fox' from -'brown', etc. Therefore our dataset becomes - -`(quick, the), (quick, brown), (brown, quick), (brown, fox), ...` - -of `(input, output)` pairs. The objective function is defined over the entire -dataset, but we typically optimize this with -[stochastic gradient descent](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) -(SGD) using one example at a time (or a 'minibatch' of `batch_size` examples, -where typically `16 <= batch_size <= 512`). So let's look at one step of -this process. - -Let's imagine at training step \\(t\\) we observe the first training case above, -where the goal is to predict `the` from `quick`. We select `num_noise` number -of noisy (contrastive) examples by drawing from some noise distribution, -typically the unigram distribution, \\(P(w)\\). For simplicity let's say -`num_noise=1` and we select `sheep` as a noisy example. Next we compute the -loss for this pair of observed and noisy examples, i.e. the objective at time -step \\(t\\) becomes - -$$J^{(t)}_\text{NEG} = \log Q_\theta(D=1 | \text{the, quick}) + - \log(Q_\theta(D=0 | \text{sheep, quick}))$$ - -The goal is to make an update to the embedding parameters \\(\theta\\) to improve -(in this case, maximize) this objective function. We do this by deriving the -gradient of the loss with respect to the embedding parameters \\(\theta\\), i.e. -\\(\frac{\partial}{\partial \theta} J_\text{NEG}\\) (luckily TensorFlow provides -easy helper functions for doing this!). We then perform an update to the -embeddings by taking a small step in the direction of the gradient. When this -process is repeated over the entire training set, this has the effect of -'moving' the embedding vectors around for each word until the model is -successful at discriminating real words from noise words. - -We can visualize the learned vectors by projecting them down to 2 dimensions -using for instance something like the -[t-SNE dimensionality reduction technique](https://lvdmaaten.github.io/tsne/). -When we inspect these visualizations it becomes apparent that the vectors -capture some general, and in fact quite useful, semantic information about -words and their relationships to one another. It was very interesting when we -first discovered that certain directions in the induced vector space specialize -towards certain semantic relationships, e.g. *male-female*, *verb tense* and -even *country-capital* relationships between words, as illustrated in the figure -below (see also for example -[Mikolov et al., 2013](https://www.aclweb.org/anthology/N13-1090)). - -
- -
- -This explains why these vectors are also useful as features for many canonical -NLP prediction tasks, such as part-of-speech tagging or named entity recognition -(see for example the original work by -[Collobert et al., 2011](https://arxiv.org/abs/1103.0398) -([pdf](https://arxiv.org/pdf/1103.0398.pdf)), or follow-up work by -[Turian et al., 2010](https://www.aclweb.org/anthology/P10-1040)). - -But for now, let's just use them to draw pretty pictures! - -## Building the Graph - -This is all about embeddings, so let's define our embedding matrix. -This is just a big random matrix to start. We'll initialize the values to be -uniform in the unit cube. - -```python -embeddings = tf.Variable( - tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0)) -``` - -The noise-contrastive estimation loss is defined in terms of a logistic regression -model. For this, we need to define the weights and biases for each word in the -vocabulary (also called the `output weights` as opposed to the `input -embeddings`). So let's define that. - -```python -nce_weights = tf.Variable( - tf.truncated_normal([vocabulary_size, embedding_size], - stddev=1.0 / math.sqrt(embedding_size))) -nce_biases = tf.Variable(tf.zeros([vocabulary_size])) -``` - -Now that we have the parameters in place, we can define our skip-gram model -graph. For simplicity, let's suppose we've already integerized our text corpus -with a vocabulary so that each word is represented as an integer (see -[tensorflow/examples/tutorials/word2vec/word2vec_basic.py](https://www.tensorflow.org/code/tensorflow/examples/tutorials/word2vec/word2vec_basic.py) -for the details). The skip-gram model takes two inputs. One is a batch full of -integers representing the source context words, the other is for the target -words. Let's create placeholder nodes for these inputs, so that we can feed in -data later. - -```python -# Placeholders for inputs -train_inputs = tf.placeholder(tf.int32, shape=[batch_size]) -train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1]) -``` - -Now what we need to do is look up the vector for each of the source words in -the batch. TensorFlow has handy helpers that make this easy. - -```python -embed = tf.nn.embedding_lookup(embeddings, train_inputs) -``` - -Ok, now that we have the embeddings for each word, we'd like to try to predict -the target word using the noise-contrastive training objective. - -```python -# Compute the NCE loss, using a sample of the negative labels each time. -loss = tf.reduce_mean( - tf.nn.nce_loss(weights=nce_weights, - biases=nce_biases, - labels=train_labels, - inputs=embed, - num_sampled=num_sampled, - num_classes=vocabulary_size)) -``` - -Now that we have a loss node, we need to add the nodes required to compute -gradients and update the parameters, etc. For this we will use stochastic -gradient descent, and TensorFlow has handy helpers to make this easy as well. - -```python -# We use the SGD optimizer. -optimizer = tf.train.GradientDescentOptimizer(learning_rate=1.0).minimize(loss) -``` - -## Training the Model - -Training the model is then as simple as using a `feed_dict` to push data into -the placeholders and calling -@{tf.Session.run} with this new data -in a loop. - -```python -for inputs, labels in generate_batch(...): - feed_dict = {train_inputs: inputs, train_labels: labels} - _, cur_loss = session.run([optimizer, loss], feed_dict=feed_dict) -``` - -See the full example code in -[tensorflow/examples/tutorials/word2vec/word2vec_basic.py](https://www.tensorflow.org/code/tensorflow/examples/tutorials/word2vec/word2vec_basic.py). - -## Visualizing the Learned Embeddings - -After training has finished we can visualize the learned embeddings using -t-SNE. - -
- -
- -Et voila! As expected, words that are similar end up clustering nearby each -other. For a more heavyweight implementation of word2vec that showcases more of -the advanced features of TensorFlow, see the implementation in -[models/tutorials/embedding/word2vec.py](https://www.tensorflow.org/code/tensorflow_models/tutorials/embedding/word2vec.py). - -## Evaluating Embeddings: Analogical Reasoning - -Embeddings are useful for a wide variety of prediction tasks in NLP. Short of -training a full-blown part-of-speech model or named-entity model, one simple way -to evaluate embeddings is to directly use them to predict syntactic and semantic -relationships like `king is to queen as father is to ?`. This is called -*analogical reasoning* and the task was introduced by -[Mikolov and colleagues -](https://www.aclweb.org/anthology/N13-1090). -Download the dataset for this task from -[download.tensorflow.org](http://download.tensorflow.org/data/questions-words.txt). - -To see how we do this evaluation, have a look at the `build_eval_graph()` and -`eval()` functions in -[models/tutorials/embedding/word2vec.py](https://www.tensorflow.org/code/tensorflow_models/tutorials/embedding/word2vec.py). - -The choice of hyperparameters can strongly influence the accuracy on this task. -To achieve state-of-the-art performance on this task requires training over a -very large dataset, carefully tuning the hyperparameters and making use of -tricks like subsampling the data, which is out of the scope of this tutorial. - - -## Optimizing the Implementation - -Our vanilla implementation showcases the flexibility of TensorFlow. For -example, changing the training objective is as simple as swapping out the call -to `tf.nn.nce_loss()` for an off-the-shelf alternative such as -`tf.nn.sampled_softmax_loss()`. If you have a new idea for a loss function, you -can manually write an expression for the new objective in TensorFlow and let -the optimizer compute its derivatives. This flexibility is invaluable in the -exploratory phase of machine learning model development, where we are trying -out several different ideas and iterating quickly. - -Once you have a model structure you're satisfied with, it may be worth -optimizing your implementation to run more efficiently (and cover more data in -less time). For example, the naive code we used in this tutorial would suffer -compromised speed because we use Python for reading and feeding data items -- -each of which require very little work on the TensorFlow back-end. If you find -your model is seriously bottlenecked on input data, you may want to implement a -custom data reader for your problem, as described in -@{$new_data_formats$New Data Formats}. For the case of Skip-Gram -modeling, we've actually already done this for you as an example in -[models/tutorials/embedding/word2vec.py](https://www.tensorflow.org/code/tensorflow_models/tutorials/embedding/word2vec.py). - -If your model is no longer I/O bound but you want still more performance, you -can take things further by writing your own TensorFlow Ops, as described in -@{$adding_an_op$Adding a New Op}. Again we've provided an -example of this for the Skip-Gram case -[models/tutorials/embedding/word2vec_optimized.py](https://www.tensorflow.org/code/tensorflow_models/tutorials/embedding/word2vec_optimized.py). -Feel free to benchmark these against each other to measure performance -improvements at each stage. - -## Conclusion - -In this tutorial we covered the word2vec model, a computationally efficient -model for learning word embeddings. We motivated why embeddings are useful, -discussed efficient training techniques and showed how to implement all of this -in TensorFlow. Overall, we hope that this has show-cased how TensorFlow affords -you the flexibility you need for early experimentation, and the control you -later need for bespoke optimized implementation. -- cgit v1.2.3 From b46fde9a42f97d66535a2dde60642ce22473f80c Mon Sep 17 00:00:00 2001 From: Billy Lamberta Date: Tue, 3 Jul 2018 16:56:01 -0700 Subject: fix rc2 --- tensorflow/docs_src/install/install_c.md | 2 +- tensorflow/docs_src/install/install_go.md | 2 +- tensorflow/docs_src/install/install_java.md | 22 +++++++++++----------- tensorflow/docs_src/install/install_linux.md | 18 +++++++++--------- tensorflow/docs_src/install/install_mac.md | 10 +++++----- tensorflow/docs_src/install/install_sources.md | 4 ++-- 6 files changed, 29 insertions(+), 29 deletions(-) diff --git a/tensorflow/docs_src/install/install_c.md b/tensorflow/docs_src/install/install_c.md index 2901848745..9aebf2bfa4 100644 --- a/tensorflow/docs_src/install/install_c.md +++ b/tensorflow/docs_src/install/install_c.md @@ -38,7 +38,7 @@ enable TensorFlow for C: OS="linux" # Change to "darwin" for macOS TARGET_DIRECTORY="/usr/local" curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-${OS}-x86_64-1.9.0-rc0.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-${OS}-x86_64-1.9.0-rc2.tar.gz" | sudo tar -C $TARGET_DIRECTORY -xz The `tar` command extracts the TensorFlow C library into the `lib` diff --git a/tensorflow/docs_src/install/install_go.md b/tensorflow/docs_src/install/install_go.md index 2c126df5aa..1907355341 100644 --- a/tensorflow/docs_src/install/install_go.md +++ b/tensorflow/docs_src/install/install_go.md @@ -38,7 +38,7 @@ steps to install this library and enable TensorFlow for Go: TF_TYPE="cpu" # Change to "gpu" for GPU support TARGET_DIRECTORY='/usr/local' curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-$(go env GOOS)-x86_64-1.9.0-rc0.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-$(go env GOOS)-x86_64-1.9.0-rc2.tar.gz" | sudo tar -C $TARGET_DIRECTORY -xz The `tar` command extracts the TensorFlow C library into the `lib` diff --git a/tensorflow/docs_src/install/install_java.md b/tensorflow/docs_src/install/install_java.md index 692dfc9cef..1fbdcc2b47 100644 --- a/tensorflow/docs_src/install/install_java.md +++ b/tensorflow/docs_src/install/install_java.md @@ -36,7 +36,7 @@ following to the project's `pom.xml` to use the TensorFlow Java APIs: org.tensorflow tensorflow - 1.9.0-rc0 + 1.9.0-rc2 ``` @@ -65,7 +65,7 @@ As an example, these steps will create a Maven project that uses TensorFlow: org.tensorflow tensorflow - 1.9.0-rc0 + 1.9.0-rc2 @@ -124,12 +124,12 @@ instead: org.tensorflow libtensorflow - 1.9.0-rc0 + 1.9.0-rc2 org.tensorflow libtensorflow_jni_gpu - 1.9.0-rc0 + 1.9.0-rc2 ``` @@ -148,7 +148,7 @@ refer to the simpler instructions above instead. Take the following steps to install TensorFlow for Java on Linux or macOS: 1. Download - [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc0.jar), + [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc2.jar), which is the TensorFlow Java Archive (JAR). 2. Decide whether you will run TensorFlow for Java on CPU(s) only or with @@ -167,7 +167,7 @@ Take the following steps to install TensorFlow for Java on Linux or macOS: OS=$(uname -s | tr '[:upper:]' '[:lower:]') mkdir -p ./jni curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-${TF_TYPE}-${OS}-x86_64-1.9.0-rc0.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-${TF_TYPE}-${OS}-x86_64-1.9.0-rc2.tar.gz" | tar -xz -C ./jni ### Install on Windows @@ -175,10 +175,10 @@ Take the following steps to install TensorFlow for Java on Linux or macOS: Take the following steps to install TensorFlow for Java on Windows: 1. Download - [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc0.jar), + [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc2.jar), which is the TensorFlow Java Archive (JAR). 2. Download the following Java Native Interface (JNI) file appropriate for - [TensorFlow for Java on Windows](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-cpu-windows-x86_64-1.9.0-rc0.zip). + [TensorFlow for Java on Windows](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-cpu-windows-x86_64-1.9.0-rc2.zip). 3. Extract this .zip file. __Note__: The native library (`tensorflow_jni.dll`) requires `msvcp140.dll` at runtime, which is included in the [Visual C++ 2015 Redistributable](https://www.microsoft.com/en-us/download/details.aspx?id=48145) package. @@ -227,7 +227,7 @@ must be part of your `classpath`. For example, you can include the downloaded `.jar` in your `classpath` by using the `-cp` compilation flag as follows: -
javac -cp libtensorflow-1.9.0-rc0.jar HelloTF.java
+
javac -cp libtensorflow-1.9.0-rc2.jar HelloTF.java
### Running @@ -241,11 +241,11 @@ two files are available to the JVM: For example, the following command line executes the `HelloTF` program on Linux and macOS X: -
java -cp libtensorflow-1.9.0-rc0.jar:. -Djava.library.path=./jni HelloTF
+
java -cp libtensorflow-1.9.0-rc2.jar:. -Djava.library.path=./jni HelloTF
And the following command line executes the `HelloTF` program on Windows: -
java -cp libtensorflow-1.9.0-rc0.jar;. -Djava.library.path=jni HelloTF
+
java -cp libtensorflow-1.9.0-rc2.jar;. -Djava.library.path=jni HelloTF
If the program prints Hello from version, you've successfully installed TensorFlow for Java and are ready to use the API. If the program diff --git a/tensorflow/docs_src/install/install_linux.md b/tensorflow/docs_src/install/install_linux.md index f21c073a1b..8efa166073 100644 --- a/tensorflow/docs_src/install/install_linux.md +++ b/tensorflow/docs_src/install/install_linux.md @@ -436,7 +436,7 @@ Take the following steps to install TensorFlow in an Anaconda environment:
      (tensorflow)$ pip install --ignore-installed --upgrade \
-     https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp34-cp34m-linux_x86_64.whl
+ https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp34-cp34m-linux_x86_64.whl ## Validate your installation @@ -676,14 +676,14 @@ This section documents the relevant values for Linux installations. CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp27-none-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp27-none-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc0-cp27-none-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp27-none-linux_x86_64.whl
 
Note that GPU support requires the NVIDIA hardware and software described in @@ -695,14 +695,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp34-cp34m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp34-cp34m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc0-cp34-cp34m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp34-cp34m-linux_x86_64.whl
 
Note that GPU support requires the NVIDIA hardware and software described in @@ -714,14 +714,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp35-cp35m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp35-cp35m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc0-cp35-cp35m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp35-cp35m-linux_x86_64.whl
 
@@ -733,14 +733,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc0-cp36-cp36m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp36-cp36m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc0-cp36-cp36m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp36-cp36m-linux_x86_64.whl
 
diff --git a/tensorflow/docs_src/install/install_mac.md b/tensorflow/docs_src/install/install_mac.md index c6f0c17924..5b593d1ca9 100644 --- a/tensorflow/docs_src/install/install_mac.md +++ b/tensorflow/docs_src/install/install_mac.md @@ -119,7 +119,7 @@ Take the following steps to install TensorFlow with Virtualenv: TensorFlow in the active Virtualenv is as follows:
 $ pip3 install --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py3-none-any.whl
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py3-none-any.whl If you encounter installation problems, see [Common Installation Problems](#common-installation-problems). @@ -242,7 +242,7 @@ take the following steps: issue the following command:
 $ sudo pip3 install --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py3-none-any.whl 
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py3-none-any.whl If the preceding command fails, see [installation problems](#common-installation-problems). @@ -350,7 +350,7 @@ Take the following steps to install TensorFlow in an Anaconda environment: TensorFlow for Python 2.7:
 (targetDirectory)$ pip install --ignore-installed --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py2-none-any.whl
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py2-none-any.whl @@ -517,7 +517,7 @@ The value you specify depends on your Python version.
-https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py2-none-any.whl
+https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py2-none-any.whl
 
@@ -525,5 +525,5 @@ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py2-none-a
-https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc0-py3-none-any.whl
+https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py3-none-any.whl
 
diff --git a/tensorflow/docs_src/install/install_sources.md b/tensorflow/docs_src/install/install_sources.md index fc1f6d05bd..3801fc0f83 100644 --- a/tensorflow/docs_src/install/install_sources.md +++ b/tensorflow/docs_src/install/install_sources.md @@ -338,10 +338,10 @@ Invoke `pip install` to install that pip package. The filename of the `.whl` file depends on your platform. For example, the following command will install the pip package -for TensorFlow 1.9.0rc0 on Linux: +for TensorFlow 1.9.0rc2 on Linux:
-$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.9.0rc0-py2-none-any.whl
+$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.9.0rc2-py2-none-any.whl
 
## Validate your installation -- cgit v1.2.3 From d33bc55210478d58b858704bfa92316860b777fa Mon Sep 17 00:00:00 2001 From: Amit Patankar Date: Fri, 6 Jul 2018 09:27:31 -0700 Subject: Updating the version to 1.9.0 official. --- tensorflow/core/public/version.h | 2 +- tensorflow/docs_src/install/install_c.md | 2 +- tensorflow/docs_src/install/install_go.md | 2 +- tensorflow/docs_src/install/install_java.md | 22 +++++++++++----------- tensorflow/docs_src/install/install_linux.md | 18 +++++++++--------- tensorflow/docs_src/install/install_mac.md | 10 +++++----- tensorflow/docs_src/install/install_sources.md | 4 ++-- tensorflow/tools/pip_package/setup.py | 2 +- 8 files changed, 31 insertions(+), 31 deletions(-) diff --git a/tensorflow/core/public/version.h b/tensorflow/core/public/version.h index 0e4a61ac1f..cea5e8ffb0 100644 --- a/tensorflow/core/public/version.h +++ b/tensorflow/core/public/version.h @@ -24,7 +24,7 @@ limitations under the License. // TF_VERSION_SUFFIX is non-empty for pre-releases (e.g. "-alpha", "-alpha.1", // "-beta", "-rc", "-rc.1") -#define TF_VERSION_SUFFIX "-rc2" +#define TF_VERSION_SUFFIX "" #define TF_STR_HELPER(x) #x #define TF_STR(x) TF_STR_HELPER(x) diff --git a/tensorflow/docs_src/install/install_c.md b/tensorflow/docs_src/install/install_c.md index 9aebf2bfa4..362a03cd56 100644 --- a/tensorflow/docs_src/install/install_c.md +++ b/tensorflow/docs_src/install/install_c.md @@ -38,7 +38,7 @@ enable TensorFlow for C: OS="linux" # Change to "darwin" for macOS TARGET_DIRECTORY="/usr/local" curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-${OS}-x86_64-1.9.0-rc2.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-${OS}-x86_64-1.9.0.tar.gz" | sudo tar -C $TARGET_DIRECTORY -xz The `tar` command extracts the TensorFlow C library into the `lib` diff --git a/tensorflow/docs_src/install/install_go.md b/tensorflow/docs_src/install/install_go.md index 1907355341..a4f2e5733b 100644 --- a/tensorflow/docs_src/install/install_go.md +++ b/tensorflow/docs_src/install/install_go.md @@ -38,7 +38,7 @@ steps to install this library and enable TensorFlow for Go: TF_TYPE="cpu" # Change to "gpu" for GPU support TARGET_DIRECTORY='/usr/local' curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-$(go env GOOS)-x86_64-1.9.0-rc2.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-${TF_TYPE}-$(go env GOOS)-x86_64-1.9.0.tar.gz" | sudo tar -C $TARGET_DIRECTORY -xz The `tar` command extracts the TensorFlow C library into the `lib` diff --git a/tensorflow/docs_src/install/install_java.md b/tensorflow/docs_src/install/install_java.md index b9c9912816..643c3b715f 100644 --- a/tensorflow/docs_src/install/install_java.md +++ b/tensorflow/docs_src/install/install_java.md @@ -36,7 +36,7 @@ following to the project's `pom.xml` to use the TensorFlow Java APIs: org.tensorflow tensorflow - 1.9.0-rc2 + 1.9.0 ``` @@ -65,7 +65,7 @@ As an example, these steps will create a Maven project that uses TensorFlow: org.tensorflow tensorflow - 1.9.0-rc2 + 1.9.0 @@ -124,12 +124,12 @@ instead: org.tensorflow libtensorflow - 1.9.0-rc2 + 1.9.0 org.tensorflow libtensorflow_jni_gpu - 1.9.0-rc2 + 1.9.0 ``` @@ -148,7 +148,7 @@ refer to the simpler instructions above instead. Take the following steps to install TensorFlow for Java on Linux or macOS: 1. Download - [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc2.jar), + [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0.jar), which is the TensorFlow Java Archive (JAR). 2. Decide whether you will run TensorFlow for Java on CPU(s) only or with @@ -167,7 +167,7 @@ Take the following steps to install TensorFlow for Java on Linux or macOS: OS=$(uname -s | tr '[:upper:]' '[:lower:]') mkdir -p ./jni curl -L \ - "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-${TF_TYPE}-${OS}-x86_64-1.9.0-rc2.tar.gz" | + "https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-${TF_TYPE}-${OS}-x86_64-1.9.0.tar.gz" | tar -xz -C ./jni ### Install on Windows @@ -175,10 +175,10 @@ Take the following steps to install TensorFlow for Java on Linux or macOS: Take the following steps to install TensorFlow for Java on Windows: 1. Download - [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0-rc2.jar), + [libtensorflow.jar](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-1.9.0.jar), which is the TensorFlow Java Archive (JAR). 2. Download the following Java Native Interface (JNI) file appropriate for - [TensorFlow for Java on Windows](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-cpu-windows-x86_64-1.9.0-rc2.zip). + [TensorFlow for Java on Windows](https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow_jni-cpu-windows-x86_64-1.9.0.zip). 3. Extract this .zip file. @@ -227,7 +227,7 @@ must be part of your `classpath`. For example, you can include the downloaded `.jar` in your `classpath` by using the `-cp` compilation flag as follows: -
javac -cp libtensorflow-1.9.0-rc2.jar HelloTF.java
+
javac -cp libtensorflow-1.9.0.jar HelloTF.java
### Running @@ -241,11 +241,11 @@ two files are available to the JVM: For example, the following command line executes the `HelloTF` program on Linux and macOS X: -
java -cp libtensorflow-1.9.0-rc2.jar:. -Djava.library.path=./jni HelloTF
+
java -cp libtensorflow-1.9.0.jar:. -Djava.library.path=./jni HelloTF
And the following command line executes the `HelloTF` program on Windows: -
java -cp libtensorflow-1.9.0-rc2.jar;. -Djava.library.path=jni HelloTF
+
java -cp libtensorflow-1.9.0.jar;. -Djava.library.path=jni HelloTF
If the program prints Hello from version, you've successfully installed TensorFlow for Java and are ready to use the API. If the program diff --git a/tensorflow/docs_src/install/install_linux.md b/tensorflow/docs_src/install/install_linux.md index ae3d50ff39..abec8ca072 100644 --- a/tensorflow/docs_src/install/install_linux.md +++ b/tensorflow/docs_src/install/install_linux.md @@ -438,7 +438,7 @@ Take the following steps to install TensorFlow in an Anaconda environment:
      (tensorflow)$ pip install --ignore-installed --upgrade \
-     https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp34-cp34m-linux_x86_64.whl
+ https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0-cp34-cp34m-linux_x86_64.whl ## Validate your installation @@ -678,14 +678,14 @@ This section documents the relevant values for Linux installations. CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp27-none-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0-cp27-none-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp27-none-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0-cp27-none-linux_x86_64.whl
 
Note that GPU support requires the NVIDIA hardware and software described in @@ -697,14 +697,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp34-cp34m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0-cp34-cp34m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp34-cp34m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0-cp34-cp34m-linux_x86_64.whl
 
Note that GPU support requires the NVIDIA hardware and software described in @@ -716,14 +716,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp35-cp35m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0-cp35-cp35m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp35-cp35m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0-cp35-cp35m-linux_x86_64.whl
 
@@ -735,14 +735,14 @@ Note that GPU support requires the NVIDIA hardware and software described in CPU only:
-https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0rc2-cp36-cp36m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-1.9.0-cp36-cp36m-linux_x86_64.whl
 
GPU support:
-https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0rc2-cp36-cp36m-linux_x86_64.whl
+https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.9.0-cp36-cp36m-linux_x86_64.whl
 
diff --git a/tensorflow/docs_src/install/install_mac.md b/tensorflow/docs_src/install/install_mac.md index 3de6da1342..167d17adb4 100644 --- a/tensorflow/docs_src/install/install_mac.md +++ b/tensorflow/docs_src/install/install_mac.md @@ -119,7 +119,7 @@ Take the following steps to install TensorFlow with Virtualenv: TensorFlow in the active Virtualenv is as follows:
 $ pip3 install --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py3-none-any.whl
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0-py3-none-any.whl If you encounter installation problems, see [Common Installation Problems](#common-installation-problems). @@ -242,7 +242,7 @@ take the following steps: issue the following command:
 $ sudo pip3 install --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py3-none-any.whl 
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0-py3-none-any.whl If the preceding command fails, see [installation problems](#common-installation-problems). @@ -350,7 +350,7 @@ Take the following steps to install TensorFlow in an Anaconda environment: TensorFlow for Python 2.7:
 (targetDirectory)$ pip install --ignore-installed --upgrade \
-     https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py2-none-any.whl
+ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0-py2-none-any.whl @@ -518,7 +518,7 @@ The value you specify depends on your Python version.
-https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py2-none-any.whl
+https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0-py2-none-any.whl
 
@@ -526,5 +526,5 @@ https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py2-none-a
-https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0rc2-py3-none-any.whl
+https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.9.0-py3-none-any.whl
 
diff --git a/tensorflow/docs_src/install/install_sources.md b/tensorflow/docs_src/install/install_sources.md index 3520f97c9a..79da209928 100644 --- a/tensorflow/docs_src/install/install_sources.md +++ b/tensorflow/docs_src/install/install_sources.md @@ -328,10 +328,10 @@ Invoke `pip install` to install that pip package. The filename of the `.whl` file depends on your platform. For example, the following command will install the pip package -for TensorFlow 1.9.0rc2 on Linux: +for TensorFlow 1.9.0 on Linux:
-$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.9.0rc2-py2-none-any.whl
+$ sudo pip install /tmp/tensorflow_pkg/tensorflow-1.9.0-py2-none-any.whl
 
## Validate your installation diff --git a/tensorflow/tools/pip_package/setup.py b/tensorflow/tools/pip_package/setup.py index 8c077580aa..dc9d059bab 100644 --- a/tensorflow/tools/pip_package/setup.py +++ b/tensorflow/tools/pip_package/setup.py @@ -45,7 +45,7 @@ DOCLINES = __doc__.split('\n') # This version string is semver compatible, but incompatible with pip. # For pip, we will remove all '-' characters from this string, and use the # result for pip. -_VERSION = '1.9.0-rc2' +_VERSION = '1.9.0' REQUIRED_PACKAGES = [ 'absl-py >= 0.1.6', -- cgit v1.2.3 From a522d458dacd3a34c4ff2e6b76556f623fe7dbd6 Mon Sep 17 00:00:00 2001 From: Gunhan Gulsoy Date: Fri, 29 Jun 2018 22:43:22 -0700 Subject: Remove unused gcp and hdfs config flags, as these are on by default now. PiperOrigin-RevId: 202753310 --- tensorflow/tools/ci_build/ci_parameterized_build.sh | 2 +- tensorflow/tools/ci_build/ci_sanity.sh | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tensorflow/tools/ci_build/ci_parameterized_build.sh b/tensorflow/tools/ci_build/ci_parameterized_build.sh index e621f85652..6aaeb14aee 100755 --- a/tensorflow/tools/ci_build/ci_parameterized_build.sh +++ b/tensorflow/tools/ci_build/ci_parameterized_build.sh @@ -132,7 +132,7 @@ BAZEL_CMD="bazel test" BAZEL_BUILD_ONLY_CMD="bazel build" BAZEL_CLEAN_CMD="bazel clean" -DEFAULT_BAZEL_CONFIGS="--config=gcp --config=hdfs" +DEFAULT_BAZEL_CONFIGS="" PIP_CMD="${CI_BUILD_DIR}/builds/pip.sh" PIP_TEST_TUTORIALS_FLAG="--test_tutorials" diff --git a/tensorflow/tools/ci_build/ci_sanity.sh b/tensorflow/tools/ci_build/ci_sanity.sh index 05676f9551..0dd32ad1a8 100755 --- a/tensorflow/tools/ci_build/ci_sanity.sh +++ b/tensorflow/tools/ci_build/ci_sanity.sh @@ -543,7 +543,7 @@ SANITY_STEPS=("do_pylint PYTHON2" "do_pylint PYTHON3" "do_check_futures_test" "d SANITY_STEPS_DESC=("Python 2 pylint" "Python 3 pylint" "Check that python files have certain __future__ imports" "buildifier check" "bazel nobuild" "pip: license check for external dependencies" "C library: license check for external dependencies" "Java Native Library: license check for external dependencies" "Pip Smoke Test: Checking py_test dependencies exist in pip package" "Check load py_test: Check that BUILD files with py_test target properly load py_test" "Code Link Check: Check there are no broken links" "Test entries in /tensorflow/contrib/cmake/python_{modules|protos|protos_cc}.txt for validity and consistency" "Check file names for cases") INCREMENTAL_FLAG="" -DEFAULT_BAZEL_CONFIGS="--config=hdfs --config=gcp" +DEFAULT_BAZEL_CONFIGS="" # Parse command-line arguments BAZEL_FLAGS=${DEFAULT_BAZEL_CONFIGS} -- cgit v1.2.3