aboutsummaryrefslogtreecommitdiffhomepage
diff options
context:
space:
mode:
authorGravatar Vincent Vanhoucke <vanhoucke@google.com>2016-01-25 10:43:05 -0800
committerGravatar Vijay Vasudevan <vrv@google.com>2016-01-25 10:50:07 -0800
commite0a4afbd33387c8a3750a4df72f79b4af5db3880 (patch)
tree41df2bb9d2c05d5db697f05931a1ac067fbeb173
parentb168b1abafa1c80243b5b7bb02c83832a70f5a0c (diff)
Switch notebooks to v4 format. Should be a no-op functionally.
Change: 112965179
-rw-r--r--tensorflow/examples/udacity/1_notmnist.ipynb952
-rw-r--r--tensorflow/examples/udacity/2_fullyconnected.ipynb817
-rw-r--r--tensorflow/examples/udacity/3_regularization.ipynb449
-rw-r--r--tensorflow/examples/udacity/4_convolutions.ipynb641
-rw-r--r--tensorflow/examples/udacity/5_word2vec.ipynb3
-rw-r--r--tensorflow/examples/udacity/6_lstm.ipynb1373
6 files changed, 2861 insertions, 1374 deletions
diff --git a/tensorflow/examples/udacity/1_notmnist.ipynb b/tensorflow/examples/udacity/1_notmnist.ipynb
index f39fc0f07a..e32e4bf030 100644
--- a/tensorflow/examples/udacity/1_notmnist.ipynb
+++ b/tensorflow/examples/udacity/1_notmnist.ipynb
@@ -1,396 +1,652 @@
{
- "worksheets": [
+ "nbformat": 4,
+ "nbformat_minor": 0,
+ "metadata": {
+ "colab": {
+ "version": "0.3.2",
+ "views": {},
+ "default_view": {},
+ "name": "1_notmnist.ipynb",
+ "provenance": []
+ }
+ },
+ "cells": [
{
- "cells": [
- {
- "metadata": {
- "id": "5hIbr52I7Z7U",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "Deep Learning\n=============\n\nAssignment 1\n------------\n\nThe objective of this assignment is to learn about simple data curation practices, and familiarize you with some of the data we'll be reusing later.\n\nThis notebook uses the [notMNIST](http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html) dataset to be used with python experiments. This dataset is designed to look like the classic [MNIST](http://yann.lecun.com/exdb/mnist/) dataset, while looking a little more like real data: it's a harder task, and the data is a lot less 'clean' than MNIST."
- },
- {
- "metadata": {
- "id": "apJbCsBHl-2A",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
- },
- "cell_type": "code",
- "input": "# These are all the modules we'll be using later. Make sure you can import them\n# before proceeding further.\nimport matplotlib.pyplot as plt\nimport numpy as np\nimport os\nimport tarfile\nimport urllib\nfrom IPython.display import display, Image\nfrom scipy import ndimage\nfrom sklearn.linear_model import LogisticRegression\nimport cPickle as pickle",
- "language": "python",
- "outputs": []
- },
- {
- "metadata": {
- "id": "jNWGtZaXn-5j",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "First, we'll download the dataset to our local machine. The data consists of characters rendered in a variety of fonts on a 28x28 image. The labels are limited to 'A' through 'J' (10 classes). The training set has about 500k and the testset 19000 labelled examples. Given these sizes, it should be possible to train models quickly on any machine."
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5hIbr52I7Z7U",
+ "colab_type": "text"
+ },
+ "source": [
+ "Deep Learning\n",
+ "=============\n",
+ "\n",
+ "Assignment 1\n",
+ "------------\n",
+ "\n",
+ "The objective of this assignment is to learn about simple data curation practices, and familiarize you with some of the data we'll be reusing later.\n",
+ "\n",
+ "This notebook uses the [notMNIST](http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html) dataset to be used with python experiments. This dataset is designed to look like the classic [MNIST](http://yann.lecun.com/exdb/mnist/) dataset, while looking a little more like real data: it's a harder task, and the data is a lot less 'clean' than MNIST."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "apJbCsBHl-2A",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
},
- {
- "metadata": {
- "id": "EYRJ4ICW6-da",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 186058,
- "status": "ok",
- "timestamp": 1444485672507,
- "user": {
- "color": "#1FA15D",
- "displayName": "Vincent Vanhoucke",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "05076109866853157986",
- "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
- "sessionId": "2a0a5e044bb03b66",
- "userId": "102167687554210253930"
- },
- "user_tz": 420
- },
- "outputId": "0d0f85df-155f-4a89-8e7e-ee32df36ec8d"
+ "cellView": "both"
+ },
+ "source": [
+ "# These are all the modules we'll be using later. Make sure you can import them\n",
+ "# before proceeding further.\n",
+ "import matplotlib.pyplot as plt\n",
+ "import numpy as np\n",
+ "import os\n",
+ "import tarfile\n",
+ "import urllib\n",
+ "from IPython.display import display, Image\n",
+ "from scipy import ndimage\n",
+ "from sklearn.linear_model import LogisticRegression\n",
+ "import cPickle as pickle"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "jNWGtZaXn-5j",
+ "colab_type": "text"
+ },
+ "source": [
+ "First, we'll download the dataset to our local machine. The data consists of characters rendered in a variety of fonts on a 28x28 image. The labels are limited to 'A' through 'J' (10 classes). The training set has about 500k and the testset 19000 labelled examples. Given these sizes, it should be possible to train models quickly on any machine."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "EYRJ4ICW6-da",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "url = 'http://yaroslavvb.com/upload/notMNIST/'\n\ndef maybe_download(filename, expected_bytes):\n \"\"\"Download a file if not present, and make sure it's the right size.\"\"\"\n if not os.path.exists(filename):\n filename, _ = urllib.urlretrieve(url + filename, filename)\n statinfo = os.stat(filename)\n if statinfo.st_size == expected_bytes:\n print 'Found and verified', filename\n else:\n raise Exception(\n 'Failed to verify' + filename + '. Can you get to it with a browser?')\n return filename\n\ntrain_filename = maybe_download('notMNIST_large.tar.gz', 247336696)\ntest_filename = maybe_download('notMNIST_small.tar.gz', 8458043)",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Found and verified notMNIST_large.tar.gz\nFound and verified notMNIST_small.tar.gz\n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "cC3p0oEyF8QT",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 186058,
+ "status": "ok",
+ "timestamp": 1444485672507,
+ "user": {
+ "color": "#1FA15D",
+ "displayName": "Vincent Vanhoucke",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "05076109866853157986",
+ "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
+ "sessionId": "2a0a5e044bb03b66",
+ "userId": "102167687554210253930"
},
- "cell_type": "markdown",
- "source": "Extract the dataset from the compressed .tar.gz file.\nThis should give you a set of directories, labelled A through J."
+ "user_tz": 420
},
+ "outputId": "0d0f85df-155f-4a89-8e7e-ee32df36ec8d"
+ },
+ "source": [
+ "url = 'http://yaroslavvb.com/upload/notMNIST/'\n",
+ "\n",
+ "def maybe_download(filename, expected_bytes):\n",
+ " \"\"\"Download a file if not present, and make sure it's the right size.\"\"\"\n",
+ " if not os.path.exists(filename):\n",
+ " filename, _ = urllib.urlretrieve(url + filename, filename)\n",
+ " statinfo = os.stat(filename)\n",
+ " if statinfo.st_size == expected_bytes:\n",
+ " print 'Found and verified', filename\n",
+ " else:\n",
+ " raise Exception(\n",
+ " 'Failed to verify' + filename + '. Can you get to it with a browser?')\n",
+ " return filename\n",
+ "\n",
+ "train_filename = maybe_download('notMNIST_large.tar.gz', 247336696)\n",
+ "test_filename = maybe_download('notMNIST_small.tar.gz', 8458043)"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "H8CBE-WZ8nmj",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 186055,
- "status": "ok",
- "timestamp": 1444485672525,
- "user": {
- "color": "#1FA15D",
- "displayName": "Vincent Vanhoucke",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "05076109866853157986",
- "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
- "sessionId": "2a0a5e044bb03b66",
- "userId": "102167687554210253930"
- },
- "user_tz": 420
- },
- "outputId": "ef6c790c-2513-4b09-962e-27c79390c762"
+ "output_type": "stream",
+ "text": [
+ "Found and verified notMNIST_large.tar.gz\n",
+ "Found and verified notMNIST_small.tar.gz\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "cC3p0oEyF8QT",
+ "colab_type": "text"
+ },
+ "source": [
+ "Extract the dataset from the compressed .tar.gz file.\n",
+ "This should give you a set of directories, labelled A through J."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "H8CBE-WZ8nmj",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "num_classes = 10\n\ndef extract(filename):\n tar = tarfile.open(filename)\n tar.extractall()\n tar.close()\n root = os.path.splitext(os.path.splitext(filename)[0])[0] # remove .tar.gz\n data_folders = [os.path.join(root, d) for d in sorted(os.listdir(root))]\n if len(data_folders) != num_classes:\n raise Exception(\n 'Expected %d folders, one per class. Found %d instead.' % (\n num_classes, len(data_folders)))\n print data_folders\n return data_folders\n \ntrain_folders = extract(train_filename)\ntest_folders = extract(test_filename)",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "['notMNIST_large/A', 'notMNIST_large/B', 'notMNIST_large/C', 'notMNIST_large/D', 'notMNIST_large/E', 'notMNIST_large/F', 'notMNIST_large/G', 'notMNIST_large/H', 'notMNIST_large/I', 'notMNIST_large/J']\n['notMNIST_small/A', 'notMNIST_small/B', 'notMNIST_small/C', 'notMNIST_small/D', 'notMNIST_small/E', 'notMNIST_small/F', 'notMNIST_small/G', 'notMNIST_small/H', 'notMNIST_small/I', 'notMNIST_small/J']\n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "4riXK3IoHgx6",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "---\nProblem 1\n---------\n\nLet's take a peek at some of the data to make sure it looks sensible. Each exemplar should be an image of a character A through J rendered in a different font. Display a sample of the images that we just downloaded. Hint: you can use the package IPython.display.\n\n---"
- },
- {
- "metadata": {
- "id": "PBdkjESPK8tw",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 186055,
+ "status": "ok",
+ "timestamp": 1444485672525,
+ "user": {
+ "color": "#1FA15D",
+ "displayName": "Vincent Vanhoucke",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "05076109866853157986",
+ "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
+ "sessionId": "2a0a5e044bb03b66",
+ "userId": "102167687554210253930"
},
- "cell_type": "markdown",
- "source": "Now let's load the data in a more manageable format.\n\nWe'll convert the entire dataset into a 3D array (image index, x, y) of floating point values, normalized to have approximately zero mean and standard deviation ~0.5 to make training easier down the road. The labels will be stored into a separate array of integers 0 through 9.\n\nA few images might not be readable, we'll just skip them."
+ "user_tz": 420
},
+ "outputId": "ef6c790c-2513-4b09-962e-27c79390c762"
+ },
+ "source": [
+ "num_classes = 10\n",
+ "\n",
+ "def extract(filename):\n",
+ " tar = tarfile.open(filename)\n",
+ " tar.extractall()\n",
+ " tar.close()\n",
+ " root = os.path.splitext(os.path.splitext(filename)[0])[0] # remove .tar.gz\n",
+ " data_folders = [os.path.join(root, d) for d in sorted(os.listdir(root))]\n",
+ " if len(data_folders) != num_classes:\n",
+ " raise Exception(\n",
+ " 'Expected %d folders, one per class. Found %d instead.' % (\n",
+ " num_classes, len(data_folders)))\n",
+ " print data_folders\n",
+ " return data_folders\n",
+ " \n",
+ "train_folders = extract(train_filename)\n",
+ "test_folders = extract(test_filename)"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "h7q0XhG3MJdf",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 30
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 399874,
- "status": "ok",
- "timestamp": 1444485886378,
- "user": {
- "color": "#1FA15D",
- "displayName": "Vincent Vanhoucke",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "05076109866853157986",
- "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
- "sessionId": "2a0a5e044bb03b66",
- "userId": "102167687554210253930"
- },
- "user_tz": 420
- },
- "outputId": "92c391bb-86ff-431d-9ada-315568a19e59"
+ "output_type": "stream",
+ "text": [
+ "['notMNIST_large/A', 'notMNIST_large/B', 'notMNIST_large/C', 'notMNIST_large/D', 'notMNIST_large/E', 'notMNIST_large/F', 'notMNIST_large/G', 'notMNIST_large/H', 'notMNIST_large/I', 'notMNIST_large/J']\n",
+ "['notMNIST_small/A', 'notMNIST_small/B', 'notMNIST_small/C', 'notMNIST_small/D', 'notMNIST_small/E', 'notMNIST_small/F', 'notMNIST_small/G', 'notMNIST_small/H', 'notMNIST_small/I', 'notMNIST_small/J']\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4riXK3IoHgx6",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 1\n",
+ "---------\n",
+ "\n",
+ "Let's take a peek at some of the data to make sure it looks sensible. Each exemplar should be an image of a character A through J rendered in a different font. Display a sample of the images that we just downloaded. Hint: you can use the package IPython.display.\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "PBdkjESPK8tw",
+ "colab_type": "text"
+ },
+ "source": [
+ "Now let's load the data in a more manageable format.\n",
+ "\n",
+ "We'll convert the entire dataset into a 3D array (image index, x, y) of floating point values, normalized to have approximately zero mean and standard deviation ~0.5 to make training easier down the road. The labels will be stored into a separate array of integers 0 through 9.\n",
+ "\n",
+ "A few images might not be readable, we'll just skip them."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "h7q0XhG3MJdf",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "image_size = 28 # Pixel width and height.\npixel_depth = 255.0 # Number of levels per pixel.\n\ndef load(data_folders, min_num_images, max_num_images):\n dataset = np.ndarray(\n shape=(max_num_images, image_size, image_size), dtype=np.float32)\n labels = np.ndarray(shape=(max_num_images), dtype=np.int32)\n label_index = 0\n image_index = 0\n for folder in data_folders:\n print folder\n for image in os.listdir(folder):\n if image_index >= max_num_images:\n raise Exception('More images than expected: %d >= %d' % (\n num_images, max_num_images))\n image_file = os.path.join(folder, image)\n try:\n image_data = (ndimage.imread(image_file).astype(float) -\n pixel_depth / 2) / pixel_depth\n if image_data.shape != (image_size, image_size):\n raise Exception('Unexpected image shape: %s' % str(image_data.shape))\n dataset[image_index, :, :] = image_data\n labels[image_index] = label_index\n image_index += 1\n except IOError as e:\n print 'Could not read:', image_file, ':', e, '- it\\'s ok, skipping.'\n label_index += 1\n num_images = image_index\n dataset = dataset[0:num_images, :, :]\n labels = labels[0:num_images]\n if num_images < min_num_images:\n raise Exception('Many fewer images than expected: %d < %d' % (\n num_images, min_num_images))\n print 'Full dataset tensor:', dataset.shape\n print 'Mean:', np.mean(dataset)\n print 'Standard deviation:', np.std(dataset)\n print 'Labels:', labels.shape\n return dataset, labels\ntrain_dataset, train_labels = load(train_folders, 450000, 550000)\ntest_dataset, test_labels = load(test_folders, 18000, 20000)",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "notMNIST_large/A\nCould not read: notMNIST_large/A/SG90IE11c3RhcmQgQlROIFBvc3Rlci50dGY=.png : cannot identify image file - it's ok, skipping.\nCould not read: notMNIST_large/A/RnJlaWdodERpc3BCb29rSXRhbGljLnR0Zg==.png : cannot identify image file - it's ok, skipping.\nCould not read: notMNIST_large/A/Um9tYW5hIEJvbGQucGZi.png : cannot identify image file - it's ok, skipping.\nnotMNIST_large/B\nCould not read: notMNIST_large/B/TmlraXNFRi1TZW1pQm9sZEl0YWxpYy5vdGY=.png : cannot identify image file - it's ok, skipping.\nnotMNIST_large/C\nnotMNIST_large/D\nCould not read: notMNIST_large/D/VHJhbnNpdCBCb2xkLnR0Zg==.png : cannot identify image file - it's ok, skipping.\nnotMNIST_large/E\nnotMNIST_large/F\nnotMNIST_large/G\nnotMNIST_large/H\nnotMNIST_large/I\nnotMNIST_large/J\nFull dataset tensor: (529114, 28, 28)\nMean: -0.0816593\nStandard deviation: 0.454232\nLabels: (529114,)\nnotMNIST_small/A\nCould not read: notMNIST_small/A/RGVtb2NyYXRpY2FCb2xkT2xkc3R5bGUgQm9sZC50dGY=.png : cannot identify image file - it's ok, skipping.\nnotMNIST_small/B\nnotMNIST_small/C\nnotMNIST_small/D\nnotMNIST_small/E\nnotMNIST_small/F\nCould not read: notMNIST_small/F/Q3Jvc3NvdmVyIEJvbGRPYmxpcXVlLnR0Zg==.png : cannot identify image file - it's ok, skipping.\nnotMNIST_small/G\nnotMNIST_small/H\nnotMNIST_small/I\nnotMNIST_small/J\nFull dataset tensor: (18724, 28, 28)\nMean: -0.0746364\nStandard deviation: 0.458622\nLabels: (18724,)\n"
+ "item_id": 30
}
]
},
- {
- "metadata": {
- "id": "vUdbskYE2d87",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 399874,
+ "status": "ok",
+ "timestamp": 1444485886378,
+ "user": {
+ "color": "#1FA15D",
+ "displayName": "Vincent Vanhoucke",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "05076109866853157986",
+ "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
+ "sessionId": "2a0a5e044bb03b66",
+ "userId": "102167687554210253930"
},
- "cell_type": "markdown",
- "source": "---\nProblem 2\n---------\n\nLet's verify that the data still looks good. Displaying a sample of the labels and images from the ndarray. Hint: you can use matplotlib.pyplot.\n\n---"
+ "user_tz": 420
},
+ "outputId": "92c391bb-86ff-431d-9ada-315568a19e59"
+ },
+ "source": [
+ "image_size = 28 # Pixel width and height.\n",
+ "pixel_depth = 255.0 # Number of levels per pixel.\n",
+ "\n",
+ "def load(data_folders, min_num_images, max_num_images):\n",
+ " dataset = np.ndarray(\n",
+ " shape=(max_num_images, image_size, image_size), dtype=np.float32)\n",
+ " labels = np.ndarray(shape=(max_num_images), dtype=np.int32)\n",
+ " label_index = 0\n",
+ " image_index = 0\n",
+ " for folder in data_folders:\n",
+ " print folder\n",
+ " for image in os.listdir(folder):\n",
+ " if image_index >= max_num_images:\n",
+ " raise Exception('More images than expected: %d >= %d' % (\n",
+ " num_images, max_num_images))\n",
+ " image_file = os.path.join(folder, image)\n",
+ " try:\n",
+ " image_data = (ndimage.imread(image_file).astype(float) -\n",
+ " pixel_depth / 2) / pixel_depth\n",
+ " if image_data.shape != (image_size, image_size):\n",
+ " raise Exception('Unexpected image shape: %s' % str(image_data.shape))\n",
+ " dataset[image_index, :, :] = image_data\n",
+ " labels[image_index] = label_index\n",
+ " image_index += 1\n",
+ " except IOError as e:\n",
+ " print 'Could not read:', image_file, ':', e, '- it\\'s ok, skipping.'\n",
+ " label_index += 1\n",
+ " num_images = image_index\n",
+ " dataset = dataset[0:num_images, :, :]\n",
+ " labels = labels[0:num_images]\n",
+ " if num_images < min_num_images:\n",
+ " raise Exception('Many fewer images than expected: %d < %d' % (\n",
+ " num_images, min_num_images))\n",
+ " print 'Full dataset tensor:', dataset.shape\n",
+ " print 'Mean:', np.mean(dataset)\n",
+ " print 'Standard deviation:', np.std(dataset)\n",
+ " print 'Labels:', labels.shape\n",
+ " return dataset, labels\n",
+ "train_dataset, train_labels = load(train_folders, 450000, 550000)\n",
+ "test_dataset, test_labels = load(test_folders, 18000, 20000)"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "GPTCnjIcyuKN",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "Next, we'll randomize the data. It's important to have the labels well shuffled for the training and test distributions to match."
- },
- {
- "metadata": {
- "id": "6WZ2l2tN2zOL",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
- },
- "cell_type": "code",
- "input": "np.random.seed(133)\ndef randomize(dataset, labels):\n permutation = np.random.permutation(labels.shape[0])\n shuffled_dataset = dataset[permutation,:,:]\n shuffled_labels = labels[permutation]\n return shuffled_dataset, shuffled_labels\ntrain_dataset, train_labels = randomize(train_dataset, train_labels)\ntest_dataset, test_labels = randomize(test_dataset, test_labels)",
- "language": "python",
- "outputs": []
- },
- {
- "metadata": {
- "id": "puDUTe6t6USl",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "---\nProblem 3\n---------\nConvince yourself that the data is still good after shuffling!\n\n---"
- },
- {
- "metadata": {
- "id": "cYznx5jUwzoO",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "---\nProblem 4\n---------\nAnother check: we expect the data to be balanced across classes. Verify that.\n\n---"
- },
- {
- "metadata": {
- "id": "LA7M7K22ynCt",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "Prune the training data as needed. Depending on your computer setup, you might not be able to fit it all in memory, and you can tune train_size as needed.\n\nAlso create a validation dataset for hyperparameter tuning."
+ "output_type": "stream",
+ "text": [
+ "notMNIST_large/A\n",
+ "Could not read: notMNIST_large/A/SG90IE11c3RhcmQgQlROIFBvc3Rlci50dGY=.png : cannot identify image file - it's ok, skipping.\n",
+ "Could not read: notMNIST_large/A/RnJlaWdodERpc3BCb29rSXRhbGljLnR0Zg==.png : cannot identify image file - it's ok, skipping.\n",
+ "Could not read: notMNIST_large/A/Um9tYW5hIEJvbGQucGZi.png : cannot identify image file - it's ok, skipping.\n",
+ "notMNIST_large/B\n",
+ "Could not read: notMNIST_large/B/TmlraXNFRi1TZW1pQm9sZEl0YWxpYy5vdGY=.png : cannot identify image file - it's ok, skipping.\n",
+ "notMNIST_large/C\n",
+ "notMNIST_large/D\n",
+ "Could not read: notMNIST_large/D/VHJhbnNpdCBCb2xkLnR0Zg==.png : cannot identify image file - it's ok, skipping.\n",
+ "notMNIST_large/E\n",
+ "notMNIST_large/F\n",
+ "notMNIST_large/G\n",
+ "notMNIST_large/H\n",
+ "notMNIST_large/I\n",
+ "notMNIST_large/J\n",
+ "Full dataset tensor: (529114, 28, 28)\n",
+ "Mean: -0.0816593\n",
+ "Standard deviation: 0.454232\n",
+ "Labels: (529114,)\n",
+ "notMNIST_small/A\n",
+ "Could not read: notMNIST_small/A/RGVtb2NyYXRpY2FCb2xkT2xkc3R5bGUgQm9sZC50dGY=.png : cannot identify image file - it's ok, skipping.\n",
+ "notMNIST_small/B\n",
+ "notMNIST_small/C\n",
+ "notMNIST_small/D\n",
+ "notMNIST_small/E\n",
+ "notMNIST_small/F\n",
+ "Could not read: notMNIST_small/F/Q3Jvc3NvdmVyIEJvbGRPYmxpcXVlLnR0Zg==.png : cannot identify image file - it's ok, skipping.\n",
+ "notMNIST_small/G\n",
+ "notMNIST_small/H\n",
+ "notMNIST_small/I\n",
+ "notMNIST_small/J\n",
+ "Full dataset tensor: (18724, 28, 28)\n",
+ "Mean: -0.0746364\n",
+ "Standard deviation: 0.458622\n",
+ "Labels: (18724,)\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "vUdbskYE2d87",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 2\n",
+ "---------\n",
+ "\n",
+ "Let's verify that the data still looks good. Displaying a sample of the labels and images from the ndarray. Hint: you can use matplotlib.pyplot.\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "GPTCnjIcyuKN",
+ "colab_type": "text"
+ },
+ "source": [
+ "Next, we'll randomize the data. It's important to have the labels well shuffled for the training and test distributions to match."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "6WZ2l2tN2zOL",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
},
- {
- "metadata": {
- "id": "s3mWgZLpyuzq",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 411281,
- "status": "ok",
- "timestamp": 1444485897869,
- "user": {
- "color": "#1FA15D",
- "displayName": "Vincent Vanhoucke",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "05076109866853157986",
- "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
- "sessionId": "2a0a5e044bb03b66",
- "userId": "102167687554210253930"
- },
- "user_tz": 420
- },
- "outputId": "8af66da6-902d-4719-bedc-7c9fb7ae7948"
+ "cellView": "both"
+ },
+ "source": [
+ "np.random.seed(133)\n",
+ "def randomize(dataset, labels):\n",
+ " permutation = np.random.permutation(labels.shape[0])\n",
+ " shuffled_dataset = dataset[permutation,:,:]\n",
+ " shuffled_labels = labels[permutation]\n",
+ " return shuffled_dataset, shuffled_labels\n",
+ "train_dataset, train_labels = randomize(train_dataset, train_labels)\n",
+ "test_dataset, test_labels = randomize(test_dataset, test_labels)"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "puDUTe6t6USl",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 3\n",
+ "---------\n",
+ "Convince yourself that the data is still good after shuffling!\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "cYznx5jUwzoO",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 4\n",
+ "---------\n",
+ "Another check: we expect the data to be balanced across classes. Verify that.\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "LA7M7K22ynCt",
+ "colab_type": "text"
+ },
+ "source": [
+ "Prune the training data as needed. Depending on your computer setup, you might not be able to fit it all in memory, and you can tune train_size as needed.\n",
+ "\n",
+ "Also create a validation dataset for hyperparameter tuning."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "s3mWgZLpyuzq",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "train_size = 200000\nvalid_size = 10000\n\nvalid_dataset = train_dataset[:valid_size,:,:]\nvalid_labels = train_labels[:valid_size]\ntrain_dataset = train_dataset[valid_size:valid_size+train_size,:,:]\ntrain_labels = train_labels[valid_size:valid_size+train_size]\nprint 'Training', train_dataset.shape, train_labels.shape\nprint 'Validation', valid_dataset.shape, valid_labels.shape",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Training (200000, 28, 28) (200000,)\nValidation (10000, 28, 28) (10000,)\n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "tIQJaJuwg5Hw",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 411281,
+ "status": "ok",
+ "timestamp": 1444485897869,
+ "user": {
+ "color": "#1FA15D",
+ "displayName": "Vincent Vanhoucke",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "05076109866853157986",
+ "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
+ "sessionId": "2a0a5e044bb03b66",
+ "userId": "102167687554210253930"
},
- "cell_type": "markdown",
- "source": "Finally, let's save the data for later reuse:"
+ "user_tz": 420
},
+ "outputId": "8af66da6-902d-4719-bedc-7c9fb7ae7948"
+ },
+ "source": [
+ "train_size = 200000\n",
+ "valid_size = 10000\n",
+ "\n",
+ "valid_dataset = train_dataset[:valid_size,:,:]\n",
+ "valid_labels = train_labels[:valid_size]\n",
+ "train_dataset = train_dataset[valid_size:valid_size+train_size,:,:]\n",
+ "train_labels = train_labels[valid_size:valid_size+train_size]\n",
+ "print 'Training', train_dataset.shape, train_labels.shape\n",
+ "print 'Validation', valid_dataset.shape, valid_labels.shape"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "QiR_rETzem6C",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
- },
- "cell_type": "code",
- "input": "pickle_file = 'notMNIST.pickle'\n\ntry:\n f = open(pickle_file, 'wb')\n save = {\n 'train_dataset': train_dataset,\n 'train_labels': train_labels,\n 'valid_dataset': valid_dataset,\n 'valid_labels': valid_labels,\n 'test_dataset': test_dataset,\n 'test_labels': test_labels,\n }\n pickle.dump(save, f, pickle.HIGHEST_PROTOCOL)\n f.close()\nexcept Exception as e:\n print 'Unable to save data to', pickle_file, ':', e\n raise",
- "language": "python",
- "outputs": []
+ "output_type": "stream",
+ "text": [
+ "Training (200000, 28, 28) (200000,)\n",
+ "Validation (10000, 28, 28) (10000,)\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tIQJaJuwg5Hw",
+ "colab_type": "text"
+ },
+ "source": [
+ "Finally, let's save the data for later reuse:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "QiR_rETzem6C",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
},
- {
- "metadata": {
- "id": "hQbLjrW_iT39",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 413065,
- "status": "ok",
- "timestamp": 1444485899688,
- "user": {
- "color": "#1FA15D",
- "displayName": "Vincent Vanhoucke",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "05076109866853157986",
- "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
- "sessionId": "2a0a5e044bb03b66",
- "userId": "102167687554210253930"
- },
- "user_tz": 420
- },
- "outputId": "b440efc6-5ee1-4cbc-d02d-93db44ebd956"
+ "cellView": "both"
+ },
+ "source": [
+ "pickle_file = 'notMNIST.pickle'\n",
+ "\n",
+ "try:\n",
+ " f = open(pickle_file, 'wb')\n",
+ " save = {\n",
+ " 'train_dataset': train_dataset,\n",
+ " 'train_labels': train_labels,\n",
+ " 'valid_dataset': valid_dataset,\n",
+ " 'valid_labels': valid_labels,\n",
+ " 'test_dataset': test_dataset,\n",
+ " 'test_labels': test_labels,\n",
+ " }\n",
+ " pickle.dump(save, f, pickle.HIGHEST_PROTOCOL)\n",
+ " f.close()\n",
+ "except Exception as e:\n",
+ " print 'Unable to save data to', pickle_file, ':', e\n",
+ " raise"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "hQbLjrW_iT39",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "statinfo = os.stat(pickle_file)\nprint 'Compressed pickle size:', statinfo.st_size",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Compressed pickle size: 718193801\n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "gE_cRAQB33lk",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 413065,
+ "status": "ok",
+ "timestamp": 1444485899688,
+ "user": {
+ "color": "#1FA15D",
+ "displayName": "Vincent Vanhoucke",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "05076109866853157986",
+ "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
+ "sessionId": "2a0a5e044bb03b66",
+ "userId": "102167687554210253930"
},
- "cell_type": "markdown",
- "source": "---\nProblem 5\n---------\n\nBy construction, this dataset might contain a lot of overlapping samples, including training data that's also contained in the validation and test set! Overlap between training and test can skew the results if you expect to use your model in an environment where there is never an overlap, but are actually ok if you expect to see training samples recur when you use it.\nMeasure how much overlap there is between training, validation and test samples.\nOptional questions:\n- What about near duplicates between datasets? (images that are almost identical)\n- Create a sanitized validation and test set, and compare your accuracy on those in subsequent assignments.\n---"
+ "user_tz": 420
},
+ "outputId": "b440efc6-5ee1-4cbc-d02d-93db44ebd956"
+ },
+ "source": [
+ "statinfo = os.stat(pickle_file)\n",
+ "print 'Compressed pickle size:', statinfo.st_size"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "L8oww1s4JMQx",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "---\nProblem 6\n---------\n\nLet's get an idea of what an off-the-shelf classifier can give you on this data. It's always good to check that there is something to learn, and that it's a problem that is not so trivial that a canned solution solves it.\n\nTrain a simple model on this data using 50, 100, 1000 and 5000 training samples. Hint: you can use the LogisticRegression model from sklearn.linear_model.\n\nOptional question: train an off-the-shelf model on all the data!\n\n---"
+ "output_type": "stream",
+ "text": [
+ "Compressed pickle size: 718193801\n"
+ ],
+ "name": "stdout"
}
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "gE_cRAQB33lk",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 5\n",
+ "---------\n",
+ "\n",
+ "By construction, this dataset might contain a lot of overlapping samples, including training data that's also contained in the validation and test set! Overlap between training and test can skew the results if you expect to use your model in an environment where there is never an overlap, but are actually ok if you expect to see training samples recur when you use it.\n",
+ "Measure how much overlap there is between training, validation and test samples.\n",
+ "Optional questions:\n",
+ "- What about near duplicates between datasets? (images that are almost identical)\n",
+ "- Create a sanitized validation and test set, and compare your accuracy on those in subsequent assignments.\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "L8oww1s4JMQx",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 6\n",
+ "---------\n",
+ "\n",
+ "Let's get an idea of what an off-the-shelf classifier can give you on this data. It's always good to check that there is something to learn, and that it's a problem that is not so trivial that a canned solution solves it.\n",
+ "\n",
+ "Train a simple model on this data using 50, 100, 1000 and 5000 training samples. Hint: you can use the LogisticRegression model from sklearn.linear_model.\n",
+ "\n",
+ "Optional question: train an off-the-shelf model on all the data!\n",
+ "\n",
+ "---"
]
}
- ],
- "metadata": {
- "name": "1_notmnist.ipynb",
- "colabVersion": "0.3.2",
- "colab_views": {},
- "colab_default_view": {}
- },
- "nbformat": 3,
- "nbformat_minor": 0
-}
+ ]
+} \ No newline at end of file
diff --git a/tensorflow/examples/udacity/2_fullyconnected.ipynb b/tensorflow/examples/udacity/2_fullyconnected.ipynb
index fb5be12ac0..5abcfd3be9 100644
--- a/tensorflow/examples/udacity/2_fullyconnected.ipynb
+++ b/tensorflow/examples/udacity/2_fullyconnected.ipynb
@@ -1,311 +1,584 @@
{
- "worksheets": [
+ "nbformat": 4,
+ "nbformat_minor": 0,
+ "metadata": {
+ "colab": {
+ "version": "0.3.2",
+ "views": {},
+ "default_view": {},
+ "name": "2_fullyconnected.ipynb",
+ "provenance": []
+ }
+ },
+ "cells": [
{
- "cells": [
- {
- "metadata": {
- "id": "kR-4eNdK6lYS",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "Deep Learning\n=============\n\nAssignment 2\n------------\n\nPreviously in `1_notmnist.ipynb`, we created a pickle with formatted datasets for training, development and testing on the [notMNIST dataset](http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html).\n\nThe goal of this assignment is to progressively train deeper and more accurate models using TensorFlow."
- },
- {
- "metadata": {
- "id": "JLpLa8Jt7Vu4",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
- },
- "cell_type": "code",
- "input": "# These are all the modules we'll be using later. Make sure you can import them\n# before proceeding further.\nimport cPickle as pickle\nimport numpy as np\nimport tensorflow as tf",
- "language": "python",
- "outputs": []
- },
- {
- "metadata": {
- "id": "1HrCK6e17WzV",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "First reload the data we generated in `1_notmist.ipynb`."
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "kR-4eNdK6lYS",
+ "colab_type": "text"
+ },
+ "source": [
+ "Deep Learning\n",
+ "=============\n",
+ "\n",
+ "Assignment 2\n",
+ "------------\n",
+ "\n",
+ "Previously in `1_notmnist.ipynb`, we created a pickle with formatted datasets for training, development and testing on the [notMNIST dataset](http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html).\n",
+ "\n",
+ "The goal of this assignment is to progressively train deeper and more accurate models using TensorFlow."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "JLpLa8Jt7Vu4",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
},
- {
- "metadata": {
- "id": "y3-cj1bpmuxc",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 19456,
- "status": "ok",
- "timestamp": 1449847956073,
- "user": {
- "color": "",
- "displayName": "",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "",
- "photoUrl": "",
- "sessionId": "0",
- "userId": ""
- },
- "user_tz": 480
- },
- "outputId": "0ddb1607-1fc4-4ddb-de28-6c7ab7fb0c33"
+ "cellView": "both"
+ },
+ "source": [
+ "# These are all the modules we'll be using later. Make sure you can import them\n",
+ "# before proceeding further.\n",
+ "import cPickle as pickle\n",
+ "import numpy as np\n",
+ "import tensorflow as tf"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "1HrCK6e17WzV",
+ "colab_type": "text"
+ },
+ "source": [
+ "First reload the data we generated in `1_notmist.ipynb`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "y3-cj1bpmuxc",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "pickle_file = 'notMNIST.pickle'\n\nwith open(pickle_file, 'rb') as f:\n save = pickle.load(f)\n train_dataset = save['train_dataset']\n train_labels = save['train_labels']\n valid_dataset = save['valid_dataset']\n valid_labels = save['valid_labels']\n test_dataset = save['test_dataset']\n test_labels = save['test_labels']\n del save # hint to help gc free up memory\n print 'Training set', train_dataset.shape, train_labels.shape\n print 'Validation set', valid_dataset.shape, valid_labels.shape\n print 'Test set', test_dataset.shape, test_labels.shape",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Training set (200000, 28, 28) (200000,)\nValidation set (10000, 28, 28) (10000,)\nTest set (18724, 28, 28) (18724,)\n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "L7aHrm6nGDMB",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 19456,
+ "status": "ok",
+ "timestamp": 1449847956073,
+ "user": {
+ "color": "",
+ "displayName": "",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "",
+ "photoUrl": "",
+ "sessionId": "0",
+ "userId": ""
},
- "cell_type": "markdown",
- "source": "Reformat into a shape that's more adapted to the models we're going to train:\n- data as a flat matrix,\n- labels as float 1-hot encodings."
+ "user_tz": 480
},
+ "outputId": "0ddb1607-1fc4-4ddb-de28-6c7ab7fb0c33"
+ },
+ "source": [
+ "pickle_file = 'notMNIST.pickle'\n",
+ "\n",
+ "with open(pickle_file, 'rb') as f:\n",
+ " save = pickle.load(f)\n",
+ " train_dataset = save['train_dataset']\n",
+ " train_labels = save['train_labels']\n",
+ " valid_dataset = save['valid_dataset']\n",
+ " valid_labels = save['valid_labels']\n",
+ " test_dataset = save['test_dataset']\n",
+ " test_labels = save['test_labels']\n",
+ " del save # hint to help gc free up memory\n",
+ " print 'Training set', train_dataset.shape, train_labels.shape\n",
+ " print 'Validation set', valid_dataset.shape, valid_labels.shape\n",
+ " print 'Test set', test_dataset.shape, test_labels.shape"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "IRSyYiIIGIzS",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 19723,
- "status": "ok",
- "timestamp": 1449847956364,
- "user": {
- "color": "",
- "displayName": "",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "",
- "photoUrl": "",
- "sessionId": "0",
- "userId": ""
- },
- "user_tz": 480
- },
- "outputId": "2ba0fc75-1487-4ace-a562-cf81cae82793"
+ "output_type": "stream",
+ "text": [
+ "Training set (200000, 28, 28) (200000,)\n",
+ "Validation set (10000, 28, 28) (10000,)\n",
+ "Test set (18724, 28, 28) (18724,)\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "L7aHrm6nGDMB",
+ "colab_type": "text"
+ },
+ "source": [
+ "Reformat into a shape that's more adapted to the models we're going to train:\n",
+ "- data as a flat matrix,\n",
+ "- labels as float 1-hot encodings."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "IRSyYiIIGIzS",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "image_size = 28\nnum_labels = 10\n\ndef reformat(dataset, labels):\n dataset = dataset.reshape((-1, image_size * image_size)).astype(np.float32)\n # Map 0 to [1.0, 0.0, 0.0 ...], 1 to [0.0, 1.0, 0.0 ...]\n labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32)\n return dataset, labels\ntrain_dataset, train_labels = reformat(train_dataset, train_labels)\nvalid_dataset, valid_labels = reformat(valid_dataset, valid_labels)\ntest_dataset, test_labels = reformat(test_dataset, test_labels)\nprint 'Training set', train_dataset.shape, train_labels.shape\nprint 'Validation set', valid_dataset.shape, valid_labels.shape\nprint 'Test set', test_dataset.shape, test_labels.shape",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Training set (200000, 784) (200000, 10)\nValidation set (10000, 784) (10000, 10)\nTest set (18724, 784) (18724, 10)\n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "nCLVqyQ5vPPH",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "We're first going to train a multinomial logistic regression using simple gradient descent.\n\nTensorFlow works like this:\n* First you describe the computation that you want to see performed: what the inputs, the variables, and the operations look like. These get created as nodes over a computation graph. This description is all contained within the block below:\n\n with graph.as_default():\n ...\n\n* Then you can run the operations on this graph as many times as you want by calling `session.run()`, providing it outputs to fetch from the graph that get returned. This runtime operation is all contained in the block below:\n\n with tf.Session(graph=graph) as session:\n ...\n\nLet's load all the data into TensorFlow and build the computation graph corresponding to our training:"
- },
- {
- "metadata": {
- "id": "Nfv39qvtvOl_",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 19723,
+ "status": "ok",
+ "timestamp": 1449847956364,
+ "user": {
+ "color": "",
+ "displayName": "",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "",
+ "photoUrl": "",
+ "sessionId": "0",
+ "userId": ""
},
- "cell_type": "code",
- "input": "# With gradient descent training, even this much data is prohibitive.\n# Subset the training data for faster turnaround.\ntrain_subset = 10000\n\ngraph = tf.Graph()\nwith graph.as_default():\n\n # Input data.\n # Load the training, validation and test data into constants that are\n # attached to the graph.\n tf_train_dataset = tf.constant(train_dataset[:train_subset, :])\n tf_train_labels = tf.constant(train_labels[:train_subset])\n tf_valid_dataset = tf.constant(valid_dataset)\n tf_test_dataset = tf.constant(test_dataset)\n \n # Variables.\n # These are the parameters that we are going to be training. The weight\n # matrix will be initialized using random valued following a (truncated)\n # normal distribution. The biases get initialized to zero.\n weights = tf.Variable(\n tf.truncated_normal([image_size * image_size, num_labels]))\n biases = tf.Variable(tf.zeros([num_labels]))\n \n # Training computation.\n # We multiply the inputs with the weight matrix, and add biases. We compute\n # the softmax and cross-entropy (it's one operation in TensorFlow, because\n # it's very common, and it can be optimized). We take the average of this\n # cross-entropy across all training examples: that's our loss.\n logits = tf.matmul(tf_train_dataset, weights) + biases\n loss = tf.reduce_mean(\n tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels))\n \n # Optimizer.\n # We are going to find the minimum of this loss using gradient descent.\n optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)\n \n # Predictions for the training, validation, and test data.\n # These are not part of training, but merely here so that we can report\n # accuracy figures as we train.\n train_prediction = tf.nn.softmax(logits)\n valid_prediction = tf.nn.softmax(\n tf.matmul(tf_valid_dataset, weights) + biases)\n test_prediction = tf.nn.softmax(tf.matmul(tf_test_dataset, weights) + biases)",
- "language": "python",
- "outputs": []
+ "user_tz": 480
},
+ "outputId": "2ba0fc75-1487-4ace-a562-cf81cae82793"
+ },
+ "source": [
+ "image_size = 28\n",
+ "num_labels = 10\n",
+ "\n",
+ "def reformat(dataset, labels):\n",
+ " dataset = dataset.reshape((-1, image_size * image_size)).astype(np.float32)\n",
+ " # Map 0 to [1.0, 0.0, 0.0 ...], 1 to [0.0, 1.0, 0.0 ...]\n",
+ " labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32)\n",
+ " return dataset, labels\n",
+ "train_dataset, train_labels = reformat(train_dataset, train_labels)\n",
+ "valid_dataset, valid_labels = reformat(valid_dataset, valid_labels)\n",
+ "test_dataset, test_labels = reformat(test_dataset, test_labels)\n",
+ "print 'Training set', train_dataset.shape, train_labels.shape\n",
+ "print 'Validation set', valid_dataset.shape, valid_labels.shape\n",
+ "print 'Test set', test_dataset.shape, test_labels.shape"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "KQcL4uqISHjP",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "Let's run this computation and iterate:"
+ "output_type": "stream",
+ "text": [
+ "Training set (200000, 784) (200000, 10)\n",
+ "Validation set (10000, 784) (10000, 10)\n",
+ "Test set (18724, 784) (18724, 10)\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "nCLVqyQ5vPPH",
+ "colab_type": "text"
+ },
+ "source": [
+ "We're first going to train a multinomial logistic regression using simple gradient descent.\n",
+ "\n",
+ "TensorFlow works like this:\n",
+ "* First you describe the computation that you want to see performed: what the inputs, the variables, and the operations look like. These get created as nodes over a computation graph. This description is all contained within the block below:\n",
+ "\n",
+ " with graph.as_default():\n",
+ " ...\n",
+ "\n",
+ "* Then you can run the operations on this graph as many times as you want by calling `session.run()`, providing it outputs to fetch from the graph that get returned. This runtime operation is all contained in the block below:\n",
+ "\n",
+ " with tf.Session(graph=graph) as session:\n",
+ " ...\n",
+ "\n",
+ "Let's load all the data into TensorFlow and build the computation graph corresponding to our training:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "Nfv39qvtvOl_",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
},
- {
- "metadata": {
- "id": "z2cjdenH869W",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 9
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 57454,
- "status": "ok",
- "timestamp": 1449847994134,
- "user": {
- "color": "",
- "displayName": "",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "",
- "photoUrl": "",
- "sessionId": "0",
- "userId": ""
- },
- "user_tz": 480
- },
- "outputId": "4c037ba1-b526-4d8e-e632-91e2a0333267"
+ "cellView": "both"
+ },
+ "source": [
+ "# With gradient descent training, even this much data is prohibitive.\n",
+ "# Subset the training data for faster turnaround.\n",
+ "train_subset = 10000\n",
+ "\n",
+ "graph = tf.Graph()\n",
+ "with graph.as_default():\n",
+ "\n",
+ " # Input data.\n",
+ " # Load the training, validation and test data into constants that are\n",
+ " # attached to the graph.\n",
+ " tf_train_dataset = tf.constant(train_dataset[:train_subset, :])\n",
+ " tf_train_labels = tf.constant(train_labels[:train_subset])\n",
+ " tf_valid_dataset = tf.constant(valid_dataset)\n",
+ " tf_test_dataset = tf.constant(test_dataset)\n",
+ " \n",
+ " # Variables.\n",
+ " # These are the parameters that we are going to be training. The weight\n",
+ " # matrix will be initialized using random valued following a (truncated)\n",
+ " # normal distribution. The biases get initialized to zero.\n",
+ " weights = tf.Variable(\n",
+ " tf.truncated_normal([image_size * image_size, num_labels]))\n",
+ " biases = tf.Variable(tf.zeros([num_labels]))\n",
+ " \n",
+ " # Training computation.\n",
+ " # We multiply the inputs with the weight matrix, and add biases. We compute\n",
+ " # the softmax and cross-entropy (it's one operation in TensorFlow, because\n",
+ " # it's very common, and it can be optimized). We take the average of this\n",
+ " # cross-entropy across all training examples: that's our loss.\n",
+ " logits = tf.matmul(tf_train_dataset, weights) + biases\n",
+ " loss = tf.reduce_mean(\n",
+ " tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels))\n",
+ " \n",
+ " # Optimizer.\n",
+ " # We are going to find the minimum of this loss using gradient descent.\n",
+ " optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)\n",
+ " \n",
+ " # Predictions for the training, validation, and test data.\n",
+ " # These are not part of training, but merely here so that we can report\n",
+ " # accuracy figures as we train.\n",
+ " train_prediction = tf.nn.softmax(logits)\n",
+ " valid_prediction = tf.nn.softmax(\n",
+ " tf.matmul(tf_valid_dataset, weights) + biases)\n",
+ " test_prediction = tf.nn.softmax(tf.matmul(tf_test_dataset, weights) + biases)"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "KQcL4uqISHjP",
+ "colab_type": "text"
+ },
+ "source": [
+ "Let's run this computation and iterate:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "z2cjdenH869W",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "num_steps = 801\n\ndef accuracy(predictions, labels):\n return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1))\n / predictions.shape[0])\n\nwith tf.Session(graph=graph) as session:\n # This is a one-time operation which ensures the parameters get initialized as\n # we described in the graph: random weights for the matrix, zeros for the\n # biases. \n tf.initialize_all_variables().run()\n print 'Initialized'\n for step in xrange(num_steps):\n # Run the computations. We tell .run() that we want to run the optimizer,\n # and get the loss value and the training predictions returned as numpy\n # arrays.\n _, l, predictions = session.run([optimizer, loss, train_prediction])\n if (step % 100 == 0):\n print 'Loss at step', step, ':', l\n print 'Training accuracy: %.1f%%' % accuracy(\n predictions, train_labels[:train_subset, :])\n # Calling .eval() on valid_prediction is basically like calling run(), but\n # just to get that one numpy array. Note that it recomputes all its graph\n # dependencies.\n print 'Validation accuracy: %.1f%%' % accuracy(\n valid_prediction.eval(), valid_labels)\n print 'Test accuracy: %.1f%%' % accuracy(test_prediction.eval(), test_labels)",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Initialized\nLoss at step 0 : 17.2939\nTraining accuracy: 10.8%\nValidation accuracy: 13.8%\nLoss at step 100 : 2.26903\nTraining accuracy: 72.3%\nValidation accuracy: 71.6%\nLoss at step 200 : 1.84895\nTraining accuracy: 74.9%\nValidation accuracy: 73.9%\nLoss at step 300 : 1.60701\nTraining accuracy: 76.0%\nValidation accuracy: 74.5%\nLoss at step 400 : 1.43912\nTraining accuracy: 76.8%\nValidation accuracy: 74.8%\nLoss at step 500 : 1.31349\nTraining accuracy: 77.5%\nValidation accuracy: 75.0%\nLoss at step 600 : 1.21501\nTraining accuracy: 78.1%\nValidation accuracy: 75.4%\nLoss at step 700 : 1.13515\nTraining accuracy: 78.6%\nValidation accuracy: 75.4%\nLoss at step 800 : 1.0687\nTraining accuracy: 79.2%\nValidation accuracy: 75.6%\nTest accuracy: 82.9%\n"
+ "item_id": 9
}
]
},
- {
- "metadata": {
- "id": "x68f-hxRGm3H",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 57454,
+ "status": "ok",
+ "timestamp": 1449847994134,
+ "user": {
+ "color": "",
+ "displayName": "",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "",
+ "photoUrl": "",
+ "sessionId": "0",
+ "userId": ""
},
- "cell_type": "markdown",
- "source": "Let's now switch to stochastic gradient descent training instead, which is much faster.\n\nThe graph will be similar, except that instead of holding all the training data into a constant node, we create a `Placeholder` node which will be fed actual data at every call of `sesion.run()`."
+ "user_tz": 480
},
+ "outputId": "4c037ba1-b526-4d8e-e632-91e2a0333267"
+ },
+ "source": [
+ "num_steps = 801\n",
+ "\n",
+ "def accuracy(predictions, labels):\n",
+ " return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1))\n",
+ " / predictions.shape[0])\n",
+ "\n",
+ "with tf.Session(graph=graph) as session:\n",
+ " # This is a one-time operation which ensures the parameters get initialized as\n",
+ " # we described in the graph: random weights for the matrix, zeros for the\n",
+ " # biases. \n",
+ " tf.initialize_all_variables().run()\n",
+ " print 'Initialized'\n",
+ " for step in xrange(num_steps):\n",
+ " # Run the computations. We tell .run() that we want to run the optimizer,\n",
+ " # and get the loss value and the training predictions returned as numpy\n",
+ " # arrays.\n",
+ " _, l, predictions = session.run([optimizer, loss, train_prediction])\n",
+ " if (step % 100 == 0):\n",
+ " print 'Loss at step', step, ':', l\n",
+ " print 'Training accuracy: %.1f%%' % accuracy(\n",
+ " predictions, train_labels[:train_subset, :])\n",
+ " # Calling .eval() on valid_prediction is basically like calling run(), but\n",
+ " # just to get that one numpy array. Note that it recomputes all its graph\n",
+ " # dependencies.\n",
+ " print 'Validation accuracy: %.1f%%' % accuracy(\n",
+ " valid_prediction.eval(), valid_labels)\n",
+ " print 'Test accuracy: %.1f%%' % accuracy(test_prediction.eval(), test_labels)"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "qhPMzWYRGrzM",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
- },
- "cell_type": "code",
- "input": "batch_size = 128\n\ngraph = tf.Graph()\nwith graph.as_default():\n\n # Input data. For the training data, we use a placeholder that will be fed\n # at run time with a training minibatch.\n tf_train_dataset = tf.placeholder(tf.float32,\n shape=(batch_size, image_size * image_size))\n tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))\n tf_valid_dataset = tf.constant(valid_dataset)\n tf_test_dataset = tf.constant(test_dataset)\n \n # Variables.\n weights = tf.Variable(\n tf.truncated_normal([image_size * image_size, num_labels]))\n biases = tf.Variable(tf.zeros([num_labels]))\n \n # Training computation.\n logits = tf.matmul(tf_train_dataset, weights) + biases\n loss = tf.reduce_mean(\n tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels))\n \n # Optimizer.\n optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)\n \n # Predictions for the training, validation, and test data.\n train_prediction = tf.nn.softmax(logits)\n valid_prediction = tf.nn.softmax(\n tf.matmul(tf_valid_dataset, weights) + biases)\n test_prediction = tf.nn.softmax(tf.matmul(tf_test_dataset, weights) + biases)",
- "language": "python",
- "outputs": []
- },
- {
- "metadata": {
- "id": "XmVZESmtG4JH",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "Let's run it:"
+ "output_type": "stream",
+ "text": [
+ "Initialized\n",
+ "Loss at step 0 : 17.2939\n",
+ "Training accuracy: 10.8%\n",
+ "Validation accuracy: 13.8%\n",
+ "Loss at step 100 : 2.26903\n",
+ "Training accuracy: 72.3%\n",
+ "Validation accuracy: 71.6%\n",
+ "Loss at step 200 : 1.84895\n",
+ "Training accuracy: 74.9%\n",
+ "Validation accuracy: 73.9%\n",
+ "Loss at step 300 : 1.60701\n",
+ "Training accuracy: 76.0%\n",
+ "Validation accuracy: 74.5%\n",
+ "Loss at step 400 : 1.43912\n",
+ "Training accuracy: 76.8%\n",
+ "Validation accuracy: 74.8%\n",
+ "Loss at step 500 : 1.31349\n",
+ "Training accuracy: 77.5%\n",
+ "Validation accuracy: 75.0%\n",
+ "Loss at step 600 : 1.21501\n",
+ "Training accuracy: 78.1%\n",
+ "Validation accuracy: 75.4%\n",
+ "Loss at step 700 : 1.13515\n",
+ "Training accuracy: 78.6%\n",
+ "Validation accuracy: 75.4%\n",
+ "Loss at step 800 : 1.0687\n",
+ "Training accuracy: 79.2%\n",
+ "Validation accuracy: 75.6%\n",
+ "Test accuracy: 82.9%\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "x68f-hxRGm3H",
+ "colab_type": "text"
+ },
+ "source": [
+ "Let's now switch to stochastic gradient descent training instead, which is much faster.\n",
+ "\n",
+ "The graph will be similar, except that instead of holding all the training data into a constant node, we create a `Placeholder` node which will be fed actual data at every call of `sesion.run()`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "qhPMzWYRGrzM",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
},
- {
- "metadata": {
- "id": "FoF91pknG_YW",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 6
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 66292,
- "status": "ok",
- "timestamp": 1449848003013,
- "user": {
- "color": "",
- "displayName": "",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "",
- "photoUrl": "",
- "sessionId": "0",
- "userId": ""
- },
- "user_tz": 480
- },
- "outputId": "d255c80e-954d-4183-ca1c-c7333ce91d0a"
+ "cellView": "both"
+ },
+ "source": [
+ "batch_size = 128\n",
+ "\n",
+ "graph = tf.Graph()\n",
+ "with graph.as_default():\n",
+ "\n",
+ " # Input data. For the training data, we use a placeholder that will be fed\n",
+ " # at run time with a training minibatch.\n",
+ " tf_train_dataset = tf.placeholder(tf.float32,\n",
+ " shape=(batch_size, image_size * image_size))\n",
+ " tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))\n",
+ " tf_valid_dataset = tf.constant(valid_dataset)\n",
+ " tf_test_dataset = tf.constant(test_dataset)\n",
+ " \n",
+ " # Variables.\n",
+ " weights = tf.Variable(\n",
+ " tf.truncated_normal([image_size * image_size, num_labels]))\n",
+ " biases = tf.Variable(tf.zeros([num_labels]))\n",
+ " \n",
+ " # Training computation.\n",
+ " logits = tf.matmul(tf_train_dataset, weights) + biases\n",
+ " loss = tf.reduce_mean(\n",
+ " tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels))\n",
+ " \n",
+ " # Optimizer.\n",
+ " optimizer = tf.train.GradientDescentOptimizer(0.5).minimize(loss)\n",
+ " \n",
+ " # Predictions for the training, validation, and test data.\n",
+ " train_prediction = tf.nn.softmax(logits)\n",
+ " valid_prediction = tf.nn.softmax(\n",
+ " tf.matmul(tf_valid_dataset, weights) + biases)\n",
+ " test_prediction = tf.nn.softmax(tf.matmul(tf_test_dataset, weights) + biases)"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "XmVZESmtG4JH",
+ "colab_type": "text"
+ },
+ "source": [
+ "Let's run it:"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "FoF91pknG_YW",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "num_steps = 3001\n\nwith tf.Session(graph=graph) as session:\n tf.initialize_all_variables().run()\n print \"Initialized\"\n for step in xrange(num_steps):\n # Pick an offset within the training data, which has been randomized.\n # Note: we could use better randomization across epochs.\n offset = (step * batch_size) % (train_labels.shape[0] - batch_size)\n # Generate a minibatch.\n batch_data = train_dataset[offset:(offset + batch_size), :]\n batch_labels = train_labels[offset:(offset + batch_size), :]\n # Prepare a dictionary telling the session where to feed the minibatch.\n # The key of the dictionary is the placeholder node of the graph to be fed,\n # and the value is the numpy array to feed to it.\n feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}\n _, l, predictions = session.run(\n [optimizer, loss, train_prediction], feed_dict=feed_dict)\n if (step % 500 == 0):\n print \"Minibatch loss at step\", step, \":\", l\n print \"Minibatch accuracy: %.1f%%\" % accuracy(predictions, batch_labels)\n print \"Validation accuracy: %.1f%%\" % accuracy(\n valid_prediction.eval(), valid_labels)\n print \"Test accuracy: %.1f%%\" % accuracy(test_prediction.eval(), test_labels)",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Initialized\nMinibatch loss at step 0 : 16.8091\nMinibatch accuracy: 12.5%\nValidation accuracy: 14.0%\nMinibatch loss at step 500 : 1.75256\nMinibatch accuracy: 77.3%\nValidation accuracy: 75.0%\nMinibatch loss at step 1000 : 1.32283\nMinibatch accuracy: 77.3%\nValidation accuracy: 76.6%\nMinibatch loss at step 1500 : 0.944533\nMinibatch accuracy: 83.6%\nValidation accuracy: 76.5%\nMinibatch loss at step 2000 : 1.03795\nMinibatch accuracy: 78.9%\nValidation accuracy: 77.8%\nMinibatch loss at step 2500 : 1.10219\nMinibatch accuracy: 80.5%\nValidation accuracy: 78.0%\nMinibatch loss at step 3000 : 0.758874\nMinibatch accuracy: 82.8%\nValidation accuracy: 78.8%\nTest accuracy: 86.1%\n"
+ "item_id": 6
}
]
},
- {
- "metadata": {
- "id": "7omWxtvLLxik",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 66292,
+ "status": "ok",
+ "timestamp": 1449848003013,
+ "user": {
+ "color": "",
+ "displayName": "",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "",
+ "photoUrl": "",
+ "sessionId": "0",
+ "userId": ""
},
- "cell_type": "markdown",
- "source": "---\nProblem\n-------\n\nTurn the logistic regression example with SGD into a 1-hidden layer neural network with rectified linear units (nn.relu()) and 1024 hidden nodes. This model should improve your validation / test accuracy.\n\n---"
+ "user_tz": 480
+ },
+ "outputId": "d255c80e-954d-4183-ca1c-c7333ce91d0a"
+ },
+ "source": [
+ "num_steps = 3001\n",
+ "\n",
+ "with tf.Session(graph=graph) as session:\n",
+ " tf.initialize_all_variables().run()\n",
+ " print \"Initialized\"\n",
+ " for step in xrange(num_steps):\n",
+ " # Pick an offset within the training data, which has been randomized.\n",
+ " # Note: we could use better randomization across epochs.\n",
+ " offset = (step * batch_size) % (train_labels.shape[0] - batch_size)\n",
+ " # Generate a minibatch.\n",
+ " batch_data = train_dataset[offset:(offset + batch_size), :]\n",
+ " batch_labels = train_labels[offset:(offset + batch_size), :]\n",
+ " # Prepare a dictionary telling the session where to feed the minibatch.\n",
+ " # The key of the dictionary is the placeholder node of the graph to be fed,\n",
+ " # and the value is the numpy array to feed to it.\n",
+ " feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}\n",
+ " _, l, predictions = session.run(\n",
+ " [optimizer, loss, train_prediction], feed_dict=feed_dict)\n",
+ " if (step % 500 == 0):\n",
+ " print \"Minibatch loss at step\", step, \":\", l\n",
+ " print \"Minibatch accuracy: %.1f%%\" % accuracy(predictions, batch_labels)\n",
+ " print \"Validation accuracy: %.1f%%\" % accuracy(\n",
+ " valid_prediction.eval(), valid_labels)\n",
+ " print \"Test accuracy: %.1f%%\" % accuracy(test_prediction.eval(), test_labels)"
+ ],
+ "outputs": [
+ {
+ "output_type": "stream",
+ "text": [
+ "Initialized\n",
+ "Minibatch loss at step 0 : 16.8091\n",
+ "Minibatch accuracy: 12.5%\n",
+ "Validation accuracy: 14.0%\n",
+ "Minibatch loss at step 500 : 1.75256\n",
+ "Minibatch accuracy: 77.3%\n",
+ "Validation accuracy: 75.0%\n",
+ "Minibatch loss at step 1000 : 1.32283\n",
+ "Minibatch accuracy: 77.3%\n",
+ "Validation accuracy: 76.6%\n",
+ "Minibatch loss at step 1500 : 0.944533\n",
+ "Minibatch accuracy: 83.6%\n",
+ "Validation accuracy: 76.5%\n",
+ "Minibatch loss at step 2000 : 1.03795\n",
+ "Minibatch accuracy: 78.9%\n",
+ "Validation accuracy: 77.8%\n",
+ "Minibatch loss at step 2500 : 1.10219\n",
+ "Minibatch accuracy: 80.5%\n",
+ "Validation accuracy: 78.0%\n",
+ "Minibatch loss at step 3000 : 0.758874\n",
+ "Minibatch accuracy: 82.8%\n",
+ "Validation accuracy: 78.8%\n",
+ "Test accuracy: 86.1%\n"
+ ],
+ "name": "stdout"
}
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "7omWxtvLLxik",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem\n",
+ "-------\n",
+ "\n",
+ "Turn the logistic regression example with SGD into a 1-hidden layer neural network with rectified linear units (nn.relu()) and 1024 hidden nodes. This model should improve your validation / test accuracy.\n",
+ "\n",
+ "---"
]
}
- ],
- "metadata": {
- "name": "2_fullyconnected.ipynb",
- "colabVersion": "0.3.2",
- "colab_views": {},
- "colab_default_view": {}
- },
- "nbformat": 3,
- "nbformat_minor": 0
+ ]
} \ No newline at end of file
diff --git a/tensorflow/examples/udacity/3_regularization.ipynb b/tensorflow/examples/udacity/3_regularization.ipynb
index e2e0b81922..a61f7f4859 100644
--- a/tensorflow/examples/udacity/3_regularization.ipynb
+++ b/tensorflow/examples/udacity/3_regularization.ipynb
@@ -1,196 +1,299 @@
{
- "worksheets": [
+ "nbformat": 4,
+ "nbformat_minor": 0,
+ "metadata": {
+ "colab": {
+ "version": "0.3.2",
+ "views": {},
+ "default_view": {},
+ "name": "3_regularization.ipynb",
+ "provenance": []
+ }
+ },
+ "cells": [
{
- "cells": [
- {
- "metadata": {
- "id": "kR-4eNdK6lYS",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "Deep Learning\n=============\n\nAssignment 3\n------------\n\nPreviously in `2_fullyconnected.ipynb`, you trained a logistic regression and a neural network model.\n\nThe goal of this assignment is to explore regularization techniques."
- },
- {
- "metadata": {
- "id": "JLpLa8Jt7Vu4",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
- },
- "cell_type": "code",
- "input": "# These are all the modules we'll be using later. Make sure you can import them\n# before proceeding further.\nimport cPickle as pickle\nimport numpy as np\nimport tensorflow as tf",
- "language": "python",
- "outputs": []
- },
- {
- "metadata": {
- "id": "1HrCK6e17WzV",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "First reload the data we generated in _notmist.ipynb_."
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "kR-4eNdK6lYS",
+ "colab_type": "text"
+ },
+ "source": [
+ "Deep Learning\n",
+ "=============\n",
+ "\n",
+ "Assignment 3\n",
+ "------------\n",
+ "\n",
+ "Previously in `2_fullyconnected.ipynb`, you trained a logistic regression and a neural network model.\n",
+ "\n",
+ "The goal of this assignment is to explore regularization techniques."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "JLpLa8Jt7Vu4",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
},
- {
- "metadata": {
- "id": "y3-cj1bpmuxc",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 11777,
- "status": "ok",
- "timestamp": 1449849322348,
- "user": {
- "color": "",
- "displayName": "",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "",
- "photoUrl": "",
- "sessionId": "0",
- "userId": ""
- },
- "user_tz": 480
- },
- "outputId": "e03576f1-ebbe-4838-c388-f1777bcc9873"
+ "cellView": "both"
+ },
+ "source": [
+ "# These are all the modules we'll be using later. Make sure you can import them\n",
+ "# before proceeding further.\n",
+ "import cPickle as pickle\n",
+ "import numpy as np\n",
+ "import tensorflow as tf"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "1HrCK6e17WzV",
+ "colab_type": "text"
+ },
+ "source": [
+ "First reload the data we generated in _notmist.ipynb_."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "y3-cj1bpmuxc",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "pickle_file = 'notMNIST.pickle'\n\nwith open(pickle_file, 'rb') as f:\n save = pickle.load(f)\n train_dataset = save['train_dataset']\n train_labels = save['train_labels']\n valid_dataset = save['valid_dataset']\n valid_labels = save['valid_labels']\n test_dataset = save['test_dataset']\n test_labels = save['test_labels']\n del save # hint to help gc free up memory\n print 'Training set', train_dataset.shape, train_labels.shape\n print 'Validation set', valid_dataset.shape, valid_labels.shape\n print 'Test set', test_dataset.shape, test_labels.shape",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Training set (200000, 28, 28) (200000,)\nValidation set (10000, 28, 28) (10000,)\nTest set (18724, 28, 28) (18724,)\n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "L7aHrm6nGDMB",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 11777,
+ "status": "ok",
+ "timestamp": 1449849322348,
+ "user": {
+ "color": "",
+ "displayName": "",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "",
+ "photoUrl": "",
+ "sessionId": "0",
+ "userId": ""
},
- "cell_type": "markdown",
- "source": "Reformat into a shape that's more adapted to the models we're going to train:\n- data as a flat matrix,\n- labels as float 1-hot encodings."
+ "user_tz": 480
},
+ "outputId": "e03576f1-ebbe-4838-c388-f1777bcc9873"
+ },
+ "source": [
+ "pickle_file = 'notMNIST.pickle'\n",
+ "\n",
+ "with open(pickle_file, 'rb') as f:\n",
+ " save = pickle.load(f)\n",
+ " train_dataset = save['train_dataset']\n",
+ " train_labels = save['train_labels']\n",
+ " valid_dataset = save['valid_dataset']\n",
+ " valid_labels = save['valid_labels']\n",
+ " test_dataset = save['test_dataset']\n",
+ " test_labels = save['test_labels']\n",
+ " del save # hint to help gc free up memory\n",
+ " print 'Training set', train_dataset.shape, train_labels.shape\n",
+ " print 'Validation set', valid_dataset.shape, valid_labels.shape\n",
+ " print 'Test set', test_dataset.shape, test_labels.shape"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "IRSyYiIIGIzS",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 11728,
- "status": "ok",
- "timestamp": 1449849322356,
- "user": {
- "color": "",
- "displayName": "",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "",
- "photoUrl": "",
- "sessionId": "0",
- "userId": ""
- },
- "user_tz": 480
- },
- "outputId": "3f8996ee-3574-4f44-c953-5c8a04636582"
+ "output_type": "stream",
+ "text": [
+ "Training set (200000, 28, 28) (200000,)\n",
+ "Validation set (10000, 28, 28) (10000,)\n",
+ "Test set (18724, 28, 28) (18724,)\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "L7aHrm6nGDMB",
+ "colab_type": "text"
+ },
+ "source": [
+ "Reformat into a shape that's more adapted to the models we're going to train:\n",
+ "- data as a flat matrix,\n",
+ "- labels as float 1-hot encodings."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "IRSyYiIIGIzS",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "image_size = 28\nnum_labels = 10\n\ndef reformat(dataset, labels):\n dataset = dataset.reshape((-1, image_size * image_size)).astype(np.float32)\n # Map 2 to [0.0, 1.0, 0.0 ...], 3 to [0.0, 0.0, 1.0 ...]\n labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32)\n return dataset, labels\ntrain_dataset, train_labels = reformat(train_dataset, train_labels)\nvalid_dataset, valid_labels = reformat(valid_dataset, valid_labels)\ntest_dataset, test_labels = reformat(test_dataset, test_labels)\nprint 'Training set', train_dataset.shape, train_labels.shape\nprint 'Validation set', valid_dataset.shape, valid_labels.shape\nprint 'Test set', test_dataset.shape, test_labels.shape",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Training set (200000, 784) (200000, 10)\nValidation set (10000, 784) (10000, 10)\nTest set (18724, 784) (18724, 10)\n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "RajPLaL_ZW6w",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
- },
- "cell_type": "code",
- "input": "def accuracy(predictions, labels):\n return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1))\n / predictions.shape[0])",
- "language": "python",
- "outputs": []
- },
- {
- "metadata": {
- "id": "sgLbUAQ1CW-1",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 11728,
+ "status": "ok",
+ "timestamp": 1449849322356,
+ "user": {
+ "color": "",
+ "displayName": "",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "",
+ "photoUrl": "",
+ "sessionId": "0",
+ "userId": ""
},
- "cell_type": "markdown",
- "source": "---\nProblem 1\n---------\n\nIntroduce and tune L2 regularization for both logistic and neural network models. Remember that L2 amounts to adding a penalty on the norm of the weights to the loss. In TensorFlow, you can compue the L2 loss for a tensor `t` using `nn.l2_loss(t)`. The right amount of regularization should improve your validation / test accuracy.\n\n---"
+ "user_tz": 480
},
+ "outputId": "3f8996ee-3574-4f44-c953-5c8a04636582"
+ },
+ "source": [
+ "image_size = 28\n",
+ "num_labels = 10\n",
+ "\n",
+ "def reformat(dataset, labels):\n",
+ " dataset = dataset.reshape((-1, image_size * image_size)).astype(np.float32)\n",
+ " # Map 2 to [0.0, 1.0, 0.0 ...], 3 to [0.0, 0.0, 1.0 ...]\n",
+ " labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32)\n",
+ " return dataset, labels\n",
+ "train_dataset, train_labels = reformat(train_dataset, train_labels)\n",
+ "valid_dataset, valid_labels = reformat(valid_dataset, valid_labels)\n",
+ "test_dataset, test_labels = reformat(test_dataset, test_labels)\n",
+ "print 'Training set', train_dataset.shape, train_labels.shape\n",
+ "print 'Validation set', valid_dataset.shape, valid_labels.shape\n",
+ "print 'Test set', test_dataset.shape, test_labels.shape"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "na8xX2yHZzNF",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "---\nProblem 2\n---------\nLet's demonstrate an extreme case of overfitting. Restrict your training data to just a few batches. What happens?\n\n---"
- },
- {
- "metadata": {
- "id": "ww3SCBUdlkRc",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "---\nProblem 3\n---------\nIntroduce Dropout on the hidden layer of the neural network. Remember: Dropout should only be introduced during training, not evaluation, otherwise your evaluation results would be stochastic as well. TensorFlow provides `nn.dropout()` for that, but you have to make sure it's only inserted during training.\n\nWhat happens to our extreme overfitting case?\n\n---"
- },
- {
- "metadata": {
- "id": "-b1hTz3VWZjw",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "---\nProblem 4\n---------\n\nTry to get the best performance you can using a multi-layer model! The best reported test accuracy using a deep network is [97.1%](http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html?showComment=1391023266211#c8758720086795711595).\n\nOne avenue you can explore is to add multiple layers.\n\nAnother one is to use learning rate decay:\n\n global_step = tf.Variable(0) # count the number of steps taken.\n learning_rate = tf.train.exponential_decay(0.5, step, ...)\n optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)\n \n ---\n"
+ "output_type": "stream",
+ "text": [
+ "Training set (200000, 784) (200000, 10)\n",
+ "Validation set (10000, 784) (10000, 10)\n",
+ "Test set (18724, 784) (18724, 10)\n"
+ ],
+ "name": "stdout"
}
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "RajPLaL_ZW6w",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
+ },
+ "cellView": "both"
+ },
+ "source": [
+ "def accuracy(predictions, labels):\n",
+ " return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1))\n",
+ " / predictions.shape[0])"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "sgLbUAQ1CW-1",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 1\n",
+ "---------\n",
+ "\n",
+ "Introduce and tune L2 regularization for both logistic and neural network models. Remember that L2 amounts to adding a penalty on the norm of the weights to the loss. In TensorFlow, you can compue the L2 loss for a tensor `t` using `nn.l2_loss(t)`. The right amount of regularization should improve your validation / test accuracy.\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "na8xX2yHZzNF",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 2\n",
+ "---------\n",
+ "Let's demonstrate an extreme case of overfitting. Restrict your training data to just a few batches. What happens?\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ww3SCBUdlkRc",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 3\n",
+ "---------\n",
+ "Introduce Dropout on the hidden layer of the neural network. Remember: Dropout should only be introduced during training, not evaluation, otherwise your evaluation results would be stochastic as well. TensorFlow provides `nn.dropout()` for that, but you have to make sure it's only inserted during training.\n",
+ "\n",
+ "What happens to our extreme overfitting case?\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "-b1hTz3VWZjw",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 4\n",
+ "---------\n",
+ "\n",
+ "Try to get the best performance you can using a multi-layer model! The best reported test accuracy using a deep network is [97.1%](http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html?showComment=1391023266211#c8758720086795711595).\n",
+ "\n",
+ "One avenue you can explore is to add multiple layers.\n",
+ "\n",
+ "Another one is to use learning rate decay:\n",
+ "\n",
+ " global_step = tf.Variable(0) # count the number of steps taken.\n",
+ " learning_rate = tf.train.exponential_decay(0.5, step, ...)\n",
+ " optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)\n",
+ " \n",
+ " ---\n"
]
}
- ],
- "metadata": {
- "name": "3_regularization.ipynb",
- "colabVersion": "0.3.2",
- "colab_views": {},
- "colab_default_view": {}
- },
- "nbformat": 3,
- "nbformat_minor": 0
+ ]
} \ No newline at end of file
diff --git a/tensorflow/examples/udacity/4_convolutions.ipynb b/tensorflow/examples/udacity/4_convolutions.ipynb
index 2ddc8ca30e..94266663b0 100644
--- a/tensorflow/examples/udacity/4_convolutions.ipynb
+++ b/tensorflow/examples/udacity/4_convolutions.ipynb
@@ -1,242 +1,463 @@
{
- "worksheets": [
+ "nbformat": 4,
+ "nbformat_minor": 0,
+ "metadata": {
+ "colab": {
+ "version": "0.3.2",
+ "views": {},
+ "default_view": {},
+ "name": "4_convolutions.ipynb",
+ "provenance": []
+ }
+ },
+ "cells": [
{
- "cells": [
- {
- "metadata": {
- "id": "4embtkV0pNxM",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "Deep Learning\n=============\n\nAssignment 4\n------------\n\nPreviously in `2_fullyconnected.ipynb` and `3_regularization.ipynb`, we trained fully connected networks to classify [notMNIST](http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html) characters.\n\nThe goal of this assignment is make the neural network convolutional."
- },
- {
- "metadata": {
- "id": "tm2CQN_Cpwj0",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
- },
- "cell_type": "code",
- "input": "# These are all the modules we'll be using later. Make sure you can import them\n# before proceeding further.\nimport cPickle as pickle\nimport numpy as np\nimport tensorflow as tf",
- "language": "python",
- "outputs": []
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4embtkV0pNxM",
+ "colab_type": "text"
+ },
+ "source": [
+ "Deep Learning\n",
+ "=============\n",
+ "\n",
+ "Assignment 4\n",
+ "------------\n",
+ "\n",
+ "Previously in `2_fullyconnected.ipynb` and `3_regularization.ipynb`, we trained fully connected networks to classify [notMNIST](http://yaroslavvb.blogspot.com/2011/09/notmnist-dataset.html) characters.\n",
+ "\n",
+ "The goal of this assignment is make the neural network convolutional."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "tm2CQN_Cpwj0",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
},
- {
- "metadata": {
- "id": "y3-cj1bpmuxc",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 11948,
- "status": "ok",
- "timestamp": 1446658914837,
- "user": {
- "color": "",
- "displayName": "",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "",
- "photoUrl": "",
- "sessionId": "0",
- "userId": ""
- },
- "user_tz": 480
- },
- "outputId": "016b1a51-0290-4b08-efdb-8c95ffc3cd01"
+ "cellView": "both"
+ },
+ "source": [
+ "# These are all the modules we'll be using later. Make sure you can import them\n",
+ "# before proceeding further.\n",
+ "import cPickle as pickle\n",
+ "import numpy as np\n",
+ "import tensorflow as tf"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "y3-cj1bpmuxc",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "pickle_file = 'notMNIST.pickle'\n\nwith open(pickle_file, 'rb') as f:\n save = pickle.load(f)\n train_dataset = save['train_dataset']\n train_labels = save['train_labels']\n valid_dataset = save['valid_dataset']\n valid_labels = save['valid_labels']\n test_dataset = save['test_dataset']\n test_labels = save['test_labels']\n del save # hint to help gc free up memory\n print 'Training set', train_dataset.shape, train_labels.shape\n print 'Validation set', valid_dataset.shape, valid_labels.shape\n print 'Test set', test_dataset.shape, test_labels.shape",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Training set (200000, 28, 28) (200000,)\nValidation set (10000, 28, 28) (10000,)\nTest set (18724, 28, 28) (18724,)\n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "L7aHrm6nGDMB",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 11948,
+ "status": "ok",
+ "timestamp": 1446658914837,
+ "user": {
+ "color": "",
+ "displayName": "",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "",
+ "photoUrl": "",
+ "sessionId": "0",
+ "userId": ""
},
- "cell_type": "markdown",
- "source": "Reformat into a TensorFlow-friendly shape:\n- convolutions need the image data formatted as a cube (width by height by #channels)\n- labels as float 1-hot encodings."
+ "user_tz": 480
},
+ "outputId": "016b1a51-0290-4b08-efdb-8c95ffc3cd01"
+ },
+ "source": [
+ "pickle_file = 'notMNIST.pickle'\n",
+ "\n",
+ "with open(pickle_file, 'rb') as f:\n",
+ " save = pickle.load(f)\n",
+ " train_dataset = save['train_dataset']\n",
+ " train_labels = save['train_labels']\n",
+ " valid_dataset = save['valid_dataset']\n",
+ " valid_labels = save['valid_labels']\n",
+ " test_dataset = save['test_dataset']\n",
+ " test_labels = save['test_labels']\n",
+ " del save # hint to help gc free up memory\n",
+ " print 'Training set', train_dataset.shape, train_labels.shape\n",
+ " print 'Validation set', valid_dataset.shape, valid_labels.shape\n",
+ " print 'Test set', test_dataset.shape, test_labels.shape"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "IRSyYiIIGIzS",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 11952,
- "status": "ok",
- "timestamp": 1446658914857,
- "user": {
- "color": "",
- "displayName": "",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "",
- "photoUrl": "",
- "sessionId": "0",
- "userId": ""
- },
- "user_tz": 480
- },
- "outputId": "650a208c-8359-4852-f4f5-8bf10e80ef6c"
+ "output_type": "stream",
+ "text": [
+ "Training set (200000, 28, 28) (200000,)\n",
+ "Validation set (10000, 28, 28) (10000,)\n",
+ "Test set (18724, 28, 28) (18724,)\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "L7aHrm6nGDMB",
+ "colab_type": "text"
+ },
+ "source": [
+ "Reformat into a TensorFlow-friendly shape:\n",
+ "- convolutions need the image data formatted as a cube (width by height by #channels)\n",
+ "- labels as float 1-hot encodings."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "IRSyYiIIGIzS",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "image_size = 28\nnum_labels = 10\nnum_channels = 1 # grayscale\n\nimport numpy as np\n\ndef reformat(dataset, labels):\n dataset = dataset.reshape(\n (-1, image_size, image_size, num_channels)).astype(np.float32)\n labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32)\n return dataset, labels\ntrain_dataset, train_labels = reformat(train_dataset, train_labels)\nvalid_dataset, valid_labels = reformat(valid_dataset, valid_labels)\ntest_dataset, test_labels = reformat(test_dataset, test_labels)\nprint 'Training set', train_dataset.shape, train_labels.shape\nprint 'Validation set', valid_dataset.shape, valid_labels.shape\nprint 'Test set', test_dataset.shape, test_labels.shape",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Training set (200000, 28, 28, 1) (200000, 10)\nValidation set (10000, 28, 28, 1) (10000, 10)\nTest set (18724, 28, 28, 1) (18724, 10)\n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "AgQDIREv02p1",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 11952,
+ "status": "ok",
+ "timestamp": 1446658914857,
+ "user": {
+ "color": "",
+ "displayName": "",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "",
+ "photoUrl": "",
+ "sessionId": "0",
+ "userId": ""
},
- "cell_type": "code",
- "input": "def accuracy(predictions, labels):\n return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1))\n / predictions.shape[0])",
- "language": "python",
- "outputs": []
+ "user_tz": 480
},
+ "outputId": "650a208c-8359-4852-f4f5-8bf10e80ef6c"
+ },
+ "source": [
+ "image_size = 28\n",
+ "num_labels = 10\n",
+ "num_channels = 1 # grayscale\n",
+ "\n",
+ "import numpy as np\n",
+ "\n",
+ "def reformat(dataset, labels):\n",
+ " dataset = dataset.reshape(\n",
+ " (-1, image_size, image_size, num_channels)).astype(np.float32)\n",
+ " labels = (np.arange(num_labels) == labels[:,None]).astype(np.float32)\n",
+ " return dataset, labels\n",
+ "train_dataset, train_labels = reformat(train_dataset, train_labels)\n",
+ "valid_dataset, valid_labels = reformat(valid_dataset, valid_labels)\n",
+ "test_dataset, test_labels = reformat(test_dataset, test_labels)\n",
+ "print 'Training set', train_dataset.shape, train_labels.shape\n",
+ "print 'Validation set', valid_dataset.shape, valid_labels.shape\n",
+ "print 'Test set', test_dataset.shape, test_labels.shape"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "5rhgjmROXu2O",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "Let's build a small network with two convolutional layers, followed by one fully connected layer. Convolutional networks are more expensive computationally, so we'll limit its depth and number of fully connected nodes."
+ "output_type": "stream",
+ "text": [
+ "Training set (200000, 28, 28, 1) (200000, 10)\n",
+ "Validation set (10000, 28, 28, 1) (10000, 10)\n",
+ "Test set (18724, 28, 28, 1) (18724, 10)\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "AgQDIREv02p1",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
},
- {
- "metadata": {
- "id": "IZYv70SvvOan",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
- },
- "cell_type": "code",
- "input": "batch_size = 16\npatch_size = 5\ndepth = 16\nnum_hidden = 64\n\ngraph = tf.Graph()\n\nwith graph.as_default():\n\n # Input data.\n tf_train_dataset = tf.placeholder(\n tf.float32, shape=(batch_size, image_size, image_size, num_channels))\n tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))\n tf_valid_dataset = tf.constant(valid_dataset)\n tf_test_dataset = tf.constant(test_dataset)\n \n # Variables.\n layer1_weights = tf.Variable(tf.truncated_normal(\n [patch_size, patch_size, num_channels, depth], stddev=0.1))\n layer1_biases = tf.Variable(tf.zeros([depth]))\n layer2_weights = tf.Variable(tf.truncated_normal(\n [patch_size, patch_size, depth, depth], stddev=0.1))\n layer2_biases = tf.Variable(tf.constant(1.0, shape=[depth]))\n layer3_weights = tf.Variable(tf.truncated_normal(\n [image_size / 4 * image_size / 4 * depth, num_hidden], stddev=0.1))\n layer3_biases = tf.Variable(tf.constant(1.0, shape=[num_hidden]))\n layer4_weights = tf.Variable(tf.truncated_normal(\n [num_hidden, num_labels], stddev=0.1))\n layer4_biases = tf.Variable(tf.constant(1.0, shape=[num_labels]))\n \n # Model.\n def model(data):\n conv = tf.nn.conv2d(data, layer1_weights, [1, 2, 2, 1], padding='SAME')\n hidden = tf.nn.relu(conv + layer1_biases)\n conv = tf.nn.conv2d(hidden, layer2_weights, [1, 2, 2, 1], padding='SAME')\n hidden = tf.nn.relu(conv + layer2_biases)\n shape = hidden.get_shape().as_list()\n reshape = tf.reshape(hidden, [shape[0], shape[1] * shape[2] * shape[3]])\n hidden = tf.nn.relu(tf.matmul(reshape, layer3_weights) + layer3_biases)\n return tf.matmul(hidden, layer4_weights) + layer4_biases\n \n # Training computation.\n logits = model(tf_train_dataset)\n loss = tf.reduce_mean(\n tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels))\n \n # Optimizer.\n optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss)\n \n # Predictions for the training, validation, and test data.\n train_prediction = tf.nn.softmax(logits)\n valid_prediction = tf.nn.softmax(model(tf_valid_dataset))\n test_prediction = tf.nn.softmax(model(tf_test_dataset))",
- "language": "python",
- "outputs": []
+ "cellView": "both"
+ },
+ "source": [
+ "def accuracy(predictions, labels):\n",
+ " return (100.0 * np.sum(np.argmax(predictions, 1) == np.argmax(labels, 1))\n",
+ " / predictions.shape[0])"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "5rhgjmROXu2O",
+ "colab_type": "text"
+ },
+ "source": [
+ "Let's build a small network with two convolutional layers, followed by one fully connected layer. Convolutional networks are more expensive computationally, so we'll limit its depth and number of fully connected nodes."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "IZYv70SvvOan",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
},
- {
- "metadata": {
- "id": "noKFb2UovVFR",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 37
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 63292,
- "status": "ok",
- "timestamp": 1446658966251,
- "user": {
- "color": "",
- "displayName": "",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "",
- "photoUrl": "",
- "sessionId": "0",
- "userId": ""
- },
- "user_tz": 480
- },
- "outputId": "28941338-2ef9-4088-8bd1-44295661e628"
+ "cellView": "both"
+ },
+ "source": [
+ "batch_size = 16\n",
+ "patch_size = 5\n",
+ "depth = 16\n",
+ "num_hidden = 64\n",
+ "\n",
+ "graph = tf.Graph()\n",
+ "\n",
+ "with graph.as_default():\n",
+ "\n",
+ " # Input data.\n",
+ " tf_train_dataset = tf.placeholder(\n",
+ " tf.float32, shape=(batch_size, image_size, image_size, num_channels))\n",
+ " tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size, num_labels))\n",
+ " tf_valid_dataset = tf.constant(valid_dataset)\n",
+ " tf_test_dataset = tf.constant(test_dataset)\n",
+ " \n",
+ " # Variables.\n",
+ " layer1_weights = tf.Variable(tf.truncated_normal(\n",
+ " [patch_size, patch_size, num_channels, depth], stddev=0.1))\n",
+ " layer1_biases = tf.Variable(tf.zeros([depth]))\n",
+ " layer2_weights = tf.Variable(tf.truncated_normal(\n",
+ " [patch_size, patch_size, depth, depth], stddev=0.1))\n",
+ " layer2_biases = tf.Variable(tf.constant(1.0, shape=[depth]))\n",
+ " layer3_weights = tf.Variable(tf.truncated_normal(\n",
+ " [image_size / 4 * image_size / 4 * depth, num_hidden], stddev=0.1))\n",
+ " layer3_biases = tf.Variable(tf.constant(1.0, shape=[num_hidden]))\n",
+ " layer4_weights = tf.Variable(tf.truncated_normal(\n",
+ " [num_hidden, num_labels], stddev=0.1))\n",
+ " layer4_biases = tf.Variable(tf.constant(1.0, shape=[num_labels]))\n",
+ " \n",
+ " # Model.\n",
+ " def model(data):\n",
+ " conv = tf.nn.conv2d(data, layer1_weights, [1, 2, 2, 1], padding='SAME')\n",
+ " hidden = tf.nn.relu(conv + layer1_biases)\n",
+ " conv = tf.nn.conv2d(hidden, layer2_weights, [1, 2, 2, 1], padding='SAME')\n",
+ " hidden = tf.nn.relu(conv + layer2_biases)\n",
+ " shape = hidden.get_shape().as_list()\n",
+ " reshape = tf.reshape(hidden, [shape[0], shape[1] * shape[2] * shape[3]])\n",
+ " hidden = tf.nn.relu(tf.matmul(reshape, layer3_weights) + layer3_biases)\n",
+ " return tf.matmul(hidden, layer4_weights) + layer4_biases\n",
+ " \n",
+ " # Training computation.\n",
+ " logits = model(tf_train_dataset)\n",
+ " loss = tf.reduce_mean(\n",
+ " tf.nn.softmax_cross_entropy_with_logits(logits, tf_train_labels))\n",
+ " \n",
+ " # Optimizer.\n",
+ " optimizer = tf.train.GradientDescentOptimizer(0.05).minimize(loss)\n",
+ " \n",
+ " # Predictions for the training, validation, and test data.\n",
+ " train_prediction = tf.nn.softmax(logits)\n",
+ " valid_prediction = tf.nn.softmax(model(tf_valid_dataset))\n",
+ " test_prediction = tf.nn.softmax(model(tf_test_dataset))"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "noKFb2UovVFR",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "num_steps = 1001\n\nwith tf.Session(graph=graph) as session:\n tf.initialize_all_variables().run()\n print \"Initialized\"\n for step in xrange(num_steps):\n offset = (step * batch_size) % (train_labels.shape[0] - batch_size)\n batch_data = train_dataset[offset:(offset + batch_size), :, :, :]\n batch_labels = train_labels[offset:(offset + batch_size), :]\n feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}\n _, l, predictions = session.run(\n [optimizer, loss, train_prediction], feed_dict=feed_dict)\n if (step % 50 == 0):\n print \"Minibatch loss at step\", step, \":\", l\n print \"Minibatch accuracy: %.1f%%\" % accuracy(predictions, batch_labels)\n print \"Validation accuracy: %.1f%%\" % accuracy(\n valid_prediction.eval(), valid_labels)\n print \"Test accuracy: %.1f%%\" % accuracy(test_prediction.eval(), test_labels)",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Initialized\nMinibatch loss at step 0 : 3.51275\nMinibatch accuracy: 6.2%\nValidation accuracy: 12.8%\nMinibatch loss at step 50 : 1.48703\nMinibatch accuracy: 43.8%\nValidation accuracy: 50.4%\nMinibatch loss at step 100 : 1.04377\nMinibatch accuracy: 68.8%\nValidation accuracy: 67.4%\nMinibatch loss at step 150 : 0.601682\nMinibatch accuracy: 68.8%\nValidation accuracy: 73.0%\nMinibatch loss at step 200 : 0.898649\nMinibatch accuracy: 75.0%\nValidation accuracy: 77.8%\nMinibatch loss at step 250 : 1.3637\nMinibatch accuracy: 56.2%\nValidation accuracy: 75.4%\nMinibatch loss at step 300 : 1.41968\nMinibatch accuracy: 62.5%\nValidation accuracy: 76.0%\nMinibatch loss at step 350 : 0.300648\nMinibatch accuracy: 81.2%\nValidation accuracy: 80.2%\nMinibatch loss at step 400 : 1.32092\nMinibatch accuracy: 56.2%\nValidation accuracy: 80.4%\nMinibatch loss at step 450 : 0.556701\nMinibatch accuracy: 81.2%\nValidation accuracy: 79.4%\nMinibatch loss at step 500 : 1.65595\nMinibatch accuracy: 43.8%\nValidation accuracy: 79.6%\nMinibatch loss at step 550 : 1.06995\nMinibatch accuracy: 75.0%\nValidation accuracy: 81.2%\nMinibatch loss at step 600 : 0.223684\nMinibatch accuracy: 100.0%\nValidation accuracy: 82.3%\nMinibatch loss at step 650 : 0.619602\nMinibatch accuracy: 87.5%\nValidation accuracy: 81.8%\nMinibatch loss at step 700 : 0.812091\nMinibatch accuracy: 75.0%\nValidation accuracy: 82.4%\nMinibatch loss at step 750 : 0.276302\nMinibatch accuracy: 87.5%\nValidation accuracy: 82.3%\nMinibatch loss at step 800 : 0.450241\nMinibatch accuracy: 81.2%\nValidation accuracy: 82.3%\nMinibatch loss at step 850 : 0.137139\nMinibatch accuracy: 93.8%\nValidation accuracy: 82.3%\nMinibatch loss at step 900 : 0.52664\nMinibatch accuracy: 75.0%\nValidation accuracy: 82.2%\nMinibatch loss at step 950 : 0.623835\nMinibatch accuracy: 87.5%\nValidation accuracy: 82.1%\nMinibatch loss at step 1000 : 0.243114\nMinibatch accuracy: 93.8%\nValidation accuracy: 82.9%\nTest accuracy: 90.0%\n"
+ "item_id": 37
}
]
},
- {
- "metadata": {
- "id": "KedKkn4EutIK",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 63292,
+ "status": "ok",
+ "timestamp": 1446658966251,
+ "user": {
+ "color": "",
+ "displayName": "",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "",
+ "photoUrl": "",
+ "sessionId": "0",
+ "userId": ""
},
- "cell_type": "markdown",
- "source": "---\nProblem 1\n---------\n\nThe convolutional model above uses convolutions with stride 2 to reduce the dimensionality. Replace the strides by a max pooling operation (`nn.max_pool()`) of stride 2 and kernel size 2.\n\n---"
+ "user_tz": 480
},
+ "outputId": "28941338-2ef9-4088-8bd1-44295661e628"
+ },
+ "source": [
+ "num_steps = 1001\n",
+ "\n",
+ "with tf.Session(graph=graph) as session:\n",
+ " tf.initialize_all_variables().run()\n",
+ " print \"Initialized\"\n",
+ " for step in xrange(num_steps):\n",
+ " offset = (step * batch_size) % (train_labels.shape[0] - batch_size)\n",
+ " batch_data = train_dataset[offset:(offset + batch_size), :, :, :]\n",
+ " batch_labels = train_labels[offset:(offset + batch_size), :]\n",
+ " feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels}\n",
+ " _, l, predictions = session.run(\n",
+ " [optimizer, loss, train_prediction], feed_dict=feed_dict)\n",
+ " if (step % 50 == 0):\n",
+ " print \"Minibatch loss at step\", step, \":\", l\n",
+ " print \"Minibatch accuracy: %.1f%%\" % accuracy(predictions, batch_labels)\n",
+ " print \"Validation accuracy: %.1f%%\" % accuracy(\n",
+ " valid_prediction.eval(), valid_labels)\n",
+ " print \"Test accuracy: %.1f%%\" % accuracy(test_prediction.eval(), test_labels)"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "klf21gpbAgb-",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "---\nProblem 2\n---------\n\nTry to get the best performance you can using a convolutional net. Look for example at the classic [LeNet5](http://yann.lecun.com/exdb/lenet/) architecture, adding Dropout, and/or adding learning rate decay.\n\n---"
+ "output_type": "stream",
+ "text": [
+ "Initialized\n",
+ "Minibatch loss at step 0 : 3.51275\n",
+ "Minibatch accuracy: 6.2%\n",
+ "Validation accuracy: 12.8%\n",
+ "Minibatch loss at step 50 : 1.48703\n",
+ "Minibatch accuracy: 43.8%\n",
+ "Validation accuracy: 50.4%\n",
+ "Minibatch loss at step 100 : 1.04377\n",
+ "Minibatch accuracy: 68.8%\n",
+ "Validation accuracy: 67.4%\n",
+ "Minibatch loss at step 150 : 0.601682\n",
+ "Minibatch accuracy: 68.8%\n",
+ "Validation accuracy: 73.0%\n",
+ "Minibatch loss at step 200 : 0.898649\n",
+ "Minibatch accuracy: 75.0%\n",
+ "Validation accuracy: 77.8%\n",
+ "Minibatch loss at step 250 : 1.3637\n",
+ "Minibatch accuracy: 56.2%\n",
+ "Validation accuracy: 75.4%\n",
+ "Minibatch loss at step 300 : 1.41968\n",
+ "Minibatch accuracy: 62.5%\n",
+ "Validation accuracy: 76.0%\n",
+ "Minibatch loss at step 350 : 0.300648\n",
+ "Minibatch accuracy: 81.2%\n",
+ "Validation accuracy: 80.2%\n",
+ "Minibatch loss at step 400 : 1.32092\n",
+ "Minibatch accuracy: 56.2%\n",
+ "Validation accuracy: 80.4%\n",
+ "Minibatch loss at step 450 : 0.556701\n",
+ "Minibatch accuracy: 81.2%\n",
+ "Validation accuracy: 79.4%\n",
+ "Minibatch loss at step 500 : 1.65595\n",
+ "Minibatch accuracy: 43.8%\n",
+ "Validation accuracy: 79.6%\n",
+ "Minibatch loss at step 550 : 1.06995\n",
+ "Minibatch accuracy: 75.0%\n",
+ "Validation accuracy: 81.2%\n",
+ "Minibatch loss at step 600 : 0.223684\n",
+ "Minibatch accuracy: 100.0%\n",
+ "Validation accuracy: 82.3%\n",
+ "Minibatch loss at step 650 : 0.619602\n",
+ "Minibatch accuracy: 87.5%\n",
+ "Validation accuracy: 81.8%\n",
+ "Minibatch loss at step 700 : 0.812091\n",
+ "Minibatch accuracy: 75.0%\n",
+ "Validation accuracy: 82.4%\n",
+ "Minibatch loss at step 750 : 0.276302\n",
+ "Minibatch accuracy: 87.5%\n",
+ "Validation accuracy: 82.3%\n",
+ "Minibatch loss at step 800 : 0.450241\n",
+ "Minibatch accuracy: 81.2%\n",
+ "Validation accuracy: 82.3%\n",
+ "Minibatch loss at step 850 : 0.137139\n",
+ "Minibatch accuracy: 93.8%\n",
+ "Validation accuracy: 82.3%\n",
+ "Minibatch loss at step 900 : 0.52664\n",
+ "Minibatch accuracy: 75.0%\n",
+ "Validation accuracy: 82.2%\n",
+ "Minibatch loss at step 950 : 0.623835\n",
+ "Minibatch accuracy: 87.5%\n",
+ "Validation accuracy: 82.1%\n",
+ "Minibatch loss at step 1000 : 0.243114\n",
+ "Minibatch accuracy: 93.8%\n",
+ "Validation accuracy: 82.9%\n",
+ "Test accuracy: 90.0%\n"
+ ],
+ "name": "stdout"
}
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "KedKkn4EutIK",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 1\n",
+ "---------\n",
+ "\n",
+ "The convolutional model above uses convolutions with stride 2 to reduce the dimensionality. Replace the strides by a max pooling operation (`nn.max_pool()`) of stride 2 and kernel size 2.\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "klf21gpbAgb-",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 2\n",
+ "---------\n",
+ "\n",
+ "Try to get the best performance you can using a convolutional net. Look for example at the classic [LeNet5](http://yann.lecun.com/exdb/lenet/) architecture, adding Dropout, and/or adding learning rate decay.\n",
+ "\n",
+ "---"
]
}
- ],
- "metadata": {
- "name": "4_convolutions.ipynb",
- "colabVersion": "0.3.2",
- "colab_views": {},
- "colab_default_view": {}
- },
- "nbformat": 3,
- "nbformat_minor": 0
-}
+ ]
+} \ No newline at end of file
diff --git a/tensorflow/examples/udacity/5_word2vec.ipynb b/tensorflow/examples/udacity/5_word2vec.ipynb
index d2da9828d0..cf704cecb4 100644
--- a/tensorflow/examples/udacity/5_word2vec.ipynb
+++ b/tensorflow/examples/udacity/5_word2vec.ipynb
@@ -6,7 +6,8 @@
"version": "0.3.2",
"views": {},
"default_view": {},
- "name": "5_word2vec.ipynb"
+ "name": "5_word2vec.ipynb",
+ "provenance": []
}
},
"cells": [
diff --git a/tensorflow/examples/udacity/6_lstm.ipynb b/tensorflow/examples/udacity/6_lstm.ipynb
index 1db35d2315..8e755dfc95 100644
--- a/tensorflow/examples/udacity/6_lstm.ipynb
+++ b/tensorflow/examples/udacity/6_lstm.ipynb
@@ -1,433 +1,1066 @@
{
- "worksheets": [
+ "nbformat": 4,
+ "nbformat_minor": 0,
+ "metadata": {
+ "colab": {
+ "version": "0.3.2",
+ "views": {},
+ "default_view": {},
+ "name": "6_lstm.ipynb",
+ "provenance": []
+ }
+ },
+ "cells": [
{
- "cells": [
- {
- "metadata": {
- "id": "8tQJd2YSCfWR",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": ""
- },
- {
- "metadata": {
- "id": "D7tqLMoKF6uq",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "Deep Learning\n=============\n\nAssignment 6\n------------\n\nAfter training a skip-gram model in `5_word2vec.ipynb`, the goal of this notebook is to train a LSTM character model over [Text8](http://mattmahoney.net/dc/textdata) data."
- },
- {
- "metadata": {
- "id": "MvEblsgEXxrd",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
- },
- "cell_type": "code",
- "input": "# These are all the modules we'll be using later. Make sure you can import them\n# before proceeding further.\nimport os\nimport numpy as np\nimport random\nimport string\nimport tensorflow as tf\nimport urllib\nimport zipfile",
- "language": "python",
- "outputs": []
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "8tQJd2YSCfWR",
+ "colab_type": "text"
+ },
+ "source": [
+ ""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "D7tqLMoKF6uq",
+ "colab_type": "text"
+ },
+ "source": [
+ "Deep Learning\n",
+ "=============\n",
+ "\n",
+ "Assignment 6\n",
+ "------------\n",
+ "\n",
+ "After training a skip-gram model in `5_word2vec.ipynb`, the goal of this notebook is to train a LSTM character model over [Text8](http://mattmahoney.net/dc/textdata) data."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "MvEblsgEXxrd",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
},
- {
- "metadata": {
- "id": "RJ-o3UBUFtCw",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 5993,
- "status": "ok",
- "timestamp": 1445965582896,
- "user": {
- "color": "#1FA15D",
- "displayName": "Vincent Vanhoucke",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "05076109866853157986",
- "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
- "sessionId": "6f6f07b359200c46",
- "userId": "102167687554210253930"
- },
- "user_tz": 420
- },
- "outputId": "d530534e-0791-4a94-ca6d-1c8f1b908a9e"
+ "cellView": "both"
+ },
+ "source": [
+ "# These are all the modules we'll be using later. Make sure you can import them\n",
+ "# before proceeding further.\n",
+ "import os\n",
+ "import numpy as np\n",
+ "import random\n",
+ "import string\n",
+ "import tensorflow as tf\n",
+ "import urllib\n",
+ "import zipfile"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "RJ-o3UBUFtCw",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "url = 'http://mattmahoney.net/dc/'\n\ndef maybe_download(filename, expected_bytes):\n \"\"\"Download a file if not present, and make sure it's the right size.\"\"\"\n if not os.path.exists(filename):\n filename, _ = urllib.urlretrieve(url + filename, filename)\n statinfo = os.stat(filename)\n if statinfo.st_size == expected_bytes:\n print 'Found and verified', filename\n else:\n print statinfo.st_size\n raise Exception(\n 'Failed to verify ' + filename + '. Can you get to it with a browser?')\n return filename\n\nfilename = maybe_download('text8.zip', 31344016)",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Found and verified text8.zip\n"
+ "item_id": 1
}
]
},
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 5993,
+ "status": "ok",
+ "timestamp": 1445965582896,
+ "user": {
+ "color": "#1FA15D",
+ "displayName": "Vincent Vanhoucke",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "05076109866853157986",
+ "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
+ "sessionId": "6f6f07b359200c46",
+ "userId": "102167687554210253930"
+ },
+ "user_tz": 420
+ },
+ "outputId": "d530534e-0791-4a94-ca6d-1c8f1b908a9e"
+ },
+ "source": [
+ "url = 'http://mattmahoney.net/dc/'\n",
+ "\n",
+ "def maybe_download(filename, expected_bytes):\n",
+ " \"\"\"Download a file if not present, and make sure it's the right size.\"\"\"\n",
+ " if not os.path.exists(filename):\n",
+ " filename, _ = urllib.urlretrieve(url + filename, filename)\n",
+ " statinfo = os.stat(filename)\n",
+ " if statinfo.st_size == expected_bytes:\n",
+ " print 'Found and verified', filename\n",
+ " else:\n",
+ " print statinfo.st_size\n",
+ " raise Exception(\n",
+ " 'Failed to verify ' + filename + '. Can you get to it with a browser?')\n",
+ " return filename\n",
+ "\n",
+ "filename = maybe_download('text8.zip', 31344016)"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "Mvf09fjugFU_",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 5982,
- "status": "ok",
- "timestamp": 1445965582916,
- "user": {
- "color": "#1FA15D",
- "displayName": "Vincent Vanhoucke",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "05076109866853157986",
- "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
- "sessionId": "6f6f07b359200c46",
- "userId": "102167687554210253930"
- },
- "user_tz": 420
- },
- "outputId": "8f75db58-3862-404b-a0c3-799380597390"
+ "output_type": "stream",
+ "text": [
+ "Found and verified text8.zip\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "Mvf09fjugFU_",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "def read_data(filename):\n f = zipfile.ZipFile(filename)\n for name in f.namelist():\n return f.read(name)\n f.close()\n \ntext = read_data(filename)\nprint \"Data size\", len(text)",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Data size 100000000\n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "ga2CYACE-ghb",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 5982,
+ "status": "ok",
+ "timestamp": 1445965582916,
+ "user": {
+ "color": "#1FA15D",
+ "displayName": "Vincent Vanhoucke",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "05076109866853157986",
+ "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
+ "sessionId": "6f6f07b359200c46",
+ "userId": "102167687554210253930"
},
- "cell_type": "markdown",
- "source": "Create a small validation set."
+ "user_tz": 420
},
+ "outputId": "8f75db58-3862-404b-a0c3-799380597390"
+ },
+ "source": [
+ "def read_data(filename):\n",
+ " f = zipfile.ZipFile(filename)\n",
+ " for name in f.namelist():\n",
+ " return f.read(name)\n",
+ " f.close()\n",
+ " \n",
+ "text = read_data(filename)\n",
+ "print \"Data size\", len(text)"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "w-oBpfFG-j43",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 6184,
- "status": "ok",
- "timestamp": 1445965583138,
- "user": {
- "color": "#1FA15D",
- "displayName": "Vincent Vanhoucke",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "05076109866853157986",
- "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
- "sessionId": "6f6f07b359200c46",
- "userId": "102167687554210253930"
- },
- "user_tz": 420
- },
- "outputId": "bdb96002-d021-4379-f6de-a977924f0d02"
+ "output_type": "stream",
+ "text": [
+ "Data size 100000000\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ga2CYACE-ghb",
+ "colab_type": "text"
+ },
+ "source": [
+ "Create a small validation set."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "w-oBpfFG-j43",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "valid_size = 1000\nvalid_text = text[:valid_size]\ntrain_text = text[valid_size:]\ntrain_size = len(train_text)\nprint train_size, train_text[:64]\nprint valid_size, valid_text[:64]",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "99999000 ons anarchists advocate social relations based upon voluntary as\n1000 anarchism originated as a term of abuse first used against earl\n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "Zdw6i4F8glpp",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 6184,
+ "status": "ok",
+ "timestamp": 1445965583138,
+ "user": {
+ "color": "#1FA15D",
+ "displayName": "Vincent Vanhoucke",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "05076109866853157986",
+ "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
+ "sessionId": "6f6f07b359200c46",
+ "userId": "102167687554210253930"
},
- "cell_type": "markdown",
- "source": "Utility functions to map characters to vocabulary IDs and back."
+ "user_tz": 420
},
+ "outputId": "bdb96002-d021-4379-f6de-a977924f0d02"
+ },
+ "source": [
+ "valid_size = 1000\n",
+ "valid_text = text[:valid_size]\n",
+ "train_text = text[valid_size:]\n",
+ "train_size = len(train_text)\n",
+ "print train_size, train_text[:64]\n",
+ "print valid_size, valid_text[:64]"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "gAL1EECXeZsD",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 6276,
- "status": "ok",
- "timestamp": 1445965583249,
- "user": {
- "color": "#1FA15D",
- "displayName": "Vincent Vanhoucke",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "05076109866853157986",
- "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
- "sessionId": "6f6f07b359200c46",
- "userId": "102167687554210253930"
- },
- "user_tz": 420
- },
- "outputId": "88fc9032-feb9-45ff-a9a0-a26759cc1f2e"
+ "output_type": "stream",
+ "text": [
+ "99999000 ons anarchists advocate social relations based upon voluntary as\n",
+ "1000 anarchism originated as a term of abuse first used against earl\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Zdw6i4F8glpp",
+ "colab_type": "text"
+ },
+ "source": [
+ "Utility functions to map characters to vocabulary IDs and back."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "gAL1EECXeZsD",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "vocabulary_size = len(string.ascii_lowercase) + 1 # [a-z] + ' '\nfirst_letter = ord(string.ascii_lowercase[0])\n\ndef char2id(char):\n if char in string.ascii_lowercase:\n return ord(char) - first_letter + 1\n elif char == ' ':\n return 0\n else:\n print 'Unexpected character:', char\n return 0\n \ndef id2char(dictid):\n if dictid > 0:\n return chr(dictid + first_letter - 1)\n else:\n return ' '\n\nprint char2id('a'), char2id('z'), char2id(' '), char2id('\u00ef')\nprint id2char(1), id2char(26), id2char(0)",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "1 26 0 Unexpected character: \u00ef\n0\na z \n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "lFwoyygOmWsL",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 6276,
+ "status": "ok",
+ "timestamp": 1445965583249,
+ "user": {
+ "color": "#1FA15D",
+ "displayName": "Vincent Vanhoucke",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "05076109866853157986",
+ "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
+ "sessionId": "6f6f07b359200c46",
+ "userId": "102167687554210253930"
},
- "cell_type": "markdown",
- "source": "Function to generate a training batch for the LSTM model."
+ "user_tz": 420
},
+ "outputId": "88fc9032-feb9-45ff-a9a0-a26759cc1f2e"
+ },
+ "source": [
+ "vocabulary_size = len(string.ascii_lowercase) + 1 # [a-z] + ' '\n",
+ "first_letter = ord(string.ascii_lowercase[0])\n",
+ "\n",
+ "def char2id(char):\n",
+ " if char in string.ascii_lowercase:\n",
+ " return ord(char) - first_letter + 1\n",
+ " elif char == ' ':\n",
+ " return 0\n",
+ " else:\n",
+ " print 'Unexpected character:', char\n",
+ " return 0\n",
+ " \n",
+ "def id2char(dictid):\n",
+ " if dictid > 0:\n",
+ " return chr(dictid + first_letter - 1)\n",
+ " else:\n",
+ " return ' '\n",
+ "\n",
+ "print char2id('a'), char2id('z'), char2id(' '), char2id('\u00ef')\n",
+ "print id2char(1), id2char(26), id2char(0)"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "d9wMtjy5hCj9",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 1
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 6473,
- "status": "ok",
- "timestamp": 1445965583467,
- "user": {
- "color": "#1FA15D",
- "displayName": "Vincent Vanhoucke",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "05076109866853157986",
- "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
- "sessionId": "6f6f07b359200c46",
- "userId": "102167687554210253930"
- },
- "user_tz": 420
- },
- "outputId": "3dd79c80-454a-4be0-8b71-4a4a357b3367"
+ "output_type": "stream",
+ "text": [
+ "1 26 0 Unexpected character: \u00ef\n",
+ "0\n",
+ "a z \n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "lFwoyygOmWsL",
+ "colab_type": "text"
+ },
+ "source": [
+ "Function to generate a training batch for the LSTM model."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "d9wMtjy5hCj9",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "batch_size=64\nnum_unrollings=10\n\nclass BatchGenerator(object):\n def __init__(self, text, batch_size, num_unrollings):\n self._text = text\n self._text_size = len(text)\n self._batch_size = batch_size\n self._num_unrollings = num_unrollings\n segment = self._text_size / batch_size\n self._cursor = [ offset * segment for offset in xrange(batch_size)]\n self._last_batch = self._next_batch()\n \n def _next_batch(self):\n \"\"\"Generate a single batch from the current cursor position in the data.\"\"\"\n batch = np.zeros(shape=(self._batch_size, vocabulary_size), dtype=np.float)\n for b in xrange(self._batch_size):\n batch[b, char2id(self._text[self._cursor[b]])] = 1.0\n self._cursor[b] = (self._cursor[b] + 1) % self._text_size\n return batch\n \n def next(self):\n \"\"\"Generate the next array of batches from the data. The array consists of\n the last batch of the previous array, followed by num_unrollings new ones.\n \"\"\"\n batches = [self._last_batch]\n for step in xrange(self._num_unrollings):\n batches.append(self._next_batch())\n self._last_batch = batches[-1]\n return batches\n\ndef characters(probabilities):\n \"\"\"Turn a 1-hot encoding or a probability distribution over the possible\n characters back into its (mostl likely) character representation.\"\"\"\n return [id2char(c) for c in np.argmax(probabilities, 1)]\n\ndef batches2string(batches):\n \"\"\"Convert a sequence of batches back into their (most likely) string\n representation.\"\"\"\n s = [''] * batches[0].shape[0]\n for b in batches:\n s = [''.join(x) for x in zip(s, characters(b))]\n return s\n\ntrain_batches = BatchGenerator(train_text, batch_size, num_unrollings)\nvalid_batches = BatchGenerator(valid_text, 1, 1)\n\nprint batches2string(train_batches.next())\nprint batches2string(train_batches.next())\nprint batches2string(valid_batches.next())\nprint batches2string(valid_batches.next())",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "['ons anarchi', 'when milita', 'lleria arch', ' abbeys and', 'married urr', 'hel and ric', 'y and litur', 'ay opened f', 'tion from t', 'migration t', 'new york ot', 'he boeing s', 'e listed wi', 'eber has pr', 'o be made t', 'yer who rec', 'ore signifi', 'a fierce cr', ' two six ei', 'aristotle s', 'ity can be ', ' and intrac', 'tion of the', 'dy to pass ', 'f certain d', 'at it will ', 'e convince ', 'ent told hi', 'ampaign and', 'rver side s', 'ious texts ', 'o capitaliz', 'a duplicate', 'gh ann es d', 'ine january', 'ross zero t', 'cal theorie', 'ast instanc', ' dimensiona', 'most holy m', 't s support', 'u is still ', 'e oscillati', 'o eight sub', 'of italy la', 's the tower', 'klahoma pre', 'erprise lin', 'ws becomes ', 'et in a naz', 'the fabian ', 'etchy to re', ' sharman ne', 'ised empero', 'ting in pol', 'd neo latin', 'th risky ri', 'encyclopedi', 'fense the a', 'duating fro', 'treet grid ', 'ations more', 'appeal of d', 'si have mad']\n['ists advoca', 'ary governm', 'hes nationa', 'd monasteri', 'raca prince', 'chard baer ', 'rgical lang', 'for passeng', 'the nationa', 'took place ', 'ther well k', 'seven six s', 'ith a gloss', 'robably bee', 'to recogniz', 'ceived the ', 'icant than ', 'ritic of th', 'ight in sig', 's uncaused ', ' lost as in', 'cellular ic', 'e size of t', ' him a stic', 'drugs confu', ' take to co', ' the priest', 'im to name ', 'd barred at', 'standard fo', ' such as es', 'ze on the g', 'e of the or', 'd hiver one', 'y eight mar', 'the lead ch', 'es classica', 'ce the non ', 'al analysis', 'mormons bel', 't or at lea', ' disagreed ', 'ing system ', 'btypes base', 'anguages th', 'r commissio', 'ess one nin', 'nux suse li', ' the first ', 'zi concentr', ' society ne', 'elatively s', 'etworks sha', 'or hirohito', 'litical ini', 'n most of t', 'iskerdoo ri', 'ic overview', 'air compone', 'om acnm acc', ' centerline', 'e than any ', 'devotional ', 'de such dev']\n[' a']\n['an']\n"
+ "item_id": 1
}
]
},
- {
- "metadata": {
- "id": "KyVd8FxT5QBc",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 6473,
+ "status": "ok",
+ "timestamp": 1445965583467,
+ "user": {
+ "color": "#1FA15D",
+ "displayName": "Vincent Vanhoucke",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "05076109866853157986",
+ "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
+ "sessionId": "6f6f07b359200c46",
+ "userId": "102167687554210253930"
},
- "cell_type": "code",
- "input": "def logprob(predictions, labels):\n \"\"\"Log-probability of the true labels in a predicted batch.\"\"\"\n predictions[predictions < 1e-10] = 1e-10\n return np.sum(np.multiply(labels, -np.log(predictions))) / labels.shape[0]\n\ndef sample_distribution(distribution):\n \"\"\"Sample one element from a distribution assumed to be an array of normalized\n probabilities.\n \"\"\"\n r = random.uniform(0, 1)\n s = 0\n for i in xrange(len(distribution)):\n s += distribution[i]\n if s >= r:\n return i\n return len(distribution) - 1\n\ndef sample(prediction):\n \"\"\"Turn a (column) prediction into 1-hot encoded samples.\"\"\"\n p = np.zeros(shape=[1, vocabulary_size], dtype=np.float)\n p[0, sample_distribution(prediction[0])] = 1.0\n return p\n\ndef random_distribution():\n \"\"\"Generate a random column of probabilities.\"\"\"\n b = np.random.uniform(0.0, 1.0, size=[1, vocabulary_size])\n return b/np.sum(b, 1)[:,None]",
- "language": "python",
- "outputs": []
+ "user_tz": 420
},
+ "outputId": "3dd79c80-454a-4be0-8b71-4a4a357b3367"
+ },
+ "source": [
+ "batch_size=64\n",
+ "num_unrollings=10\n",
+ "\n",
+ "class BatchGenerator(object):\n",
+ " def __init__(self, text, batch_size, num_unrollings):\n",
+ " self._text = text\n",
+ " self._text_size = len(text)\n",
+ " self._batch_size = batch_size\n",
+ " self._num_unrollings = num_unrollings\n",
+ " segment = self._text_size / batch_size\n",
+ " self._cursor = [ offset * segment for offset in xrange(batch_size)]\n",
+ " self._last_batch = self._next_batch()\n",
+ " \n",
+ " def _next_batch(self):\n",
+ " \"\"\"Generate a single batch from the current cursor position in the data.\"\"\"\n",
+ " batch = np.zeros(shape=(self._batch_size, vocabulary_size), dtype=np.float)\n",
+ " for b in xrange(self._batch_size):\n",
+ " batch[b, char2id(self._text[self._cursor[b]])] = 1.0\n",
+ " self._cursor[b] = (self._cursor[b] + 1) % self._text_size\n",
+ " return batch\n",
+ " \n",
+ " def next(self):\n",
+ " \"\"\"Generate the next array of batches from the data. The array consists of\n",
+ " the last batch of the previous array, followed by num_unrollings new ones.\n",
+ " \"\"\"\n",
+ " batches = [self._last_batch]\n",
+ " for step in xrange(self._num_unrollings):\n",
+ " batches.append(self._next_batch())\n",
+ " self._last_batch = batches[-1]\n",
+ " return batches\n",
+ "\n",
+ "def characters(probabilities):\n",
+ " \"\"\"Turn a 1-hot encoding or a probability distribution over the possible\n",
+ " characters back into its (mostl likely) character representation.\"\"\"\n",
+ " return [id2char(c) for c in np.argmax(probabilities, 1)]\n",
+ "\n",
+ "def batches2string(batches):\n",
+ " \"\"\"Convert a sequence of batches back into their (most likely) string\n",
+ " representation.\"\"\"\n",
+ " s = [''] * batches[0].shape[0]\n",
+ " for b in batches:\n",
+ " s = [''.join(x) for x in zip(s, characters(b))]\n",
+ " return s\n",
+ "\n",
+ "train_batches = BatchGenerator(train_text, batch_size, num_unrollings)\n",
+ "valid_batches = BatchGenerator(valid_text, 1, 1)\n",
+ "\n",
+ "print batches2string(train_batches.next())\n",
+ "print batches2string(train_batches.next())\n",
+ "print batches2string(valid_batches.next())\n",
+ "print batches2string(valid_batches.next())"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "K8f67YXaDr4C",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "Simple LSTM Model."
+ "output_type": "stream",
+ "text": [
+ "['ons anarchi', 'when milita', 'lleria arch', ' abbeys and', 'married urr', 'hel and ric', 'y and litur', 'ay opened f', 'tion from t', 'migration t', 'new york ot', 'he boeing s', 'e listed wi', 'eber has pr', 'o be made t', 'yer who rec', 'ore signifi', 'a fierce cr', ' two six ei', 'aristotle s', 'ity can be ', ' and intrac', 'tion of the', 'dy to pass ', 'f certain d', 'at it will ', 'e convince ', 'ent told hi', 'ampaign and', 'rver side s', 'ious texts ', 'o capitaliz', 'a duplicate', 'gh ann es d', 'ine january', 'ross zero t', 'cal theorie', 'ast instanc', ' dimensiona', 'most holy m', 't s support', 'u is still ', 'e oscillati', 'o eight sub', 'of italy la', 's the tower', 'klahoma pre', 'erprise lin', 'ws becomes ', 'et in a naz', 'the fabian ', 'etchy to re', ' sharman ne', 'ised empero', 'ting in pol', 'd neo latin', 'th risky ri', 'encyclopedi', 'fense the a', 'duating fro', 'treet grid ', 'ations more', 'appeal of d', 'si have mad']\n",
+ "['ists advoca', 'ary governm', 'hes nationa', 'd monasteri', 'raca prince', 'chard baer ', 'rgical lang', 'for passeng', 'the nationa', 'took place ', 'ther well k', 'seven six s', 'ith a gloss', 'robably bee', 'to recogniz', 'ceived the ', 'icant than ', 'ritic of th', 'ight in sig', 's uncaused ', ' lost as in', 'cellular ic', 'e size of t', ' him a stic', 'drugs confu', ' take to co', ' the priest', 'im to name ', 'd barred at', 'standard fo', ' such as es', 'ze on the g', 'e of the or', 'd hiver one', 'y eight mar', 'the lead ch', 'es classica', 'ce the non ', 'al analysis', 'mormons bel', 't or at lea', ' disagreed ', 'ing system ', 'btypes base', 'anguages th', 'r commissio', 'ess one nin', 'nux suse li', ' the first ', 'zi concentr', ' society ne', 'elatively s', 'etworks sha', 'or hirohito', 'litical ini', 'n most of t', 'iskerdoo ri', 'ic overview', 'air compone', 'om acnm acc', ' centerline', 'e than any ', 'devotional ', 'de such dev']\n",
+ "[' a']\n",
+ "['an']\n"
+ ],
+ "name": "stdout"
+ }
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "KyVd8FxT5QBc",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
},
- {
- "metadata": {
- "id": "Q5rxZK6RDuGe",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- }
- },
- "cellView": "both"
- },
- "cell_type": "code",
- "input": "num_nodes = 64\n\ngraph = tf.Graph()\nwith graph.as_default():\n \n # Parameters:\n # Input gate: input, previous output, and bias.\n ix = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], -0.1, 0.1))\n im = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], -0.1, 0.1))\n ib = tf.Variable(tf.zeros([1, num_nodes]))\n # Forget gate: input, previous output, and bias.\n fx = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], -0.1, 0.1))\n fm = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], -0.1, 0.1))\n fb = tf.Variable(tf.zeros([1, num_nodes]))\n # Memory cell: input, state and bias. \n cx = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], -0.1, 0.1))\n cm = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], -0.1, 0.1))\n cb = tf.Variable(tf.zeros([1, num_nodes]))\n # Output gate: input, previous output, and bias.\n ox = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], -0.1, 0.1))\n om = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], -0.1, 0.1))\n ob = tf.Variable(tf.zeros([1, num_nodes]))\n # Variables saving state across unrollings.\n saved_output = tf.Variable(tf.zeros([batch_size, num_nodes]), trainable=False)\n saved_state = tf.Variable(tf.zeros([batch_size, num_nodes]), trainable=False)\n # Classifier weights and biases.\n w = tf.Variable(tf.truncated_normal([num_nodes, vocabulary_size], -0.1, 0.1))\n b = tf.Variable(tf.zeros([vocabulary_size]))\n \n # Definition of the cell computation.\n def lstm_cell(i, o, state):\n \"\"\"Create a LSTM cell. See e.g.: http://arxiv.org/pdf/1402.1128v1.pdf\n Note that in this formulation, we omit the various connections between the\n previous state and the gates.\"\"\"\n input_gate = tf.sigmoid(tf.matmul(i, ix) + tf.matmul(o, im) + ib)\n forget_gate = tf.sigmoid(tf.matmul(i, fx) + tf.matmul(o, fm) + fb)\n update = tf.matmul(i, cx) + tf.matmul(o, cm) + cb\n state = forget_gate * state + input_gate * tf.tanh(update)\n output_gate = tf.sigmoid(tf.matmul(i, ox) + tf.matmul(o, om) + ob)\n return output_gate * tf.tanh(state), state\n\n # Input data.\n train_data = list()\n for _ in xrange(num_unrollings + 1):\n train_data.append(\n tf.placeholder(tf.float32, shape=[batch_size,vocabulary_size]))\n train_inputs = train_data[:num_unrollings]\n train_labels = train_data[1:] # labels are inputs shifted by one time step.\n\n # Unrolled LSTM loop.\n outputs = list()\n output = saved_output\n state = saved_state\n for i in train_inputs:\n output, state = lstm_cell(i, output, state)\n outputs.append(output)\n\n # State saving across unrollings.\n with tf.control_dependencies([saved_output.assign(output),\n saved_state.assign(state)]):\n # Classifier.\n logits = tf.nn.xw_plus_b(tf.concat(0, outputs), w, b)\n loss = tf.reduce_mean(\n tf.nn.softmax_cross_entropy_with_logits(\n logits, tf.concat(0, train_labels)))\n\n # Optimizer.\n global_step = tf.Variable(0)\n learning_rate = tf.train.exponential_decay(\n 10.0, global_step, 5000, 0.1, staircase=True)\n optimizer = tf.train.GradientDescentOptimizer(learning_rate)\n gradients, v = zip(*optimizer.compute_gradients(loss))\n gradients, _ = tf.clip_by_global_norm(gradients, 1.25)\n optimizer = optimizer.apply_gradients(\n zip(gradients, v), global_step=global_step)\n\n # Predictions.\n train_prediction = tf.nn.softmax(logits)\n \n # Sampling and validation eval: batch 1, no unrolling.\n sample_input = tf.placeholder(tf.float32, shape=[1, vocabulary_size])\n saved_sample_output = tf.Variable(tf.zeros([1, num_nodes]))\n saved_sample_state = tf.Variable(tf.zeros([1, num_nodes]))\n reset_sample_state = tf.group(\n saved_sample_output.assign(tf.zeros([1, num_nodes])),\n saved_sample_state.assign(tf.zeros([1, num_nodes])))\n sample_output, sample_state = lstm_cell(\n sample_input, saved_sample_output, saved_sample_state)\n with tf.control_dependencies([saved_sample_output.assign(sample_output),\n saved_sample_state.assign(sample_state)]):\n sample_prediction = tf.nn.softmax(tf.nn.xw_plus_b(sample_output, w, b))",
- "language": "python",
- "outputs": []
+ "cellView": "both"
+ },
+ "source": [
+ "def logprob(predictions, labels):\n",
+ " \"\"\"Log-probability of the true labels in a predicted batch.\"\"\"\n",
+ " predictions[predictions < 1e-10] = 1e-10\n",
+ " return np.sum(np.multiply(labels, -np.log(predictions))) / labels.shape[0]\n",
+ "\n",
+ "def sample_distribution(distribution):\n",
+ " \"\"\"Sample one element from a distribution assumed to be an array of normalized\n",
+ " probabilities.\n",
+ " \"\"\"\n",
+ " r = random.uniform(0, 1)\n",
+ " s = 0\n",
+ " for i in xrange(len(distribution)):\n",
+ " s += distribution[i]\n",
+ " if s >= r:\n",
+ " return i\n",
+ " return len(distribution) - 1\n",
+ "\n",
+ "def sample(prediction):\n",
+ " \"\"\"Turn a (column) prediction into 1-hot encoded samples.\"\"\"\n",
+ " p = np.zeros(shape=[1, vocabulary_size], dtype=np.float)\n",
+ " p[0, sample_distribution(prediction[0])] = 1.0\n",
+ " return p\n",
+ "\n",
+ "def random_distribution():\n",
+ " \"\"\"Generate a random column of probabilities.\"\"\"\n",
+ " b = np.random.uniform(0.0, 1.0, size=[1, vocabulary_size])\n",
+ " return b/np.sum(b, 1)[:,None]"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "K8f67YXaDr4C",
+ "colab_type": "text"
+ },
+ "source": [
+ "Simple LSTM Model."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "Q5rxZK6RDuGe",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
+ }
},
- {
- "metadata": {
- "id": "RD9zQCZTEaEm",
- "colab_type": "code",
- "colab": {
- "autoexec": {
- "startup": false,
- "wait_interval": 0
- },
- "output_extras": [
- {
- "item_id": 41
- },
- {
- "item_id": 80
- },
- {
- "item_id": 126
- },
- {
- "item_id": 144
- }
- ]
- },
- "cellView": "both",
- "executionInfo": {
- "elapsed": 199909,
- "status": "ok",
- "timestamp": 1445965877333,
- "user": {
- "color": "#1FA15D",
- "displayName": "Vincent Vanhoucke",
- "isAnonymous": false,
- "isMe": true,
- "permissionId": "05076109866853157986",
- "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
- "sessionId": "6f6f07b359200c46",
- "userId": "102167687554210253930"
- },
- "user_tz": 420
- },
- "outputId": "5e868466-2532-4545-ce35-b403cf5d9de6"
+ "cellView": "both"
+ },
+ "source": [
+ "num_nodes = 64\n",
+ "\n",
+ "graph = tf.Graph()\n",
+ "with graph.as_default():\n",
+ " \n",
+ " # Parameters:\n",
+ " # Input gate: input, previous output, and bias.\n",
+ " ix = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], -0.1, 0.1))\n",
+ " im = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], -0.1, 0.1))\n",
+ " ib = tf.Variable(tf.zeros([1, num_nodes]))\n",
+ " # Forget gate: input, previous output, and bias.\n",
+ " fx = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], -0.1, 0.1))\n",
+ " fm = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], -0.1, 0.1))\n",
+ " fb = tf.Variable(tf.zeros([1, num_nodes]))\n",
+ " # Memory cell: input, state and bias. \n",
+ " cx = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], -0.1, 0.1))\n",
+ " cm = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], -0.1, 0.1))\n",
+ " cb = tf.Variable(tf.zeros([1, num_nodes]))\n",
+ " # Output gate: input, previous output, and bias.\n",
+ " ox = tf.Variable(tf.truncated_normal([vocabulary_size, num_nodes], -0.1, 0.1))\n",
+ " om = tf.Variable(tf.truncated_normal([num_nodes, num_nodes], -0.1, 0.1))\n",
+ " ob = tf.Variable(tf.zeros([1, num_nodes]))\n",
+ " # Variables saving state across unrollings.\n",
+ " saved_output = tf.Variable(tf.zeros([batch_size, num_nodes]), trainable=False)\n",
+ " saved_state = tf.Variable(tf.zeros([batch_size, num_nodes]), trainable=False)\n",
+ " # Classifier weights and biases.\n",
+ " w = tf.Variable(tf.truncated_normal([num_nodes, vocabulary_size], -0.1, 0.1))\n",
+ " b = tf.Variable(tf.zeros([vocabulary_size]))\n",
+ " \n",
+ " # Definition of the cell computation.\n",
+ " def lstm_cell(i, o, state):\n",
+ " \"\"\"Create a LSTM cell. See e.g.: http://arxiv.org/pdf/1402.1128v1.pdf\n",
+ " Note that in this formulation, we omit the various connections between the\n",
+ " previous state and the gates.\"\"\"\n",
+ " input_gate = tf.sigmoid(tf.matmul(i, ix) + tf.matmul(o, im) + ib)\n",
+ " forget_gate = tf.sigmoid(tf.matmul(i, fx) + tf.matmul(o, fm) + fb)\n",
+ " update = tf.matmul(i, cx) + tf.matmul(o, cm) + cb\n",
+ " state = forget_gate * state + input_gate * tf.tanh(update)\n",
+ " output_gate = tf.sigmoid(tf.matmul(i, ox) + tf.matmul(o, om) + ob)\n",
+ " return output_gate * tf.tanh(state), state\n",
+ "\n",
+ " # Input data.\n",
+ " train_data = list()\n",
+ " for _ in xrange(num_unrollings + 1):\n",
+ " train_data.append(\n",
+ " tf.placeholder(tf.float32, shape=[batch_size,vocabulary_size]))\n",
+ " train_inputs = train_data[:num_unrollings]\n",
+ " train_labels = train_data[1:] # labels are inputs shifted by one time step.\n",
+ "\n",
+ " # Unrolled LSTM loop.\n",
+ " outputs = list()\n",
+ " output = saved_output\n",
+ " state = saved_state\n",
+ " for i in train_inputs:\n",
+ " output, state = lstm_cell(i, output, state)\n",
+ " outputs.append(output)\n",
+ "\n",
+ " # State saving across unrollings.\n",
+ " with tf.control_dependencies([saved_output.assign(output),\n",
+ " saved_state.assign(state)]):\n",
+ " # Classifier.\n",
+ " logits = tf.nn.xw_plus_b(tf.concat(0, outputs), w, b)\n",
+ " loss = tf.reduce_mean(\n",
+ " tf.nn.softmax_cross_entropy_with_logits(\n",
+ " logits, tf.concat(0, train_labels)))\n",
+ "\n",
+ " # Optimizer.\n",
+ " global_step = tf.Variable(0)\n",
+ " learning_rate = tf.train.exponential_decay(\n",
+ " 10.0, global_step, 5000, 0.1, staircase=True)\n",
+ " optimizer = tf.train.GradientDescentOptimizer(learning_rate)\n",
+ " gradients, v = zip(*optimizer.compute_gradients(loss))\n",
+ " gradients, _ = tf.clip_by_global_norm(gradients, 1.25)\n",
+ " optimizer = optimizer.apply_gradients(\n",
+ " zip(gradients, v), global_step=global_step)\n",
+ "\n",
+ " # Predictions.\n",
+ " train_prediction = tf.nn.softmax(logits)\n",
+ " \n",
+ " # Sampling and validation eval: batch 1, no unrolling.\n",
+ " sample_input = tf.placeholder(tf.float32, shape=[1, vocabulary_size])\n",
+ " saved_sample_output = tf.Variable(tf.zeros([1, num_nodes]))\n",
+ " saved_sample_state = tf.Variable(tf.zeros([1, num_nodes]))\n",
+ " reset_sample_state = tf.group(\n",
+ " saved_sample_output.assign(tf.zeros([1, num_nodes])),\n",
+ " saved_sample_state.assign(tf.zeros([1, num_nodes])))\n",
+ " sample_output, sample_state = lstm_cell(\n",
+ " sample_input, saved_sample_output, saved_sample_state)\n",
+ " with tf.control_dependencies([saved_sample_output.assign(sample_output),\n",
+ " saved_sample_state.assign(sample_state)]):\n",
+ " sample_prediction = tf.nn.softmax(tf.nn.xw_plus_b(sample_output, w, b))"
+ ],
+ "outputs": [],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "code",
+ "metadata": {
+ "id": "RD9zQCZTEaEm",
+ "colab_type": "code",
+ "colab": {
+ "autoexec": {
+ "startup": false,
+ "wait_interval": 0
},
- "cell_type": "code",
- "input": "num_steps = 7001\nsummary_frequency = 100\n\nwith tf.Session(graph=graph) as session:\n tf.initialize_all_variables().run()\n print 'Initialized'\n mean_loss = 0\n for step in xrange(num_steps):\n batches = train_batches.next()\n feed_dict = dict()\n for i in xrange(num_unrollings + 1):\n feed_dict[train_data[i]] = batches[i]\n _, l, predictions, lr = session.run(\n [optimizer, loss, train_prediction, learning_rate], feed_dict=feed_dict)\n mean_loss += l\n if step % summary_frequency == 0:\n if step > 0:\n mean_loss = mean_loss / summary_frequency\n # The mean loss is an estimate of the loss over the last few batches.\n print 'Average loss at step', step, ':', mean_loss, 'learning rate:', lr\n mean_loss = 0\n labels = np.concatenate(list(batches)[1:])\n print 'Minibatch perplexity: %.2f' % float(\n np.exp(logprob(predictions, labels)))\n if step % (summary_frequency * 10) == 0:\n # Generate some samples.\n print '=' * 80\n for _ in xrange(5):\n feed = sample(random_distribution())\n sentence = characters(feed)[0]\n reset_sample_state.run()\n for _ in xrange(79):\n prediction = sample_prediction.eval({sample_input: feed})\n feed = sample(prediction)\n sentence += characters(feed)[0]\n print sentence\n print '=' * 80\n # Measure validation set perplexity.\n reset_sample_state.run()\n valid_logprob = 0\n for _ in xrange(valid_size):\n b = valid_batches.next()\n predictions = sample_prediction.eval({sample_input: b[0]})\n valid_logprob = valid_logprob + logprob(predictions, b[1])\n print 'Validation set perplexity: %.2f' % float(np.exp(\n valid_logprob / valid_size))",
- "language": "python",
- "outputs": [
+ "output_extras": [
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "Initialized\nAverage loss at step 0 : 3.29904174805 learning rate: 10.0\nMinibatch perplexity: 27.09\n================================================================================\nsrk dwmrnuldtbbgg tapootidtu xsciu sgokeguw hi ieicjq lq piaxhazvc s fht wjcvdlh\nlhrvallvbeqqquc dxd y siqvnle bzlyw nr rwhkalezo siie o deb e lpdg storq u nx o\nmeieu nantiouie gdys qiuotblci loc hbiznauiccb cqzed acw l tsm adqxplku gn oaxet\nunvaouc oxchywdsjntdh zpklaejvxitsokeerloemee htphisb th eaeqseibumh aeeyj j orw\nogmnictpycb whtup otnilnesxaedtekiosqet liwqarysmt arj flioiibtqekycbrrgoysj\n================================================================================\nValidation set perplexity: 19.99\nAverage loss at step 100 : 2.59553678274 learning rate: 10.0\nMinibatch perplexity: 9.57\nValidation set perplexity: 10.60\nAverage loss at step 200 : 2.24747137785 learning rate: 10.0\nMinibatch perplexity: 7.68\nValidation set perplexity: 8.84\nAverage loss at step 300 : 2.09438110709 learning rate: 10.0\nMinibatch perplexity: 7.41\nValidation set perplexity: 8.13\nAverage loss at step 400 : 1.99440989017 learning rate: 10.0\nMinibatch perplexity: 6.46\nValidation set perplexity: 7.58\nAverage loss at step 500 : 1.9320810616 learning rate: 10.0\nMinibatch perplexity: 6.30\nValidation set perplexity: 6.88\nAverage loss at step 600 : 1.90935629249 learning rate: 10.0\nMinibatch perplexity: 7.21\nValidation set perplexity: 6.91\nAverage loss at step 700 : 1.85583009005 learning rate: 10.0\nMinibatch perplexity: 6.13\nValidation set perplexity: 6.60\nAverage loss at step 800 : 1.82152368546 learning rate: 10.0\nMinibatch perplexity: 6.01\nValidation set perplexity: 6.37\nAverage loss at step 900 : 1.83169809818 learning rate: 10.0\nMinibatch perplexity: 7.20\nValidation set perplexity: 6.23\nAverage loss at step 1000 : 1.82217029214 learning rate: 10.0\nMinibatch perplexity: 6.73\n================================================================================\nle action b of the tert sy ofter selvorang previgned stischdy yocal chary the co\nle relganis networks partucy cetinning wilnchan sics rumeding a fulch laks oftes\nhian andoris ret the ecause bistory l pidect one eight five lack du that the ses\naiv dromery buskocy becomer worils resism disele retery exterrationn of hide in \nmer miter y sught esfectur of the upission vain is werms is vul ugher compted by\n================================================================================\nValidation set perplexity: 6.07\nAverage loss at step 1100 : 1.77301145077 learning rate: 10.0\nMinibatch perplexity: 6.03\nValidation set perplexity: 5.89\nAverage loss at step 1200 : 1.75306463003 learning rate: 10.0\nMinibatch perplexity: 6.50\nValidation set perplexity: 5.61\nAverage loss at step 1300 : 1.72937195778 learning rate: 10.0\nMinibatch perplexity: 5.00\nValidation set perplexity: 5.60\nAverage loss at step 1400 : 1.74773373723 learning rate: 10.0\nMinibatch perplexity: 6.48\nValidation set perplexity: 5.66\nAverage loss at step 1500 : 1.7368799901 learning rate: 10.0\nMinibatch perplexity: 5.22\nValidation set perplexity: 5.44\nAverage loss at step 1600 : 1.74528762937 learning rate: 10.0\nMinibatch perplexity: 5.85\nValidation set perplexity: 5.33\nAverage loss at step 1700 : 1.70881183743 learning rate: 10.0\nMinibatch perplexity: 5.33\nValidation set perplexity: 5.56\nAverage loss at step 1800 : 1.67776108027 learning rate: 10.0\nMinibatch perplexity: 5.33\nValidation set perplexity: 5.29\nAverage loss at step 1900 : 1.64935536742 learning rate: 10.0\nMinibatch perplexity: 5.29\nValidation set perplexity: 5.15\nAverage loss at step"
+ "item_id": 41
},
{
- "output_type": "stream",
- "stream": "stdout",
- "text": " 2000 : 1.69528644681 learning rate: 10.0\nMinibatch perplexity: 5.13\n================================================================================\nvers soqually have one five landwing to docial page kagan lower with ther batern\nctor son alfortmandd tethre k skin the known purated to prooust caraying the fit\nje in beverb is the sournction bainedy wesce tu sture artualle lines digra forme\nm rousively haldio ourso ond anvary was for the seven solies hild buil s to te\nzall for is it is one nine eight eight one neval to the kime typer oene where he\n================================================================================\nValidation set perplexity: 5.25\nAverage loss at step 2100 : 1.68808053017 learning rate: 10.0\nMinibatch perplexity: 5.17\nValidation set perplexity: 5.01\nAverage loss at step 2200 : 1.68322490931 learning rate: 10.0\nMinibatch perplexity: 5.09\nValidation set perplexity: 5.15\nAverage loss at step 2300 : 1.64465074301 learning rate: 10.0\nMinibatch perplexity: 5.51\nValidation set perplexity: 5.00\nAverage loss at step 2400 : 1.66408578038 learning rate: 10.0\nMinibatch perplexity: 5.86\nValidation set perplexity: 4.80\nAverage loss at step 2500 : 1.68515402555 learning rate: 10.0\nMinibatch perplexity: 5.75\nValidation set perplexity: 4.82\nAverage loss at step 2600 : 1.65405208349 learning rate: 10.0\nMinibatch perplexity: 5.38\nValidation set perplexity: 4.85\nAverage loss at step 2700 : 1.65706222177 learning rate: 10.0\nMinibatch perplexity: 5.46\nValidation set perplexity: 4.78\nAverage loss at step 2800 : 1.65204829812 learning rate: 10.0\nMinibatch perplexity: 5.06\nValidation set perplexity: 4.64\nAverage loss at step 2900 : 1.65107253551 learning rate: 10.0\nMinibatch perplexity: 5.00\nValidation set perplexity: 4.61\nAverage loss at step 3000 : 1.6495274055 learning rate: 10.0\nMinibatch perplexity: 4.53\n================================================================================\nject covered in belo one six six to finsh that all di rozial sime it a the lapse\nble which the pullic bocades record r to sile dric two one four nine seven six f\n originally ame the playa ishaps the stotchational in a p dstambly name which as\nore volum to bay riwer foreal in nuily operety can and auscham frooripm however \nkan traogey was lacous revision the mott coupofiteditey the trando insended frop\n================================================================================\nValidation set perplexity: 4.76\nAverage loss at step 3100 : 1.63705502152 learning rate: 10.0\nMinibatch perplexity: 5.50\nValidation set perplexity: 4.76\nAverage loss at step 3200 : 1.64740695596 learning rate: 10.0\nMinibatch perplexity: 4.84\nValidation set perplexity: 4.67\nAverage loss at step 3300 : 1.64711504817 learning rate: 10.0\nMinibatch perplexity: 5.39\nValidation set perplexity: 4.57\nAverage loss at step 3400 : 1.67113256454 learning rate: 10.0\nMinibatch perplexity: 5.56\nValidation set perplexity: 4.71\nAverage loss at step 3500 : 1.65637169957 learning rate: 10.0\nMinibatch perplexity: 5.03\nValidation set perplexity: 4.80\nAverage loss at step 3600 : 1.66601825476 learning rate: 10.0\nMinibatch perplexity: 4.63\nValidation set perplexity: 4.52\nAverage loss at step 3700 : 1.65021387935 learning rate: 10.0\nMinibatch perplexity: 5.50\nValidation set perplexity: 4.56\nAverage loss at step 3800 : 1.64481814981 learning rate: 10.0\nMinibatch perplexity: 4.60\nValidation set perplexity: 4.54\nAverage loss at step 3900 : 1.642069453 learning rate: 10.0\nMinibatch perplexity: 4.91\nValidation set perplexity: 4.54\nAverage loss at step 4000 : 1.65179730773 learning rate: 10.0\nMinibatch perplexity: 4.77\n================================================================================\nk s rasbonish roctes the nignese at heacle was sito of beho anarchys and with ro\njusar two sue wletaus of chistical in causations d ow trancic bruthing ha laters\nde and speacy pulted yoftret worksy zeatlating to eight d had to ie bue seven si"
+ "item_id": 80
},
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "\ns fiction of the feelly constive suq flanch earlied curauking bjoventation agent\nquen s playing it calana our seopity also atbellisionaly comexing the revideve i\n================================================================================\nValidation set perplexity: 4.58\nAverage loss at step 4100 : 1.63794238806 learning rate: 10.0\nMinibatch perplexity: 5.47\nValidation set perplexity: 4.79\nAverage loss at step 4200 : 1.63822438836 learning rate: 10.0\nMinibatch perplexity: 5.30\nValidation set perplexity: 4.54\nAverage loss at step 4300 : 1.61844664574 learning rate: 10.0\nMinibatch perplexity: 4.69\nValidation set perplexity: 4.54\nAverage loss at step 4400 : 1.61255454302 learning rate: 10.0\nMinibatch perplexity: 4.67\nValidation set perplexity: 4.54\nAverage loss at step 4500 : 1.61543365479 learning rate: 10.0\nMinibatch perplexity: 4.83\nValidation set perplexity: 4.69\nAverage loss at step 4600 : 1.61607327104 learning rate: 10.0\nMinibatch perplexity: 5.18\nValidation set perplexity: 4.64\nAverage loss at step 4700 : 1.62757282495 learning rate: 10.0\nMinibatch perplexity: 4.24\nValidation set perplexity: 4.66\nAverage loss at step 4800 : 1.63222063541 learning rate: 10.0\nMinibatch perplexity: 5.30\nValidation set perplexity: 4.53\nAverage loss at step 4900 : 1.63678096652 learning rate: 10.0\nMinibatch perplexity: 5.43\nValidation set perplexity: 4.64\nAverage loss at step 5000 : 1.610340662 learning rate: 1.0\nMinibatch perplexity: 5.10\n================================================================================\nin b one onarbs revieds the kimiluge that fondhtic fnoto cre one nine zero zero \n of is it of marking panzia t had wap ironicaghni relly deah the omber b h menba\nong messified it his the likdings ara subpore the a fames distaled self this int\ny advante authors the end languarle meit common tacing bevolitione and eight one\nzes that materly difild inllaring the fusts not panition assertian causecist bas\n================================================================================\nValidation set perplexity: 4.69\nAverage loss at step 5100 : 1.60593637228 learning rate: 1.0\nMinibatch perplexity: 4.69\nValidation set perplexity: 4.47\nAverage loss at step 5200 : 1.58993269444 learning rate: 1.0\nMinibatch perplexity: 4.65\nValidation set perplexity: 4.39\nAverage loss at step 5300 : 1.57930587292 learning rate: 1.0\nMinibatch perplexity: 5.11\nValidation set perplexity: 4.39\nAverage loss at step 5400 : 1.58022856832 learning rate: 1.0\nMinibatch perplexity: 5.19\nValidation set perplexity: 4.37\nAverage loss at step 5500 : 1.56654450059 learning rate: 1.0\nMinibatch perplexity: 4.69\nValidation set perplexity: 4.33\nAverage loss at step 5600 : 1.58013380885 learning rate: 1.0\nMinibatch perplexity: 5.13\nValidation set perplexity: 4.35\nAverage loss at step 5700 : 1.56974959254 learning rate: 1.0\nMinibatch perplexity: 5.00\nValidation set perplexity: 4.34\nAverage loss at step 5800 : 1.5839582932 learning rate: 1.0\nMinibatch perplexity: 4.88\nValidation set perplexity: 4.31\nAverage loss at step 5900 : 1.57129439116 learning rate: 1.0\nMinibatch perplexity: 4.66\nValidation set perplexity: 4.32\nAverage loss at step 6000 : 1.55144061089 learning rate: 1.0\nMinibatch perplexity: 4.55\n================================================================================\nutic clositical poopy stribe addi nixe one nine one zero zero eight zero b ha ex\nzerns b one internequiption of the secordy way anti proble akoping have fictiona\nphare united from has poporarly cities book ins sweden emperor a sass in origina\nquulk destrebinist and zeilazar and on low and by in science over country weilti\nx are holivia work missincis ons in the gages to starsle histon one icelanctrotu\n================================================================================\nValidation set perplexity: 4.30\nAverage loss at step 6100 : 1.56450940847 learning rate: 1.0\nMinibatch perplexity: 4.77\nValidation set perplexity: 4.27"
+ "item_id": 126
},
{
- "output_type": "stream",
- "stream": "stdout",
- "text": "\nAverage loss at step 6200 : 1.53433164835 learning rate: 1.0\nMinibatch perplexity: 4.77\nValidation set perplexity: 4.27\nAverage loss at step 6300 : 1.54773445129 learning rate: 1.0\nMinibatch perplexity: 4.76\nValidation set perplexity: 4.25\nAverage loss at step 6400 : 1.54021131516 learning rate: 1.0\nMinibatch perplexity: 4.56\nValidation set perplexity: 4.24\nAverage loss at step 6500 : 1.56153374553 learning rate: 1.0\nMinibatch perplexity: 5.43\nValidation set perplexity: 4.27\nAverage loss at step 6600 : 1.59556478739 learning rate: 1.0\nMinibatch perplexity: 4.92\nValidation set perplexity: 4.28\nAverage loss at step 6700 : 1.58076951623 learning rate: 1.0\nMinibatch perplexity: 4.77\nValidation set perplexity: 4.30\nAverage loss at step 6800 : 1.6070714438 learning rate: 1.0\nMinibatch perplexity: 4.98\nValidation set perplexity: 4.28\nAverage loss at step 6900 : 1.58413293839 learning rate: 1.0\nMinibatch perplexity: 4.61\nValidation set perplexity: 4.29\nAverage loss at step 7000 : 1.57905534983 learning rate: 1.0\nMinibatch perplexity: 5.08\n================================================================================\njague are officiencinels ored by film voon higherise haik one nine on the iffirc\noshe provision that manned treatists on smalle bodariturmeristing the girto in s\nkis would softwenn mustapultmine truativersakys bersyim by s of confound esc bub\nry of the using one four six blain ira mannom marencies g with fextificallise re\n one son vit even an conderouss to person romer i a lebapter at obiding are iuse\n================================================================================\nValidation set perplexity: 4.25\n"
+ "item_id": 144
}
]
},
- {
- "metadata": {
- "id": "pl4vtmFfa5nn",
- "colab_type": "text"
+ "cellView": "both",
+ "executionInfo": {
+ "elapsed": 199909,
+ "status": "ok",
+ "timestamp": 1445965877333,
+ "user": {
+ "color": "#1FA15D",
+ "displayName": "Vincent Vanhoucke",
+ "isAnonymous": false,
+ "isMe": true,
+ "permissionId": "05076109866853157986",
+ "photoUrl": "//lh6.googleusercontent.com/-cCJa7dTDcgQ/AAAAAAAAAAI/AAAAAAAACgw/r2EZ_8oYer4/s50-c-k-no/photo.jpg",
+ "sessionId": "6f6f07b359200c46",
+ "userId": "102167687554210253930"
},
- "cell_type": "markdown",
- "source": "---\nProblem 1\n---------\n\nYou might have noticed that the definition of the LSTM cell involves 4 matrix multiplications with the input, and 4 matrix multiplications with the output. Simplify the expression by using a single matrix multiply for each, and variables that are 4 times larger.\n\n---"
+ "user_tz": 420
},
+ "outputId": "5e868466-2532-4545-ce35-b403cf5d9de6"
+ },
+ "source": [
+ "num_steps = 7001\n",
+ "summary_frequency = 100\n",
+ "\n",
+ "with tf.Session(graph=graph) as session:\n",
+ " tf.initialize_all_variables().run()\n",
+ " print 'Initialized'\n",
+ " mean_loss = 0\n",
+ " for step in xrange(num_steps):\n",
+ " batches = train_batches.next()\n",
+ " feed_dict = dict()\n",
+ " for i in xrange(num_unrollings + 1):\n",
+ " feed_dict[train_data[i]] = batches[i]\n",
+ " _, l, predictions, lr = session.run(\n",
+ " [optimizer, loss, train_prediction, learning_rate], feed_dict=feed_dict)\n",
+ " mean_loss += l\n",
+ " if step % summary_frequency == 0:\n",
+ " if step > 0:\n",
+ " mean_loss = mean_loss / summary_frequency\n",
+ " # The mean loss is an estimate of the loss over the last few batches.\n",
+ " print 'Average loss at step', step, ':', mean_loss, 'learning rate:', lr\n",
+ " mean_loss = 0\n",
+ " labels = np.concatenate(list(batches)[1:])\n",
+ " print 'Minibatch perplexity: %.2f' % float(\n",
+ " np.exp(logprob(predictions, labels)))\n",
+ " if step % (summary_frequency * 10) == 0:\n",
+ " # Generate some samples.\n",
+ " print '=' * 80\n",
+ " for _ in xrange(5):\n",
+ " feed = sample(random_distribution())\n",
+ " sentence = characters(feed)[0]\n",
+ " reset_sample_state.run()\n",
+ " for _ in xrange(79):\n",
+ " prediction = sample_prediction.eval({sample_input: feed})\n",
+ " feed = sample(prediction)\n",
+ " sentence += characters(feed)[0]\n",
+ " print sentence\n",
+ " print '=' * 80\n",
+ " # Measure validation set perplexity.\n",
+ " reset_sample_state.run()\n",
+ " valid_logprob = 0\n",
+ " for _ in xrange(valid_size):\n",
+ " b = valid_batches.next()\n",
+ " predictions = sample_prediction.eval({sample_input: b[0]})\n",
+ " valid_logprob = valid_logprob + logprob(predictions, b[1])\n",
+ " print 'Validation set perplexity: %.2f' % float(np.exp(\n",
+ " valid_logprob / valid_size))"
+ ],
+ "outputs": [
{
- "metadata": {
- "id": "4eErTCTybtph",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "---\nProblem 2\n---------\n\nWe want to train a LSTM over bigrams, that is pairs of consecutive characters like 'ab' instead of single characters like 'a'. Since the number of possible bigrams is large, feeding them directly to the LSTM using 1-hot encodings will lead to a very sparse representation that is very wasteful computationally.\n\na- Introduce an embedding lookup on the inputs, and feed the embeddings to the LSTM cell instead of the inputs themselves.\n\nb- Write a bigram-based LSTM, modeled on the character LSTM above.\n\nc- Introduce Dropout. For best practices on how to use Dropout in LSTMs, refer to this [article](http://arxiv.org/abs/1409.2329).\n\n---"
+ "output_type": "stream",
+ "text": [
+ "Initialized\n",
+ "Average loss at step 0 : 3.29904174805 learning rate: 10.0\n",
+ "Minibatch perplexity: 27.09\n",
+ "================================================================================\n",
+ "srk dwmrnuldtbbgg tapootidtu xsciu sgokeguw hi ieicjq lq piaxhazvc s fht wjcvdlh\n",
+ "lhrvallvbeqqquc dxd y siqvnle bzlyw nr rwhkalezo siie o deb e lpdg storq u nx o\n",
+ "meieu nantiouie gdys qiuotblci loc hbiznauiccb cqzed acw l tsm adqxplku gn oaxet\n",
+ "unvaouc oxchywdsjntdh zpklaejvxitsokeerloemee htphisb th eaeqseibumh aeeyj j orw\n",
+ "ogmnictpycb whtup otnilnesxaedtekiosqet liwqarysmt arj flioiibtqekycbrrgoysj\n",
+ "================================================================================\n",
+ "Validation set perplexity: 19.99\n",
+ "Average loss at step 100 : 2.59553678274 learning rate: 10.0\n",
+ "Minibatch perplexity: 9.57\n",
+ "Validation set perplexity: 10.60\n",
+ "Average loss at step 200 : 2.24747137785 learning rate: 10.0\n",
+ "Minibatch perplexity: 7.68\n",
+ "Validation set perplexity: 8.84\n",
+ "Average loss at step 300 : 2.09438110709 learning rate: 10.0\n",
+ "Minibatch perplexity: 7.41\n",
+ "Validation set perplexity: 8.13\n",
+ "Average loss at step 400 : 1.99440989017 learning rate: 10.0\n",
+ "Minibatch perplexity: 6.46\n",
+ "Validation set perplexity: 7.58\n",
+ "Average loss at step 500 : 1.9320810616 learning rate: 10.0\n",
+ "Minibatch perplexity: 6.30\n",
+ "Validation set perplexity: 6.88\n",
+ "Average loss at step 600 : 1.90935629249 learning rate: 10.0\n",
+ "Minibatch perplexity: 7.21\n",
+ "Validation set perplexity: 6.91\n",
+ "Average loss at step 700 : 1.85583009005 learning rate: 10.0\n",
+ "Minibatch perplexity: 6.13\n",
+ "Validation set perplexity: 6.60\n",
+ "Average loss at step 800 : 1.82152368546 learning rate: 10.0\n",
+ "Minibatch perplexity: 6.01\n",
+ "Validation set perplexity: 6.37\n",
+ "Average loss at step 900 : 1.83169809818 learning rate: 10.0\n",
+ "Minibatch perplexity: 7.20\n",
+ "Validation set perplexity: 6.23\n",
+ "Average loss at step 1000 : 1.82217029214 learning rate: 10.0\n",
+ "Minibatch perplexity: 6.73\n",
+ "================================================================================\n",
+ "le action b of the tert sy ofter selvorang previgned stischdy yocal chary the co\n",
+ "le relganis networks partucy cetinning wilnchan sics rumeding a fulch laks oftes\n",
+ "hian andoris ret the ecause bistory l pidect one eight five lack du that the ses\n",
+ "aiv dromery buskocy becomer worils resism disele retery exterrationn of hide in \n",
+ "mer miter y sught esfectur of the upission vain is werms is vul ugher compted by\n",
+ "================================================================================\n",
+ "Validation set perplexity: 6.07\n",
+ "Average loss at step 1100 : 1.77301145077 learning rate: 10.0\n",
+ "Minibatch perplexity: 6.03\n",
+ "Validation set perplexity: 5.89\n",
+ "Average loss at step 1200 : 1.75306463003 learning rate: 10.0\n",
+ "Minibatch perplexity: 6.50\n",
+ "Validation set perplexity: 5.61\n",
+ "Average loss at step 1300 : 1.72937195778 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.00\n",
+ "Validation set perplexity: 5.60\n",
+ "Average loss at step 1400 : 1.74773373723 learning rate: 10.0\n",
+ "Minibatch perplexity: 6.48\n",
+ "Validation set perplexity: 5.66\n",
+ "Average loss at step 1500 : 1.7368799901 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.22\n",
+ "Validation set perplexity: 5.44\n",
+ "Average loss at step 1600 : 1.74528762937 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.85\n",
+ "Validation set perplexity: 5.33\n",
+ "Average loss at step 1700 : 1.70881183743 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.33\n",
+ "Validation set perplexity: 5.56\n",
+ "Average loss at step 1800 : 1.67776108027 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.33\n",
+ "Validation set perplexity: 5.29\n",
+ "Average loss at step 1900 : 1.64935536742 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.29\n",
+ "Validation set perplexity: 5.15\n",
+ "Average loss at step"
+ ],
+ "name": "stdout"
},
{
- "metadata": {
- "id": "Y5tapX3kpcqZ",
- "colab_type": "text"
- },
- "cell_type": "markdown",
- "source": "---\nProblem 3\n---------\n\n(difficult!)\n\nWrite a sequence-to-sequence LSTM which mirrors all the words in a sentence. For example, if your input is:\n\n the quick brown fox\n \nthe model should attempt to output:\n\n eht kciuq nworb xof\n \nRefer to the lecture on how to put together a sequence-to-sequence model, as well as [this article](http://arxiv.org/abs/1409.3215) for best practices.\n\n---"
+ "output_type": "stream",
+ "text": [
+ " 2000 : 1.69528644681 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.13\n",
+ "================================================================================\n",
+ "vers soqually have one five landwing to docial page kagan lower with ther batern\n",
+ "ctor son alfortmandd tethre k skin the known purated to prooust caraying the fit\n",
+ "je in beverb is the sournction bainedy wesce tu sture artualle lines digra forme\n",
+ "m rousively haldio ourso ond anvary was for the seven solies hild buil s to te\n",
+ "zall for is it is one nine eight eight one neval to the kime typer oene where he\n",
+ "================================================================================\n",
+ "Validation set perplexity: 5.25\n",
+ "Average loss at step 2100 : 1.68808053017 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.17\n",
+ "Validation set perplexity: 5.01\n",
+ "Average loss at step 2200 : 1.68322490931 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.09\n",
+ "Validation set perplexity: 5.15\n",
+ "Average loss at step 2300 : 1.64465074301 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.51\n",
+ "Validation set perplexity: 5.00\n",
+ "Average loss at step 2400 : 1.66408578038 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.86\n",
+ "Validation set perplexity: 4.80\n",
+ "Average loss at step 2500 : 1.68515402555 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.75\n",
+ "Validation set perplexity: 4.82\n",
+ "Average loss at step 2600 : 1.65405208349 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.38\n",
+ "Validation set perplexity: 4.85\n",
+ "Average loss at step 2700 : 1.65706222177 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.46\n",
+ "Validation set perplexity: 4.78\n",
+ "Average loss at step 2800 : 1.65204829812 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.06\n",
+ "Validation set perplexity: 4.64\n",
+ "Average loss at step 2900 : 1.65107253551 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.00\n",
+ "Validation set perplexity: 4.61\n",
+ "Average loss at step 3000 : 1.6495274055 learning rate: 10.0\n",
+ "Minibatch perplexity: 4.53\n",
+ "================================================================================\n",
+ "ject covered in belo one six six to finsh that all di rozial sime it a the lapse\n",
+ "ble which the pullic bocades record r to sile dric two one four nine seven six f\n",
+ " originally ame the playa ishaps the stotchational in a p dstambly name which as\n",
+ "ore volum to bay riwer foreal in nuily operety can and auscham frooripm however \n",
+ "kan traogey was lacous revision the mott coupofiteditey the trando insended frop\n",
+ "================================================================================\n",
+ "Validation set perplexity: 4.76\n",
+ "Average loss at step 3100 : 1.63705502152 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.50\n",
+ "Validation set perplexity: 4.76\n",
+ "Average loss at step 3200 : 1.64740695596 learning rate: 10.0\n",
+ "Minibatch perplexity: 4.84\n",
+ "Validation set perplexity: 4.67\n",
+ "Average loss at step 3300 : 1.64711504817 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.39\n",
+ "Validation set perplexity: 4.57\n",
+ "Average loss at step 3400 : 1.67113256454 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.56\n",
+ "Validation set perplexity: 4.71\n",
+ "Average loss at step 3500 : 1.65637169957 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.03\n",
+ "Validation set perplexity: 4.80\n",
+ "Average loss at step 3600 : 1.66601825476 learning rate: 10.0\n",
+ "Minibatch perplexity: 4.63\n",
+ "Validation set perplexity: 4.52\n",
+ "Average loss at step 3700 : 1.65021387935 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.50\n",
+ "Validation set perplexity: 4.56\n",
+ "Average loss at step 3800 : 1.64481814981 learning rate: 10.0\n",
+ "Minibatch perplexity: 4.60\n",
+ "Validation set perplexity: 4.54\n",
+ "Average loss at step 3900 : 1.642069453 learning rate: 10.0\n",
+ "Minibatch perplexity: 4.91\n",
+ "Validation set perplexity: 4.54\n",
+ "Average loss at step 4000 : 1.65179730773 learning rate: 10.0\n",
+ "Minibatch perplexity: 4.77\n",
+ "================================================================================\n",
+ "k s rasbonish roctes the nignese at heacle was sito of beho anarchys and with ro\n",
+ "jusar two sue wletaus of chistical in causations d ow trancic bruthing ha laters\n",
+ "de and speacy pulted yoftret worksy zeatlating to eight d had to ie bue seven si"
+ ],
+ "name": "stdout"
+ },
+ {
+ "output_type": "stream",
+ "text": [
+ "\n",
+ "s fiction of the feelly constive suq flanch earlied curauking bjoventation agent\n",
+ "quen s playing it calana our seopity also atbellisionaly comexing the revideve i\n",
+ "================================================================================\n",
+ "Validation set perplexity: 4.58\n",
+ "Average loss at step 4100 : 1.63794238806 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.47\n",
+ "Validation set perplexity: 4.79\n",
+ "Average loss at step 4200 : 1.63822438836 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.30\n",
+ "Validation set perplexity: 4.54\n",
+ "Average loss at step 4300 : 1.61844664574 learning rate: 10.0\n",
+ "Minibatch perplexity: 4.69\n",
+ "Validation set perplexity: 4.54\n",
+ "Average loss at step 4400 : 1.61255454302 learning rate: 10.0\n",
+ "Minibatch perplexity: 4.67\n",
+ "Validation set perplexity: 4.54\n",
+ "Average loss at step 4500 : 1.61543365479 learning rate: 10.0\n",
+ "Minibatch perplexity: 4.83\n",
+ "Validation set perplexity: 4.69\n",
+ "Average loss at step 4600 : 1.61607327104 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.18\n",
+ "Validation set perplexity: 4.64\n",
+ "Average loss at step 4700 : 1.62757282495 learning rate: 10.0\n",
+ "Minibatch perplexity: 4.24\n",
+ "Validation set perplexity: 4.66\n",
+ "Average loss at step 4800 : 1.63222063541 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.30\n",
+ "Validation set perplexity: 4.53\n",
+ "Average loss at step 4900 : 1.63678096652 learning rate: 10.0\n",
+ "Minibatch perplexity: 5.43\n",
+ "Validation set perplexity: 4.64\n",
+ "Average loss at step 5000 : 1.610340662 learning rate: 1.0\n",
+ "Minibatch perplexity: 5.10\n",
+ "================================================================================\n",
+ "in b one onarbs revieds the kimiluge that fondhtic fnoto cre one nine zero zero \n",
+ " of is it of marking panzia t had wap ironicaghni relly deah the omber b h menba\n",
+ "ong messified it his the likdings ara subpore the a fames distaled self this int\n",
+ "y advante authors the end languarle meit common tacing bevolitione and eight one\n",
+ "zes that materly difild inllaring the fusts not panition assertian causecist bas\n",
+ "================================================================================\n",
+ "Validation set perplexity: 4.69\n",
+ "Average loss at step 5100 : 1.60593637228 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.69\n",
+ "Validation set perplexity: 4.47\n",
+ "Average loss at step 5200 : 1.58993269444 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.65\n",
+ "Validation set perplexity: 4.39\n",
+ "Average loss at step 5300 : 1.57930587292 learning rate: 1.0\n",
+ "Minibatch perplexity: 5.11\n",
+ "Validation set perplexity: 4.39\n",
+ "Average loss at step 5400 : 1.58022856832 learning rate: 1.0\n",
+ "Minibatch perplexity: 5.19\n",
+ "Validation set perplexity: 4.37\n",
+ "Average loss at step 5500 : 1.56654450059 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.69\n",
+ "Validation set perplexity: 4.33\n",
+ "Average loss at step 5600 : 1.58013380885 learning rate: 1.0\n",
+ "Minibatch perplexity: 5.13\n",
+ "Validation set perplexity: 4.35\n",
+ "Average loss at step 5700 : 1.56974959254 learning rate: 1.0\n",
+ "Minibatch perplexity: 5.00\n",
+ "Validation set perplexity: 4.34\n",
+ "Average loss at step 5800 : 1.5839582932 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.88\n",
+ "Validation set perplexity: 4.31\n",
+ "Average loss at step 5900 : 1.57129439116 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.66\n",
+ "Validation set perplexity: 4.32\n",
+ "Average loss at step 6000 : 1.55144061089 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.55\n",
+ "================================================================================\n",
+ "utic clositical poopy stribe addi nixe one nine one zero zero eight zero b ha ex\n",
+ "zerns b one internequiption of the secordy way anti proble akoping have fictiona\n",
+ "phare united from has poporarly cities book ins sweden emperor a sass in origina\n",
+ "quulk destrebinist and zeilazar and on low and by in science over country weilti\n",
+ "x are holivia work missincis ons in the gages to starsle histon one icelanctrotu\n",
+ "================================================================================\n",
+ "Validation set perplexity: 4.30\n",
+ "Average loss at step 6100 : 1.56450940847 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.77\n",
+ "Validation set perplexity: 4.27"
+ ],
+ "name": "stdout"
+ },
+ {
+ "output_type": "stream",
+ "text": [
+ "\n",
+ "Average loss at step 6200 : 1.53433164835 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.77\n",
+ "Validation set perplexity: 4.27\n",
+ "Average loss at step 6300 : 1.54773445129 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.76\n",
+ "Validation set perplexity: 4.25\n",
+ "Average loss at step 6400 : 1.54021131516 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.56\n",
+ "Validation set perplexity: 4.24\n",
+ "Average loss at step 6500 : 1.56153374553 learning rate: 1.0\n",
+ "Minibatch perplexity: 5.43\n",
+ "Validation set perplexity: 4.27\n",
+ "Average loss at step 6600 : 1.59556478739 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.92\n",
+ "Validation set perplexity: 4.28\n",
+ "Average loss at step 6700 : 1.58076951623 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.77\n",
+ "Validation set perplexity: 4.30\n",
+ "Average loss at step 6800 : 1.6070714438 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.98\n",
+ "Validation set perplexity: 4.28\n",
+ "Average loss at step 6900 : 1.58413293839 learning rate: 1.0\n",
+ "Minibatch perplexity: 4.61\n",
+ "Validation set perplexity: 4.29\n",
+ "Average loss at step 7000 : 1.57905534983 learning rate: 1.0\n",
+ "Minibatch perplexity: 5.08\n",
+ "================================================================================\n",
+ "jague are officiencinels ored by film voon higherise haik one nine on the iffirc\n",
+ "oshe provision that manned treatists on smalle bodariturmeristing the girto in s\n",
+ "kis would softwenn mustapultmine truativersakys bersyim by s of confound esc bub\n",
+ "ry of the using one four six blain ira mannom marencies g with fextificallise re\n",
+ " one son vit even an conderouss to person romer i a lebapter at obiding are iuse\n",
+ "================================================================================\n",
+ "Validation set perplexity: 4.25\n"
+ ],
+ "name": "stdout"
}
+ ],
+ "execution_count": 0
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "pl4vtmFfa5nn",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 1\n",
+ "---------\n",
+ "\n",
+ "You might have noticed that the definition of the LSTM cell involves 4 matrix multiplications with the input, and 4 matrix multiplications with the output. Simplify the expression by using a single matrix multiply for each, and variables that are 4 times larger.\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "4eErTCTybtph",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 2\n",
+ "---------\n",
+ "\n",
+ "We want to train a LSTM over bigrams, that is pairs of consecutive characters like 'ab' instead of single characters like 'a'. Since the number of possible bigrams is large, feeding them directly to the LSTM using 1-hot encodings will lead to a very sparse representation that is very wasteful computationally.\n",
+ "\n",
+ "a- Introduce an embedding lookup on the inputs, and feed the embeddings to the LSTM cell instead of the inputs themselves.\n",
+ "\n",
+ "b- Write a bigram-based LSTM, modeled on the character LSTM above.\n",
+ "\n",
+ "c- Introduce Dropout. For best practices on how to use Dropout in LSTMs, refer to this [article](http://arxiv.org/abs/1409.2329).\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "Y5tapX3kpcqZ",
+ "colab_type": "text"
+ },
+ "source": [
+ "---\n",
+ "Problem 3\n",
+ "---------\n",
+ "\n",
+ "(difficult!)\n",
+ "\n",
+ "Write a sequence-to-sequence LSTM which mirrors all the words in a sentence. For example, if your input is:\n",
+ "\n",
+ " the quick brown fox\n",
+ " \n",
+ "the model should attempt to output:\n",
+ "\n",
+ " eht kciuq nworb xof\n",
+ " \n",
+ "Refer to the lecture on how to put together a sequence-to-sequence model, as well as [this article](http://arxiv.org/abs/1409.3215) for best practices.\n",
+ "\n",
+ "---"
]
}
- ],
- "metadata": {
- "name": "6_lstm.ipynb",
- "colabVersion": "0.3.2",
- "colab_views": {},
- "colab_default_view": {}
- },
- "nbformat": 3,
- "nbformat_minor": 0
+ ]
} \ No newline at end of file