1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
|
# How to run TensorFlow on S3
This document describes how to run TensorFlow on S3 file system.
## S3
We assume that you are familiar with @{$reading_data$reading data}.
To use S3 with TensorFlow, change the file paths you use to read and write
data to an S3 path. For example:
```python
filenames = ["s3://bucketname/path/to/file1.tfrecord",
"s3://bucketname/path/to/file2.tfrecord"]
dataset = tf.data.TFRecordDataset(filenames)
```
When reading or writing data on S3 with your TensorFlow program, the behavior
could be controlled by various environmental variables:
* **AWS_REGION**: By default, regional endpoint is used for S3, with region
controlled by `AWS_REGION`. If `AWS_REGION` is not specified, then
`us-east-1` is used.
* **S3_ENDPOINT**: The endpoint could be overridden explicitly with
`S3_ENDPOINT` specified.
* **S3_USE_HTTPS**: HTTPS is used to access S3 by default, unless
`S3_USE_HTTPS=0`.
* **S3_VERIFY_SSL**: If HTTPS is used, SSL verification could be disabled
with `S3_VERIFY_SSL=0`.
To read or write objects in a bucket that is no publicly accessible,
AWS credentials must be provided through one of the following methods:
* Set credentials in the AWS credentials profile file on the local system,
located at: `~/.aws/credentials` on Linux, macOS, or Unix, or
`C:\Users\USERNAME\.aws\credentials` on Windows.
* Set the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment
variables.
* If TensorFlow is deployed on an EC2 instance, specify an IAM role and then
give the EC2 instance access to that role.
|