tensorflow/docs_src/deploy/s3.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

# How to run TensorFlow on S3

This document describes how to run TensorFlow on S3 file system.

## S3

We assume that you are familiar with @{$reading_data$reading data}.

To use S3 with TensorFlow, change the file paths you use to read and write
data to an S3 path. For example:

```python
filenames = ["s3://bucketname/path/to/file1.tfrecord",
             "s3://bucketname/path/to/file2.tfrecord"]
dataset = tf.data.TFRecordDataset(filenames)
```

When reading or writing data on S3 with your TensorFlow program, the behavior
could be controlled by various environmental variables:

*   **AWS_REGION**: By default, regional endpoint is used for S3, with region
    controlled by `AWS_REGION`. If `AWS_REGION` is not specified, then
    `us-east-1` is used.
*   **S3_ENDPOINT**: The endpoint could be overridden explicitly with
    `S3_ENDPOINT` specified.
*   **S3_USE_HTTPS**: HTTPS is used to access S3 by default, unless
    `S3_USE_HTTPS=0`.
*   **S3_VERIFY_SSL**: If HTTPS is used, SSL verification could be disabled
    with `S3_VERIFY_SSL=0`.

To read or write objects in a bucket that is no publicly accessible,
AWS credentials must be provided through one of the following methods:

*   Set credentials in the AWS credentials profile file on the local system,
    located at: `~/.aws/credentials` on Linux, macOS, or Unix, or
    `C:\Users\USERNAME\.aws\credentials` on Windows.
*   Set the `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` environment
    variables.
*   If TensorFlow is deployed on an EC2 instance, specify an IAM role and then
    give the EC2 instance access to that role.