summaryrefslogtreecommitdiff
path: root/doc/special_remotes/Amazon_S3.mdwn
blob: 42c4a545345cf4e56934e8ef2a260fa2f5aa1af3 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
This special remote type stores file contents in a bucket in Amazon S3
or a similar service.

See [[walkthrough/using_Amazon_S3]] for usage examples.

## initremote parameters

A number of parameters can be passed to `git annex initremote` to configure
the S3 remote.

* `encryption` - Either "none" to disable encryption,
  or a value that can be looked up (using gpg -k) to find a gpg encryption
  key that will be given access to the remote. Note that additional gpg
  keys can be given access to a remote by rerunning initremote with
  the new key id.

* `datacenter` - Defaults to "US". Other values include "EU",
  "us-west-1", and "ap-southeast-1".

* `storageclass` - Default is "STANDARD". If you have configured git-annex
  to preserve multiple [[copies]], consider setting this to "REDUCED_REDUNDANCY"
  to save money.

* `host` and `port` - Specify in order to use a different, S3 compatable
  service.

## data security

When encryption=none, there is **no** protection against your data being read
as it is sent to/from S3, or by Amazon when it is stored in S3. This should
only be used for public data.

** Encryption is not yet supported. **

When encryption is enabled, all files stored in the bucket are
encrypted with gpg. Additionally, the filenames themselves are hashed
to obfuscate them. The size of the encrypted files, and access patterns of
the data, should be the only clues to what type of data you are storing in
S3.

[[!template id=note text="""
This scheme was originally developed by Lars Wirzenius at al
[for Obnam](http://braawi.org/obnam/encryption/).
"""]]
The data stored in S3 is encrypted by gpg with a symmetric cipher. The
passphrase of the cipher is itself checked into your git repository,
encrypted using one or more gpg public keys. This scheme allows new private
keys to be given access to a bucket's content, after the bucket is created
and is in use. The symmetric cipher is also hashed together with filenames
used in the bucket, in order to obfuscate the filenames.