summaryrefslogtreecommitdiff
path: root/doc/todo/S3.mdwn
blob: 09a64f1a7a4810e3bf37928d193f7402d27a9367 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
Support Amazon S3 as a file storage backend.

There's a haskell library that looks good. Not yet in Debian.

Multiple ways of using S3 are possible. Current plan is to 
have a special type of git remote (though git won't know how to use it;
only git-annex will) that uses a S3 bucket.

Something like:

	[remote "s3"]
		annex-s3bucket = mybucket
		annex-s3datacenter = Europe
		annex-uuid = 1a586cf6-45e9-11e0-ba9c-3b0a3397aec2
		annex-cost = 500

The UUID will be stored as a special file in the S3 bucket.

Using a different type of remote like this will allow S3 to be used
anywhere a regular remote would be used. `git annex get` will transparently
download a file from S3 if S3 has it and is the cheapest remote.

	git annex copy --to s3
	git annex move --from s3
	git annex drop --from s3 # not currently allowed, will need adding

Each s3 remote will count as one copy for numcopies handling, just like
any other remote.

## unused checking

One problem is `git annex unused`. Currently it only looks at the local
repository, not remotes. But if something is dropped from the local repo,
and you forget to drop it from S3, cruft can build up there.

This could be fixed by adding a hook to list all keys present in a remote.
Then unused could scan remotes for keys, and if they were not used locally,
offer the possibility to drop them from the remote.