doc/todo/credentials-less_access_to_s3/comment_1_eda703c0902d346e063c6d7ae00f6400._comment


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

[[!comment format=mdwn
 username="joey"
 subject="""comment 1"""
 date="2015-06-02T17:24:10Z"
 content="""
The S3 remote already does it when using the IA as a special case --
it knows how to map from IA S3 buckets to archive.org urls, and so the
special remote adds the url automatically. It would be quite easy to
add that for non-IA; the user would just need to configure the url where
the files in the bucket can be found.

As noted, this does increase storage needs, since that's written to the
git-annex branch. Although I'd expect it will compress pretty well when git
gets around to packing it, since the added data is static base of the
url, plus the key. Plus a little bit of overhead for hash directories and
timestamps. Oh, and plus modifying the location tracking info to indicate
that the file is present in the web special remote. Ok, maybe it's a lot of
overhead. :)

Could avoid writing this info to the git-annex branch, and just use a
function to generate urls for keys that are known to be in S3. The
`whereisKey` interface already exists -- it's how the web special remote
adds urls to `whereis` display itself.

But I'm not sure what to do about the complication that the S3
special remote may not be enabled -- and if not, it doesn't get the
opportunity to register such a function.

Maybe this would need the S3 special remote to be enabled in read-only
mode? Although, if we're going to do that, it could just handle the
read-only retrieval itself, and not involve the web special remote at all.

So:

	git annex initremote S3 type=s3 ... bucket=foo readonlyurl=http://foo.s3.amazon.com/
	# elsewhere
	git annex enableremote S3 readonly=true
"""]]