summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2011-05-16 11:20:30 -0400
committerGravatar Joey Hess <joey@kitenet.net>2011-05-16 11:20:35 -0400
commit1d2984441c654f01e88e427f3289f8066cd2e6b0 (patch)
tree6e0232740696fc94e2f78becb262c18f45ef2506 /doc
parent79c74bf27dfb9795ad35bc4e4c2061004212621d (diff)
add a few tweaks to make it easy to use the Internet Archive's variant of S3
In particular, munge key filenames to comply with the IA's filename limits, disable encryption, support their nonstandard way of creating buckets, and allow x-amz-* headers to be specified in initremote to set item metadata. Still TODO: initremote does not handle multiword metadata headers right.
Diffstat (limited to 'doc')
-rw-r--r--doc/special_remotes/S3.mdwn13
-rw-r--r--doc/walkthrough/Internet_Archive_via_S3.mdwn42
2 files changed, 32 insertions, 23 deletions
diff --git a/doc/special_remotes/S3.mdwn b/doc/special_remotes/S3.mdwn
index abd61ac79..d6a7229e3 100644
--- a/doc/special_remotes/S3.mdwn
+++ b/doc/special_remotes/S3.mdwn
@@ -5,6 +5,12 @@ See [[walkthrough/using_Amazon_S3]] for usage examples.
## configuration
+The standard environment variables `ANNEX_S3_ACCESS_KEY_ID` and
+`ANNEX_S3_SECRET_ACCESS_KEY` are used to supply login credentials
+for Amazon. When encryption is enabled, they are stored in encrypted form
+by `git annex initremote`, so you do not need to keep the environment
+variables set after the initial initalization of the remote.
+
A number of parameters can be passed to `git annex initremote` to configure
the S3 remote.
@@ -29,8 +35,5 @@ the S3 remote.
so by default, a bucket name is chosen based on the remote name
and UUID. This can be specified to pick a bucket name.
-The standard environment variables `ANNEX_S3_ACCESS_KEY_ID` and
-`ANNEX_S3_SECRET_ACCESS_KEY` can be used to supply login credentials
-for Amazon. When encryption is enabled, they are stored in encrypted form
-by `git annex initremote`, so you do not need to keep the environment
-variables set after the initial initalization of the remote.
+* `x-amz-*` are passed through as http headers when storing keys
+ in S3.
diff --git a/doc/walkthrough/Internet_Archive_via_S3.mdwn b/doc/walkthrough/Internet_Archive_via_S3.mdwn
index 3cd83a2e7..e0f8fafb4 100644
--- a/doc/walkthrough/Internet_Archive_via_S3.mdwn
+++ b/doc/walkthrough/Internet_Archive_via_S3.mdwn
@@ -15,20 +15,18 @@ Sign up for an account, and get your access keys here:
# export AWS_ACCESS_KEY_ID=blahblah
# export AWS_SECRET_ACCESS_KEY=xxxxxxx
-Now go to <http://www.archive.org/create/> and create the item.
-This allows you to fill in metadata which git-annex cannot provide to the
-Internet Archive. (It also works around a bug with bucket creation.)
-
-(Note that there seems to be a bug in either hS3 or the archive that
-breaks authentication when the item name contains spaces or upper-case
-letters.. use all lowercase and no spaces.)
-
-Specify `host=s3.us.archive.org` when doing initremote to set up
-a remote at the Archive. It does not make sense to use encryption.
-For the bucket name, specify the item name you created earlier.
-
- # git annex initremote panama type=S3 encryption=none host=s3.us.archive.org bucket=panama-canal-lock-blueprints
- initremote archive-panama (checking bucket) (creating bucket in US) ok
+Specify `host=s3.us.archive.org` when doing `initremote` to set up
+a remote at the Archive. This will enable a special Internet Archive mode:
+Encryption is not allowed; you are required to specify a bucket name
+rather than letting git-annex pick a random one; and you can optionally
+specify `x-archive-meta*` headers to add metadata as explained in their
+[documentation](http://www.archive.org/help/abouts3.txt).
+
+ # git annex initremote archive-panama type=S3
+ # host=s3.us.archive.org bucket=panama-canal-lock-blueprints \
+ x-archive-meta-mediatype=texts x-archive-meta-language=eng \
+ x-archive-meta-title="original Panama Canal lock design blueprints"
+ initremote archive-panama (Internet Archive mode) (checking bucket) (creating bucket in US) ok
# git annex describe archive-panama "Internet Archive item for my grandfather's Panama Canal lock design blueprints"
describe archive-panama ok
@@ -36,11 +34,19 @@ Then you can annex files and copy them to the remote as usual:
# git annex add photo1.jpeg
add photo1.jpeg ok
- # git annex copy photo1.jpeg --to archive-panama
- copy (checking archive-panama...) (to archive-panama...) ok
+ # git annex copy photo1.jpeg --fast --to archive-panama
+ copy (to archive-panama...) ok
+
+-----
Note that it probably makes the most sense to use the WORM backend
for files, since that exposes the original filename in the key stored
in the Archive, which allows its special processing for sound files,
-movies, etc to be done. Also, the Internet Archive has restrictions
-on what is allowed in a filename; particularly no spaces are allowed.
+movies, etc to be done.
+
+Also, the Internet Archive has restrictions on what is allowed in a
+filename; particularly no spaces are allowed.
+
+There seems to be a bug in either hS3 or the archive that breaks
+authentication when the bucket name contains spaces or upper-case letters..
+use all lowercase and no spaces when making the bucket with `initremote`.