aboutsummaryrefslogtreecommitdiff
path: root/doc/tips/Internet_Archive_via_S3.mdwn
diff options
context:
space:
mode:
Diffstat (limited to 'doc/tips/Internet_Archive_via_S3.mdwn')
-rw-r--r--doc/tips/Internet_Archive_via_S3.mdwn49
1 files changed, 49 insertions, 0 deletions
diff --git a/doc/tips/Internet_Archive_via_S3.mdwn b/doc/tips/Internet_Archive_via_S3.mdwn
new file mode 100644
index 000000000..8c0f2dde7
--- /dev/null
+++ b/doc/tips/Internet_Archive_via_S3.mdwn
@@ -0,0 +1,49 @@
+[The Internet Archive](http://www.archive.org/) allows members to upload
+collections using an Amazon S3
+[compatible API](http://www.archive.org/help/abouts3.txt), and this can
+be used with git-annex's [[special_remotes/S3]] support.
+
+So, you can locally archive things with git-annex, define remotes that
+correspond to "items" at the Internet Archive, and use git-annex to upload
+your files to there. Of course, your use of the Internet Archive must
+comply with their [terms of service](http://www.archive.org/about/terms.php).
+
+Sign up for an account, and get your access keys here:
+<http://www.archive.org/account/s3.php>
+
+ # export AWS_ACCESS_KEY_ID=blahblah
+ # export AWS_SECRET_ACCESS_KEY=xxxxxxx
+
+Specify `host=s3.us.archive.org` when doing `initremote` to set up
+a remote at the Archive. This will enable a special Internet Archive mode:
+Encryption is not allowed; you are required to specify a bucket name
+rather than having git-annex pick a random one; and you can optionally
+specify `x-archive-meta*` headers to add metadata as explained in their
+[documentation](http://www.archive.org/help/abouts3.txt).
+
+[[!template id=note text="""
+/!\ There seems to be a bug in either hS3 or the archive that breaks
+authentication when the bucket name contains spaces or upper-case letters..
+use all lowercase and no spaces when making the bucket with `initremote`.
+"""]]
+
+ # git annex initremote archive-panama type=S3 \
+ host=s3.us.archive.org bucket=panama-canal-lock-blueprints \
+ x-archive-meta-mediatype=texts x-archive-meta-language=eng \
+ x-archive-meta-title="original Panama Canal lock design blueprints"
+ initremote archive-panama (Internet Archive mode) ok
+ # git annex describe archive-panama "a man, a plan, a canal: panama"
+ describe archive-panama ok
+
+Then you can annex files and copy them to the remote as usual:
+
+ # git annex add photo1.jpeg --backend=SHA1E
+ add photo1.jpeg (checksum...) ok
+ # git annex copy photo1.jpeg --fast --to archive-panama
+ copy (to archive-panama...) ok
+
+Note the use of the SHA1E [[backend|backends]]. It makes most sense
+to use the WORM or SHA1E backend for files that will be stored in
+the Internet Archive, since the key name will be exposed as the filename
+there, and since the Archive does special processing of files based on
+their extension.