summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2017-09-08 15:41:31 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2017-09-08 15:46:24 -0400
commit5ef1c9b5690057e5b18dc7dcc3627776b400c544 (patch)
treef71d9ad13509977736bd55698cb4ccc18311e091 /doc
parent23f55c0efdd58f8024d9b0c9e4b02db7b8d27b61 (diff)
S3 export (untested)
It opens a http connection per file exported, but then so does git annex copy --to s3. Decided not to munge exported filenames for IA. Too large a chance of the munging having confusing results. Instead, export of files not supported by IA, eg with spaces in their name, will fail. This commit was supported by the NSF-funded DataLad project.
Diffstat (limited to 'doc')
-rw-r--r--doc/special_remotes/S3.mdwn4
-rw-r--r--doc/tips/Internet_Archive_via_S3.mdwn35
-rw-r--r--doc/todo/export.mdwn10
3 files changed, 21 insertions, 28 deletions
diff --git a/doc/special_remotes/S3.mdwn b/doc/special_remotes/S3.mdwn
index d526d35f5..138105f1d 100644
--- a/doc/special_remotes/S3.mdwn
+++ b/doc/special_remotes/S3.mdwn
@@ -66,6 +66,10 @@ the S3 remote.
so by default, a bucket name is chosen based on the remote name
and UUID. This can be specified to pick a bucket name.
+* `exporttree` - Set to "yes" to make this special remote usable
+ by [[git-annex-export]]. It will not be usable as a general-purpose
+ special remote.
+
* `public` - Set to "yes" to allow public read access to files sent
to the S3 remote. This is accomplished by setting an ACL when each
file is uploaded to the remote. So, changes to this setting will
diff --git a/doc/tips/Internet_Archive_via_S3.mdwn b/doc/tips/Internet_Archive_via_S3.mdwn
index 15f241c9f..20d14bdec 100644
--- a/doc/tips/Internet_Archive_via_S3.mdwn
+++ b/doc/tips/Internet_Archive_via_S3.mdwn
@@ -55,31 +55,14 @@ from it. Also, git-annex whereis will tell you a public url for the file
on archive.org. (It may take a while for archive.org to make the file
publically visibile.)
-Note the use of the SHA256E [[backend|backends]] when adding files. That is
-the default backend used by git-annex, but even if you don't normally use
-it, it makes most sense to use the WORM or SHA256E backend for files that
-will be stored in the Internet Archive, since the key name will be exposed
-as the filename there, and since the Archive does special processing of
-files based on their extension.
+## exporting trees
-## publishing only one subdirectory
+By default, files stored in the Internet Archive will show up there named
+by their git-annex key, not the original filename. If the filenames
+are important, you can run `git annex initremote` with an additional
+parameter "exporttree=yes", and then use [[git-annex-export]] to publish
+a tree of files to the Internet Archive.
-Perhaps you have a repository with lots of files in it, and only want
-to publish some of them to a particular Internet Archive item. Of course
-you can specify which files to send manually, but it's useful to
-configure [[preferred_content]] settings so git-annex knows what content
-you want to store in the Internet Archive.
-
-One way to do this is using the "public" repository type.
-
- git annex enableremote archive-panama preferreddir=panama
- git annex wanted archive-panama standard
- git annex group archive-panama public
-
-Now anything in a "panama" directory will be sent to that remote,
-and anything else won't. You can use `git annex copy --auto` or the
-assistant and it'll do the right thing.
-
-When setting up an Internet Archive item using the webapp, this
-configuration is automatically done, using an item name that the user
-enters as the name of the subdirectory.
+Note that the Internet Archive does not support filenames containing
+whitespace and some other characters. Exporting such problem filenames will
+fail; you can rename the file and re-export.
diff --git a/doc/todo/export.mdwn b/doc/todo/export.mdwn
index 7a94cd1c8..535678c2a 100644
--- a/doc/todo/export.mdwn
+++ b/doc/todo/export.mdwn
@@ -24,8 +24,14 @@ Work is in progress. Todo list:
export from another repository also doesn't work right, because the
export database is not populated. So, seems that the export database needs
to get populated based on the export log in these cases.
-* Support export to aditional special remotes (S3 etc)
-* Support export to external special remotes.
+* Support export to aditional special remotes (webdav etc)
+* Support export in the assistant (when eg setting up a S3 special remote).
+ Would need git-annex sync to export to the master tree?
+ This is similar to the little-used preferreddir= preferred content
+ setting and the "public" repository group.
+* Test S3 export.
+* Test export to IA via S3. In particualar, does removing an exported file
+ work?
Low priority: