summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2017-09-08 16:19:38 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2017-09-08 16:28:28 -0400
commit4e44bd5314174c2e71d93d124ec5067052f2ec56 (patch)
treeaa5fa2fae8890061ec90cf46cacc7ca4476e7a77
parent5ef1c9b5690057e5b18dc7dcc3627776b400c544 (diff)
S3 export finalization
Fixed ACL issue, and updated some documentation.
-rw-r--r--Remote/S3.hs6
-rw-r--r--doc/tips/public_Amazon_S3_remote.mdwn6
-rw-r--r--doc/tips/publishing_your_files_to_the_public.mdwn99
-rw-r--r--doc/tips/publishing_your_files_to_the_public/old_method.mdwn88
-rw-r--r--doc/tips/publishing_your_files_to_the_public/old_method/comment_1_48f545ce26dbec944f96796ed3b9204d._comment (renamed from doc/tips/publishing_your_files_to_the_public/comment_1_48f545ce26dbec944f96796ed3b9204d._comment)0
-rw-r--r--doc/tips/publishing_your_files_to_the_public/old_method/comment_2_27a40806d009d617b3ad56873197bf87._comment (renamed from doc/tips/publishing_your_files_to_the_public/comment_2_27a40806d009d617b3ad56873197bf87._comment)0
-rw-r--r--doc/tips/publishing_your_files_to_the_public/old_method/comment_3_2f5045629e40e8d881725876190c7846._comment (renamed from doc/tips/publishing_your_files_to_the_public/comment_3_2f5045629e40e8d881725876190c7846._comment)0
-rw-r--r--doc/tips/publishing_your_files_to_the_public/old_method/comment_4_37405f20da790141187e9f780c999448._comment (renamed from doc/tips/publishing_your_files_to_the_public/comment_4_37405f20da790141187e9f780c999448._comment)0
-rw-r--r--doc/tips/publishing_your_files_to_the_public/old_method/comment_5_29c3ee4aed6a5b53b6767a96a7b85ad9._comment (renamed from doc/tips/publishing_your_files_to_the_public/comment_5_29c3ee4aed6a5b53b6767a96a7b85ad9._comment)0
-rw-r--r--doc/todo/export.mdwn1
10 files changed, 120 insertions, 80 deletions
diff --git a/Remote/S3.hs b/Remote/S3.hs
index 96d24d00e..f80a08bb2 100644
--- a/Remote/S3.hs
+++ b/Remote/S3.hs
@@ -357,14 +357,16 @@ checkPresentExportS3 r info _k loc =
go = withS3Handle (config r) (gitconfig r) (uuid r) $ \h -> do
checkKeyHelper info h (T.pack $ bucketExportLocation info loc)
+-- S3 has no move primitive; copy and delete.
renameExportS3 :: Remote -> S3Info -> Key -> ExportLocation -> ExportLocation -> Annex Bool
renameExportS3 r info _k src dest = catchNonAsync go (\e -> warning (show e) >> return False)
where
go = withS3Handle (config r) (gitconfig r) (uuid r) $ \h -> do
- -- S3 has no move primitive; copy and delete.
- void $ sendS3Handle h $ S3.copyObject (bucket info) dstobject
+ let co = S3.copyObject (bucket info) dstobject
(S3.ObjectId (bucket info) srcobject Nothing)
S3.CopyMetadata
+ -- ACL is not preserved by copy.
+ void $ sendS3Handle h $ co { S3.coAcl = acl info }
void $ sendS3Handle h $ S3.DeleteObject srcobject (bucket info)
return True
srcobject = T.pack $ bucketExportLocation info src
diff --git a/doc/tips/public_Amazon_S3_remote.mdwn b/doc/tips/public_Amazon_S3_remote.mdwn
index d362fd75d..ce484adfb 100644
--- a/doc/tips/public_Amazon_S3_remote.mdwn
+++ b/doc/tips/public_Amazon_S3_remote.mdwn
@@ -2,6 +2,9 @@ Here's how to create a Amazon [[S3 special remote|special_remotes/S3]] that
can be read by anyone who gets a clone of your git-annex repository,
without them needing Amazon AWS credentials.
+If you want to publish files to S3 so they can be accessed without using
+git-annex, see [[publishing_your_files_to_the_public]].
+
Note: Bear in mind that Amazon will charge the owner of the bucket
for public downloads from that bucket.
@@ -52,6 +55,3 @@ who are not using git-annex. To find the url, use `git annex whereis`.
----
See [[special_remotes/S3]] for details about configuring S3 remotes.
-
-See [[publishing_your_files_to_the_public]] for other ways to use a public
-S3 bucket.
diff --git a/doc/tips/publishing_your_files_to_the_public.mdwn b/doc/tips/publishing_your_files_to_the_public.mdwn
index 5409dda0d..f7d332d57 100644
--- a/doc/tips/publishing_your_files_to_the_public.mdwn
+++ b/doc/tips/publishing_your_files_to_the_public.mdwn
@@ -1,88 +1,39 @@
# Creating a special S3 remote to hold files shareable by URL
-(In this example, I'll assume you'll be creating a bucket in S3 named **public-annex** and a special remote in git-annex, which will store its files in the previous bucket, named **public-s3**, but change these names if you are going to do the thing for real)
+In this example, I'll assume you'll be creating a bucket in Amazon S3 named
+$BUCKET and a special remote named public-s3. Be sure to replace $BUCKET
+with something like "public-bucket-joey" when you follow along in your
+shell.
-Set up your special [S3](http://git-annex.branchable.com/special_remotes/S3/) remote with (at least) these options:
+Set up your special [[S3 remote|special_remotes/S3]] with (at least) these options:
- git annex initremote public-s3 type=s3 encryption=none bucket=public-annex chunk=0 public=yes
+ git annex initremote public-s3 type=s3 encryption=none bucket=$BUCKET exporttree=yes public=yes encryption=none
-This way git-annex will upload the files to this repo, (when you call `git
-annex copy [FILES...] --to public-s3`) without encrypting them and without
-chunking them. And, thanks to the public=yes, they will be
-accessible by anyone with the link.
+Then export the files in the master branch to the remote:
-(Note that public=yes was added in git-annex version 5.20150605.
-If you have an older version, it will be silently ignored, and you
-will instead need to use the AWS dashboard to configure a public get policy
-for the bucket.)
+ git annex export master --to public-s3
-Following the example, the files will be accessible at `http://public-annex.s3.amazonaws.com/KEY` where `KEY` is the file key created by git-annex and which you can discover running
+You can run that command again to update the export. See
+[[git-annex-export]] for details.
- git annex lookupkey FILEPATH
+Each exported file will be available to the public from
+`http://$BUCKET.s3.amazonaws.com/$file`
-This way you can share a link to each file you have at your S3 remote.
+Note: Bear in mind that Amazon will charge the owner of the bucket
+for public downloads from that bucket.
-## Sharing all links in a folder
+# Indexes
-To share all the links in a given folder, for example, you can go to that folder and run (this is an example with the _fish_ shell, but I'm sure you can do the same in _bash_, I just don't know exactly):
+By default, there is no index.ntml file exported, so if you open
+`http://$BUCKET.s3.amazonaws.com/` in a web browser, you'll see an
+XML document listing the files.
- for filename in (ls)
- echo $filename": https://public-annex.s3.amazonaws.com/"(git annex lookupkey $filename)
- end
+For a nicer list of files, you can make an index.html file, check it into
+git, and export it to the bucket. You'll need to configure the bucket to
+use index.html as its index document, as
+[explained here](https://stackoverflow.com/questions/27899/is-there-a-way-to-have-index-html-functionality-with-content-hosted-on-s3).
-## Sharing all links matching certain metadata
+# Old method
-The same applies to all the filters you can do with git-annex.
-
-For example, let's share links to all the files whose _author_'s name starts with "Mario" and are, in fact, stored at your public-s3 remote.
-However, instead of just a list of links we will output a markdown-formatted list of the filenames linked to their S3 urls:
-
- for filename in (git annex find --metadata "author=Mario*" --and --in public-s3)
- echo "* ["$filename"](https://public-annex.s3.amazonaws.com/"(git annex lookupkey $filename)")"
- end
-
-Very useful.
-
-## Sharing links with time-limited URLs
-
-By using pre-signed URLs it is possible to create limits on how long a URL is valid for retrieving an object.
-To enable use a private S3 bucket for the remotes and then pre-sign actual URL with the script in [AWS-Tools](https://github.com/gdbtek/aws-tools).
-Example:
-
- key=`git annex lookupkey "$fname"`; sign_s3_url.bash --region 'eu-west-1' --bucket 'mybuck' --file-path $key --aws-access-key-id XX --aws-secret-access-key XX --method 'GET' --minute-expire 10
-
-## Adding the S3 URL as a source
-
-Assuming all files in the current directory are available on S3, this will register the public S3 url for the file in git-annex, making it available for everyone *through git-annex*:
-
-<pre>
-git annex find --in public-s3 | while read file ; do
- key=$(git annex lookupkey $file)
- echo $key https://public-annex.s3.amazonaws.com/$key
-done | git annex registerurl
-</pre>
-
-`registerurl` was introduced in `5.20150317`.
-
-## Manually configuring a public get policy
-
-Here is how to manually configure a public get policy
-for a bucket, in the AWS dashboard.
-
- {
- "Version": "2008-10-17",
- "Statement": [
- {
- "Sid": "AllowPublicRead",
- "Effect": "Allow",
- "Principal": {
- "AWS": "*"
- },
- "Action": "s3:GetObject",
- "Resource": "arn:aws:s3:::public-annex/*"
- }
- ]
- }
-
-This should not be necessary if using a new enough version
-of git-annex, which can instead be configured with public=yet.
+To use `git annex export`, you need git-annex version 6.20170909 or
+newer. Before we had `git annex export` an [[old_method]] was used instead.
diff --git a/doc/tips/publishing_your_files_to_the_public/old_method.mdwn b/doc/tips/publishing_your_files_to_the_public/old_method.mdwn
new file mode 100644
index 000000000..5409dda0d
--- /dev/null
+++ b/doc/tips/publishing_your_files_to_the_public/old_method.mdwn
@@ -0,0 +1,88 @@
+# Creating a special S3 remote to hold files shareable by URL
+
+(In this example, I'll assume you'll be creating a bucket in S3 named **public-annex** and a special remote in git-annex, which will store its files in the previous bucket, named **public-s3**, but change these names if you are going to do the thing for real)
+
+Set up your special [S3](http://git-annex.branchable.com/special_remotes/S3/) remote with (at least) these options:
+
+ git annex initremote public-s3 type=s3 encryption=none bucket=public-annex chunk=0 public=yes
+
+This way git-annex will upload the files to this repo, (when you call `git
+annex copy [FILES...] --to public-s3`) without encrypting them and without
+chunking them. And, thanks to the public=yes, they will be
+accessible by anyone with the link.
+
+(Note that public=yes was added in git-annex version 5.20150605.
+If you have an older version, it will be silently ignored, and you
+will instead need to use the AWS dashboard to configure a public get policy
+for the bucket.)
+
+Following the example, the files will be accessible at `http://public-annex.s3.amazonaws.com/KEY` where `KEY` is the file key created by git-annex and which you can discover running
+
+ git annex lookupkey FILEPATH
+
+This way you can share a link to each file you have at your S3 remote.
+
+## Sharing all links in a folder
+
+To share all the links in a given folder, for example, you can go to that folder and run (this is an example with the _fish_ shell, but I'm sure you can do the same in _bash_, I just don't know exactly):
+
+ for filename in (ls)
+ echo $filename": https://public-annex.s3.amazonaws.com/"(git annex lookupkey $filename)
+ end
+
+## Sharing all links matching certain metadata
+
+The same applies to all the filters you can do with git-annex.
+
+For example, let's share links to all the files whose _author_'s name starts with "Mario" and are, in fact, stored at your public-s3 remote.
+However, instead of just a list of links we will output a markdown-formatted list of the filenames linked to their S3 urls:
+
+ for filename in (git annex find --metadata "author=Mario*" --and --in public-s3)
+ echo "* ["$filename"](https://public-annex.s3.amazonaws.com/"(git annex lookupkey $filename)")"
+ end
+
+Very useful.
+
+## Sharing links with time-limited URLs
+
+By using pre-signed URLs it is possible to create limits on how long a URL is valid for retrieving an object.
+To enable use a private S3 bucket for the remotes and then pre-sign actual URL with the script in [AWS-Tools](https://github.com/gdbtek/aws-tools).
+Example:
+
+ key=`git annex lookupkey "$fname"`; sign_s3_url.bash --region 'eu-west-1' --bucket 'mybuck' --file-path $key --aws-access-key-id XX --aws-secret-access-key XX --method 'GET' --minute-expire 10
+
+## Adding the S3 URL as a source
+
+Assuming all files in the current directory are available on S3, this will register the public S3 url for the file in git-annex, making it available for everyone *through git-annex*:
+
+<pre>
+git annex find --in public-s3 | while read file ; do
+ key=$(git annex lookupkey $file)
+ echo $key https://public-annex.s3.amazonaws.com/$key
+done | git annex registerurl
+</pre>
+
+`registerurl` was introduced in `5.20150317`.
+
+## Manually configuring a public get policy
+
+Here is how to manually configure a public get policy
+for a bucket, in the AWS dashboard.
+
+ {
+ "Version": "2008-10-17",
+ "Statement": [
+ {
+ "Sid": "AllowPublicRead",
+ "Effect": "Allow",
+ "Principal": {
+ "AWS": "*"
+ },
+ "Action": "s3:GetObject",
+ "Resource": "arn:aws:s3:::public-annex/*"
+ }
+ ]
+ }
+
+This should not be necessary if using a new enough version
+of git-annex, which can instead be configured with public=yet.
diff --git a/doc/tips/publishing_your_files_to_the_public/comment_1_48f545ce26dbec944f96796ed3b9204d._comment b/doc/tips/publishing_your_files_to_the_public/old_method/comment_1_48f545ce26dbec944f96796ed3b9204d._comment
index 6ee85367e..6ee85367e 100644
--- a/doc/tips/publishing_your_files_to_the_public/comment_1_48f545ce26dbec944f96796ed3b9204d._comment
+++ b/doc/tips/publishing_your_files_to_the_public/old_method/comment_1_48f545ce26dbec944f96796ed3b9204d._comment
diff --git a/doc/tips/publishing_your_files_to_the_public/comment_2_27a40806d009d617b3ad56873197bf87._comment b/doc/tips/publishing_your_files_to_the_public/old_method/comment_2_27a40806d009d617b3ad56873197bf87._comment
index 9cca4e2fa..9cca4e2fa 100644
--- a/doc/tips/publishing_your_files_to_the_public/comment_2_27a40806d009d617b3ad56873197bf87._comment
+++ b/doc/tips/publishing_your_files_to_the_public/old_method/comment_2_27a40806d009d617b3ad56873197bf87._comment
diff --git a/doc/tips/publishing_your_files_to_the_public/comment_3_2f5045629e40e8d881725876190c7846._comment b/doc/tips/publishing_your_files_to_the_public/old_method/comment_3_2f5045629e40e8d881725876190c7846._comment
index c76d3a30c..c76d3a30c 100644
--- a/doc/tips/publishing_your_files_to_the_public/comment_3_2f5045629e40e8d881725876190c7846._comment
+++ b/doc/tips/publishing_your_files_to_the_public/old_method/comment_3_2f5045629e40e8d881725876190c7846._comment
diff --git a/doc/tips/publishing_your_files_to_the_public/comment_4_37405f20da790141187e9f780c999448._comment b/doc/tips/publishing_your_files_to_the_public/old_method/comment_4_37405f20da790141187e9f780c999448._comment
index 2855c3fdd..2855c3fdd 100644
--- a/doc/tips/publishing_your_files_to_the_public/comment_4_37405f20da790141187e9f780c999448._comment
+++ b/doc/tips/publishing_your_files_to_the_public/old_method/comment_4_37405f20da790141187e9f780c999448._comment
diff --git a/doc/tips/publishing_your_files_to_the_public/comment_5_29c3ee4aed6a5b53b6767a96a7b85ad9._comment b/doc/tips/publishing_your_files_to_the_public/old_method/comment_5_29c3ee4aed6a5b53b6767a96a7b85ad9._comment
index bd77d03ce..bd77d03ce 100644
--- a/doc/tips/publishing_your_files_to_the_public/comment_5_29c3ee4aed6a5b53b6767a96a7b85ad9._comment
+++ b/doc/tips/publishing_your_files_to_the_public/old_method/comment_5_29c3ee4aed6a5b53b6767a96a7b85ad9._comment
diff --git a/doc/todo/export.mdwn b/doc/todo/export.mdwn
index 535678c2a..ac77b3d72 100644
--- a/doc/todo/export.mdwn
+++ b/doc/todo/export.mdwn
@@ -29,7 +29,6 @@ Work is in progress. Todo list:
Would need git-annex sync to export to the master tree?
This is similar to the little-used preferreddir= preferred content
setting and the "public" repository group.
-* Test S3 export.
* Test export to IA via S3. In particualar, does removing an exported file
work?