I started using a repo on S3, so that partially answered my question about how files are stored on S3.

author: wsha.code+ga@b38779424f41c5701bbe5937340be43ff1474b2d <wsha.code+ga@b38779424f41c5701bbe5937340be43ff1474b2d@web> 2015-12-08 10:02:06 +0000
committer: admin <admin@branchable.com> 2015-12-08 10:02:06 +0000
commit: 4e35c755f3e6a38469fa15438f0afa60cc54464e (patch)
tree: 28b21c485ea75b8c60e849b84b40f8343ee96837
parent: f2dd1ff901e2186171ae52447a0f82824e95cf37 (diff)
1 files changed, 2 insertions, 2 deletions
diff --git a/doc/forum/Future_proofing___47___disaster_recovery_with_an_encrypted_special_remote.mdwn b/doc/forum/Future_proofing___47___disaster_recovery_with_an_encrypted_special_remote.mdwn
index 8bc879de6..2c229d211 100644
--- a/doc/forum/Future_proofing___47___disaster_recovery_with_an_encrypted_special_remote.mdwn
+++ b/doc/forum/Future_proofing___47___disaster_recovery_with_an_encrypted_special_remote.mdwn
@@ -1,9 +1,9 @@
 I am interested in using `git annex` to manage encrypted backups to Amazon S3/Glacier. So `git annex` will be used with the main file directory in direct mode and an encrypted S3 or Glacier remote set up in archive mode and then `git annex add .` and `git annex sync` will be run periodically. The intent is for this set up to be a backup for catastrophic failure, so I want to make sure I take care of future-proofing and disaster recovery properly. So my basic question is what would I need to have backed up and what would I have to do if the computer with the main repository died. I try to break that out into more specific questions below.
 
-0. Do the S3/Glacier remotes just store the contents of `.git/annex/objects` in encrypted form and nothing else? So if I was left with nothing but the AWS bucket and couldn't get `git annex` to work for whatever reason, I could recover my files by hand if I had the encryption key (though I wouldn't know the file names or directory structure)?
+0. S3/Glacier remotes store the contents of `.git/annex/objects` in encrypted form with hashes for file names and nothing else (other than a uuid). The hashes do not match the keys in the main repo. Are they the same keys encrypted? Is there a way to look up the S3 file name corresponding to a file in the repo?
 
 1. For `shared` encryption, I see the cipher text in `remote.log` in the `git-annex` branch. Assuming I didn't have access to `git annex`, what would I need to do to convert that cipher text into a form that I could use with `gpg` to decrypt files?
    
 2. Same question but for `hybrid` encryption rather than `shared`. I assume the answer is similar but I need to decrypt the cipher first with my gpg key? How do I do that?
 
-3. Assuming I did have access to `git annex`, what would I need to create a new repo on a new computer with access to all of the files in the S3/Glacier bucket? I think I would need my Amazon credentials, my gpg key if using hybrid or public key encryption, and the `.git` folder as it was the last time files were pushed to the S3/Glacier remote (which would have the necessary decryption information for shared encryption). Is that right? I guess mainly I am checking that the remote does not store any metadata about the repo, so for `git annex` to be able to pull files back out I would need a backup of the `.git` directory and that back up would need to be up to date (can't just copy remote.log and have `git annex` work out the rest from the remote's contents). So for a full backup, my script would need to `tar` the `.git` directory, encrypt it, and push it to S3/Glacier separately after `git annex` does a sync. Then I could recover everything as long as I had a secure backup of my Amazon credentials and my encryption key(s).
+3. Assuming I did have access to `git annex`, what would I need to create a new repo on a new computer with access to all of the files in the S3/Glacier bucket? I think I would need my Amazon credentials (possibly already embedded in the git repo), my gpg key if using hybrid or public key encryption, and the `.git` folder as it was the last time files were pushed to the S3/Glacier remote (which would have the necessary decryption information for shared encryption). Is that right? I guess mainly I am checking that the remote does not store any metadata about the repo, so for `git annex` to be able to pull files back out I would need a backup of the `.git` directory and that back up would need to be up to date (can't just copy remote.log and have `git annex` work out the rest from the remote's contents). So for a full backup, my script would need to `tar` the `.git` directory, encrypt it, and push it to S3/Glacier separately after `git annex` does a sync. Then I could recover everything as long as I had a secure backup of my Amazon credentials and my encryption key(s).
author	wsha.code+ga@b38779424f41c5701bbe5937340be43ff1474b2d <wsha.code+ga@b38779424f41c5701bbe5937340be43ff1474b2d@web>	2015-12-08 10:02:06 +0000
committer	admin <admin@branchable.com>	2015-12-08 10:02:06 +0000
commit	4e35c755f3e6a38469fa15438f0afa60cc54464e (patch)
tree	28b21c485ea75b8c60e849b84b40f8343ee96837
parent	f2dd1ff901e2186171ae52447a0f82824e95cf37 (diff)