aboutsummaryrefslogtreecommitdiff
path: root/doc/forum/Future_proofing___47___disaster_recovery_with_an_encrypted_special_remote.mdwn
blob: 2c229d211bf5191e34ae8fac960c86fd89f52eeb (plain)
1
2
3
4
5
6
7
8
9
I am interested in using `git annex` to manage encrypted backups to Amazon S3/Glacier. So `git annex` will be used with the main file directory in direct mode and an encrypted S3 or Glacier remote set up in archive mode and then `git annex add .` and `git annex sync` will be run periodically. The intent is for this set up to be a backup for catastrophic failure, so I want to make sure I take care of future-proofing and disaster recovery properly. So my basic question is what would I need to have backed up and what would I have to do if the computer with the main repository died. I try to break that out into more specific questions below.

0. S3/Glacier remotes store the contents of `.git/annex/objects` in encrypted form with hashes for file names and nothing else (other than a uuid). The hashes do not match the keys in the main repo. Are they the same keys encrypted? Is there a way to look up the S3 file name corresponding to a file in the repo?

1. For `shared` encryption, I see the cipher text in `remote.log` in the `git-annex` branch. Assuming I didn't have access to `git annex`, what would I need to do to convert that cipher text into a form that I could use with `gpg` to decrypt files?
   
2. Same question but for `hybrid` encryption rather than `shared`. I assume the answer is similar but I need to decrypt the cipher first with my gpg key? How do I do that?

3. Assuming I did have access to `git annex`, what would I need to create a new repo on a new computer with access to all of the files in the S3/Glacier bucket? I think I would need my Amazon credentials (possibly already embedded in the git repo), my gpg key if using hybrid or public key encryption, and the `.git` folder as it was the last time files were pushed to the S3/Glacier remote (which would have the necessary decryption information for shared encryption). Is that right? I guess mainly I am checking that the remote does not store any metadata about the repo, so for `git annex` to be able to pull files back out I would need a backup of the `.git` directory and that back up would need to be up to date (can't just copy remote.log and have `git annex` work out the rest from the remote's contents). So for a full backup, my script would need to `tar` the `.git` directory, encrypt it, and push it to S3/Glacier separately after `git annex` does a sync. Then I could recover everything as long as I had a secure backup of my Amazon credentials and my encryption key(s).