aboutsummaryrefslogtreecommitdiff
path: root/doc/bugs/Glacier_remote_uploads_duplicates/comment_1_8aef582a0f0d0c7f764b425fc45de3b4._comment
blob: 9e42d4caec2ec3165ddeb55f31b6d97730cceaf6 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
[[!comment format=mdwn
 username="http://joeyh.name/"
 nickname="joey"
 subject="comment 1"
 date="2013-05-23T15:55:16Z"
 content="""
Please beware of the warning on the man page when using --trust-glacier-inventory:

> Be careful using this, especially if you or someone  else  might
> have  recently  removed  a file from Glacier. If you try to drop
> the only other copy of the file, and this switch is enabled, you
> could lose data!

While I'm inclined to want git-annex to store the necessary mappings from keys to glacier IDs in the git-annex branch, which would allow uploads/downloads from multiple repositories to the same glacier repository, it will not help with this problem. The git-annex branch can be out of date too.

It seems that what's needed is a separate form of the checkpresent hook, that's used when deciding whether to copy data to glacier.
We want this to trust the glacier inventory. But we don't want to trust the glacier inventory when moving data to glacier, or when running `git annex drop`! (unless --trust-glacier-inventory is specified). I think this would be easy to add. If you're up for testing a patch, I could do it today.

BTW, there does seem to be a workaround that avoids duplicate copies to glacier:

    git annex copy --to glacier --not --in glacier

While normally copy checks the inventory to see if a key has been sent to glacier, and so will re-send, the `--not --in glacier`
trusts the location tracking information, so if git-annex has sent the key before, it will skip the copy.
"""]]