doc/bugs/s3_special_remote_does_not_resume_uploads_even_with_new_chunking/comment_4_bd631d470ee0365a11483c9a2e563b32._comment


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

[[!comment format=mdwn
 username="joey"
 subject="""comment 4"""
 date="2015-07-16T17:57:44Z"
 content="""
This should have been filed as a bug report... I will move the thread to
bugs after posting this comment.

In your obfuscated log, it tries to HEAD GPGHMACSHA1--1111111111
and when that fails, it PUTs GPGHMACSHA1--2222222222. From this, we can
deduce that GPGHMACSHA1--1111111111 is not the first chunk, but is the full
non-chunked file, and GPGHMACSHA1--2222222222 is actually the first chunk.

For testing, I modifed the S3 remote to make file uploads succeed, but then
report to git-annex that they failed. So, git annex copy uploads the 1st
chunk and then fails, same as it was interrupted there. Repeating the copy,
I see the same thing; it HEADs the full key, does not HEAD the first chunk,
and so doesn't notice it was uploaded before, and so re-uploads the first
chunk.

The HEAD of the full key is just done for backwards compatability reasons.
The problem is that it's not checking if the current chunk it's gonna
upload is present in the remote. But, there is code in seekResume that
is supposed to do that very check: `tryNonAsync (checker k)`

Aha, the problem seems to be in the checkpresent action that's passed to
that. Looks like it's passing in a dummy checkpresent action.

I've fixed this in git, and now it resumes properly in my test case.
"""]]