summaryrefslogtreecommitdiff
path: root/doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_/comment_3_6674e4dbc7437ce941bcef6272c3433b._comment
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2017-05-25 16:02:17 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2017-05-25 17:40:23 -0400
commit9785222714d65ded2274723c8b0a210c6152ea36 (patch)
tree2cc0a99fbe0dd9f4924aa5b7e5bbbfe36e7cb80b /doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_/comment_3_6674e4dbc7437ce941bcef6272c3433b._comment
parentbb803411fb99e482dca0c1c0aa740f28b4a98820 (diff)
Fix transfer log file locking problem when running concurrent transfers.
orElse is great, but was not the right thing to use here because waitTakeLock could retry for other reasons than the lock being held, which made tryTakeLock fail when it shouldn't. Instead, move the code to tryTakeLock and implement waitTakeLock using tryTakeLock and retry. (Also, in runTransfer, when checkSaneLock fails, dropLock to avoid leaking a lock handle.) This commit was supported by the NSF-funded DataLad project.
Diffstat (limited to 'doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_/comment_3_6674e4dbc7437ce941bcef6272c3433b._comment')
-rw-r--r--doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_/comment_3_6674e4dbc7437ce941bcef6272c3433b._comment40
1 files changed, 40 insertions, 0 deletions
diff --git a/doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_/comment_3_6674e4dbc7437ce941bcef6272c3433b._comment b/doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_/comment_3_6674e4dbc7437ce941bcef6272c3433b._comment
new file mode 100644
index 000000000..a004b75b9
--- /dev/null
+++ b/doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_/comment_3_6674e4dbc7437ce941bcef6272c3433b._comment
@@ -0,0 +1,40 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 3"""
+ date="2017-05-25T19:08:59Z"
+ content="""
+That looks like concurrent `git config` setting remote.origin.annex-uuid
+are failing.
+
+I have not reproduced the `.git/config` error, but with a local
+clone of a repository, I have been able to reproduce some intermittent
+"transfer already in progress, or unable to take transfer lock" failures
+with `git annex get -J5`, happening after remote.origin.annex-uuid has been
+cached.
+
+So, two distinct bugs I think..
+
+---
+
+Debugging, the lock it fails to take always seems to be the lock on
+the remote side, which points to the local clone being involved somehow.
+
+Debugging further, Utility.LockPool.STM.tryTakeLock is what's failing.
+That's supposed to only fail when another thread holds a conflicting lock,
+but as it's implemented with `orElse`, if the main STM
+transaction retries due to other STM activity on the same TVar,
+it will give up when it shouldn't.
+
+That's probably why this is happening under heavier concurrency loads;
+it makes that failure case much more likely. And with a local clone,
+twice as much locking is done.
+
+I've fixed this part of it!
+
+---
+
+The concurrent `git config` part remains.
+Since git-annex can potentially have multiple threads doing different `git
+config` for their own reasons concurrently, it seems it will need to add
+its own locking around that.
+"""]]