summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2017-05-25 18:27:01 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2017-05-25 18:28:23 -0400
commit893f5c6b0069bfed1750a58ee6aa38d2d98b1b18 (patch)
tree7b20f41fa2bd987888f08314c755d53d477c055b
parent9785222714d65ded2274723c8b0a210c6152ea36 (diff)
Avoid concurrent git-config setting problem when running concurrent threads.
See my comment. This only avoids the problem for -J; two git-annex processes started at the same time could still both try to write to .git/config and one fail. That would be very unlikely though, and it doesn't really seem worth adding an additional layer of locking around .git/config. This commit was supported by the NSF-funded DataLad project.
-rw-r--r--CHANGELOG2
-rw-r--r--CmdLine/Action.hs7
-rw-r--r--doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_.mdwn2
-rw-r--r--doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_/comment_4_480df575c68bfb37b8bb4fb43737726f._comment29
4 files changed, 40 insertions, 0 deletions
diff --git a/CHANGELOG b/CHANGELOG
index bf9166722..1232b09f1 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -6,6 +6,8 @@ git-annex (6.20170520) UNRELEASED; urgency=medium
a transfer does not resume.
* Fix transfer log file locking problem when running concurrent
transfers.
+ * Avoid concurrent git-config setting problem when running concurrent
+ threads.
-- Joey Hess <id@joeyh.name> Wed, 24 May 2017 14:03:40 -0400
diff --git a/CmdLine/Action.hs b/CmdLine/Action.hs
index 27621e445..75c9e9471 100644
--- a/CmdLine/Action.hs
+++ b/CmdLine/Action.hs
@@ -16,6 +16,7 @@ import Types.Command
import Types.Concurrency
import Messages.Concurrent
import Types.Messages
+import Remote.List
import Control.Concurrent.Async
import Control.Exception (throwIO)
@@ -57,6 +58,12 @@ commandAction a = go =<< Annex.getState Annex.concurrency
ws <- Annex.getState Annex.workers
(st, ws') <- if null ws
then do
+ -- Generate the remote list now, to avoid
+ -- each thread generating it, which would
+ -- be more expensive and could cause
+ -- threads to contend over eg, calls to
+ -- setConfig.
+ _ <- remoteList
st <- dupState
return (st, replicate (n-1) (Left st))
else do
diff --git a/doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_.mdwn b/doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_.mdwn
index 4725f9a07..10dbd5547 100644
--- a/doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_.mdwn
+++ b/doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_.mdwn
@@ -29,3 +29,5 @@ SHA256E-s328--c2eb8088cdc71a0d4cbd660312bef5421a47ce7da3655efdb17712e7188be4a1.t
"""]]
[[!meta author=yoh]]
+
+> [[fixed|done]] --[[Joey]]
diff --git a/doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_/comment_4_480df575c68bfb37b8bb4fb43737726f._comment b/doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_/comment_4_480df575c68bfb37b8bb4fb43737726f._comment
new file mode 100644
index 000000000..3827077c2
--- /dev/null
+++ b/doc/bugs/parallel_get_can_fail_some_downloads_and_require_re-getting_/comment_4_480df575c68bfb37b8bb4fb43737726f._comment
@@ -0,0 +1,29 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 4"""
+ date="2017-05-25T21:52:37Z"
+ content="""
+The .git/config concurrent access happens because the remote
+list is only generated on demand, and nothing demands it when running with
+-J until all the threads are spun up and each thread has its own state
+then, so each generates the remote list.
+
+There don't look to be any other git-config settings that
+would cause problems for concurrency other than ones run while
+generating the remote list.
+
+So, generating the remote list before starting concurrent threads
+would avoid that problem, and also leads to a slightly faster startup
+since the remote git config only has to be read once, etc.
+
+The only risk in doing that would be if generating a Remote
+opens some kind of handle, which can't be used concurrently, or
+is less efficient if only opened once and then used by multiple threads.
+
+I've audited all the Remote.*.gen methods, and they all
+seem ok. For example, Remote.External.gen sets up a worker pool
+of external special remote processes, and new ones are allocated as needed.
+And Remote.P2P.chainGen sets up a connection pool.
+
+Ok, gone ahead with this fix.
+"""]]