summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2016-04-18 14:04:20 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2016-04-18 14:04:50 -0400
commit71f7e6a178b9b5e694975729f276548fc5461c21 (patch)
tree6645ff0c45230878ed227f40fad5660ea87ed608
parentc6d028f5e85e8cf3dcc2c15cf2e9d40a6ad16b81 (diff)
fix drop hang reported by musicmatze
Fix hang when dropping content needs to lock the content on a ssh remote, which occurred when the remote has git-annex version 5.20151019 or newer. Analysis: `race` runs 2 threads at once, and the hGetLine finishes first. So, it tries to cancel the waitForProcess, but unfortunately that is making a foreign call and so cannot be canceled. The remote git-annex-shell is waiting for a line on stdin before it will exit. Deadlock. This only occurred sometimes; I reproduced it going from darkstar to elephant, but not from darkstar to darkstar. Not sure how that fits into the above analysis -- perhaps a race condition is also involved? Fixed by not using `race`; now the hGetLine will fail with an exception if the remote git-annex-shell exits without any output.
-rw-r--r--Remote/Git.hs21
-rw-r--r--debian/changelog4
2 files changed, 13 insertions, 12 deletions
diff --git a/Remote/Git.hs b/Remote/Git.hs
index 256428137..284d6a49c 100644
--- a/Remote/Git.hs
+++ b/Remote/Git.hs
@@ -57,7 +57,6 @@ import Types.NumCopies
import Control.Concurrent
import Control.Concurrent.MSampleVar
-import Control.Concurrent.Async
import qualified Data.Map as M
import Network.URI
@@ -387,17 +386,14 @@ lockKey r key callback
, std_out = CreatePipe
, std_err = UseHandle nullh
}
- -- Wait for either the process to exit, or for it to
- -- indicate the content is locked.
- v <- liftIO $ race
- (waitForProcess p)
- (hGetLine hout)
- let signaldone = void $ tryNonAsync $ liftIO $ do
- hPutStrLn hout ""
- hFlush hout
- hClose hin
- hClose hout
- void $ waitForProcess p
+ v <- liftIO $ tryIO $ hGetLine hout
+ let signaldone = void $ tryNonAsync $ liftIO $ mapM_ tryNonAsync
+ [ hPutStrLn hout ""
+ , hFlush hout
+ , hClose hin
+ , hClose hout
+ , void $ waitForProcess p
+ ]
let checkexited = not . isJust <$> getProcessExitCode p
case v of
Left _exited -> do
@@ -405,6 +401,7 @@ lockKey r key callback
liftIO $ do
hClose hin
hClose hout
+ void $ waitForProcess p
failedlock
Right l
| l == Ssh.contentLockedMarker -> bracket_
diff --git a/debian/changelog b/debian/changelog
index 588c34542..9c008669d 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -17,6 +17,10 @@ git-annex (6.20160413) UNRELEASED; urgency=medium
* fsck: Warn when core.sharedRepository is set and an annex object file's
write bit is not set and cannot be set due to the file being owned
by a different user.
+ * Fix hang when dropping content needs to lock the content on a
+ ssh remote, which occurred when the remote has git-annex version
+ 5.20151019 or newer. (The bug was in the client side; the remote
+ git-annex-shell does not need to be upgraded.)
-- Joey Hess <id@joeyh.name> Wed, 13 Apr 2016 13:30:32 -0400