summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2016-05-02 10:53:24 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2016-05-02 10:53:24 -0400
commitef2feac946c7b61f29486a5554e0d5e8a4a5b4e7 (patch)
tree59ef658566c544caa7b6463b59c19ef8f1c94b28
parent6e795acaa5735491080c8bd77fbc67b921871a96 (diff)
Fix bug that sometimes prevented git-annex smudge --clean from consuming all its input, which resulted in git add bypassing git-annex.
-rw-r--r--Command/Smudge.hs9
-rw-r--r--debian/changelog2
-rw-r--r--doc/bugs/git-annex_smudge_fails_on_git_add.mdwn2
-rw-r--r--doc/bugs/git-annex_smudge_fails_on_git_add/comment_7_dfcc8af09fb7c7658613bd7bb94a7723._comment48
4 files changed, 59 insertions, 2 deletions
diff --git a/Command/Smudge.hs b/Command/Smudge.hs
index 1e169c1bf..52367d499 100644
--- a/Command/Smudge.hs
+++ b/Command/Smudge.hs
@@ -73,8 +73,13 @@ clean file = do
if isJust (parseLinkOrPointer b)
then liftIO $ B.hPut stdout b
else ifM (shouldAnnex file)
- ( liftIO . emitPointer
- =<< go =<< ingest =<< lockDown cfg file
+ ( do
+ -- Even though we ingest the actual file,
+ -- and not stdin, we need to consume all
+ -- stdin, or git will get annoyed.
+ B.length b `seq` return ()
+ liftIO . emitPointer
+ =<< go =<< ingest =<< lockDown cfg file
, liftIO $ B.hPut stdout b
)
stop
diff --git a/debian/changelog b/debian/changelog
index d4c586bac..22ff0c5ca 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,5 +1,7 @@
git-annex (6.20160420) UNRELEASED; urgency=medium
+ * Fix bug that sometimes prevented git-annex smudge --clean from consuming
+ all its input, which resulted in git add bypassing git-annex.
* Fix build with directory-1.2.6.2.
-- Joey Hess <id@joeyh.name> Thu, 28 Apr 2016 13:17:04 -0400
diff --git a/doc/bugs/git-annex_smudge_fails_on_git_add.mdwn b/doc/bugs/git-annex_smudge_fails_on_git_add.mdwn
index 61e90d184..5b91e1ed3 100644
--- a/doc/bugs/git-annex_smudge_fails_on_git_add.mdwn
+++ b/doc/bugs/git-annex_smudge_fails_on_git_add.mdwn
@@ -55,3 +55,5 @@ error: external filter git-annex smudge --clean %f failed
### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
No, I'm a first-time user. Thanks for this neat piece of software.
+
+> [[fixed|done]] --[[Joey]]
diff --git a/doc/bugs/git-annex_smudge_fails_on_git_add/comment_7_dfcc8af09fb7c7658613bd7bb94a7723._comment b/doc/bugs/git-annex_smudge_fails_on_git_add/comment_7_dfcc8af09fb7c7658613bd7bb94a7723._comment
new file mode 100644
index 000000000..bf5e62256
--- /dev/null
+++ b/doc/bugs/git-annex_smudge_fails_on_git_add/comment_7_dfcc8af09fb7c7658613bd7bb94a7723._comment
@@ -0,0 +1,48 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 7"""
+ date="2016-05-02T14:19:43Z"
+ content="""
+Relevant parts of the strace:
+
+ [pid 21392] read(0, <unfinished ...>
+ [pid 21393] <... futex resumed> ) = 0
+ [pid 21392] <... read resumed> "\0\0009\214moov\0\0\0lmvhd\0\0\0\0\300\\\26\256\300\\\26\267\0\0\2X"..., 32752) = 32752
+ [pid 21392] write(1, "/annex/objects/SHA256E-s2589913-"..., 102 <unfinished...>
+ [pid 21389] <... read resumed> "/annex/objects/SHA256E-s2589913-"..., 2589913) = 102
+
+21392 is git-annex smudge, and 21389 is apparently a thread belonging to git.
+This shows git feeding 32752 bytes of content to git-annex smudge on stdin, and
+git-annex smudge responding with the annex object.
+
+Notice that git-annex calculates the actual file size to be 2589913 bytes, more
+than git has sent it at the time it replies. It can tell this because it stats
+and hashes the actual file.
+
+ [pid 21392] +++ exited with 0 +++
+ [pid 21390] write(9, "ml\3\23<\270\315\321s\322\306\341\301e\261\312[<\2\n\17\315\16a\v\245\215(q\207\314\23"..., 2495705) = -1 EPIPE (Broken pipe)
+ [pid 21390] --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=21389, si_uid=2001} ---
+ [pid 21390] close(9) = 0
+ [pid 21390] write(2, "error: cannot feed the input to "..., 76error: cannot feed the input to external filter git-annex smudge --clean %f
+
+Now git-annex exits, and then git, tries to feed it another 2495705 bytes
+of data after it's exited.
+
+So, it seems that the problem is git-annex is somehow doing a short read of its
+stdin, and not waiting for git to feed it the full content.
+
+And, I see this bug now in the code:
+
+ b <- liftIO $ B.hGetContents stdin
+ if isJust (parseLinkOrPointer b)
+
+`parseLinkOrPointer` is careful to *not* look at the whole file content,
+so nothing forces the lazy bytestring read to consume all of stdin.
+So, if git's writes are broken up as happened in this strace, git-annex
+will exit without consuming the whole stdin.
+
+I've committed a fix along with this comment, so since you're building
+from source, and able to reproduce this problem (which I have not
+so far been able to reproduce myself)
+you should be able to update and rebuild and verify the fix.
+"""]]