aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2012-07-18 19:42:29 -0400
committerGravatar Joey Hess <joey@kitenet.net>2012-07-18 19:42:29 -0400
commit61695f9f41a90b05c61661f9d9d052afcb5783df (patch)
tree65ad779e92a576ca5ea96b66f8aa58ca378b80f9
parentcfdd4d66029915c683a1653c941c05e45205b13a (diff)
blog for the day
-rw-r--r--doc/design/assistant/blog/day_39__twice_is_enemy_action.mdwn66
-rw-r--r--doc/design/assistant/syncing.mdwn14
-rw-r--r--doc/todo/assistant_threaded_runtime.mdwn5
3 files changed, 78 insertions, 7 deletions
diff --git a/doc/design/assistant/blog/day_39__twice_is_enemy_action.mdwn b/doc/design/assistant/blog/day_39__twice_is_enemy_action.mdwn
new file mode 100644
index 000000000..14896fcb1
--- /dev/null
+++ b/doc/design/assistant/blog/day_39__twice_is_enemy_action.mdwn
@@ -0,0 +1,66 @@
+Beating my head against the threaded runtime some more. I can reproduce
+one of the hangs consistently by running 1000 git annex add commands
+in a loop. It hangs around 1% of the time, reading from `git cat-file`.
+
+Interestingly, `git cat-file` is not yet running at this point --
+git-annex has forked a child process, but the child has not yet exec'd it.
+Stracing the child git-annex, I see it stuck in a futex. Adding tracing,
+I see the child never manages to run any code at all.
+
+This really looks like the problem is once again in MissingH, which uses
+`forkProcess`. Which happens to come with a big warning about being very
+unsafe, in very subtle ways. Looking at the C code that the newer `process`
+library uses when sparning a pipe to a process, it messes around with lots of
+things; blocking signals, stopping a timer, etc. Hundreds of lines of C
+code to safely start a child process, all doing things that MissingH omits.
+
+That's the second time I've seemingly isolated a hang in the GHC threaded
+runtime to MissingH.
+
+And so I've started converting git-annex to use the new `process` library,
+for running all its external commands. John Goerzen had mentioned `process`
+to me once before when I found a nasty bug in MissingH, as the cool new
+thing that would probably eliminate the `System.Cmd.Utils` part of MissingH,
+but I'd not otherwise heard much about it. (It also seems to have the
+benefit of supporting Windows.)
+
+This is a big change and it's early days, but each time I see a hang, I'm
+converting the code to use `process`, and so far the hangs have just gone
+away when I do that.
+
+---
+
+Hours later... I've converted *all* of git-annex to use `process`.
+
+In the er, process, the `--debug` switch stopped printing all the commands
+it runs. I may try to restore that later.
+
+I've not tested everything, but the test suite passes, even when
+using the threaded runtime. **MILESTONE**
+
+Looking forward to getting out of these weeds and back to useful work..
+
+---
+
+Hours later yet.... The `assistant` branch in git now uses the threaded
+runtime. It works beautifully, using proper threads to run file transfers
+in.
+
+That should fix the problem I was seeing on OSX yesterday. Too tired to
+test it now.
+
+--
+
+Amazingly, all the assistant's own dozen or so threads and thread
+synch variables etc all work great under the threaded runtime. I had
+assumed I'd see yet more concurrency problems there when switching to it,
+but it all looks good. (Or whatever problems there are are subtle ones?)
+
+I'm very relieved. The threaded logjam is broken! I had been getting
+increasingly worried that not having the threaded runtime available would
+make it very difficult to make the assistant perform really well, and cause
+problems with the webapp, perhaps preventing me from using Yesod.
+
+Now it looks like smooth sailing ahead. Still some hard problems, but
+it feels like with inotify and kqueue and the threaded runtime all
+dealt with, the really hard infrastructure-level problems are behind me.
diff --git a/doc/design/assistant/syncing.mdwn b/doc/design/assistant/syncing.mdwn
index 3fe27d5ac..9b3e3b08e 100644
--- a/doc/design/assistant/syncing.mdwn
+++ b/doc/design/assistant/syncing.mdwn
@@ -10,13 +10,6 @@ all the other git clones, at both the git level and the key/value level.
on remotes, and transfer. But first, need to ensure that when a remote
receives content, and updates its location log, it syncs that update
out.
-* Transfer watching has a race on kqueue systems, which makes finished
- fast transfers not be noticed by the TransferWatcher. Which in turn
- prevents the transfer slot being freed and any further transfers
- from happening. So, this approach is too fragile to rely on for
- maintaining the TransferSlots. Instead, need [[todo/assistant_threaded_runtime]],
- which would allow running something for sure when a transfer thread
- finishes.
## longer-term TODO
@@ -106,3 +99,10 @@ anyway.
Annex state monad. **done**
* Write transfer control thread, which decides when to launch transfers.
**done**
+* Transfer watching has a race on kqueue systems, which makes finished
+ fast transfers not be noticed by the TransferWatcher. Which in turn
+ prevents the transfer slot being freed and any further transfers
+ from happening. So, this approach is too fragile to rely on for
+ maintaining the TransferSlots. Instead, need [[todo/assistant_threaded_runtime]],
+ which would allow running something for sure when a transfer thread
+ finishes. **done**
diff --git a/doc/todo/assistant_threaded_runtime.mdwn b/doc/todo/assistant_threaded_runtime.mdwn
index edfa51669..3953cf062 100644
--- a/doc/todo/assistant_threaded_runtime.mdwn
+++ b/doc/todo/assistant_threaded_runtime.mdwn
@@ -13,6 +13,11 @@ When pulling, pushing, and merging, the assistant runs external git
commands, and this does block all other threads. The threaded runtime would
really help here.
+[[done]]; the assistant now builds with the threaded runtime.
+Some work still remains to run certian long-running external git commands
+in their own threads to prevent them blocking things, but that is easy to
+do, now. --[[Joey]]
+
---
Currently, git-annex seems unstable when built with the threaded runtime.