diff options
author | 2012-06-10 16:33:42 -0400 | |
---|---|---|
committer | 2012-06-10 16:33:42 -0400 | |
commit | a0e29b214f9cb0d3772f9d97d152e2dd9e40adf5 (patch) | |
tree | 911750bfada290bc26154bacd24435332191fe3b /doc | |
parent | 3bb58afd594c7208a34749202a858644498acb6f (diff) |
blog for the day
Diffstat (limited to 'doc')
-rw-r--r-- | doc/design/assistant/blog/day_5__committing.mdwn | 57 |
1 files changed, 57 insertions, 0 deletions
diff --git a/doc/design/assistant/blog/day_5__committing.mdwn b/doc/design/assistant/blog/day_5__committing.mdwn new file mode 100644 index 000000000..3840138c6 --- /dev/null +++ b/doc/design/assistant/blog/day_5__committing.mdwn @@ -0,0 +1,57 @@ +After a few days otherwise engaged, back to work today. + +My focus was on adding the committing thread mentioned in [[day_4__speed]]. +I got rather further than expected! + +First, I implemented a really dumb thread, that woke up once per second, +checked if any changes had been made, and committed them. Of course, this +rather sucked. In the middle of a large operation like untarring a tarball, +or `rm -r` of a large directory tree, it made lots of commits and made +things slow and ugly. This was not unexpected. + +So next, I added some smarts to it. First, I wanted to stop it waking up +every second when there was nothing to do, and instead blocking wait on a +change occuring. Secondly, I wanted it to know when past changes happened, +so it could detect batch mode scenarios, and avoid committing too +frequently. + +I played around with combinations of various Haskell thread communications +tools to get that information to the committer thread: `MVar`, `Chan`, +`QSem`, `QSemN`. Eventually, I realized all I needed was a simple channel +through which the timestamps of changes could be sent. However, `Chan` +wasn't quite suitable, and I had to add a dependency on +[Software Transactional Memory](http://en.wikipedia.org/wiki/Software_Transactional_Memory), +and use a `TChan`. Now I'm cooking with gas! + +With that data channel available to the committer thread, it quickly got +some very nice smart behavior. Playing around with it, I find it commits +*instantly* when I'm making some random change that I'd want the +git-annex assistant to sync out instantly; and that its batch job detection +works pretty well too. + +There's surely room for improvement, and I made this part of the code +be an entirely pure function, so it's really easy to change the strategy. +This part of the committer thread is so nice and clean, that here's the +current code, for your viewing pleasure: + +[[format haskell """ +{- Decide if now is a good time to make a commit. + - Note that the list of change times has an undefined order. + - + - Current strategy: If there have been 10 commits within the past second, + - a batch activity is taking place, so wait for later. + -} +shouldCommit :: UTCTime -> [UTCTime] -> Bool +shouldCommit now changetimes + | len == 0 = False + | len > 4096 = True -- avoid bloating queue too much + | length (filter thisSecond changetimes) < 10 = True + | otherwise = False -- batch activity + where + len = length changetimes + thisSecond t = now `diffUTCTime` t <= 1 +"""]] + +Still some polishing to do to eliminate minor innefficiencies and deal +with more races, but this part of the git-annex assistant is now very usable, +and will be going out to my beta testers soon! |