diff options
author | Joey Hess <joey@kitenet.net> | 2012-06-29 15:44:14 -0400 |
---|---|---|
committer | Joey Hess <joey@kitenet.net> | 2012-06-29 15:44:14 -0400 |
commit | 660f81d2b2d8393577771c5f51e9da5f0ba00e22 (patch) | |
tree | d7431322f18561e37931966793a2e33b4300a73b /doc | |
parent | c79625290a9e17e8c9f6f0ed93a0e23a5ef0126c (diff) |
blog for the day
Diffstat (limited to 'doc')
-rw-r--r-- | doc/design/assistant/blog/day_20__data_transfer_design.mdwn | 51 | ||||
-rw-r--r-- | doc/design/assistant/progressbars.mdwn | 2 | ||||
-rw-r--r-- | doc/design/assistant/syncing.mdwn | 4 |
3 files changed, 54 insertions, 3 deletions
diff --git a/doc/design/assistant/blog/day_20__data_transfer_design.mdwn b/doc/design/assistant/blog/day_20__data_transfer_design.mdwn new file mode 100644 index 000000000..2733f09bc --- /dev/null +++ b/doc/design/assistant/blog/day_20__data_transfer_design.mdwn @@ -0,0 +1,51 @@ +Today is a planning day. I have only a few days left before I'm off to +Nicaragua for [DebConf](http://debconf12.debconf.org/), where I'll only +have smaller chunks of time without interruptions. So it's important to get +some well-defined smallish chunks designed that I can work on later. See +bulleted action items below. Each should be around 1-2 hours unless it +turns out to be 8 hours... :) + +First, worked on writing down a design, and some data types, for data transfer +tracking (see [[syncing]] page). Found that writing down these simple data +types before I started slinging code has clarified things a lot for me. + +Most importantly, I realized that I will need to modify `git-annex-shell` +to record on disk what transfers it's doing, so the assistant can get that +information and use it to both avoid redundant transfers (potentially a big +problem!), and later to allow the user to control them using the web app. + +So these will be the first steps as I move toward implementing data +transfer tracking and naive flood fill transferring. + +* on-disk transfers in progress information files (read/write/enumerate) +* locking for the files, so redundant transfer races can be detected, + and failed transfers noticed +* update files as transfers proceed. See [[progressbars]] + (updating for downloads is easy; for uploads is hard) +* add Transfer queue TChan +* enqueue Transfers (Uploads) as new files are added to the annex by + Watcher. +* enqueue Tranferrs (Downloads) as new dangling symlinks are noticed by + Watcher. +* add TransferInfo Map to DaemonStatus for tracking transfers in progress. +* Poll transfer in progress info files for changes (use inotify again! + wow! hammer, meet nail..), and update the TransferInfo Map +* Write basic Transfer handling thread. Multiple such threads need to be + able to be run at once. Each will need its own independant copy of the + Annex state monad. +* Write transfer control thread, which decides when to launch transfers. +* At startup, and possibly periodically, look for files we have that + location tracking indicates remotes do not, and enqueue Uploads for + them. Also, enqueue Downloads for any files we're missing. + +While eventually the user will be able to use the web app to prioritize +transfers, stop and start, throttle, etc, it's important to get the default +behavior right. So I'm thinking about things like how to prioritize uploads +vs downloads, when it's appropriate to have multiple downloads running at +once, etc. + +* Find a way to probe available outgoing bandwidth, to throttle so + we don't bufferbloat the network to death. +* git-annex needs a simple speed control knob, which can be plumbed + through to, at least, rsync. A good job for an hour in an + airport somewhere. diff --git a/doc/design/assistant/progressbars.mdwn b/doc/design/assistant/progressbars.mdwn index 2ade05aa5..ee7384274 100644 --- a/doc/design/assistant/progressbars.mdwn +++ b/doc/design/assistant/progressbars.mdwn @@ -9,6 +9,6 @@ To get this info for downloads, git-annex can watch the file as it arrives and use its size. TODO: What about uploads? Will i have to parse rsync's progresss output? -Feed it via a named pipe? Ugh. +Feed it via a named pipe? Ugh. Check into librsync. This is one of those potentially hidden but time consuming problems. diff --git a/doc/design/assistant/syncing.mdwn b/doc/design/assistant/syncing.mdwn index 02811f07e..ce7f9673b 100644 --- a/doc/design/assistant/syncing.mdwn +++ b/doc/design/assistant/syncing.mdwn @@ -58,9 +58,9 @@ anyway. data Transfer = Upload Key Remote | Download Key Remote data TransferID = TransferThread ThreadID | TransferProcess Pid - type AmountComplete = Integer + type BytesComplete = Integer type StartedTime = EpochTime - data TransferInfo = TransferInfo TransferID StartedTime AmountComplete + data TransferInfo = TransferInfo TransferID StartedTime BytesComplete -- add (M.Map Transfer TransferInfo) to DaemonStatus startTransfer :: Transfer -> Annex TransferID |