diff options
-rw-r--r-- | doc/design/assistant/syncing.mdwn | 56 |
1 files changed, 42 insertions, 14 deletions
diff --git a/doc/design/assistant/syncing.mdwn b/doc/design/assistant/syncing.mdwn index 898081574..83c5e9d22 100644 --- a/doc/design/assistant/syncing.mdwn +++ b/doc/design/assistant/syncing.mdwn @@ -3,15 +3,42 @@ all the other git clones, at both the git level and the key/value level. ## immediate action items -* Sync with all available remotes on startup. -* TransferScanner should avoid unnecessary scanning of remotes. - This is paricilarly important for scans queued by the NetWatcher, - which can be polling, or could be after a momentary blip in network - connectivity. The TransferScanner could check the remote's git-annex - branch; if it is not ahead of the local git-annex branch, then - there's nothing to transfer. **Except** if the tree was not already - up-to-date before the loss of connectivity. So doing this needs - tracking of when the tree is not yet fully up-to-date. +* Optimisations in 5c3e14649ee7c404f86a1b82b648d896762cbbc2 temporarily + broke content syncing in some situations, which need to be added back. + + Now syncing a disconnected remote only starts a transfer scan if the + remote's git-annex branch has diverged, which indicates it probably has + new files. But that leaves open the cases where the local repo has + new files; and where the two repos git branches are in sync, but the + content transfers are lagging behind; and where the transfer scan has + never been run. + + Need to track locally whether we're believed to be in sync with a remote. + This includes: + * All local content has been transferred to it successfully. + * The remote has been scanned once for data to transfer from it, and all + transfers initiated by that scan succeeded. + + Note the complication that, if it's initiated a transfer, our queued + transfer will be thrown out as unnecessary. But if its transfer then + fails, that needs to be noticed. + + If we're going to track failed transfers, we could just set a flag, + and use that flag later to initiate a new transfer scan. We need a flag + in any case, to ensure that a transfer scan is run for each new remote. + The flag could be `.git/annex/transfer/scanned/uuid`. + + But, if failed transfers are tracked, we could also record them, in + order to retry them later, without the scan. I'm thinking about a + directory like `.git/annex/transfer/failed/{upload,download}/uuid/`, + which failed transfer log files could be moved to. + + Note that a remote may lose content it had before, so when requeuing + a failed download, should check the location log to see if it still has + the content, and if not, queue a download from elsewhere. (And, a remote + may get content we were uploading from elsewhere, so check the location + log when queuing a failed Upload too.) + * Ensure that when a remote receives content, and updates its location log, it syncs that update back out. Prerequisite for: * After git sync, identify new content that we don't have that is now available @@ -49,6 +76,10 @@ all the other git clones, at both the git level and the key/value level. that need to be done to sync with a remote. Currently it walks the git working copy and checks each file. +## misc todo + +* --debug will show often unnecessary work being done. Optimise. + ## data syncing There are two parts to data syncing. First, map the network and second, @@ -163,8 +194,5 @@ redone to check it. finishes. **done** * Test MountWatcher on KDE, and add whatever dbus events KDE emits when drives are mounted. **done** -* Possibly periodically, or when the network connection - changes, or some heuristic suggests that a remote was disconnected from - us for a while, queue remotes for processing by the TransferScanner. - **done**; both network-manager and wicd connection events are supported, - and it falls back to polling every 30 minutes when neither is available. +* It would be nice if, when a USB drive is connected, + syncing starts automatically. Use dbus on Linux? **done** |