summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2012-08-22 15:45:20 -0400
committerGravatar Joey Hess <joey@kitenet.net>2012-08-22 15:45:20 -0400
commite43feeb5b439d3cc8078b52a18a5a8568ceee2ae (patch)
treef134154812809f240111d242a6a33adc0367533f
parent6873ca0c1b1c50d2208c5074a5e7fb7fc267f990 (diff)
update
-rw-r--r--doc/design/assistant/syncing.mdwn56
1 files changed, 42 insertions, 14 deletions
diff --git a/doc/design/assistant/syncing.mdwn b/doc/design/assistant/syncing.mdwn
index 898081574..83c5e9d22 100644
--- a/doc/design/assistant/syncing.mdwn
+++ b/doc/design/assistant/syncing.mdwn
@@ -3,15 +3,42 @@ all the other git clones, at both the git level and the key/value level.
## immediate action items
-* Sync with all available remotes on startup.
-* TransferScanner should avoid unnecessary scanning of remotes.
- This is paricilarly important for scans queued by the NetWatcher,
- which can be polling, or could be after a momentary blip in network
- connectivity. The TransferScanner could check the remote's git-annex
- branch; if it is not ahead of the local git-annex branch, then
- there's nothing to transfer. **Except** if the tree was not already
- up-to-date before the loss of connectivity. So doing this needs
- tracking of when the tree is not yet fully up-to-date.
+* Optimisations in 5c3e14649ee7c404f86a1b82b648d896762cbbc2 temporarily
+ broke content syncing in some situations, which need to be added back.
+
+ Now syncing a disconnected remote only starts a transfer scan if the
+ remote's git-annex branch has diverged, which indicates it probably has
+ new files. But that leaves open the cases where the local repo has
+ new files; and where the two repos git branches are in sync, but the
+ content transfers are lagging behind; and where the transfer scan has
+ never been run.
+
+ Need to track locally whether we're believed to be in sync with a remote.
+ This includes:
+ * All local content has been transferred to it successfully.
+ * The remote has been scanned once for data to transfer from it, and all
+ transfers initiated by that scan succeeded.
+
+ Note the complication that, if it's initiated a transfer, our queued
+ transfer will be thrown out as unnecessary. But if its transfer then
+ fails, that needs to be noticed.
+
+ If we're going to track failed transfers, we could just set a flag,
+ and use that flag later to initiate a new transfer scan. We need a flag
+ in any case, to ensure that a transfer scan is run for each new remote.
+ The flag could be `.git/annex/transfer/scanned/uuid`.
+
+ But, if failed transfers are tracked, we could also record them, in
+ order to retry them later, without the scan. I'm thinking about a
+ directory like `.git/annex/transfer/failed/{upload,download}/uuid/`,
+ which failed transfer log files could be moved to.
+
+ Note that a remote may lose content it had before, so when requeuing
+ a failed download, should check the location log to see if it still has
+ the content, and if not, queue a download from elsewhere. (And, a remote
+ may get content we were uploading from elsewhere, so check the location
+ log when queuing a failed Upload too.)
+
* Ensure that when a remote receives content, and updates its location log,
it syncs that update back out. Prerequisite for:
* After git sync, identify new content that we don't have that is now available
@@ -49,6 +76,10 @@ all the other git clones, at both the git level and the key/value level.
that need to be done to sync with a remote. Currently it walks the git
working copy and checks each file.
+## misc todo
+
+* --debug will show often unnecessary work being done. Optimise.
+
## data syncing
There are two parts to data syncing. First, map the network and second,
@@ -163,8 +194,5 @@ redone to check it.
finishes. **done**
* Test MountWatcher on KDE, and add whatever dbus events KDE emits when
drives are mounted. **done**
-* Possibly periodically, or when the network connection
- changes, or some heuristic suggests that a remote was disconnected from
- us for a while, queue remotes for processing by the TransferScanner.
- **done**; both network-manager and wicd connection events are supported,
- and it falls back to polling every 30 minutes when neither is available.
+* It would be nice if, when a USB drive is connected,
+ syncing starts automatically. Use dbus on Linux? **done**