diff options
author | 2012-05-26 21:11:19 -0400 | |
---|---|---|
committer | 2012-05-26 21:11:19 -0400 | |
commit | 45a01db6add3399ff6ca93f2e7c7d83dbf59992d (patch) | |
tree | 0616e646feb8e663c9c995be32c03260483554a8 /doc/design/assistant/syncing.mdwn | |
parent | f7524811e2e47ca8455d3038b6d0a8102e5a5044 (diff) |
add preliminary design
Diffstat (limited to 'doc/design/assistant/syncing.mdwn')
-rw-r--r-- | doc/design/assistant/syncing.mdwn | 31 |
1 files changed, 31 insertions, 0 deletions
diff --git a/doc/design/assistant/syncing.mdwn b/doc/design/assistant/syncing.mdwn new file mode 100644 index 000000000..0a081c101 --- /dev/null +++ b/doc/design/assistant/syncing.mdwn @@ -0,0 +1,31 @@ +Once files are added (or removed or moved), need to send those changes to +all the other git clones, at both the git level and the key/value level. + +## git syncing + +1. At regular intervals, just run `git annex sync`, which already handles + bidirectional syncing. +2. Investigate the XMPP approach like dvcs-autosync does, or other ways of + signaling a change out of band. +3. Add a hook, so when there's a change to sync, a program can be run. + +## data syncing + +There are two parts to data syncing. First, map the network and second, +decide what to sync when. + +Mapping the network can reuse code in `git annex map`. Once the map is +built, we want to find paths through the network that reach all nodes +eventually, with the least cost. This is a minimum spanning tree problem, +except with a directed graph, so really a Arborescence problem. + +With the map, we can determine which nodes to push new content to. Then we +need to control those data transfers, sending to the cheapest nodes first, +and with appropriate rate limiting and control facilities. + +This probably will need lots of refinements to get working well. + +## other considerations + +This assumes the network is connected. It's often not, so the +cloud needs to be used to bridge between LANs. |