diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/devblog/day_488__groundwork_for_using_p2pstdio.mdwn | 30 | ||||
-rw-r--r-- | doc/git-annex-shell.mdwn | 5 | ||||
-rw-r--r-- | doc/git-annex.mdwn | 2 | ||||
-rw-r--r-- | doc/todo/accellerate_ssh_remotes_with_git-annex-shell_mass_protocol.mdwn | 4 |
4 files changed, 39 insertions, 2 deletions
diff --git a/doc/devblog/day_488__groundwork_for_using_p2pstdio.mdwn b/doc/devblog/day_488__groundwork_for_using_p2pstdio.mdwn new file mode 100644 index 000000000..e09da769a --- /dev/null +++ b/doc/devblog/day_488__groundwork_for_using_p2pstdio.mdwn @@ -0,0 +1,30 @@ +Spent most of the day laying groundwork for using git-annex-shell p2pstdio. +Implemented pools of ssh connections to it, and added uuid verification. +Then generalized code from the p2p remote so it can be reused in the git +remote. The types got super hairy in there, but the code reuse level is +excellent. + +Finally it was time to convert the first ssh remote method +to use the P2P protocol. I chose key removal, since benchmarking it doesn't +involve the size of annexed objects. + +Here's the P2P protocol in action over ssh: + + [2018-03-08 17:02:47.688627136] chat: ssh ["localhost","-S",".git/annex/ssh/localhost","-o","ControlMaster=auto","-o","ControlPersist=yes","-T","git-annex-shell 'p2pstdio' '/~/tmp/bench/a' '--debug' 'da72c285-2615-4a67-828f-eaae4f42fc3d' --uuid db017fac-eb8f-42d9-9d09-2780b193cef1"] + [2018-03-08 17:02:47.901897195] P2P < AUTH-SUCCESS db017fac-eb8f-42d9-9d09-2780b193cef1 + [2018-03-08 17:02:47.902025504] P2P > REMOVE SHA256E-s4--97b912eb4a61df5f806ca6239dde3e1a4f51ad20aced1642cbb83dc510a5fa6b + [2018-03-08 17:02:47.910074003] P2P < SUCCESS + [2018-03-08 17:02:47.914181701] P2P > REMOVE SHA256E-s4--6af2f5b785a8930f0bd3edc833e18fa191167ab0535ef359b19a1982a6984e96 + [2018-03-08 17:02:47.918699806] P2P < SUCCESS + +For a benchmark, I set up a repository with 1000 annexed files, +and cloned it from localhost, then ran `git annex drop --from origin`. + +before: 41 seconds +after: 10 seconds + +400% speedup for dropping is pretty great.. And when there's more latency +than loopback has, the improvement should be more pronounced. +Will test it this evening over my satellite internet. :) + +Today's work was sponsored by Trenton Cronholm on [Patreon](https://patreon.com/joeyh/). diff --git a/doc/git-annex-shell.mdwn b/doc/git-annex-shell.mdwn index cf72e091b..fc536e44b 100644 --- a/doc/git-annex-shell.mdwn +++ b/doc/git-annex-shell.mdwn @@ -90,12 +90,15 @@ first "/~/" or "/~user/" is expanded to the specified home directory. Sets up a repository as a gcrypt repository. -* p2pstdio directory +* p2pstdio directory uuid This causes git-annex-shell to communicate using the git-annex p2p protocol over stdio. When supported by git-annex-shell, this allows multiple actions to be run over a single connection, improving speed. + The uuid is the one belonging to the repository that will be + communicating with git-annex-shell. + # OPTIONS Most options are the same as in git-annex. The ones specific diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn index db8cfca61..7f89bdbf4 100644 --- a/doc/git-annex.mdwn +++ b/doc/git-annex.mdwn @@ -1244,7 +1244,7 @@ Here are all the supported configuration settings. git-annex caches UUIDs of remote repositories here. -- `remote.<name>.annex-checkuuid` +* `remote.<name>.annex-checkuuid` This only affects remotes that have their url pointing to a directory on the same system. git-annex normally checks the uuid of such diff --git a/doc/todo/accellerate_ssh_remotes_with_git-annex-shell_mass_protocol.mdwn b/doc/todo/accellerate_ssh_remotes_with_git-annex-shell_mass_protocol.mdwn index ff4b8c59d..a592e17a9 100644 --- a/doc/todo/accellerate_ssh_remotes_with_git-annex-shell_mass_protocol.mdwn +++ b/doc/todo/accellerate_ssh_remotes_with_git-annex-shell_mass_protocol.mdwn @@ -40,3 +40,7 @@ Implementation todos: git-annex-shell recvkey has a speed optimisation, when it's told the file being sent is locked, it can avoid an expensive verification. * Maybe similar for transfers in the other direction? +* What happens when the assistant is running and some connections are open + and it moves between networks? +* If it's unable to ssh to a host to run p2pstdio, it will fall back to the + old method. What if the host is down, does this double the timeout? |