doc/design/assistant/telehash.mdwn


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157

[Telehash](http://telehash.org/) for secure P2P communication between
git-annex (assistant) repositories.

Or something similar like [Snow](http://www.trustiosity.com/snow/)
or [cjdns](https://github.com/cjdelisle/cjdns) or tor or i2p or [magic wormhole](http://magic-wormhole.io/).

## telehash implementation status

* node.js version seems almost complete
* C version currently lacks channel support and seems buggy (13 Jan 2014)
* No pure haskell implementation of telehash v2. There was one of
  telehash v1 (even that seems incomplete). I have pinged its author
  to see if he anticipates updating it.
* Rapid development, situation may change in a month or 2. (2014)  
  Not seeing many commits now (2015)
* Is it secure? A security review should be done by competant people
  (not Joey). See <https://github.com/telehash/telehash.org/issues/23>
* **Haskell version** 
  <https://github.com/alanz/htelehash/tree/v2/src/TeleHash>
  May support v2; v3 support seems not started yet, and not in active
  development at the moment, although there has been work on it a year ago.
* Not very convinced this is going to be usable anytime soon, so would like
  to find something that is. (2015)

## snow status

* Seems ready to use?
* NAT punching works per docs; relies on a DHT network for hole punching,
  and the reliabilty of that network is not known. I notice it has only 1
  pre-seeded peer in the source tree for the DHT, and that peer was not up
  when I tried it.
* Only provides network-layer transport, still need to implement some 
  file transfer protocol on top.

## cjdns status

* Has a network with "hundreds of active nodes"
* Is not pure P2P; there's a network that does routing
  of packets. This may be a good thing, or not.
* Seems to require manual configuration of a "friend"
  node that's already on the network, with address and password to connect
  to it, so if you can't find someone you know to connect to their node,
  you can't use it. Urk.

## tor status

* Awesome.
* Easy to install, use; very well known.
* May need root to set up a hidden service.
* There's been some [haskell packages developed recently](http://www.leonmergen.com/haskell/privacy/2015/05/30/on-anonymous-networking-in-haskell-announcing-tor-and-i2p-for-haskell.html) 
  to communicate with tor and set up onion addresses for a service.
  Could be used to make git-annex run as a hidden service.
  However, that relies on tor being configured with a ControlPort,
  without authentication. The normal tor configuration does not enable a
  ControlPort.

## i2p status

## magic wormhole

* simple file transfer protocol with out of band shared secret
* handles NAT transversal
* easy to use
* doesn't require a running daemon
* can transfer arbitrary blobs (strings, directories, files)

## general design

* Make address.log that contains (uuid, transport, address, Maybe authtoken)
* The authtoken is an additional guard, to protect against transports
  where the address might be able to be guessed, or observed by the rest of
  the network.
* Some addresses can be used with only the provided authtoken
  from the address.log. Remotes can be auto-enabled for these.
* Other addresses have Nothing povided for the authtoken, and one
  has to instead be provided during manual enabling of the remote.
* The remotedaemon runs, and/or communicates with the program implementing
  the network transport. For example for tor, the remotedaemon runs
  the hidden service, and also connects to the tor hidden services of
  other nodes.
* The remotedaemon handles both sides of git push over the transport.
* The remotedaemon may also support sending objects over the transport,
  depending on the transport.

## address discovery

The address is a public key, and the authtoken is some large chunk of data,
so won't want to type that in. Need discovery.

* Easy way is any set of repos that are already connected can communicate
  them via address.log.
* Address and authtoken can be communicated out of band (eg,
  via an OTR session or gpg encrypted mail or phone call),
  and pasted into the webapp.
* Use eg, electrum-mnemonic to encode the address+authtoken so that
  it can be read over the phone.
* Users may not have a way to communicate with perfect forward secrecy.
  So it would be good to have a address+authtoken that can only be used
  one time during pairing:

  1. Alice uses the webapp to generate a one-time address+authtoken,
     and sends it into a message to Bob.
  2. Bob enters it into his webapp.
  3. Bob's assistant contacts Alice's over the transport, presents the
     one-time authtoken. (Alice's assistant accepts it, and marks it as
     used so it cannot be used again.)
  4. Alice's webapp shows that it's ready to finish pairing; so does Bob's.
     Both wait for their users to confirm before proceeding.
  5. Alice's assistant generates a new, permanant use authtoken, sends it
     to Bob's assistant, which stores it and enables a remote using it.
  6. Bob's assistant generates a new, permanant use authtoken, sends it to
     Alice's assistant, which stores it and enables a remote using it.
  7. Alice and Bob's assistants are now paired.

  Note that this exchange can be actively MITMed. If Eve can intercept
  Alice's message to Bob, then Eve can pair with Alice. Or, if Eve can
  forge a message from Alice to Bob, Eve can trick Bob into pairing with
  her.

  If they make a phone call, it's much harder for Eve to MITM it.
  Eve would need to listen to Alice reading the authtoken and enter it
  before Bob does, so pairing with Alice. But as long as Alice waits
  for Bob to confirm he's ready to finish pairing, this will fail,
  because Bob won't get to that point if the authtoken is intercepted.

## local lan detection

At connection time, after authentication, the remote can send
(ip address, ssh host key). Try sshing to the ip address to check if
the host key matches. If so, can enable a ssh remote, which will
be cheaper than using the transport. Send the ssh public key back to the
remote to get it authorized.

## remotedaemon

See [[git-remote-daemon]] for its design.

Advantages:

* `git annex sync` could also use the running daemon
* `git-remote-$transport` could use the running daemon
* c-telehash might end up linked to openssl, which has licence combination
  problems with git-annex. A separate process not using git-annex's code
  would avoid this.
* Allows the daemon to be written in some other language if necessary
  (for example, if c-telehash development stalls and the nodejs version is
  already usable)
* Potentially could be generalized to handle other similar protocols.
  Or even the xmpp code moved into it. There could even be git-annex native
  exchange protocols implemented in such a daemon to allow SSH-less
  transfers.
* Security holes in telehash would not need to compromise the entire
  git-annex. daemon could be sandboxed in one way or another.

Disadvantages:

* Adds some complexity.