summaryrefslogtreecommitdiff
path: root/doc/design/external_special_remote_protocol.mdwn
blob: 055ba0124e85b097b52b0fe3425931e5f4249d91 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
See [[todo/support_for_writing_external_special_remotes]] for motivation.

This is a design for a protocol to be used to communicate between git-annex
and a program implementing an external special remote.

The program has a name like `git-annex-remote-$bar`. When
`git annex initremote foo type=$bar` is run, git-annex finds the
appropriate program in PATH.

The program is started by git-annex when it needs to access the special
remote, and may be left running for a long period of time. This allows
it to perform expensive setup tasks, etc. Note that git-annex may choose to
start multiple instances of the program (eg, when multiple git-annex
commands are run concurrently in a repository).

Communication is via the programs stdin and stdout. Therefore, the program
must avoid doing any prompting, or outputting anything like eg, progress to
stdout. (Such stuff can be sent to stderr instead.)

The protocol is line based. Messages are sent in either direction, from
git-annex to the program, and from the program to git-annex. No immediate
reply is made to any message, instead a later message can be sent to reply.

## example

For example, git-annex might request that a key be sent to the
remote (Key will be replaced with the key, and File with a file that has
the content to send):

	TRANSFER STORE Key File

Any number of messages can be sent back and forth while that upload
is going on. A common message the program would send is to tell the
progress of the upload (in bytes):

	PROGRESS STORE Key 10240
	PROGRESS STORE Key 20480

Once the file has been sent, the program can reply with the result:

	TRANSFER-SUCCESS STORE Key

## git-annex messages

These are the messages git-annex may send to the special remote program.

* `CONFIGURE KEY=VALUE ...`  
  Tells the remote its configuration. Any arbitrary KEY(s) can be passed.
  Only run once, at startup.
* `INITREMOTE`  
  Request that the remote be initialized. CONFIGURE will be passed first.
  Note that this may be run repeatedly, as a remote is initialized in
  different repositories, or as the configuration of a remote is changed.
* `GETCOST`  
  Requests the remote return a use cost. Higher costs are more expensive.
  (See Config/Cost.hs for some standard costs.)
* `TRANSFER STORE|RETRIEVE Key File`  
  Requests the transfer of a key. For Send, the File is the file to upload;
  for Receive the File is where to store the download. Note that the File
  should not influence the filename used on the remote. The filename used
  should be derived from the Key.  
  Multiple transfers might be requested by git-annex, but it's fine for the 
  program to serialize them and only do one at a time.
* `HAS Key`  
  Requests the remote check if a key is present in it.
* `REMOVE Key`  
  Requests the remote remove a key's contents.
  

## special remote messages

These are the messages the special remote program can send back to
git-annex.

* `VERSION Int`  
  Supported protocol version. Current version is 0. Must be sent first
  thing at starup.
* `ERROR ErrorMsg`  
  Generic error. Can be sent at any time if things get messed up.
  It would be a good idea to send this if git-annex sends a command
  you do not support. The program should exit after sending this, as
  git-annex will not talk to it any further.
* `TRANSFER-SUCCESS STORE|RETRIEVE Key`  
  Indicates the transfer completed successfully.
* `TRANSFER-FAILURE STORE|RETRIEVE Key ErrorMsg`  
  Indicates the transfer failed.
* `PROGRESS STORE|RETRIEVE Key Int`  
  Indicates the current progress of the transfer. May be repeated any
  number of times during the transfer process. This is highly recommended
  for STORE. (It is not necessary for RETRIEVE.)
* `HAS-SUCCESS Key`  
  Indicates that a key has been positively verified to be present in the
  remote.
* `HAS-FAILURE Key`  
  Indicates that a key has been positively verified to not be present in the
  remote.
* `HAS-UNKNOWN Key ErrorMsg`  
  Indicates that it is not currently possible to verify if the key is
  present in the remote. (Perhaps the remote cannot be contacted.)
* `REMOVE-SUCCESS Key`  
  Indicates the key has been removed from the remote. May be returned if
  the remote didn't have the key at the point removal was requested.
* `REMOVE-FAILURE Key`  
  Indicates that the key was unable to be removed from the remote.
* `COST Int`  
  Indicates the cost of the remote.
* `COST-UNKNOWN`  
  Indicates the remote has no opinion of its cost.
* `CONFIGURE-SUCCESS`  
  Indicates the CONFIGURE provided an acceptable configuration.
* `CONFIGURE-FAILURE ErrorMsg`  
  Indicates that CONFIGURE provided a bad configuration.
* `INITREMOTE-SUCCESS KEY=VALUE ...`  
  Indicates the INITREMOTE succeeded and the remote is ready to use.
  The keys and values can optionally be returned. They will be stored
  by git-annex, and sent back the next time it calls CONFIGURE.
* `INITREMOTE-FAILURE ErrorMsg`  
  Indicates that INITREMOTE failed.

## Simple shell example

[[!format sh """
#!/bin/sh
set -e

send () {
	echo "$@"
}

send VERSION 0

while read line; do
	set -- $line
	case "$1" in
		CONFIGURE)
			send CONFIGURE-SCCESS
		;;
		INITREMOTE)
			send INITREMOTE-SUCCESS
		;;
		GETCOST)
			send COST-UNKNOWN
		;;
		TRANSFER)
			key="$3"
			file="$4"
			case "$2" in
				STORE)
					# XXX upload file here
					# XXX when possible, send PROGRESS
					send TRANSFER-SUCCESS STORE "$key"
				;;
				RETRIEVE)
					# XXX download file here
					send TRANSFER-SUCCESS RETRIEVE "$key"
				;;
				
			esac
		;;
		HAS)
			key="$2"
			send HAS-UNKNOWN "$key" "not implemented"
		;;
		REMOVE)
			key="$2"
			# XXX remove key here
			send REMOVE-SUCCESS "$key"
		;;
		*)
			send ERROR "unknown command received: $line"
			exit 1
		;;
	esac	
done
"""]]

## TODO

* Communicate when the network connection may have changed, so long-running
  remotes can reconnect.
* Provide a way for remotes to set/get the content of a per-key
  file in the git-annex branch. Needed for eg, storing urls, or access keys
  used to retrieve a given key.
* Support for splitting files into chunks.
* git-annex hash directory lookup for a key?
* Use same verbs as used in special remote interface (instead of different
  verbs used in Types.Remote).