From 69653852585c7bb56de0baeb482211a9fbe07877 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Mon, 10 Jul 2017 14:06:49 -0400 Subject: thoughts --- ...ent_4_da8b5b8c9fc371fe138590744b2adce3._comment | 51 ++++++++++++++++++++++ 1 file changed, 51 insertions(+) create mode 100644 doc/todo/export/comment_4_da8b5b8c9fc371fe138590744b2adce3._comment diff --git a/doc/todo/export/comment_4_da8b5b8c9fc371fe138590744b2adce3._comment b/doc/todo/export/comment_4_da8b5b8c9fc371fe138590744b2adce3._comment new file mode 100644 index 000000000..1d694e0e2 --- /dev/null +++ b/doc/todo/export/comment_4_da8b5b8c9fc371fe138590744b2adce3._comment @@ -0,0 +1,51 @@ +[[!comment format=mdwn + username="joey" + subject="""comment 4""" + date="2017-07-10T17:30:23Z" + content=""" +Yoh suggested storing a treeish associated with the export to a special +remote. Pointed out that the diff from that treeish to the next one could +be used to update the export. + +(That does seem very close to [[todo/dumb, unsafe, human-readable_backend]].) + +Would need to somehow deal with partial uploads. There are two ways +an upload can be partial: + +* Some of the files in the treeish have been uploaded; others have not. +* A file has been partially uploaded. + +These two cases need to be disentangled somehow in order to handle +them. It could use the location log, so once a key gets uploaded +to the special remote, its content is marked present. However, using the +location log does not seem sufficient to handle all cases (eg two files +swapping names between two treeishes, where one of the files has been +uploaded only partially to the special remote). + +It seems promosing to keep track of two separate treeishes: + +1. The treeish that is the current goal to have exported to the special + remote. +2. The treeish for the current state of the special remote. Note that + after even an interrupted transfer, this treeish needs to be updated to + contain the current state of the special remote, which would mean + constructing a new treeish. (Perhaps a per-remote index file could be + used.) + +Having these two treeishes allows: + +* Detecting renames of exported files, which some special remotes can do + more efficiently. +* Determining the key that a given file on the special remote is + storing the content of. +* Resuming an interrupted export, without re-uploading all the files. +* Detecting a partially uploaded file, because the current state treeish + for the remote should not contain that file. +* Knowing what key was in the process of being sent to a partially uploaded + file, and so resuming that upload. Look at the goal treeish and see what + key it has for the file; as long as the goal treeish is the same goal + that was used for the interrupted export, that's the key. (This needs a + way to track if the goal has changed.) +* Optionally, making `git annex sync` and the assistant upload as needed + to satisfy goal treeishes for special remotes. +"""]] -- cgit v1.2.3