summaryrefslogtreecommitdiff
path: root/doc/design/assistant
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2014-08-06 13:45:19 -0400
committerGravatar Joey Hess <joey@kitenet.net>2014-08-06 13:45:19 -0400
commitcc2606ce542e04c22cf229d536ad621f9e25c12d (patch)
tree27601f447942bcefbfcb89c6588f330d7580098c /doc/design/assistant
parent5cd02fd1a52b3f1c19b01f4a021a7c63b96796dd (diff)
pushed checkPresent exception handling out of Remote implementations
I tend to prefer moving toward explicit exception handling, not away from it, but in this case, I think there are good reasons to let checkPresent throw exceptions: 1. They can all be caught in one place (Remote.hasKey), and we know every possible exception is caught there now, which we didn't before. 2. It simplified the code of the Remotes. I think it makes sense for Remotes to be able to be implemented without needing to worry about catching exceptions inside them. (Mostly.) 3. Types.StoreRetrieve.Preparer can only work on things that return a Bool, which all the other relevant remote methods already did. I do not see a good way to generalize that type; my previous attempts failed miserably.
Diffstat (limited to 'doc/design/assistant')
-rw-r--r--doc/design/assistant/chunks.mdwn16
1 files changed, 8 insertions, 8 deletions
diff --git a/doc/design/assistant/chunks.mdwn b/doc/design/assistant/chunks.mdwn
index a9709a778..0aa389899 100644
--- a/doc/design/assistant/chunks.mdwn
+++ b/doc/design/assistant/chunks.mdwn
@@ -91,7 +91,7 @@ cannot tell when we've gotten the last chunk. (Also, we cannot strip
padding.) Note that `addurl` sometimes generates keys w/o size info
(particularly, it does so by design when using quvi).
-Problem: Also, this makes `hasKey` hard to implement: How can it know if
+Problem: Also, this makes `checkPresent` hard to implement: How can it know if
all the chunks are present, if the key size is not known?
Problem: Also, this makes it difficult to download encrypted keys, because
@@ -111,7 +111,7 @@ So, SHA256-1048576-c1--xxxxxxx for the first chunk of 1 megabyte.
Before any chunks are stored, write a chunkcount file, eg
SHA256-s12345-c0--xxxxxxx. Note that this key is the same as the original
object's key, except with chunk number set to 0. This file contains both
-the number of chunks, and also the chunk size used. `hasKey` downloads this
+the number of chunks, and also the chunk size used. `checkPresent` downloads this
file, and then verifies that each chunk is present, looking for keys with
the expected chunk numbers and chunk size.
@@ -126,7 +126,7 @@ Note: This design lets an attacker with logs tell the (appoximate) size of
objects, by finding the small files that contain a chunk count, and
correlating when that is written/read and when other files are
written/read. That could be solved by padding the chunkcount key up to the
-size of the rest of the keys, but that's very innefficient; `hasKey` is not
+size of the rest of the keys, but that's very innefficient; `checkPresent` is not
designed to need to download large files.
# design 3
@@ -139,7 +139,7 @@ This seems difficult; attacker could probably tell where the first encrypted
part stops and the next encrypted part starts by looking for gpg headers,
and so tell which files are the first chunks.
-Also, `hasKey` would need to download some or all of the first file.
+Also, `checkPresent` would need to download some or all of the first file.
If all, that's a lot more expensive. If only some is downloaded, an
attacker can guess that the file that was partially downloaded is the
first chunk in a series, and wait for a time when it's fully downloaded to
@@ -163,7 +163,7 @@ The location log does not record locations of individual chunk keys
(too space-inneficient). Instead, look at a chunk log in the
git-annex branch to get the chunk count and size for a key.
-`hasKey` would check if any of the logged sets of chunks is
+`checkPresent` would check if any of the logged sets of chunks is
present on the remote. It would also check if the non-chunked key is
present, as a fallback.
@@ -225,7 +225,7 @@ Reasons:
Note that this means that the chunks won't exactly match the configured
chunk size. gpg does compression, which might make them a
-lot smaller. Or gpg overhead could make them slightly larger. So `hasKey`
+lot smaller. Or gpg overhead could make them slightly larger. So `checkPresent`
cannot check exact file sizes.
If padding is enabled, gpg compression should be disabled, to not leak
@@ -250,10 +250,10 @@ and skip forward to the next needed chunk. Easy.
Uploads: Check if the 1st chunk is present. If so, check the second chunk,
etc. Once the first missing chunk is found, start uploading from there.
-That adds one extra hasKey call per upload. Probably a win in most cases.
+That adds one extra checkPresent call per upload. Probably a win in most cases.
Can be improved by making special remotes open a persistent
connection that is used for transferring all chunks, as well as for
-checking hasKey.
+checking checkPresent.
Note that this is safe to do only as long as the Key being transferred
cannot possibly have 2 different contents in different repos. Notably not