summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2017-02-24 11:17:07 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2017-02-24 11:22:15 -0400
commit52249bd6c6c27df089969e956d9952b1a432ca45 (patch)
tree83c4df66d20abf6929b6face397e21906e02c341
parent845cce262c5f593e00e5cc5ebd7619595d848f36 (diff)
make file2key reject E* backend keys with a long extension
I am not happy that I had to put backend-specific code in file2key. But it would be very difficult to avoid this layering violation. Most of the time, when parsing a Key from a symlink target, git-annex never looks up its Backend at all, so adding this check to a method of the Backend object would not work. The Key could be made to contain the appropriate Backend, but since Backend is parameterized on an "a" that is fixed to the Annex monad later, that would need Key to change to "Key a". The only way to clean this up that I can see would be to have the Key contain a LowlevelBackend, and put the validation in LowlevelBackend. Perhaps later, but that would be an extensive change, so let's not do it in this commit which may want to cherry-pick to backports. This commit was sponsored by Ethan Aubin.
-rw-r--r--CHANGELOG7
-rw-r--r--Types/Key.hs29
-rw-r--r--doc/todo/sha1_collision_embedding_in_git-annex_keys.mdwn7
3 files changed, 36 insertions, 7 deletions
diff --git a/CHANGELOG b/CHANGELOG
index e69ec3d3c..4d6bc19f3 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -33,9 +33,10 @@ git-annex (6.20170215) UNRELEASED; urgency=medium
to wget, since curl is able to display only errors to stderr, unlike
wget.
* status: Pass --ignore-submodules=when option on to git status.
- * Tighten key parser to not accept keys containing non-numeric
- fields, which could be used to embed data useful for a SHA1
- attack against git.
+ * Tighten key parser to mitigate against hypothetical SHA1 preimage
+ attacks. This ensures that signed git commits of annexed files
+ will remain secure, even against the worst possible future SHA1
+ attacks, as long as git-annex is using a secure hashing backend.
-- Joey Hess <id@joeyh.name> Tue, 14 Feb 2017 15:54:25 -0400
diff --git a/Types/Key.hs b/Types/Key.hs
index 23648dd03..d4a4d3728 100644
--- a/Types/Key.hs
+++ b/Types/Key.hs
@@ -1,6 +1,6 @@
{- git-annex Key data type
-
- - Copyright 2011-2016 Joey Hess <id@joeyh.name>
+ - Copyright 2011-2017 Joey Hess <id@joeyh.name>
-
- Licensed under the GNU GPL version 3 or higher.
-}
@@ -104,7 +104,7 @@ file2key s
_ -> Nothing
findfields (c:v) (Just k)
- | c == fieldSep = Just $ k { keyName = v }
+ | c == fieldSep = addkeyname k v
| otherwise = sepfield k v $ addfield c
findfields _ v = v
@@ -134,6 +134,31 @@ file2key s
_ -> Nothing
addfield _ _ _ = Nothing
+ addkeyname k v
+ | validKeyName k v = Just $ k { keyName = v }
+ | otherwise = Nothing
+
+{- A key with a backend ending in "E" is an extension preserving key,
+ - using some hash.
+ -
+ - The length of the extension is limited in order to mitigate against
+ - SHA1 collision attacks (specifically, chosen-prefix attacks).
+ - In such an attack, the extension of the key could be made to contain
+ - the collision generation data, with the result that a signed git commit
+ - including such keys would not be secure.
+ -
+ - The maximum extension length ever generated for such a key was 8
+ - characters; 20 is used here to give a little future wiggle-room.
+ - The SHA1 common-prefix attack used 128 bytes of data.
+ -
+ - This code is here, and not in Backend.Hash (where it really belongs)
+ - so that file2key can check it whenever a Key is constructed.
+ -}
+validKeyName :: Key -> String -> Bool
+validKeyName k v
+ | end (keyBackendName k) == "E" = length (takeExtensions v) <= 20
+ | otherwise = True
+
instance ToJSON Key where
toJSON = toJSON . key2file
diff --git a/doc/todo/sha1_collision_embedding_in_git-annex_keys.mdwn b/doc/todo/sha1_collision_embedding_in_git-annex_keys.mdwn
index 16bcdbc7d..c3ecde01a 100644
--- a/doc/todo/sha1_collision_embedding_in_git-annex_keys.mdwn
+++ b/doc/todo/sha1_collision_embedding_in_git-annex_keys.mdwn
@@ -36,12 +36,15 @@ A few other potential problems:
git-annex has ever supported (probably < 20 bytes or so), which would
be less than the size of the data needed for current SHA1 collision
attacks. Presumably aa chosen-prefix attack would need a similar amount of
- data.
+ data. Update: Now done; git-annex refuses to use keys with super
+ long extensions.
* It might be possible to embed colliding data in a specially constructed
key name with an extra field in it, eg "SHA256-cXXXXXXXXXXXXXXX-...".
Need to review the code and see if such extra fields are allowed.
Update: All fields are numeric, but could contain arbitrary data
- after the number. This has been fixed; git-annex refuses to parse
+ after the number. Could have been used in a chosen-prefix attack
+ (posibly; would require field to come after key name data) or
+ preimage attack. This has been fixed; git-annex refuses to parse
such fields, so it won't work with files that try to exploit this.