diff options
author | Joey Hess <joey@kitenet.net> | 2011-11-04 15:21:45 -0400 |
---|---|---|
committer | Joey Hess <joey@kitenet.net> | 2011-11-04 15:51:01 -0400 |
commit | ef3457196ace3669ddfa93039f2d3c15baf54713 (patch) | |
tree | 391787de35537c71068cdd8e2fc882109a2c3b79 /doc/backends.mdwn | |
parent | 1089e85d48a0d3c455fc2f4139b82484b94b5bbe (diff) |
use SHA256 by default
To get old behavior, add a .gitattributes containing: * annex.backend=WORM
I feel that SHA256 is a better default for most people, as long as their
systems are fast enough that checksumming their files isn't a problem.
git-annex should default to preserving the integrity of data as well as git
does. Checksum backends also work better with editing files via
unlock/lock.
I considered just using SHA1, but since that hash is believed to be somewhat
near to being broken, and git-annex deals with large files which would be a
perfect exploit medium, I decided to go to a SHA-2 hash.
SHA512 is annoyingly long when displayed, and git-annex displays it in a
few places (and notably it is shown in ls -l), so I picked the shorter
hash. Considered SHA224 as it's even shorter, but feel it's a bit weird.
I expect git-annex will use SHA-3 at some point in the future, but
probably not soon!
Note that systems without a sha256sum (or sha256) program will fall back to
defaulting to SHA1.
Diffstat (limited to 'doc/backends.mdwn')
-rw-r--r-- | doc/backends.mdwn | 32 |
1 files changed, 18 insertions, 14 deletions
diff --git a/doc/backends.mdwn b/doc/backends.mdwn index ebcdedc2a..2030d107a 100644 --- a/doc/backends.mdwn +++ b/doc/backends.mdwn @@ -5,17 +5,19 @@ to retrieve the file's content (its value). Multiple pluggable key-value backends are supported, and a single repository can use different ones for different files. -* `WORM` ("Write Once, Read Many") This assumes that any file with - the same basename, size, and modification time has the same content. - This is the default, and the least expensive backend. -* `SHA1` -- This uses a key based on a sha1 checksum. This allows +* `SHA256` -- The default backend for new files. This allows verifying that the file content is right, and can avoid duplicates of files with the same content. Its need to generate checksums - can make it slower for large files. -* `SHA512`, `SHA384`, `SHA256`, `SHA224` -- Like SHA1, but larger - checksums. Mostly useful for the very paranoid, or anyone who is - researching checksum collisions and wants to annex their colliding data. ;) -* `SHA1E`, `SHA512E`, etc -- Variants that preserve filename extension as + can make it slower for large files. +* `WORM` ("Write Once, Read Many") This assumes that any file with + the same basename, size, and modification time has the same content. + This is the the least expensive backend, recommended for really large + files or slow systems. +* `SHA512` -- Best currently available hash, for the very paranoid. +* `SHA1` -- Smaller hash than `SHA256` for those who want a checksum + but are not concerned about security. +* `SHA384`, `SHA224` -- Hashes for people who like unusual sizes. +* `SHA256E`, `SHA1E`, etc -- Variants that preserve filename extension as part of the key. Useful for archival tasks where the filename extension contains metadata that should be preserved. @@ -27,9 +29,11 @@ For finer control of what backend is used when adding different types of files, the `.gitattributes` file can be used. The `annex.backend` attribute can be set to the name of the backend to use for matching files. -For example, to use the SHA1 backend for sound files, which tend to be -smallish and might be modified or copied over time, you could set in -`.gitattributes`: +For example, to use the SHA256 backend for sound files, which tend to be +smallish and might be modified or copied over time, +while using the WORM backend for everything else, you could set +in `.gitattributes`: - *.mp3 annex.backend=SHA1 - *.ogg annex.backend=SHA1 + * annex.backend=WORM + *.mp3 annex.backend=SHA256 + *.ogg annex.backend=SHA256 |