From bbba6c19bd03f2b4a4ce8a38a2423c794826b1c5 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sun, 28 Aug 2011 16:28:38 -0400 Subject: update documentation for new, neutered key-value backends Backends are now only used to generate keys (and check them); they are not arbitrary key-value stores for data, because it turned out such a store is better modeled as a special remote. Updated docs to not imply backends do more than they do now. Sometimes I'm tempted to rename "backend" to "keytype" or something, which would really be more clear. But it would be an annoying transition for users, with annex.backends etc. --- doc/backends.mdwn | 22 +++++++--------------- doc/copies.mdwn | 6 +++--- doc/git-annex.mdwn | 26 +++++++++++++------------- doc/walkthrough/fsck:_verifying_your_data.mdwn | 10 +++++----- doc/walkthrough/unused_data.mdwn | 2 +- 5 files changed, 29 insertions(+), 37 deletions(-) (limited to 'doc') diff --git a/doc/backends.mdwn b/doc/backends.mdwn index 9e698032d..f2f1c5580 100644 --- a/doc/backends.mdwn +++ b/doc/backends.mdwn @@ -1,24 +1,16 @@ -git-annex uses a key-value abstraction layer to allow file contents to be -stored in different ways. In theory, any key-value storage system could be -used to store file contents. - When a file is annexed, a key is generated from its content and/or metadata. The file checked into git symlinks to the key. This key can later be used to retrieve the file's content (its value). -Multiple pluggable backends are supported, and a single repository -can use different backends for different files. - -These backends can transfer file contents between configured git remotes. -It's also possible to use [[special_remotes]], such as Amazon S3 with -these backends. +Multiple pluggable key-value backends are supported, and a single repository +can use different ones for different files. -* `WORM` ("Write Once, Read Many") This backend assumes that any file with - the same basename, size, and modification time has the same content. So with - this backend, files can be moved around, but should never be added to +* `WORM` ("Write Once, Read Many") This assumes that any file with + the same basename, size, and modification time has the same content. So + files can be moved around, but should never be added to or changed. This is the default, and the least expensive backend. -* `SHA1` -- This backend uses a key based on a sha1 checksum. This backend - allows modifications of files to be tracked. Its need to generate checksums +* `SHA1` -- This uses a key based on a sha1 checksum. This allows + modifications of files to be tracked. Its need to generate checksums can make it slower for large files. * `SHA512`, `SHA384`, `SHA256`, `SHA224` -- Like SHA1, but larger checksums. Mostly useful for the very paranoid, or anyone who is diff --git a/doc/copies.mdwn b/doc/copies.mdwn index 16eba19c8..93cbd8ea8 100644 --- a/doc/copies.mdwn +++ b/doc/copies.mdwn @@ -1,8 +1,8 @@ -The WORM and SHA1 key-value [[backends]] store data inside -your git repository's `.git` directory, not in some external data store. +Annexed data is stored inside your git repository's `.git/annex` directory. +Some [[special_remotes]] can store annexed data elsewhere. It's important that data not get lost by an ill-considered `git annex drop` -command. So, then using those backends, git-annex can be configured to try +command. So, git-annex can be configured to try to keep N copies of a file's content available across all repositories. (Although [[untrusted_repositories|trust]] don't count toward this total.) diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn index a262d465f..52599611e 100644 --- a/doc/git-annex.mdwn +++ b/doc/git-annex.mdwn @@ -72,15 +72,15 @@ Many git-annex commands will stage changes for later `git commit` by you. * get [path ...] - Makes the content of annexed files available in this repository. Depending - on the backend used, this will involve copying them from another repository, - or downloading them, or transferring them from some kind of key-value store. + Makes the content of annexed files available in this repository. This + will involve copying them from another repository, or downloading them, + or transferring them from some kind of key-value store. * drop [path ...] Drops the content of annexed files from this repository. - git-annex may refuse to drop content if the backend does not think + git-annex may refuse to drop content if it does not think it is safe to do so, typically because of the setting of annex.numcopies. * move [path ...] @@ -207,14 +207,14 @@ Many git-annex commands will stage changes for later `git commit` by you. * migrate [path ...] - Changes the specified annexed files to store their content in the - default backend (or the one specified with --backend). Only files whose - content is currently available are migrated. + Changes the specified annexed files to use the default key-value backend + (or the one specified with --backend). Only files whose content + is currently available are migrated. - Note that the content is not removed from the backend it was previously in. - Use `git annex unused` to find and remove such content. + Note that the content is also still available using the old key after + migration. Use `git annex unused` to find and remove the old key. - Normally, nothing will be done to files already in the backend. + Normally, nothing will be done to files already using the new backend. However, if a backend changes the information it uses to construct a key, this can also be used to migrate files to use the new key format. @@ -293,7 +293,7 @@ Many git-annex commands will stage changes for later `git commit` by you. * fromkey file This plumbing-level command can be used to manually set up a file - to link to a specified key in the key-value backend. + in the git repository to link to a specified key. * dropkey [key ...] @@ -500,8 +500,8 @@ Here are all the supported configuration settings. # CONFIGURATION VIA .gitattributes -The backend used when adding a new file to the annex can be configured -on a per-file-type basis via `.gitattributes` files. In the file, +The key-value backend used when adding a new file to the annex can be +configured on a per-file-type basis via `.gitattributes` files. In the file, the `annex.backend` attribute can be set to the name of the backend to use. For example, this here's how to use the WORM backend by default, but the SHA1 backend for ogg files: diff --git a/doc/walkthrough/fsck:_verifying_your_data.mdwn b/doc/walkthrough/fsck:_verifying_your_data.mdwn index 7e05469a1..d036332fb 100644 --- a/doc/walkthrough/fsck:_verifying_your_data.mdwn +++ b/doc/walkthrough/fsck:_verifying_your_data.mdwn @@ -1,8 +1,8 @@ -You can use the fsck subcommand to check for problems in your data. -What can be checked depends on the [[backend|backends]] you've used to store -the data. For example, when you use the SHA1 backend, fsck will verify that -the checksums of your files are good. Fsck also checks that the annex.numcopies -setting is satisfied for all files. +You can use the fsck subcommand to check for problems in your data. What +can be checked depends on the key-value [[backend|backends]] you've used +for the data. For example, when you use the SHA1 backend, fsck will verify +that the checksums of your files are good. Fsck also checks that the +annex.numcopies setting is satisfied for all files. # git annex fsck fsck some_file (checksum...) ok diff --git a/doc/walkthrough/unused_data.mdwn b/doc/walkthrough/unused_data.mdwn index fb8419303..e142b576c 100644 --- a/doc/walkthrough/unused_data.mdwn +++ b/doc/walkthrough/unused_data.mdwn @@ -2,7 +2,7 @@ It's possible for data to accumulate in the annex that no files point to anymore. One way it can happen is if you `git rm` a file without first calling `git annex drop`. And, when you modify an annexed file, the old content of the file remains in the annex. Another way is when migrating -between backends. +between key-value [[backends|backend]]. This might be historical data you want to preserve, so git-annex defaults to preserving it. So from time to time, you may want to check for such data and -- cgit v1.2.3