summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2015-06-16 20:17:17 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2015-06-16 20:17:17 -0400
commitffa6812d083f158e7afe6978ed8f06f2ff9ebdf6 (patch)
treee76ea6c470040e89aab9ec22296483ea51152dd8
parent6d6111dd8c244285c484e416a84ed4c4a89313e1 (diff)
rewrite so it's understandable without knowing about the related command-line options
-rw-r--r--doc/preferred_content.mdwn260
1 files changed, 158 insertions, 102 deletions
diff --git a/doc/preferred_content.mdwn b/doc/preferred_content.mdwn
index e285a6a7c..1dbc4b60b 100644
--- a/doc/preferred_content.mdwn
+++ b/doc/preferred_content.mdwn
@@ -39,8 +39,8 @@ files that `git annex get --auto` will want to get, and `git annex find
will want to drop.
The expressions are very similar to the matching options documented
-on the [[git-annex]] man page. At the command line, you can use those
-options in commands like this:
+on the [[git-annex-matching-options]] man page.
+At the command line, you can use those options in commands like this:
git annex get --include='*.mp3' --and -'(' --not --largerthan=100mb -')'
@@ -48,152 +48,208 @@ The equivalent preferred content expression looks like this:
include=*.mp3 and (not largerthan=100mb)
-So, just remove the dashes, basically. However, there are some differences
-from the command line options to keep in mind:
+So, just remove the dashes, basically. But, there are some differences
+between the command line options and expressions, so see the documentation
+below to get the full story.
-### difference: file matching
+## expressions
-While --include and --exclude match files relative to the current
-directory, preferred content expressions always match files relative to the
-top of the git repository.
+* `include=glob` and `exclude=glob`
-For example, suppose you put files into `archive` directories
-when you're done with them. Then you could configure your laptop to prefer
-to not retain those files, like this:
+ Match files to include, or exclude.
+
+ While --include=glob and --exclude=glob match files relative to the current
+ directory, preferred content expressions always match files relative to the
+ top of the git repository.
+
+ For example, suppose you put files into `archive` directories
+ when you're done with them. Then you could configure your laptop to prefer
+ to not retain those files, like this:
exclude=*/archive/*
-### difference: no "in="
+* `copies=number`
-Preferred content expressions have no direct equivalent to `--in`.
+ Matches only files that git-annex believes to have the specified number
+ of copies, or more. Note that it does not check remotes to verify that
+ the copies still exist.
-Often, it's best to add repositories to groups, and match against
-the groups in a preferred content expression. So rather than
-`--in=usbdrive`, put all the USB drives into a "transfer" group,
-and use "copies=transfer:1"
+ To decide if content should be dropped, git-annex evaluates the preferred
+ content expression under the assumption that the content has *already* been
+ dropped. If the content would not be wanted then, the drop can be done.
+ So, for example, `copies=2` in a preferred content expression lets
+ content be dropped only when there are currently 3 copies of it, including
+ the repo it's being dropped from. This is different than running `git annex
+ drop --copies=2`, which will drop files that currently have 2 copies.
-### difference: dropping
+* `copies=trustlevel:number`
-To decide if content should be dropped, git-annex evaluates the preferred
-content expression under the assumption that the content has *already* been
-dropped. If the content would not be wanted then, the drop can be done.
-So, for example, `copies=2` in a preferred content expression lets
-content be dropped only when there are currently 3 copies of it, including
-the repo it's being dropped from. This is different than running `git annex
-drop --copies=2`, which will drop files that currently have 2 copies.
+ Matches only files that git-annex believes have the specified number
+ copies, on remotes with the specified trust level. For example,
+ `copies=trusted:2`
-### difference: "present"
+ To match any trust level at or higher than a given level,
+ use 'trustlevel+'. For example, `--copies=semitrusted+:2`
-There's a special "present" keyword you can use in a preferred content
-expression. This means that content is wanted if it's present,
-and not otherwise. This leaves it up to you to use git-annex manually
-to move content around. You can use this to avoid preferred content
-settings from affecting a subdirectory. For example:
+* `copies=groupname:number`
- auto/* or (include=ad-hoc/* and present)
+ Matches only files that git-annex believes have the specified number of
+ copies, on remotes in the specified group. For example,
+ `copies=archive:2`
+
+ Preferred content expressions have no equivilant to the `--in`
+ option, but groups can accomplish similar things. You can add
+ repositories to groups, and match against the groups in a
+ preferred content expression. So rather than `--in=usbdrive`,
+ put all the USB drives into a "transfer" group, and use
+ `copies=transfer:1`
+
+* `lackingcopies=number`
+
+ Matches only files that git-annex believes need the specified number or
+ more additional copies to be made in order to satisfy their numcopies
+ settings.
+
+* `approxlackingcopies=number`
+
+ Like lackingcopies, but does not look at .gitattributes annex.numcopies
+ settings. This makes it significantly faster.
+
+* `inbackend=name`
+
+ Matches only files whose content is stored using the specified key-value
+ backend.
+
+* `inallgroup=groupname`
+
+ Matches only files that git-annex believes are present in all repositories
+ in the specified group.
-Note that `not present` is a very bad thing to put in a preferred content
-expression. It'll make it want to get content that's not present, and
-drop content that is present! Don't go there..
+* `smallerthan=size` and `largerthan=size`
-### difference: "inpreferreddir"
+ Matches only files whose content is smaller than, or larger than the
+ specified size.
-There's a special "inpreferreddir" keyword you can use in a
-preferred content expression of a special remote. This means that the
-content is preferred if it's in a directory (located anywhere in the tree)
-with a special name.
+ The size can be specified with any commonly used units, for example,
+ "0.5 gb" or "100 KiloBytes"
-The name of the directory can be configured using
-`git annex enableremote $remote preferreddir=$dirname`
+* `metadata=field=glob`
-(If no directory name is configured, it uses "public" by default.)
+ Matches only files that have a metadata field attached with a value that
+ matches the glob. The values of metadata fields are matched case
+ insensitively.
-### difference: "standard"
+ To match a tag "done", use `metadata=tag=done`
-git-annex comes with some built-in preferred content expressions, that
-can be used with repositories that are in some [[standard_groups]].
+ To match author metadata, use `metadata=author=* Smith"
-When a repository is in exactly one such group, you can use the "standard"
-keyword in its preferred content expression, to match whatever content
-the group's expression matches.
-(If a repository is put into multiple standard
-groups, "standard" will match anything.. so don't do that!)
+* `present`
-Most often, the whole preferred content expression is simply "standard".
-But, you can do more complicated things, for example:
-"`standard or include=otherdir/*`"
+ Makes content be wanted if it's present, but not otherwise.
-### difference: "groupwanted"
+ This leaves it up to you to use git-annex manually
+ to move content around. You can use this to avoid preferred content
+ settings from affecting a subdirectory. For example:
-The "groupwanted" keyword can be used to refer to a preferred content
-expression that is associated with a group. This is like the "standard"
-keyword, but you can configure the preferred content expressions
-using `git annex groupwanted`.
+ auto/* or (include=ad-hoc/* and present)
+
+ Note that `not present` is a very bad thing to put in a preferred content
+ expression. It'll make it want to get content that's not present, and
+ drop content that is present! Don't go there..
+
+* `inpreferreddir`
+
+ Makes content be preferred if it's in a directory (located anywhere
+ in the tree) with a particular name.
+
+ The name of the directory can be configured using
+ `git annex enableremote $remote preferreddir=$dirname`
+
+ (If no directory name is configured, it uses "public" by default.)
-Note that when writing a groupwanted preferred content expression,
-you can use all of the keywords listed above, including "standard".
-(But not "groupwanted".)
+* `standard`
-For example, to make a variant of the standard client preferred content
-expression that does not want files in the "out" directory, you
-could run: `git annex groupwanted client "standard and exclude=out/*"`
+ git-annex comes with some built-in preferred content expressions, that
+ can be used with repositories that are in some [[standard_groups]].
-Then repositories that are in the client group and have their preferred
-content expression set to "groupwanted" will use that, while
-other client repositories that have their preferred content expression
-set to "standard" will use the standard expression.
+ When a repository is in exactly one such group, you can use the "standard"
+ keyword in its preferred content expression, to match whatever content
+ the group's expression matches.
+ (If a repository is put into multiple standard
+ groups, "standard" will match anything.. so don't do that!)
-Or, you could make a new group, with your own custom preferred content
-expression tuned for your needs, and every repository you put in this
-group and make its preferred content be "groupwanted" will use it.
+ Most often, the whole preferred content expression is simply "standard".
+ But, you can do more complicated things, for example:
+ `standard or include=otherdir/*`
-For example, the archive group only wants to archive 1 copy of each file,
-spread among every repository in the group.
-Here's how to configure a group named redundantarchive, that instead
-wants to contain 3 copies of each file:
+* `groupwanted`
+ The "groupwanted" keyword can be used to refer to a preferred content
+ expression that is associated with a group. This is like the "standard"
+ keyword, but you can configure the preferred content expressions
+ using `git annex groupwanted`.
+
+ Note that when writing a groupwanted preferred content expression,
+ you can use all of the keywords listed above, including "standard".
+ (But not "groupwanted".)
+
+ For example, to make a variant of the standard client preferred content
+ expression that does not want files in the "out" directory, you
+ could run: `git annex groupwanted client "standard and exclude=out/*"`
+
+ Then repositories that are in the client group and have their preferred
+ content expression set to "groupwanted" will use that, while
+ other client repositories that have their preferred content expression
+ set to "standard" will use the standard expression.
+
+ Or, you could make a new group, with your own custom preferred content
+ expression tuned for your needs, and every repository you put in this
+ group and make its preferred content be "groupwanted" will use it.
+
+ For example, the archive group only wants to archive 1 copy of each file,
+ spread among every repository in the group.
+ Here's how to configure a group named redundantarchive, that instead
+ wants to contain 3 copies of each file:
+
git annex groupwanted redundantarchive "not (copies=redundantarchive:3)"
for repo in foo bar baz; do
git annex group $repo redundantarchive
git annex wanted $repo groupwanted
done
-### difference: metadata matching
-
-This:
+* `unused`
- git annex get --metadata tag=done
+ Matches only keys that `git annex unused` has determined to be unused.
-becomes
+ This is related the the --unused option.
+ However, putting `unused` in a preferred content expression
+ doesn't make git-annex consider those unused keys. So when git-annex is
+ only checking preferred content expressions against files in the
+ repository (which are obviously used), `unused` in a preferred
+ content expression won't match anything.
- metadata=tag=done
+ So when is `unused` useful in a preferred content expression?
-### difference: unused
+ 1. Using `git annex sync --content --all` will operate on all files,
+ including unused ones, and take `unused` in preferred content expressions
+ into account.
+ 2. The git-annex assistant periodically scans for unused files, and
+ moves them to some repository whose preferred content expression
+ matches "unused". (Or, if annex.expireunused is set, it may just delete
+ them.)
-The --unused option makes git-annex operate on every key that `git annex
-unused` has determined to be unused. The corresponding `unused` keyword
-in a preferred content expression also matches those keys.
+* `anything`
-However, using `unused` in a preferred content expression
-doesn't make git-annex consider those keys. So when git-annex is
-only checking preferred content expressions against files in the
-repository (which are obviously used), `unused` in a preferred
-content expression won't match anything.
+ Matches any version of any file.
-So when is `unused` useful in a preferred content expression?
+* `not expression`
-* The git-annex assistant periodically scans for unused files, and
- moves them to some repository whose preferred content expression
- matches "unused". (Or, if annex.expireunused is set, it may just delete
- them.)
-* Using `git annex sync --content --all` will operate on all files,
- including unused ones, and take `unused` in preferred content expressions
- into account.
+ Inverts what the expression matches. For example, `not include=archive/*`
+ is the same as `exclude=archive/*`
-### difference: anything
+* `and` / `or` / `( expression )`
-The "anything" keyword can be used in a preferred content expression
-to match any version of any file.
+ These can be used to build up more complicated expressions.
## upgrades