diff options
author | Joey Hess <joeyh@joeyh.name> | 2016-01-22 12:24:15 -0400 |
---|---|---|
committer | Joey Hess <joeyh@joeyh.name> | 2016-01-22 12:24:15 -0400 |
commit | 85ceb10aca1ac40abde321c6279b246ecb240167 (patch) | |
tree | 5a762b7cf5cf43f30b80688cf196a8849595f7e5 | |
parent | 73223a202e026b55b6f67e9443ae4491b19bc15f (diff) |
move preferred content terminals docs to man page
-rw-r--r-- | doc/git-annex-preferred-content.mdwn | 230 | ||||
-rw-r--r-- | doc/preferred_content.mdwn | 221 |
2 files changed, 221 insertions, 230 deletions
diff --git a/doc/git-annex-preferred-content.mdwn b/doc/git-annex-preferred-content.mdwn index 49512f465..8ac8df022 100644 --- a/doc/git-annex-preferred-content.mdwn +++ b/doc/git-annex-preferred-content.mdwn @@ -1,6 +1,6 @@ # NAME -git-annex-preferred-content - +git-annex-preferred-content - which files are wanted in a repository # DESCRIPTION @@ -17,27 +17,225 @@ doing so would violate its required content settings. A repository's required content can be configured using `git annex vicfg` or `git annex required`. -Preferred content expressions are similar, but not identical to -the [[git-annex-matching-options]](1), just without the dashes. +# SYNTAX + +Preferred content expressions use a similar syntax to +the [[git-annex-matching-options]](1), without the dashes. For example: exclude=archive/* and (include=*.mp3 or smallerthan=1mb) -The main differences are that `exclude=` and `include=` always -match relative to the top of the git repository, and that there is -no equivalent to `--in`. +The idea is that you write an expression that files are matched against. If +a file matches, the repository wants to store its content. If it doesn't, +the repository wants to drop its content (if there are enough copies +elsewhere to allow removing it). + +# EXPRESSIONS + +* `include=glob` and `exclude=glob` + + Match files to include, or exclude. + + While --include=glob and --exclude=glob match files relative to the current + directory, preferred content expressions always match files relative to the + top of the git repository. + + For example, suppose you put files into `archive` directories + when you're done with them. Then you could configure your laptop to prefer + to not retain those files, like this: `exclude=*/archive/*` + +* `copies=number` + + Matches only files that git-annex believes to have the specified number + of copies, or more. Note that it does not check remotes to verify that + the copies still exist. + + To decide if content should be dropped, git-annex evaluates the preferred + content expression under the assumption that the content has *already* been + dropped. If the content would not be wanted then, the drop can be done. + So, for example, `copies=2` in a preferred content expression lets + content be dropped only when there are currently 3 copies of it, including + the repo it's being dropped from. This is different than running `git annex + drop --copies=2`, which will drop files that currently have 2 copies. + +* `copies=trustlevel:number` + + Matches only files that git-annex believes have the specified number + copies, on remotes with the specified trust level. For example, + `copies=trusted:2` + + To match any trust level at or higher than a given level, + use `trustlevel+`. For example, `copies=semitrusted+:2` + +* `copies=groupname:number` + + Matches only files that git-annex believes have the specified number of + copies, on remotes in the specified group. For example, + `copies=archive:2` + + Preferred content expressions have no equivalent to the `--in` + option, but groups can accomplish similar things. You can add + repositories to groups, and match against the groups in a + preferred content expression. So rather than `--in=usbdrive`, + put all the USB drives into a "transfer" group, and use + `copies=transfer:1` + +* `lackingcopies=number` + + Matches only files that git-annex believes need the specified number or + more additional copies to be made in order to satisfy their numcopies + settings. + +* `approxlackingcopies=number` + + Like lackingcopies, but does not look at .gitattributes annex.numcopies + settings. This makes it significantly faster. + +* `inbackend=name` + + Matches only files whose content is stored using the specified key-value + backend. + +* `inallgroup=groupname` + + Matches only files that git-annex believes are present in all repositories + in the specified group. + +* `smallerthan=size` and `largerthan=size` + + Matches only files whose content is smaller than, or larger than the + specified size. + + The size can be specified with any commonly used units, for example, + "0.5 gb" or "100 KiloBytes" + +* `metadata=field=glob` + + Matches only files that have a metadata field attached with a value that + matches the glob. The values of metadata fields are matched case + insensitively. + + To match a tag "done", use `metadata=tag=done` + + To match author metadata, use `metadata=author=*Smith` + +* `present` -For more details about preferred content expressions, see -See <https://git-annex.branchable.com/preferred_content/> + Makes content be wanted if it's present, but not otherwise. -When a repository is in one of the standard predefined groups, like "backup" -and "client", setting its preferred content to "standard" will use a -built-in preferred content expression developed for that group. -See <https://git-annex.branchable.com/preferred_content/standard_groups/> + This leaves it up to you to use git-annex manually + to move content around. You can use this to avoid preferred content + settings from affecting a subdirectory. For example: + `auto/* or (include=ad-hoc/* and present)` -If you have set a groupwanted expression for a group, it will be used -when a repository in the group has its preferred content set to -"groupwanted". + Note that `not present` is a very bad thing to put in a preferred content + expression. It'll make it want to get content that's not present, and + drop content that is present! Don't go there.. + +* `inpreferreddir` + + Makes content be preferred if it's in a directory (located anywhere + in the tree) with a particular name. + + The name of the directory can be configured using + `git annex enableremote $remote preferreddir=$dirname` + + (If no directory name is configured, it uses "public" by default.) + +* `standard` + + git-annex comes with some built-in preferred content expressions, that + can be used with repositories that are in some [[standard groups]]. + + When a repository is in exactly one such group, you can use the "standard" + keyword in its preferred content expression, to match whatever content + the group's expression matches. + (If a repository is put into multiple standard + groups, "standard" will match anything.. so don't do that!) + + Most often, the whole preferred content expression is simply "standard". + But, you can do more complicated things, for example: + `standard or include=otherdir/*` + +* `groupwanted` + + The "groupwanted" keyword can be used to refer to a preferred content + expression that is associated with a group. This is like the "standard" + keyword, but you can configure the preferred content expressions + using `git annex groupwanted`. + + Note that when writing a groupwanted preferred content expression, + you can use all of the keywords listed above, including "standard". + (But not "groupwanted".) + + For example, to make a variant of the standard client preferred content + expression that does not want files in the "out" directory, you + could run: `git annex groupwanted client "standard and exclude=out/*"` + + Then repositories that are in the client group and have their preferred + content expression set to "groupwanted" will use that, while + other client repositories that have their preferred content expression + set to "standard" will use the standard expression. + + Or, you could make a new group, with your own custom preferred content + expression tuned for your needs, and every repository you put in this + group and make its preferred content be "groupwanted" will use it. + + For example, the archive group only wants to archive 1 copy of each file, + spread among every repository in the group. + Here's how to configure a group named redundantarchive, that instead + wants to contain 3 copies of each file: + + git annex groupwanted redundantarchive "not (copies=redundantarchive:3)" + for repo in foo bar baz; do + git annex group $repo redundantarchive + git annex wanted $repo groupwanted + done + +* `unused` + + Matches only keys that `git annex unused` has determined to be unused. + + This is related the the --unused option. + However, putting `unused` in a preferred content expression + doesn't make git-annex consider those unused keys. So when git-annex is + only checking preferred content expressions against files in the + repository (which are obviously used), `unused` in a preferred + content expression won't match anything. + + So when is `unused` useful in a preferred content expression? + + Using `git annex sync --content --all` will operate on all files, + including unused ones, and take `unused` in preferred content expressions + into account. + + The git-annex assistant periodically scans for unused files, and + moves them to some repository whose preferred content expression + says it wants them. (Or, if annex.expireunused is set, it may just delete + them.) + +* `anything` + + Matches any version of any file. + +* `not expression` + + Inverts what the expression matches. For example, `not include=archive/*` + is the same as `exclude=archive/*` + +* `and` / `or` / `( expression )` + + These can be used to build up more complicated expressions. + +# TESTING + +To check at the command line which files are matched by a repository's +preferred content settings, you can use the --want-get and --want-drop +options. + +For example, git annex find --want-get --not --in . will find all the files +that git annex get --auto will want to get, and git annex find --want-drop --in +. will find all the files that git annex drop --auto will want to drop. # SEE ALSO @@ -47,6 +245,8 @@ when a repository in the group has its preferred content set to [[git-annex-wanted]](1) +<https://git-annex.branchable.com/preferred_content/> + # AUTHOR Joey Hess <id@joeyh.name> diff --git a/doc/preferred_content.mdwn b/doc/preferred_content.mdwn index e6244bde5..26db05491 100644 --- a/doc/preferred_content.mdwn +++ b/doc/preferred_content.mdwn @@ -18,16 +18,6 @@ If a file matches, the repository wants to store its content. If it doesn't, the repository wants to drop its content (if there are enough copies elsewhere to allow removing it). -## finding preferred content - -To check at the command line which files are matched by preferred content -settings, you can use the --want-get and --want-drop options. - -For example, `git annex find --want-get --not --in .` will find all the -files that `git annex get --auto` will want to get, and `git annex find ---want-drop --in .` will find all the files that `git annex drop --auto` -will want to drop. - ## writing expressions [[!template id=note text=""" @@ -42,214 +32,15 @@ and simply setting its preferred content to "standard" to match whatever is standard for that group. See [[standard_groups]] for a list. """]] +See the man page [[git-annex-preferred-content]] for details on the syntax +of preferred content expressions. -The expressions are very similar to the matching options documented -on the [[git-annex-matching-options]] man page. -At the command line, you can use those options in commands like this: - - git annex get --include='*.mp3' --and -'(' --not --largerthan=100mb -')' - -The equivalent preferred content expression looks like this: - - include=*.mp3 and (not largerthan=100mb) - -So, just remove the dashes, basically. But, there are some differences -between the command line options and expressions, so see the documentation -below to get the full story. - -* `include=glob` and `exclude=glob` - - Match files to include, or exclude. - - While --include=glob and --exclude=glob match files relative to the current - directory, preferred content expressions always match files relative to the - top of the git repository. - - For example, suppose you put files into `archive` directories - when you're done with them. Then you could configure your laptop to prefer - to not retain those files, like this: `exclude=*/archive/*` - -* `copies=number` - - Matches only files that git-annex believes to have the specified number - of copies, or more. Note that it does not check remotes to verify that - the copies still exist. - - To decide if content should be dropped, git-annex evaluates the preferred - content expression under the assumption that the content has *already* been - dropped. If the content would not be wanted then, the drop can be done. - So, for example, `copies=2` in a preferred content expression lets - content be dropped only when there are currently 3 copies of it, including - the repo it's being dropped from. This is different than running `git annex - drop --copies=2`, which will drop files that currently have 2 copies. - -* `copies=trustlevel:number` - - Matches only files that git-annex believes have the specified number - copies, on remotes with the specified trust level. For example, - `copies=trusted:2` - - To match any trust level at or higher than a given level, - use `trustlevel+`. For example, `--copies=semitrusted+:2` - -* `copies=groupname:number` - - Matches only files that git-annex believes have the specified number of - copies, on remotes in the specified group. For example, - `copies=archive:2` - - Preferred content expressions have no equivalent to the `--in` - option, but groups can accomplish similar things. You can add - repositories to groups, and match against the groups in a - preferred content expression. So rather than `--in=usbdrive`, - put all the USB drives into a "transfer" group, and use - `copies=transfer:1` - -* `lackingcopies=number` - - Matches only files that git-annex believes need the specified number or - more additional copies to be made in order to satisfy their numcopies - settings. - -* `approxlackingcopies=number` - - Like lackingcopies, but does not look at .gitattributes annex.numcopies - settings. This makes it significantly faster. - -* `inbackend=name` - - Matches only files whose content is stored using the specified key-value - backend. - -* `inallgroup=groupname` - - Matches only files that git-annex believes are present in all repositories - in the specified group. - -* `smallerthan=size` and `largerthan=size` - - Matches only files whose content is smaller than, or larger than the - specified size. - - The size can be specified with any commonly used units, for example, - "0.5 gb" or "100 KiloBytes" - -* `metadata=field=glob` - - Matches only files that have a metadata field attached with a value that - matches the glob. The values of metadata fields are matched case - insensitively. - - To match a tag "done", use `metadata=tag=done` - - To match author metadata, use `metadata=author=* Smith` - -* `present` - - Makes content be wanted if it's present, but not otherwise. - - This leaves it up to you to use git-annex manually - to move content around. You can use this to avoid preferred content - settings from affecting a subdirectory. For example: - `auto/* or (include=ad-hoc/* and present)` - - Note that `not present` is a very bad thing to put in a preferred content - expression. It'll make it want to get content that's not present, and - drop content that is present! Don't go there.. - -* `inpreferreddir` - - Makes content be preferred if it's in a directory (located anywhere - in the tree) with a particular name. - - The name of the directory can be configured using - `git annex enableremote $remote preferreddir=$dirname` - - (If no directory name is configured, it uses "public" by default.) - -* `standard` - - git-annex comes with some built-in preferred content expressions, that - can be used with repositories that are in some [[standard_groups]]. - - When a repository is in exactly one such group, you can use the "standard" - keyword in its preferred content expression, to match whatever content - the group's expression matches. - (If a repository is put into multiple standard - groups, "standard" will match anything.. so don't do that!) - - Most often, the whole preferred content expression is simply "standard". - But, you can do more complicated things, for example: - `standard or include=otherdir/*` - -* `groupwanted` - - The "groupwanted" keyword can be used to refer to a preferred content - expression that is associated with a group. This is like the "standard" - keyword, but you can configure the preferred content expressions - using `git annex groupwanted`. - - Note that when writing a groupwanted preferred content expression, - you can use all of the keywords listed above, including "standard". - (But not "groupwanted".) - - For example, to make a variant of the standard client preferred content - expression that does not want files in the "out" directory, you - could run: `git annex groupwanted client "standard and exclude=out/*"` - - Then repositories that are in the client group and have their preferred - content expression set to "groupwanted" will use that, while - other client repositories that have their preferred content expression - set to "standard" will use the standard expression. - - Or, you could make a new group, with your own custom preferred content - expression tuned for your needs, and every repository you put in this - group and make its preferred content be "groupwanted" will use it. - - For example, the archive group only wants to archive 1 copy of each file, - spread among every repository in the group. - Here's how to configure a group named redundantarchive, that instead - wants to contain 3 copies of each file: - - git annex groupwanted redundantarchive "not (copies=redundantarchive:3)" - for repo in foo bar baz; do - git annex group $repo redundantarchive - git annex wanted $repo groupwanted - done - -* `unused` - - Matches only keys that `git annex unused` has determined to be unused. - - This is related the the --unused option. - However, putting `unused` in a preferred content expression - doesn't make git-annex consider those unused keys. So when git-annex is - only checking preferred content expressions against files in the - repository (which are obviously used), `unused` in a preferred - content expression won't match anything. - - So when is `unused` useful in a preferred content expression? - - 1. Using `git annex sync --content --all` will operate on all files, - including unused ones, and take `unused` in preferred content expressions - into account. - 2. The git-annex assistant periodically scans for unused files, and - moves them to some repository whose preferred content expression - says it wants them. (Or, if annex.expireunused is set, it may just delete - them.) - -* `anything` - - Matches any version of any file. - -* `not expression` - - Inverts what the expression matches. For example, `not include=archive/*` - is the same as `exclude=archive/*` +An example: -* `and` / `or` / `( expression )` + include=*.mp3 and (not largerthan=100mb) and exclude=old/* - These can be used to build up more complicated expressions. +This makes all .mp3 files, and all other files that are less than 100 mb in +size be preferred content. It excludes all files under the "old" directory. ## upgrades |