summaryrefslogtreecommitdiff
path: root/doc/preferred_content/standard_groups.mdwn
blob: b05394a4657988788b2cc6bacaf9c2a9a40e72a1 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
git-annex comes with some built-in [[preferred_content]] settings, that can
be used with repositories that are in special groups. To make a
repository use one of these, just set its preferred content expression
to "standard", and put it in one of these groups. For example, to put the current
repository in the manual group and use this group to control the preferred
content, run

    git annex wanted . standard
    git annex group . manual

In the webapp, just edit the repository and select the group.

(Note that most of these standard expressions also make the repository
want to get any content that is only currently available on untrusted and
dead repositories. So if an untrusted repository gets connected,
any repository that can will back it up.)

The following standard groups are available:

### client

All content in the current working tree is wanted, unless it's for a
file in a "archive" directory, which has reached an archive repository.

`(include=* and ((exclude=*/archive/* and exclude=archive/*) or (not (copies=archive:1 or copies=smallarchive:1)))) or approxlackingcopies=1`

### transfer

Use for repositories that are used to transfer data between other
repositories, but do not need to retain data themselves. For
example, a repository on a server, or in the cloud, or a small
USB drive used in a sneakernet.

The preferred content expression for these causes them to get and retain
data until all clients have a copy.

`not (inallgroup=client and copies=client:2) and ($client)`

(Where $client is a copy of the preferred content expression used for
clients.)

The "copies=client:2" part of the above handles the case where
there is only one client repository. It makes a transfer repository
speculatively  prefer content in this case, even though it as of yet
has nowhere to transfer it to. Presumably, another client repository
will be added later.

### backup

All content is wanted. Even content of old/deleted files.

`anything`

### incrementalbackup

Only wants content that's not already backed up to another backup
or incremental backup repository.

`((not copies=backup:1) and (not copies=incrementalbackup:1)) or approxlackingcopies=1`

### smallarchive

Only wants content that's located in an "archive" directory, and
only if it's not already been archived somewhere else.

`((include=*/archive/* or include=archive/*) and not (copies=archive:1 or copies=smallarchive:1)) or approxlackingcopies=1`

### archive

All content is wanted, unless it's already been archived somewhere else.

`(not (copies=archive:1 or copies=smallarchive:1)) or approxlackingcopies=1`

Note that if you want to archive multiple copies (not a bad idea!),
you can set `git-annex groupwanted archive` to a version of 
the above preferred content expression with a larger number of copies
than 1. Then make the archive repositories have a preferred
content expression of "groupwanted" in order to use your modified
version.

### source

Use for repositories where files are often added, but that do not need to
retain files for local use. For example, a repository on a camera, where
it's desirable to remove photos as soon as they're transferred elsewhere.

The preferred content expression for these causes them to only retain
data until a copy has been sent to some other repository.

`not (copies=1)`

### manual

This gives you nearly full manual control over what content is stored in the
repository. This allows using the [[assistant]] without it trying to keep a
local copy of every file. Instead, you can manually run `git annex get`,
`git annex drop`, etc to manage content. Only content that is already
present is wanted.

The exception to this manual control is that content that a client
repository would not want is not wanted. So, files in archive
directories are not wanted once their content has 
reached an archive repository.

`present and ($client)`

(Where $client is a copy of the preferred content expression used for
clients.)

### public

This is used for publishing information to a repository that can be
publically accessed. Only files in a directory with a particular name
will be published. (The directory can be located anywhere in the
repository.)

`inpreferreddir`

The name of the directory can be configured using
`git annex enableremote $remote preferreddir=$dirname`

### unwanted

Use for repositories that you don't want to exist. This will result
in any content on them being moved away to other repositories. (Works
best when the unwanted repository is also marked as untrusted or dead.)

`not anything`