summaryrefslogtreecommitdiff
path: root/doc/git-annex-fsck.mdwn
blob: 1f5d75f3eef1788543bda2672b3ef01cffef34ee (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
# NAME

git-annex fsck - check for problems

# SYNOPSIS

git annex fsck `[path ...]`

# DESCRIPTION

With no parameters, this command checks the whole annex for consistency,
and warns about or fixes any problems found. This is a good complement to
`git fsck`.

With parameters, only the specified files are checked.

# OPTIONS

* `--from=remote`

  Check a remote, rather than the local repository.

  Note that by default, files will be copied from the remote to check
  their contents. To avoid this expensive transfer, and only
  verify that the remote still has the files that are expected to be on it,
  add the `--fast` option.

* `--fast`

  Avoids expensive checksum calculations (and expensive transfers when
  fscking a remote).

* `--incremental`

  Start a new incremental fsck pass. An incremental fsck can be interrupted
  at any time, with eg ctrl-c.

* `--more`

  Continue the last incremental fsck pass, where it left off.

* `--incremental-schedule=time`

  This makes a new incremental fsck be started only a specified
  time period after the last incremental fsck was started.

  The time is in the form "10d" or "300h".

  Maybe you'd like to run a fsck for 5 hours at night, picking up each
  night where it left off. You'd like this to continue until all files
  have been fscked. And once it's done, you'd like a new fsck pass to start,
  but no more often than once a month. Then put this in a nightly cron job:

	git annex fsck --incremental-schedule 30d --time-limit 5h

* `--distributed`

  Normally, fsck only fixes the git-annex location logs when an inconsistecy
  is detected. In distributed mode, each file that is checked will result
  in a location log update noting the time that it was present.

  This is useful in situations where repositories cannot be trusted to
  continue to exist. By running a periodic distributed fsck, those
  repositories can verify that they still exist and that the information
  about their contents is still accurate.

  This is not the default mode, because each distributed fsck increases
  the size of the git-annex branch. While it takes care to log identical
  location tracking lines for all keys, which will delta-compress well,
  there is still overhead in committing the changes. If this causes
  the git-annex branch to grow too big, it can be pruned using
  [[git-annex-forget]](1)

* `--expire="[repository:]time`..."

  This option makes the fsck check for location logs of the specified
  repository that have not been updated by a distributed fsck within the
  specified time period. Such stale location logs are then thrown out, so
  git-annex will no longer think that a repository contains data, if it is
  not participating in distributed fscking.
  
  The repository can be specified using the name of a remote,
  or the description or uuid of the repository. If a time is specified
  without a repository, it is used as the default value for all
  repositories. Note that location logs for the current repository are
  never expired, since they can be verified directly.

  The time is in the form "60d" or "1y". A time of "never" will disable
  expiration.

  Note that a remote can always run `fsck` later on to re-update the
  location log if it was expired in error.

* `--numcopies=N`

  Override the normally configured number of copies. 

  To verify data integrity only while disregarding required number of copies,
  use `--numcopies=1`.

* `--all`

  Normally only the files in the currently checked out branch
  are fscked. This option causes all versions of all files to be fscked.

  This is the default behavior when running git-annex in a bare repository.

* `--unused`

  Operate on files found by last run of git-annex unused.

* `--key=keyname`

  Use this option to fsck a specified key.
  
* file matching options

  The [[git-annex-matching-options]](1)
  can be used to specify files to fsck.

# OPTIONS

# SEE ALSO

[[git-annex]](1)

# AUTHOR

Joey Hess <id@joeyh.name>

Warning: Automatically converted into a man page by mdwn2man. Edit with care.