summaryrefslogtreecommitdiff
path: root/doc/design/assistant/inotify.mdwn
blob: 3263c476da27efb8ce19a5e199e5966d34665961 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
Finish "git annex watch" command, which runs, in the background, watching via
inotify for changes, and automatically annexing new files, etc.

There is a `watch` branch in git that adds such a command. To make this
really useful, it needs to:

- on startup, add any files that have appeared since last run **done**
- on startup, fix the symlinks for any renamed links **done**
- on startup, stage any files that have been deleted since last run
  (seems to require a `git commit -a` on startup, or at least a
  `git add --update`, which will notice deleted files) **done**
- notice new files, and git annex add **done**
- notice renamed files, auto-fix the symlink, and stage the new file location
  **done**
- handle cases where directories are moved outside the repo, and stop
  watching them **done**
- when a whole directory is deleted or moved, stage removal of its
  contents from the index **done**
- notice deleted files and stage the deletion
  (tricky; there's a race with add since it replaces the file with a symlink..)
  **done**
- periodically auto-commit staged changes (avoid autocommitting when
  lots of changes are coming in)
- tunable delays before adding new files, etc
- Coleasce related add/rm events. See commit
  cbdaccd44aa8f0ca30afba23fc06dd244c242075 for some details of the problems
  with doing this.
- don't annex `.gitignore` and `.gitattributes` files, but do auto-stage
  changes to them
- configurable option to only annex files meeting certian size or
  filename criteria
- honor .gitignore, not adding files it excludes (difficult, probably
  needs my own .gitignore parser to avoid excessive running of git commands
  to check for ignored files)
- Possibly, when a directory is moved out of the annex location,
  unannex its contents.
- Gracefully handle when the default limit of 8192 inotified directories
  is exceeded. This can be tuned by root, so help the user fix it.
- Support OSes other than Linux; it only uses inotify currently.
  OSX and FreeBSD use the same mechanism, and there is a Haskell interface
  for it,

## the races

Many races need to be dealt with by this code. Here are some of them.

* File is added and then removed before the add event starts.

  Not a problem; The add event does nothing since the file is not present.

* File is added and then removed before the add event has finished
  processing it.
  
  **Minor problem**; When the add's processing of the file (checksum and so
  on) fails due to it going away, there is an ugly error message, but
  things are otherwise ok.

* File is added and then replaced with another file before the annex add
  moves its content into the annex.

  **Currently unfixed**; The new content will be moved to the annex under the
  old checksum, and fsck will later catch this inconsistency.

  Possible fix: Move content someplace before doing checksumming.

* File is added and then replaced with another file before the annex add
  makes its symlink.

  **Minor problem**; The annex add will fail creating its symlink since
  the file exists. There is an ugly error message, but the second add
  event will add the new file.

* File is added and then replaced with another file before the annex add
  stages the symlink in git.

  **Currently unfixed**; `git add` will be run on the new file, which is
  not at all good when it's big. Could be dealt with by using `git
  update-index` to manually put the symlink into the index without git
  looking at what's currently on disk.

* Link is moved, fixed link is written by fix event, but then that is
  removed by the user and replaced with a file before the event finishes.

  **Currently unfixed**: `git add` will be run on the file. Basically same
  effect as previous race above.

* File is removed and then re-added before the removal event starts.

  Not a problem; The removal event does nothing since the file exists,
  and the add event replaces it in git with the new one.

* File is removed and then re-added before the removal event finishes.

  Not a problem; The removal event removes the old file from the index, and
  the add event adds the new one.