summaryrefslogtreecommitdiff
path: root/Command/Watch.hs
Commit message (Collapse)AuthorAge
* full autostart supportGravatar Joey Hess2012-08-02
| | | | | | | | git annex assistant --autostart will start separate daemons in each listed autostart repo running the webapp outside any git-annex repo will open it on the first listed autostart repo
* much better webapp startup of the assistantGravatar Joey Hess2012-07-27
| | | | | | This avoids forking another process, avoids polling, fixes a race, and avoids a rare forkProcess thread hang that I saw once time when starting the webapp.
* add assistant commandGravatar Joey Hess2012-06-22
| | | | like watch, but more magic
* reorganizeGravatar Joey Hess2012-06-13
|
* optimise link staging at startupGravatar Joey Hess2012-06-13
| | | | | | | | | | | | | | | | | | | | | Now it starts really, really fast! Down from 15 minutes or so on my big tree to around 1 minute. The trick is to remember the last time the daemon was running. Links with a ctime from before that point don't need to be restaged on startup (as long as they are correct), since the old daemon would have handled them already. We also assume that if the daemon has never run before, any links that already exist are good. The pre-commit hook fixes links, so this should be a safe assumption. Adds another MVar holding a DaemonStatus data structure. Also allowed getting rid of the Annex.Fast hack. This data structure will probably grow a lot of details about the daemon's status, that will later be used by the webapp's UI. The code to actually track when the daemon was last running is not written yet. It's 3 am.
* plumb file status through to event handlersGravatar Joey Hess2012-06-13
| | | | | | | | | | | | | The idea, not yet done, is to use this to detect when a file has an old change time, and avoid expensive restaging of the file. If git-annex watch keeps track of the last time it finished a full scan, then any symlink that is older than that time must have been scanned before, so need not be added. (Relying on moving, copying, etc of a file all updating its change time.) Anyway, this info is available for free since inotify already checks it, so it might as well make it available.
* move commentGravatar Joey Hess2012-06-13
|
* tweakGravatar Joey Hess2012-06-12
|
* do fewer commits during long batch jobsGravatar Joey Hess2012-06-12
| | | | 10 thousand queue size does not use appreciable memory in my testing.
* better optimisation of add checkGravatar Joey Hess2012-06-12
| | | | | | | | | | | | | Now really only done in the startup scan. It turns out to be quite hard for event handlers to know when the startup scan is complete. I tried to make addWatch pass that info, but found threading the state very difficult. For now, a quick hack, using the fast flag. Note that it's actually possible for inotify events to come in while the startup scan is still ongoing. Due to my hack, the expensive check will be done for files added in such inotify events.
* fix bug that turned files already in git into symlinksGravatar Joey Hess2012-06-12
| | | | | | This requires a relatively expensive test at file add time to see if it's in git already. But it can be optimised to only happen during the startup scan.
* add a flag indicating if an event was synthesized during initial dir scanGravatar Joey Hess2012-06-12
|
* cleanupGravatar Joey Hess2012-06-12
|
* hlintGravatar Joey Hess2012-06-12
|
* updateGravatar Joey Hess2012-06-11
|
* avoid using STM while the MVar is heldGravatar Joey Hess2012-06-11
| | | | | | | | | | | | I thought this might be a lock conflict that explains the deadlock when built with -threaded, but it seems not.. it still locks! It even locks without the committer thread. Indeed, it locks when running "git annex add"! -threaded is exposing some other problem. Still, this seems conceptually cleaner and did not add any inneficiencies. Also added some high-level documentation about the threads used.
* tweakGravatar Joey Hess2012-06-11
|
* git annex watch --stopGravatar Joey Hess2012-06-11
|
* add a pid fileGravatar Joey Hess2012-06-11
| | | | | Writes pid to a file. Is supposed to take an exclusive lock, but that's not working, and it's too late for me to understand why.
* daemonize git annex watchGravatar Joey Hess2012-06-11
|
* crazy optimisationGravatar Joey Hess2012-06-10
| | | | Crazy like a fox..
* run git add --update after inotify is startedGravatar Joey Hess2012-06-10
| | | | This way, there's no window where deleted files won't be noticed.
* fixed the double commits problemGravatar Joey Hess2012-06-10
|
* avoid running pre-commit hook from watch commitsGravatar Joey Hess2012-06-10
|
* tweakGravatar Joey Hess2012-06-10
|
* smart commit threadGravatar Joey Hess2012-06-10
| | | | | | | | | | | | | | | | | | | | The commit thread now has access to a channel containing the times of all uncommitted changes. This lets it be smart about detecting busy times when a batch job is running (such as rm -rf, or untarring something, etc), and avoid committing until it's done. While at the same time, instantly committing one-off changes that the user is going to expect to see immediately. I had to use STM to implement the channel, because of http://hackage.haskell.org/trac/ghc/ticket/4154 While this adds a dependency, I always wanted to use STM, so this actually makes me happy. ;) Also happy that shouldCommit is a pure function, so other commit smartness strategies can easily be played with. Although the current one seems pretty good. There is one bug, for some reason it does double commits, every time.
* add a thread to commit changesGravatar Joey Hess2012-06-10
| | | | | Currently the stupidest possible version, just wakes up every second, and may make empty commits sometimes.
* generalize and improve state MVar codeGravatar Joey Hess2012-06-10
|
* stage deletions directly using update-indexGravatar Joey Hess2012-06-10
| | | | no need to run git-rm separately
* fix non-linux buildGravatar Joey Hess2012-06-09
|
* refactor and function name cleanupGravatar Joey Hess2012-06-08
| | | | (oops, I had a calcMerge and a calc_merge!)
* use git queue for rm tooGravatar Joey Hess2012-06-07
|
* make watch use the queueGravatar Joey Hess2012-06-07
| | | | | May not work. Certianly needs to flush the queue from time to time when only symlink changes are being made.
* tweakGravatar Joey Hess2012-06-06
|
* refactorGravatar Joey Hess2012-06-06
|
* build watch on non-linux, just don't do anythingGravatar Joey Hess2012-06-06
|
* handle running out of watch descriptorsGravatar Joey Hess2012-06-06
|
* ignore .gitignore and .gitattributesGravatar Joey Hess2012-06-06
|
* close the git add raceGravatar Joey Hess2012-06-06
| | | | | | | | | | | | | | | | | There's a race adding a new file to the annex: The file is moved to the annex and replaced with a symlink, and then we git add the symlink. If someone comes along in the meantime and replaces the symlink with something else, such as a new large file, we add that instead. Which could be bad.. This race is fixed by avoiding using git add, instead the symlink is directly staged into the index. It would be nice to make `git annex add` use this same technique. I have not done so yet because it currently runs git update-index once per file, which would slow does `git annex add`. A future enhancement would be to extend the Git.Queue to include the ability to run update-index with a list of Streamers.
* run event handlers all in the same Annex monadGravatar Joey Hess2012-06-04
| | | | | | | | | | | | | | | | Uses a MVar again, as there seems no other way to thread the state through inotify events. This is a rather unsatisfactory result. I had wanted to run them in the same monad so that the git queue could be used to coleasce git commands and speed things up. But, that led to fragility: If several files are added, and one is removed before queue flush, git add will fail to add any of them. So, the queue is still explicitly flushed after each add for now. TODO: Investigate using git add --ignore-errors. This would need to be done in Command.Add. And, git add still exits nonzero with it, so would need to avoid crashing on queue flush.
* avoid explicit queue flushGravatar Joey Hess2012-06-04
| | | | | The queue is still flushed on add, because each add event is handled by a separate Annex monad. That needs to be fixed to speed up add a lot.
* ignore-unmatch when removing a staged fileGravatar Joey Hess2012-06-04
| | | | | When a file is added, and then deleted before the add action runs, the delete event was unhappy that the file never did get staged.
* refactorGravatar Joey Hess2012-06-04
|
* notice deleted files on startupGravatar Joey Hess2012-06-04
|
* deletionGravatar Joey Hess2012-06-04
| | | | | | | | | | | | | | | | | | | | | When a new file is annexed, a deletion event occurs when it's moved away to be replaced by a symlink. Most of the time, there is no problimatic race, because the same thread runs the add event as the deletion event. So, once the symlink is in place, the deletion code won't run at all, due to existing checks that a deleted file is really gone. But there is a race at startup, as then the inotify thread is running at the same time as the main thread, which does the initial tree walking and annexing. It would be possible for the deletion inotify to run in a perfect race with the addition, and remove the newly added symlink from the git cache. To solve this race, added event serialization via a MVar. We putMVar before running each event, which blocks if an event is already running. And when an event finishes (or crashes!), we takeMVar to free the lock. Also, make rm -rf not spew warnings by passing --ignore-unmatch when deleting directories.
* suppress "recording state in git" message during addGravatar Joey Hess2012-06-04
|
* add handling of symlink addition eventsGravatar Joey Hess2012-06-04
| | | | | | | | And just like that, annexed files can be moved and copies around within the tree, and are automatically fixed to point to the content, and staged in git. Huzzah! Delete still remains TODO, with its troublesome race during add..
* handle directory deletionGravatar Joey Hess2012-06-04
| | | | | When a directory is deleted, or moved away, git rm -r it to stage the deletion.
* add events for symlink creation and directory removalGravatar Joey Hess2012-06-04
| | | | | | | | | | Improved the inotify code, so it will also notice directory removal and symlink creation. In the watch code, optimised away a stat of a file that's being added, that's done by Command.Add.start. This is the reason symlink creation is handled separately from file creation, since during initial tree walk at startup, a stat was already done, and can be reused.
* watch subcommandGravatar Joey Hess2012-04-12
So far this only handles auto-annexing new files that are created inside the repository while it's running. To make this really useful, it needs to at least: - notice deleted files and stage the deletion (tricky; there's a race with add..) - notice renamed files, auto-fix the symlink, and stage the new file location - periodically auto-commit staged changes - honor .gitignore, not adding files it excludes Also nice to have would be: - Somehow sync remotes, possibly using a push sync like dvcs-autosync does, so they are immediately updated. - Somehow get content that is unavilable. This is problimatic with inotify, since we only get an event once the user has tried (and failed) to read from the file. Perhaps instead, automatically copy content that is added out to remotes, with the goal of all repos eventually getting a copy, if df allows. - Drop files that have not been used lately, or meet some other criteria (as long as there's a copy elsewhere). - Perhaps automatically dropunused files that have been deleted, although I cannot see a way to do that, since by the time the inotify deletion event arrives, the file is deleted, and we cannot see what its symlink pointed to! Alternatievely, perhaps automatically do an expensive unused/dropunused cleanup process. Some of this probably needs the currently stateless threads to maintain a common state.