summaryrefslogtreecommitdiff
path: root/Annex.hs
Commit message (Collapse)AuthorAge
* Fix zombie leak and general inneficiency when copying files to a local git repo.Gravatar Joey Hess2014-03-06
| | | | | | Benchmarking this with 1000 small files being copied, the time reduced from 15.98s to 14.64s -- an 8% improvement in the non-data-transfer overhead of git-annex copy.
* pre-commit-annex hook script to automatically extract metadata from lots of ↵Gravatar Joey Hess2014-03-02
| | | | | | | | | | | | | | | | | | types of files Using the extract(1) program to do the heavy lifting. Decided to make git-annex run pre-commit-annex when committing. Since git-annex pre-commit also runs it, it'll be run when git commit is run too, via the pre-commit hook. This basically gives back the pre-commit hook that git-annex took away. The implementation avoids repeatedly looking for the hook script when the assistant is running and committing repeatedly; only checks if the hook is available once. To make the script simpler, made git-annex metadata -s field?=value only set a field when it's not already got a value. This commit was sponsored by bak.
* Probe for quvi version at run time.Gravatar Joey Hess2014-02-28
| | | | | Overhead: git annex addurl runs quvi --version once. And more bloat to Annex state..
* metacata command can now operate on many files at onceGravatar Joey Hess2014-02-13
|
* use locking on WindowsGravatar Joey Hess2014-01-28
| | | | This is all the easy cases, where there was already a separate lock file.
* add "unused" preferred content expressionGravatar Joey Hess2014-01-22
| | | | | | | With a really nice optimisation that keeps it from having any overhead in normal operation! This commit was sponsored by Ulises Vitulli.
* numcopies cleanup, part 2Gravatar Joey Hess2014-01-21
| | | | This includes several bug fixes.
* reorganize numcopies code (no behavior changes)Gravatar Joey Hess2014-01-21
| | | | | | | Move stuff into Logs.NumCopies. Add a NumCopies newtype. Better names for various serialization classes that are specific to one thing or another.
* global numcopies settingGravatar Joey Hess2014-01-20
| | | | | | | | | | | | | | | | | | | | | | | * numcopies: New command, sets global numcopies value that is seen by all clones of a repository. * The annex.numcopies git config setting is deprecated. Once the numcopies command is used to set the global number of copies, any annex.numcopies git configs will be ignored. * assistant: Make the prefs page set the global numcopies. This global numcopies setting is needed to let preferred content expressions operate on numcopies. It's also convenient, because typically if you want git-annex to preserve N copies of files in a repo, you want it to do that no matter which repo it's running in. Making it global avoids needing to warn the user about gotchas involving inconsistent annex.numcopies settings. (See changes to doc/numcopies.mdwn.) Added a new variety of git-annex branch log file, that holds only 1 value. Will probably be useful for other stuff later. This commit was sponsored by Nicolas Pouillard.
* fix inversion of control in CommandSeek (no behavior changes)Gravatar Joey Hess2014-01-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | I've been disliking how the command seek actions were written for some time, with their inversion of control and ugly workarounds. The last straw to fix it was sync --content, which didn't fit the Annex [CommandStart] interface well at all. I have not yet made it take advantage of the changed interface though. The crucial change, and probably why I didn't do it this way from the beginning, is to make each CommandStart action be run with exceptions caught, and if it fails, increment a failure counter in annex state. So I finally remove the very first code I wrote for git-annex, which was before I had exception handling in the Annex monad, and so ran outside that monad, passing state explicitly as it ran each CommandStart action. This was a real slog from 1 to 5 am. Test suite passes. Memory usage is lower than before, sometimes by a couple of megabytes, and remains constant, even when running in a large repo, and even when repeatedly failing and incrementing the error counter. So no accidental laziness space leaks. Wall clock speed is identical, even in large repos. This commit was sponsored by an anonymous bitcoiner.
* improve matcher data type to allow matching Keys, instead of just files (no ↵Gravatar Joey Hess2014-01-18
| | | | behavior changes)
* fix reversion in relative paths to local remotes of direct mode reposGravatar Joey Hess2013-11-26
| | | | | | 52ad9a1528bc51f65411aca263def97615367943 broke support for local remotes from direct mode repos, because the relative path was taken to be from the gitdir, rather than from the work tree.
* restart on upgrade now fully workingGravatar Joey Hess2013-11-22
|
* fix standalone build of this moduleGravatar Joey Hess2013-11-22
|
* automatically set and unset core.bare when switching to/from direct modeGravatar Joey Hess2013-11-05
|
* support direct mode repositories with core.bare=true (not yet default)Gravatar Joey Hess2013-11-05
| | | | | | | | | | | Direct mode repositories can now have core.bare=true set, to prevent accidentally running git commands that try to operate on the work tree, and so do the wrong thing. This is not yet the default, and it causes known problems for git-annex sync due to receive.denyCurrentBranch not working in bare repositories. This commit was sponsored by Richard Hartmann.
* cleanupGravatar Joey Hess2013-10-18
|
* Send a git-annex user-agent when downloading urls.Gravatar Joey Hess2013-09-28
| | | | | | | | | Overridable with --user-agent option. Not yet done for S3 or WebDAV due to limitations of libraries used -- nether allows a user-agent header to be specified. This commit sponsored by Michael Zehrer.
* gitignore support for the assistant and watcherGravatar Joey Hess2013-08-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Requires git 1.8.4 or newer. When it's installed, a background git check-ignore process is run, and used to efficiently check ignores whenever a new file is added. Thanks to Adam Spiers, for getting the necessary support into git for this. A complication is what to do about files that are gitignored but have been checked into git anyway. git commands assume the ignore has been overridden in this case, and not need any more overriding to commit a changed version. However, for the assistant to do the same, it would have to run git ls-files to check if the ignored file is in git. This is somewhat expensive. Or it could use the running git-cat-file process to query the file that way, but that requires transferring the whole file content over a pipe, so it can be quite expensive too, for files that are not git-annex symlinks. Now imagine if the user knows that a file or directory tree will be getting frequent changes, and doesn't want the assistant to sync it, so gitignores it. The assistant could overload the system with repeated ls-files checks! So, I've decided that the assistant will not automatically commit changes to files that are gitignored. This is a tradeoff. Hopefully it won't be a problem to adjust .gitignore settings to not ignore files you want the assistant to autocommit, or to manually git annex add files that are listed in .gitignore. (This could be revisited if git-annex gets access to an interface to check the content of the index w/o forking a git command. This could be libgit2, or perhaps a separate git cat-file --batch-check process, so it wouldn't need to ship over the whole file content.) This commit was sponsored by Francois Marier. Thanks!
* Make --numcopies override annex.numcopies set in .gitattributes.Gravatar Joey Hess2013-07-09
|
* assistant: Work around git-cat-file's not reloading the index after files ↵Gravatar Joey Hess2013-05-25
| | | | | | are staged. Argh.
* refactorGravatar Joey Hess2013-05-24
|
* Switch to MonadCatchIO-transformers for better handling of state while ↵Gravatar Joey Hess2013-05-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | catching exceptions. As seen in this bug report, the lifted exception handling using the StateT monad throws away state changes when an action throws an exception. http://git-annex.branchable.com/bugs/git_annex_fork_bombs_on_gpg_file/ .. Which can result in cached values being redundantly calculated, or other possibly worse bugs when the annex state gets out of sync with reality. This switches from a StateT AnnexState to a ReaderT (MVar AnnexState). All changes to the state go via the MVar. So when an Annex action is running inside an exception handler, and it makes some changes, they immediately go into affect in the MVar. If it then throws an exception (or even crashes its thread!), the state changes are still in effect. The MonadCatchIO-transformers change is actually only incidental. I could have kept on using lifted-base for the exception handling. However, I'd have needed to write a new instance of MonadBaseControl for the new monad.. and I didn't write the old instance.. I begged Bas and he kindly sent it to me. Happily, MonadCatchIO-transformers is able to derive a MonadCatchIO instance for my monad. This is a deep level change. It passes the test suite! What could it break? Well.. The most likely breakage would be to code that runs an Annex action in an exception handler, and *wants* state changes to be thrown away. Perhaps the state changes leaves the state inconsistent, or wrong. Since there are relatively few places in git-annex that catch exceptions in the Annex monad, and the AnnexState is generally just used to cache calculated data, this is unlikely to be a problem. Oh yeah, this change also makes Assistant.Types.ThreadedMonad a bit redundant. It's now entirely possible to run concurrent Annex actions in different threads, all sharing access to the same state! The ThreadedMonad just adds some extra work on top of that, with its own MVar, and avoids such actions possibly stepping on one-another's toes. I have not gotten rid of it, but might try that later. Being able to run concurrent Annex actions would simplify parts of the Assistant code.
* start one git-cat-file per index fileGravatar Joey Hess2013-05-15
| | | | | | | This reverts a5031031f0d596b2381a785925beb574d90a862e and properly fixes the issue discussed there. This makes git-annex behave much nicer in direct mode.
* turn on PackageImports globallyGravatar Joey Hess2013-04-13
| | | | | | | This will make it easier to use the Evil Splicer, when it needs to add package qualified imports And there's no real downside.
* Use lower case hash directories for storing files on crippled filesystems, ↵Gravatar Joey Hess2013-04-04
| | | | | | | | | | | | | | | same as is already done for bare repositories. * since this is a crippled filesystem anyway, git-annex doesn't use symlinks on it * so there's no reason to use the mixed case hash directories that we're stuck using to avoid breaking everyone's symlinks to the content * so we can do what is already done for all bare repos, and make non-bare repos on crippled filesystems use the all-lower case hash directories * which are, happily, all 3 letters long, so they cannot conflict with mixed case hash directories * so I was able to 100% fix this and even resuming `git annex add` in the test case will recover and it will all just work.
* Bugfix: Fix bug in inode cache sentinal check, which broke copying to local ↵Gravatar Joey Hess2013-03-12
| | | | repos if the repo being copied from had moved to a different filesystem or otherwise changed all its inodes'
* Direct mode: Support filesystems like FAT which can change their inodes each ↵Gravatar Joey Hess2013-02-19
| | | | time they are mounted.
* remove unused itemGravatar Joey Hess2013-02-14
| | | | This moved to annexDirect in GitConfig.
* type based git config handling for remotesGravatar Joey Hess2013-01-01
| | | | | Still a couple of places that use git config ad-hoc, but this is most of it done.
* type based git config handlingGravatar Joey Hess2012-12-29
| | | | | | | | | | | Now there's a Config type, that's extracted from the git config at startup. Note that laziness means that individual config values are only looked up and parsed on demand, and so we get implicit memoization for all of them. So this is not only prettier and more type safe, it optimises several places that didn't have explicit memoization before. As well as getting rid of the ugly explicit memoization code. Not yet done for annex.<remote>.* configuration settings.
* memoize parsing of annex.direct config settingGravatar Joey Hess2012-12-29
| | | | | | It occurs to me that all config settings should be parsed once at startup, into a proper ADT, rather than all this ad-hoc parsing and memoization. One day..
* indentation foo, and a new coding style page. no code changesGravatar Joey Hess2012-10-28
|
* deal with mtl/monads-tf conflictGravatar Joey Hess2012-10-24
| | | | | | I had been using -ignore-package monads-tf to deal with this, but the XMPP library uses monads-tf, so that also ignores it. Instead, use PackageImports to force use of mtl in my own code.
* add ConfigMonitor threadGravatar Joey Hess2012-10-20
| | | | | | | | | | | | | | | | | | | | Monitors git-annex branch for changes, which are noticed by the Merger thread whenever the branch ref is changed (either due to an incoming push, or a local change), and refreshes cached config values for modified config files. Rate limited to run no more often than once per minute. This is important because frequent git-annex branch changes happen when files are being added, or transferred, etc. A primary use case is that, when preferred content changes are made, and get pushed to remotes, the remotes start honoring those settings. Other use cases include propigating repository description and trust changes to remotes, and learning when a remote has added a new special remote, so the webapp can present the GUI to enable that special remote locally. Also added a uuid.log cache. All other config files already had caches.
* Preferred content path matching bugfix.Gravatar Joey Hess2012-10-17
| | | | | | | When in a subdir, both the normal filepath, and the filepath relative to the top of the git repo are needed for matching. The former for key lookup, and the latter for include/exclude to match against. Previously, key lookup didn't work in this situation.
* add AssumeNotPresent parameter to limitsGravatar Joey Hess2012-10-05
| | | | | | | | | | | | | Solves the issue with preferred content expressions and dropping that I mentioned yesterday. My solution was to add a parameter to specify a set of repositories where content should be assumed not to be present. When deciding whether to drop, it can put the current repository in, and then if the expression fails to match, the content can be dropped. Using yesterday's example "(not copies=trusted:2) and (not in=usbdrive)", when the local repo is one of the 2 trusted copies, the drop check will see only 1 trusted copy, so the expression matches, and so the content will not be dropped.
* added preferred-content log, and allow editing it with vicfgGravatar Joey Hess2012-10-04
| | | | | | | | | | | | | | This includes a full parser for the boolean expressions in the log, that compiles them into Matchers. Those matchers are not used yet. A complication is that matching against an expression should never crash git-annex with an error. Instead, vicfg checks that the expressions parse. If a bad expression (or an expression understood by some future git-annex version) gets into the log, it'll be ignored. Most of the code in Limit couldn't fail anyway, but I did have to make limitCopies check its parameter first, and return an error if it's bad, rather than erroring at runtime.
* group, ungroup: New commands to indicate groups of repositories.Gravatar Joey Hess2012-10-01
|
* pointlessnessGravatar Joey Hess2012-06-29
|
* run event handlers all in the same Annex monadGravatar Joey Hess2012-06-04
| | | | | | | | | | | | | | | | Uses a MVar again, as there seems no other way to thread the state through inotify events. This is a rather unsatisfactory result. I had wanted to run them in the same monad so that the git queue could be used to coleasce git commands and speed things up. But, that led to fragility: If several files are added, and one is removed before queue flush, git add will fail to add any of them. So, the queue is still explicitly flushed after each add for now. TODO: Investigate using git add --ignore-errors. This would need to be done in Command.Add. And, git add still exits nonzero with it, so would need to avoid crashing on queue flush.
* Add support for core.worktree, and fix support for GIT_WORK_TREE and GIT_DIR.Gravatar Joey Hess2012-05-18
| | | | | | | The environment needs to override git-config. Changed when git config is read, and avoid rereading it once it's been read. chdir for both worktree settings.
* fix test suite buildGravatar Joey Hess2012-04-30
|
* Added shared cipher mode to encryptable special remotes.Gravatar Joey Hess2012-04-29
| | | | | | This option avoids gpg key distribution, at the expense of flexability, and with the requirement that all clones of the git repository be equally trusted.
* display "Recording state in git..." when staging the journalGravatar Joey Hess2012-04-27
| | | | | | | | A bit tricky to avoid printing it twice in a row when there are queued git commands to run and journal to stage. Added a generic way to run an action that may output multiple side messages, with only the first displayed.
* cache parsed core.sharedrepositoryGravatar Joey Hess2012-04-21
|
* do a cleanup commit after moving data from or to a git remoteGravatar Joey Hess2012-02-25
| | | | | | | | Added Annex.cleanup, which is a general purpose interface for adding actions to run at the end. Remotes with the old git-annex-shell will commit every time, and have no commit command, so hide stderr when running the commit command.
* Added a annex.queuesize settingGravatar Joey Hess2012-02-15
| | | | | | | | | | useful when adding hundreds of thousands of files on a system with plenty of memory. git add gets quite slow in such a large repository, so if the system has more than the ~32 mb of memory the queue can use by default, it's a useful optimisation to increase the queue size, in order to decrease the number of times git add is run.
* rework git check-attr interfaceGravatar Joey Hess2012-02-13
| | | | | | | | | | | | | | | Now gitattributes are looked up, efficiently, in only the places that really need them, using the same approach used for cat-file. The old CheckAttr code seemed very fragile, in the way it streamed files through git check-attr. I actually found that cad8824852aa0623dc41eac02a9e2bae47d88ec4 was still deadlocking with ghc 7.4, at the end of adding a lot of files. This should fix that problem, and avoid future ones. The best part is that this removes withAttrFilesInGit and withNumCopies, which were complicated Seek methods, as well as simplfying the types for several other Seek methods that had a Backend tupled in.
* switch to the strict state monadGravatar Joey Hess2012-01-29
| | | | | | | | | | I had not realized what a memory leak the lazy state monad could be, although I have not seen much evidence of actual leaking in git-annex. However, if running git-annex on a great many files, this could matter. The additional Utility.State.changeState adds even more strictness, avoiding a problem I saw in github-backup where repeatedly modifying state built up a huge pile of thunks.