summaryrefslogtreecommitdiff
path: root/Annex
Commit message (Collapse)AuthorAge
* fix transfers of key with no associated fileGravatar Joey Hess2014-01-23
| | | | | | | | | | | | | | | | | Several places assumed this would not happen, and when the AssociatedFile was Nothing, did nothing. As part of this, preferred content checks pass the Key around. Note that checkMatcher is sometimes now called with Just Key and Just File. It currently constructs a FileMatcher, ignoring the Key. However, if it constructed a FileKeyMatcher, which contained both, then it might be possible to speed up parts of Limit, which currently call the somewhat expensive lookupFileKey to get the Key. I have not made this optimisation yet, because I am not sure if the key is always the same. Will need some significant checking to satisfy myself that's the case..
* add "unused" preferred content expressionGravatar Joey Hess2014-01-22
| | | | | | | With a really nice optimisation that keeps it from having any overhead in normal operation! This commit was sponsored by Ulises Vitulli.
* benchmarked numcopies .gitattributes in preferred contentGravatar Joey Hess2014-01-21
| | | | | | | | | | | Checking .gitattributes adds a full minute to a git annex find looking for files that don't have enough copies. 2:25 increasts to 3:27. I feel this is too much of a slowdown to justify making it the default. So, exposed two versions of the preferred content expression, a slow one and a fast but approximate one. I'm using the approximate one in the default preferred content expressions to avoid slowing down the assistant.
* reorgGravatar Joey Hess2014-01-21
|
* numcopies cleanup, part 2Gravatar Joey Hess2014-01-21
| | | | This includes several bug fixes.
* reorganize numcopies code (no behavior changes)Gravatar Joey Hess2014-01-21
| | | | | | | Move stuff into Logs.NumCopies. Add a NumCopies newtype. Better names for various serialization classes that are specific to one thing or another.
* Add and use numcopiesneeded preferred content expression.Gravatar Joey Hess2014-01-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Add numcopiesneeded preferred content expression. * Client, transfer, incremental backup, and archive repositories now want to get content that does not yet have enough copies. This means the asssistant will make copies of files that don't yet meet the configured numcopies, even to places that would not normally want the file. For example, if numcopies is 4, and there are 2 client repos and 2 transfer repos, and 2 removable backup drives, the file will be sent to both transfer repos in order to make 4 copies. Once a removable drive get a copy of the file, it will be dropped from one transfer repo or the other (but not both). Another example, numcopies is 3 and there is a client that has a backup removable drive and two small archive repos. Normally once one of the small archives has a file, it will not be put into the other one. But, to satisfy numcopies, the assistant will duplicate it into the other small archive too, if the backup repo is not available to receive the file. I notice that these examples are fairly unlikely setups .. the old behavior was not too bad, but it's nice to finally have it really correct. .. Almost. I have skipped checking the annex.numcopies .gitattributes out of fear it will be too slow. This commit was sponsored by Florian Schlegel.
* global numcopies settingGravatar Joey Hess2014-01-20
| | | | | | | | | | | | | | | | | | | | | | | * numcopies: New command, sets global numcopies value that is seen by all clones of a repository. * The annex.numcopies git config setting is deprecated. Once the numcopies command is used to set the global number of copies, any annex.numcopies git configs will be ignored. * assistant: Make the prefs page set the global numcopies. This global numcopies setting is needed to let preferred content expressions operate on numcopies. It's also convenient, because typically if you want git-annex to preserve N copies of files in a repo, you want it to do that no matter which repo it's running in. Making it global avoids needing to warn the user about gotchas involving inconsistent annex.numcopies settings. (See changes to doc/numcopies.mdwn.) Added a new variety of git-annex branch log file, that holds only 1 value. Will probably be useful for other stuff later. This commit was sponsored by Nicolas Pouillard.
* much better command action handling for sync --contentGravatar Joey Hess2014-01-20
|
* fix inversion of control in CommandSeek (no behavior changes)Gravatar Joey Hess2014-01-20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | I've been disliking how the command seek actions were written for some time, with their inversion of control and ugly workarounds. The last straw to fix it was sync --content, which didn't fit the Annex [CommandStart] interface well at all. I have not yet made it take advantage of the changed interface though. The crucial change, and probably why I didn't do it this way from the beginning, is to make each CommandStart action be run with exceptions caught, and if it fails, increment a failure counter in annex state. So I finally remove the very first code I wrote for git-annex, which was before I had exception handling in the Annex monad, and so ran outside that monad, passing state explicitly as it ran each CommandStart action. This was a real slog from 1 to 5 am. Test suite passes. Memory usage is lower than before, sometimes by a couple of megabytes, and remains constant, even when running in a large repo, and even when repeatedly failing and incrementing the error counter. So no accidental laziness space leaks. Wall clock speed is identical, even in large repos. This commit was sponsored by an anonymous bitcoiner.
* sync --content: New option that makes the content of annexed files be ↵Gravatar Joey Hess2014-01-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | transferred. Similar to the assistant, this honors any configured preferred content expressions. I am not entirely happpy with the implementation. It would be nicer if the seek function returned a list of actions which included the individual file gets and copies and drops, rather than the current list of calls to syncContent. This would allow getting rid of the somewhat reundant display of "sync file [ok|failed]" after the get/put display. But, do that, withFilesInGit would need to somehow be able to construct such a mixed action list. And it would be less efficient than the current implementation, which is able to reuse several values between eg get and drop. Note that currently this does not try to satisfy numcopies when getting/putting files (numcopies are of course checked when dropping files!) This makes it like the assistant, and unlike get --auto and copy --auto, which do duplicate files when numcopies is not yet satisfied. I don't know if this is the right decision; it only seemed to make sense to have this parallel the assistant as far as possible to start with, since I know the assistant works. This commit was sponsored by Øyvind Andersen Holm.
* improve matcher data type to allow matching Keys, instead of just files (no ↵Gravatar Joey Hess2014-01-18
| | | | behavior changes)
* avoid needing a build-dep on hxt for Data.AssocListGravatar Joey Hess2014-01-14
|
* Fix a long-standing bug that could cause the wrong index file to be used ↵Gravatar Joey Hess2014-01-14
| | | | when committing to the git-annex branch, if GIT_INDEX_FILE is set in the environment. This typically resulted in git-annex branch log files being committed to the master branch and later showing up in the work tree. (These log files can be safely removed.)
* also check diskreserve for quvi downloadsGravatar Joey Hess2014-01-04
|
* addurl, importfeed: Honor annex.diskreserve as long as the size of the url ↵Gravatar Joey Hess2014-01-04
| | | | | | | | can be checked. This adds a http HEAD before the download is done. That was already the case when the assistant was running, and it seems worth it to avoid filling up the whole disk, like happened to my server today.
* add remote state logsGravatar Joey Hess2014-01-03
| | | | | | | | | | | | | | | | | | | | | | | | | This allows a remote to store a piece of arbitrary state associated with a key. This is needed to support Tahoe, where the file-cap is calculated from the data stored in it, and used to retrieve a key later. Glacier also would be much improved by using this. GETSTATE and SETSTATE are added to the external special remote protocol. Note that the state is left as-is even when a key is removed from a remote. It's up to the remote to decide when it wants to clear the state. The remote state log, $KEY.log.rmt, is a UUID-based log. However, rather than using the old UUID-based log format, I created a new variant of that format. The new varient is more space efficient (since it lacks the "timestamp=" hack, and easier to parse (and the parser doesn't mess with whitespace in the value), and avoids compatability cruft in the old one. This seemed worth cleaning up for these new files, since there could be a lot of them, while before UUID-based logs were only used for a few log files at the top of the git-annex branch. The transition code has also been updated to handle these new UUID-based logs. This commit was sponsored by Daniel Hofer.
* Auto-upgrade v3 indirect repos to v5 with no changes. This also fixes a ↵Gravatar Joey Hess2013-12-29
| | | | problem when a direct mode repo was somehow set to v3 rather than v4, and so the automatic direct mode upgrade to v5 was not done.
* Add plumbing-level lookupkey command.Gravatar Joey Hess2013-12-15
|
* Fix direct mode's handling when modifications to non-annexed files are ↵Gravatar Joey Hess2013-12-12
| | | | pulled from a remote. A bug prevented the files from being updated in the work tree, and this caused the modification to be reverted.
* format commentGravatar Joey Hess2013-12-12
|
* Avoid using git commit in direct mode, since in some situations it will read ↵Gravatar Joey Hess2013-12-01
| | | | | | | | | | | | the full contents of files in the tree. The assistant's commit code also always avoids git commit, for simplicity. Indirect mode sync still does a git commit -a to catch unstaged changes. Note that this means that direct mode sync no longer runs the pre-commit hook or any other hooks git commit might call. The git annex pre-commit hook action for direct mode is however explicitly run. (The assistant already ran git commit with hooks disabled, so no change there.)
* fix reversion in relative paths to local remotes of direct mode reposGravatar Joey Hess2013-11-26
| | | | | | 52ad9a1528bc51f65411aca263def97615367943 broke support for local remotes from direct mode repos, because the relative path was taken to be from the gitdir, rather than from the work tree.
* move programPath out of Config.Files to Annex.PathGravatar Joey Hess2013-11-24
| | | | | | | | This works around horribleness in the Mavericks cpp, which falls over on the #if when configure is running. Moving it avoids the file being built at that point. But it's also a location that makes sense..
* fsck distribution keyGravatar Joey Hess2013-11-23
|
* fix standalone build of this moduleGravatar Joey Hess2013-11-22
|
* Ensure that core.sharedrepository is honored when creating the .git/annex ↵Gravatar Joey Hess2013-11-18
| | | | directory.
* Ensure execute bit is set on directories when core.sharedrepsitory is set.Gravatar Joey Hess2013-11-18
|
* fix windows buildGravatar Joey Hess2013-11-18
|
* Direct mode .git/annex/objects directories are no longer left writableGravatar Joey Hess2013-11-15
| | | | | | | | | Because that allowed writing to symlinks of files that are not present, which followed the link and put bad content in an object location. fsck: Fix up .git/annex/object directory permissions. This commit was sponsored by an anonymous bitcoin donor.
* Fix direct mode merge bug when a direct mode file was deleted and replaced ↵Gravatar Joey Hess2013-11-15
| | | | with a directory. An ordering problem caused the directory to not get created in this case. Thanks to Tim for the test cases.
* add new status commandGravatar Joey Hess2013-11-07
| | | | | | | | | | | | | | | This works for both direct and indirect mode. It may need some performance tuning. Note that unlike git status, it only shows the status of the work tree, not the status of the index. So only one status letter, not two .. and since files that have been added and not yet committed do not differ between the work tree and the index, they are not shown. Might want to add display of the index vs the last commit eventually. This commit was sponsored by an unknown bitcoin contributor, whose contribution as been going up lately! ;)
* Merge branch 'master' into directguardGravatar Joey Hess2013-11-06
|\
| * typoGravatar Joey Hess2013-11-06
| |
| * Fix exception handling bug that could cause .git/annex/index to be used for ↵Gravatar Joey Hess2013-11-06
| | | | | | | | git commits outside the git-annex branch. Known to affect git-annex when used with the git shipped with Ubuntu 13.10.
* | work around lack of receive.denyCurrentBranch in direct modeGravatar Joey Hess2013-11-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that direct mode sets core.bare=true, git's normal prohibition about pushing into the currently checked out branch doesn't work. A simple fix for this would be an update hook which blocks the pushes.. but git hooks must be executable, and git-annex needs to be usable on eg, FAT, which lacks x bits. Instead, enabling direct mode switches the branch (eg master) to a special purpose branch (eg annex/direct/master). This branch is not pushed when syncing; instead any changes that git annex sync commits get written to master, and it's pushed (along with synced/master) to the remote. Note that initialization has been changed to always call setDirect, even if it's just setDirect False for indirect mode. This is needed because if the user has just cloned a direct mode repo, that nothing has synced with before, it may have no master branch, and only a annex/direct/master. Resulting in that branch being checked out locally too. Calling setDirect False for indirect mode moves back out of this branch, to a new master branch, and ensures that a manual "git push" doesn't push changes directly to the annex/direct/master of the remote. (It's possible that the user makes a commit w/o using git-annex and pushes it, but nothing I can do about that really.) This commit was sponsored by Jonathan Harrington.
* | v5 for direct mode, with automatic upgradeGravatar Joey Hess2013-11-05
| | | | | | | | | | This includes storing the current state of the HEAD ref, which git annex sync is going to need, but does not make sync use it.
* | refactored hook setupGravatar Joey Hess2013-11-05
|/
* add --want-get and --want-drop optionsGravatar Joey Hess2013-10-28
| | | | | New --want-get and --want-drop options which can be used to test preferred content settings. For example, "git annex find --in . --want-drop"
* refactorGravatar Joey Hess2013-10-28
|
* repair command: add handling of git-annex branch and indexGravatar Joey Hess2013-10-23
|
* git-recover-repository 1/2 doneGravatar Joey Hess2013-10-20
|
* update for DiffTree type change (which fixes assistant in subdir confusion bug)Gravatar Joey Hess2013-10-17
|
* ensure merge directory is empty before starting mergeGravatar Joey Hess2013-10-16
| | | | Don't want some past failed merge to lead to bad results, potentially.
* queue downloads of keys that fsck finds with bad contentGravatar Joey Hess2013-10-10
|
* run ssh in the directory with its socket when stoppingGravatar Joey Hess2013-10-06
| | | | | | | | | | | This guarantees that stopping an existing socket never fails. This might be the route out of the mess of needing to worry about socket lengths in general. However, it would need quite a lot of refactoring to make every place in git-annex that runs ssh run it with a cwd that was determined by the location of its connection caching socket. If this wasn't already such a mess, I'd consider even the thought of that API a bad idea..
* work around ssh brain-damangeGravatar Joey Hess2013-10-06
| | | | | | | | | | | | | The control socket path passed to ssh needs to be 17 characters shorter than the maximum unix domain socket length, because ssh appends stuff to it to make a temporary filename. Closes: #725512 Also, take the shorter of the relative and the absolute paths to the socket. Typically the relative path will be a lot shorter (unless deep inside a subdirectory of the repository), and so using it will avoid flirting with the maximum safe socket lenghts in more situations, and so lead to less breakage if all my attempts at fixing this are still buggy.
* Automatically and safely detect and recover from dangling ↵Gravatar Joey Hess2013-10-03
| | | | .git/annex/index.lock files, which would prevent git from committing to the git-annex branch, eg after a crash.
* rename confusing functionGravatar Joey Hess2013-10-03
| | | | | The index.lck file is not a lock file. Kept the historical name for now as changing it would be work.
* ensure that commitBranch is only called when the journal is lockedGravatar Joey Hess2013-10-03
| | | | | This is not strictly a requirement, since it does not actually update the journal. But it's a nice invariant to enforce.