summaryrefslogtreecommitdiff
path: root/Annex
Commit message (Collapse)AuthorAge
* status: Fixed to run in nearly constant space.Gravatar Joey Hess2012-03-11
| | | | | | | | Before, it leaked space due to caching lists of keys. Now all necessary data about keys is calculated as they stream in. The "nearly constant" is due to getKeysPresent, which builds up a lot of [] thunks as it traverses .git/annex/objects/. Will deal with it later.
* syscall optimisationGravatar Joey Hess2012-03-06
|
* configure: Check if ssh connection caching is supported by the installed ↵Gravatar Joey Hess2012-02-25
| | | | version of ssh and default annex.sshcaching accordingly.
* improve alwayscommit=false modeGravatar Joey Hess2012-02-25
| | | | | | | | | | | | | | Now changes are staged into the branch's index, but not committed, which avoids growing a large journal. And sync and merge always explicitly commit, ensuring that even when they do nothing else, they commit the staged changes. Added a flag file to indicate that the branch's journal contains uncommitted changes. (Could use git ls-files, but don't want to run that every time.) In the future, this ability to have uncommitted changes staged in the journal might be used on remotes after a series of oneshot commands.
* add annex.alwayscommit optionGravatar Joey Hess2012-02-25
| | | | | | To avoid commits of data to the git-annex branch after each command is run, set annex.alwayscommit=false. Its data will then be committed less frequently, when a merge or sync is done.
* Deal with NFS problem that caused a failure to remove a directory when ↵Gravatar Joey Hess2012-02-24
| | | | | | | | | | | | | | removing content from the annex. I was able to reproduce this on linux using the kernel's nfs server and mounting localhost:/. Determined that removing the directory fails when the just-deleted file in it was locked. Considered dropping the lock before removing the directory, but this would complicate parts of the code that should not need to worry about locking. So instead, ignore the failure to remove the directory in this case. While I was at it, made it attempt to remove both levels of hash directories, in case they're empty.
* hlintGravatar Joey Hess2012-02-16
|
* Added a annex.queuesize settingGravatar Joey Hess2012-02-15
| | | | | | | | | | useful when adding hundreds of thousands of files on a system with plenty of memory. git add gets quite slow in such a large repository, so if the system has more than the ~32 mb of memory the queue can use by default, it's a useful optimisation to increase the queue size, in order to decrease the number of times git add is run.
* tweakGravatar Joey Hess2012-02-14
|
* fix memory leak when staging the journalGravatar Joey Hess2012-02-14
| | | | | | The list of files had to be retained until the end so it could be deleted. Also, a list of update-index lines was generated and only then fed into it. Now everything streams in constant space.
* Fixed a memory leak due to excessive strictness when committing journal files.Gravatar Joey Hess2012-02-14
| | | | | | When hashing the files, the entire list of shas was read strictly. That was entirely unnecessary, since there's a cleanup action run after they're consumed.
* rework git check-attr interfaceGravatar Joey Hess2012-02-13
| | | | | | | | | | | | | | | Now gitattributes are looked up, efficiently, in only the places that really need them, using the same approach used for cat-file. The old CheckAttr code seemed very fragile, in the way it streamed files through git check-attr. I actually found that cad8824852aa0623dc41eac02a9e2bae47d88ec4 was still deadlocking with ghc 7.4, at the end of adding a lot of files. This should fix that problem, and avoid future ones. The best part is that this removes withAttrFilesInGit and withNumCopies, which were complicated Seek methods, as well as simplfying the types for several other Seek methods that had a Backend tupled in.
* Fix teardown of stale cached ssh connections.Gravatar Joey Hess2012-02-09
|
* IO exception reworkGravatar Joey Hess2012-02-03
| | | | | | ghc 7.4 comaplains about use of System.IO.Error to catch exceptions. Ok, use Control.Exception, with variants specialized to only catch IO exceptions.
* Avoid repeated location log commits when a remote is receiving files.Gravatar Joey Hess2012-01-28
| | | | | | | | | Done by adding a oneshot mode, in which location log changes are written to the journal, but not committed. Taking advantage of git-annex's existing ability to recover in this situation. This is used by git-annex-shell and other places where changes are made to a remote's location log.
* rename readMaybe to readishGravatar Joey Hess2012-01-23
| | | | a stricter (but also partial) readMaybe is getting added to base
* order user provided params after connection caching paramsGravatar Joey Hess2012-01-20
| | | | So the user can override them.
* add annex.sshcaching config settingGravatar Joey Hess2012-01-20
|
* ssh connection cachingGravatar Joey Hess2012-01-20
| | | | | | | | | | | Ssh connection caching is now enabled automatically by git-annex. Only one ssh connection is made to each host per git-annex run, which can speed some things up a lot, as well as avoiding repeated password prompts. Concurrent git-annex processes also share ssh connections. Cached ssh connections are shut down when git-annex exits. Note: The rsync special remote does not yet participate in the ssh connection caching.
* fsck --from remote --fastGravatar Joey Hess2012-01-20
| | | | | | | Avoids expensive file transfers, at the expense of checking file size and/or contents. Required some reworking of the remote code.
* optimise fsck --from normal git remotesGravatar Joey Hess2012-01-19
| | | | | | | | | | For a local git remote, can symlink the file. For a git remote using rsync, can preseed any local content. There are a few reasons to use fsck --from on a normal git remote. One is if it's using gitosis or similar, and you don't have shell access to run git annex locally. Another reason could be if you just want to fsck certian files of a bare remote.
* add a configure check for StatFSGravatar Joey Hess2012-01-15
| | | | | | | | | | | | This way, the build log will indicate whether StatFS can be relied on. I've tested all the failing architectures now, and on all of them, the StatFS code now returns Nothing, rather than Just nonsense. Also, if annex.diskreserve is set on a platform where StatFS is not working, git-annex will complain. Also, the Makefile was missing the sources target used when building with cabal.
* tweakGravatar Joey Hess2012-01-14
|
* avoid multiple unnecessary stats of the index fileGravatar Joey Hess2012-01-14
| | | | Up to one per file processed.
* tweaksGravatar Joey Hess2012-01-11
|
* reorgGravatar Joey Hess2012-01-10
|
* log: New command that displays the location log for file, showing each ↵Gravatar Joey Hess2012-01-06
| | | | | | | | | | | | | | | repository they were added to and removed from. This needs to run git log on the location log files to get at all past versions of the file, which tends to be a bit slow. It would be possible to make a version optimised for showing the location logs for every key. That would only need to run git log once, so would be faster, but it would need to process an enormous amount of data, so would not speed up the individual file case. In the future it would be nice to support log --format. log --json also doesn't work right yet.
* Added remote.name.annex-web-options configuration setting, which can be used ↵Gravatar Joey Hess2012-01-02
| | | | to provide parameters to whichever of wget or curl git-annex uses (depends on which is available, but most of their important options suitable for use here are the same).
* Merge branch 'master' into autosyncGravatar Joey Hess2011-12-30
|\
* | refactorGravatar Joey Hess2011-12-30
| |
* | add forceUpdateGravatar Joey Hess2011-12-30
| | | | | | | | | | This code is picked from my tweak-fetch branch, which already did the needed refactoring.
| * Merge branch 'new-monad-control'Gravatar Joey Hess2011-12-24
|/|
* | use Common in a few more modulesGravatar Joey Hess2011-12-20
| |
* | split out Git/Command.hsGravatar Joey Hess2011-12-14
| |
* | split more stuff out of Git.hsGravatar Joey Hess2011-12-14
| |
* | move commit to Git.BranchGravatar Joey Hess2011-12-13
| |
* | split out three modules from GitGravatar Joey Hess2011-12-13
| | | | | | | | | | Constructors and configuration make sense in separate modules. A separate Git.Types is needed to avoid cycles.
* | avoid closing pipe before all the shas are read from itGravatar Joey Hess2011-12-12
| | | | | | | | | | | | | | Could have just used hGetContentsStrict here, but that would require storing all the shas in memory. Since this is called at the end of a git-annex run, it may have created a *lot* of shas, so I avoid that memory use and stream them out like before.
* | broke out Git/HashObject.hsGravatar Joey Hess2011-12-12
| |
* | broke out Git/Branch.hs and reorganizedGravatar Joey Hess2011-12-12
| |
* | split out Git/Ref.hsGravatar Joey Hess2011-12-12
| |
* | split out Annex/Journal.hsGravatar Joey Hess2011-12-12
| |
* | split out Annex/BranchState.hsGravatar Joey Hess2011-12-12
| |
* | update commentGravatar Joey Hess2011-12-12
| |
* | optimisationGravatar Joey Hess2011-12-12
| | | | | | | | avoids a redundant call to git show-ref
* | optimisationGravatar Joey Hess2011-12-12
| | | | | | | | avoids a useless diff from git-annex..refs/heads/git-annex
* | cleanupGravatar Joey Hess2011-12-12
| |
* | avoid redundant call to updateIndexGravatar Joey Hess2011-12-11
| | | | | | | | commitBranch calls updateIndex
* | detect and recover from branch push/commit raceGravatar Joey Hess2011-12-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dealing with a race without using locking is exceedingly difficult and tricky. Fully tested, I hope. There are three places left where the branch can be updated, that are not covered by the race recovery code. Let's prove they're all immune to the race: 1. tryFastForwardTo checks to see if a fast-forward can be done, and then does git-update-ref on the branch to fast-forward it. If a push comes in before the check, then either no fast-forward will be done (ok), or the push set the branch to a ref that can still be fast-forwarded (also ok) If a push comes in after the check, the git-update-ref will undo the ref change made by the push. It's as if the push did not come in, and the next git-push will see this, and try to re-do it. (acceptable) 2. When creating the branch for the very first time, an empty index is created, and a commit of it made to the branch. The commit's ref is recorded as the current state of the index. If a push came in during that, it will be noticed the next time a commit is made to the branch, since the branch will have changed. (ok) 3. Creating the branch from an existing remote branch involves making the branch, and then getting its ref, and recording that the index reflects that ref. If a push creates the branch first, git-branch will fail (ok). If the branch is created and a racing push is then able to change it (highly unlikely!) we're still ok, because it first records the ref into the index.lck, and then updating the index. The race can cause the index.lck to have the old branch ref, while the index has the newly pushed branch merged into it, but that only results in an unnecessary update of the index file later on.
| * Merge branch 'master' into new-monad-controlGravatar Joey Hess2011-12-11
| |\ | |/ |/| | | | | Conflicts: git-annex.cabal