summaryrefslogtreecommitdiff
path: root/Annex
Commit message (Collapse)AuthorAge
...
* add annex.sshcaching config settingGravatar Joey Hess2012-01-20
|
* ssh connection cachingGravatar Joey Hess2012-01-20
| | | | | | | | | | | Ssh connection caching is now enabled automatically by git-annex. Only one ssh connection is made to each host per git-annex run, which can speed some things up a lot, as well as avoiding repeated password prompts. Concurrent git-annex processes also share ssh connections. Cached ssh connections are shut down when git-annex exits. Note: The rsync special remote does not yet participate in the ssh connection caching.
* fsck --from remote --fastGravatar Joey Hess2012-01-20
| | | | | | | Avoids expensive file transfers, at the expense of checking file size and/or contents. Required some reworking of the remote code.
* optimise fsck --from normal git remotesGravatar Joey Hess2012-01-19
| | | | | | | | | | For a local git remote, can symlink the file. For a git remote using rsync, can preseed any local content. There are a few reasons to use fsck --from on a normal git remote. One is if it's using gitosis or similar, and you don't have shell access to run git annex locally. Another reason could be if you just want to fsck certian files of a bare remote.
* add a configure check for StatFSGravatar Joey Hess2012-01-15
| | | | | | | | | | | | This way, the build log will indicate whether StatFS can be relied on. I've tested all the failing architectures now, and on all of them, the StatFS code now returns Nothing, rather than Just nonsense. Also, if annex.diskreserve is set on a platform where StatFS is not working, git-annex will complain. Also, the Makefile was missing the sources target used when building with cabal.
* tweakGravatar Joey Hess2012-01-14
|
* avoid multiple unnecessary stats of the index fileGravatar Joey Hess2012-01-14
| | | | Up to one per file processed.
* tweaksGravatar Joey Hess2012-01-11
|
* reorgGravatar Joey Hess2012-01-10
|
* log: New command that displays the location log for file, showing each ↵Gravatar Joey Hess2012-01-06
| | | | | | | | | | | | | | | repository they were added to and removed from. This needs to run git log on the location log files to get at all past versions of the file, which tends to be a bit slow. It would be possible to make a version optimised for showing the location logs for every key. That would only need to run git log once, so would be faster, but it would need to process an enormous amount of data, so would not speed up the individual file case. In the future it would be nice to support log --format. log --json also doesn't work right yet.
* Added remote.name.annex-web-options configuration setting, which can be used ↵Gravatar Joey Hess2012-01-02
| | | | to provide parameters to whichever of wget or curl git-annex uses (depends on which is available, but most of their important options suitable for use here are the same).
* Merge branch 'master' into autosyncGravatar Joey Hess2011-12-30
|\
* | refactorGravatar Joey Hess2011-12-30
| |
* | add forceUpdateGravatar Joey Hess2011-12-30
| | | | | | | | | | This code is picked from my tweak-fetch branch, which already did the needed refactoring.
| * Merge branch 'new-monad-control'Gravatar Joey Hess2011-12-24
|/|
* | use Common in a few more modulesGravatar Joey Hess2011-12-20
| |
* | split out Git/Command.hsGravatar Joey Hess2011-12-14
| |
* | split more stuff out of Git.hsGravatar Joey Hess2011-12-14
| |
* | move commit to Git.BranchGravatar Joey Hess2011-12-13
| |
* | split out three modules from GitGravatar Joey Hess2011-12-13
| | | | | | | | | | Constructors and configuration make sense in separate modules. A separate Git.Types is needed to avoid cycles.
* | avoid closing pipe before all the shas are read from itGravatar Joey Hess2011-12-12
| | | | | | | | | | | | | | Could have just used hGetContentsStrict here, but that would require storing all the shas in memory. Since this is called at the end of a git-annex run, it may have created a *lot* of shas, so I avoid that memory use and stream them out like before.
* | broke out Git/HashObject.hsGravatar Joey Hess2011-12-12
| |
* | broke out Git/Branch.hs and reorganizedGravatar Joey Hess2011-12-12
| |
* | split out Git/Ref.hsGravatar Joey Hess2011-12-12
| |
* | split out Annex/Journal.hsGravatar Joey Hess2011-12-12
| |
* | split out Annex/BranchState.hsGravatar Joey Hess2011-12-12
| |
* | update commentGravatar Joey Hess2011-12-12
| |
* | optimisationGravatar Joey Hess2011-12-12
| | | | | | | | avoids a redundant call to git show-ref
* | optimisationGravatar Joey Hess2011-12-12
| | | | | | | | avoids a useless diff from git-annex..refs/heads/git-annex
* | cleanupGravatar Joey Hess2011-12-12
| |
* | avoid redundant call to updateIndexGravatar Joey Hess2011-12-11
| | | | | | | | commitBranch calls updateIndex
* | detect and recover from branch push/commit raceGravatar Joey Hess2011-12-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dealing with a race without using locking is exceedingly difficult and tricky. Fully tested, I hope. There are three places left where the branch can be updated, that are not covered by the race recovery code. Let's prove they're all immune to the race: 1. tryFastForwardTo checks to see if a fast-forward can be done, and then does git-update-ref on the branch to fast-forward it. If a push comes in before the check, then either no fast-forward will be done (ok), or the push set the branch to a ref that can still be fast-forwarded (also ok) If a push comes in after the check, the git-update-ref will undo the ref change made by the push. It's as if the push did not come in, and the next git-push will see this, and try to re-do it. (acceptable) 2. When creating the branch for the very first time, an empty index is created, and a commit of it made to the branch. The commit's ref is recorded as the current state of the index. If a push came in during that, it will be noticed the next time a commit is made to the branch, since the branch will have changed. (ok) 3. Creating the branch from an existing remote branch involves making the branch, and then getting its ref, and recording that the index reflects that ref. If a push creates the branch first, git-branch will fail (ok). If the branch is created and a racing push is then able to change it (highly unlikely!) we're still ok, because it first records the ref into the index.lck, and then updating the index. The race can cause the index.lck to have the old branch ref, while the index has the newly pushed branch merged into it, but that only results in an unnecessary update of the index file later on.
| * Merge branch 'master' into new-monad-controlGravatar Joey Hess2011-12-11
| |\ | |/ |/| | | | | Conflicts: git-annex.cabal
* | optimize index updatingGravatar Joey Hess2011-12-11
| | | | | | | | | | | | | | | | | | | | | | The last branch ref that the index was updated to is stored in .git/annex/index.lck, and the index only updated when the current branch ref differs. (The .lck file should later be used for locking too.) Some more optimization is still needed, since there is some redundancy in calls to git show-ref.
* | slow, stupid, and safe index updatingGravatar Joey Hess2011-12-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Always merge the git-annex branch into .git/annex/index before making a commit from the index. This ensures that, when the branch has been changed in any way (by a push being received, or changes pulled directly into it, or even by the user checking it out, and committing a change), the index reflects those changes. This is much too slow; it needs to be optimised to only update the index when the branch has really changed, not every time. Also, there is an unhandled race, when a change is made to the branch right after the index gets updated. I left it in for now because it's unlikely and I didn't want to complicate things with additional locking yet.
* | move a file location to Locations.hsGravatar Joey Hess2011-12-11
| |
* | no need to show, it's a stringGravatar Joey Hess2011-12-10
| |
* | hslintGravatar Joey Hess2011-12-09
| |
| * adjust to build with monad-control-0.3Gravatar Joey Hess2011-12-05
|/ | | | | | | | | I had to, I hope temporarily, lose my nice Annex newtype, and use a type synonym. This because I cannot find a way to derive a MonadBaseControl instance of the Annex newtype. I've emailed Bas van Dijk in hope he can help get the newtype back. Otherwise appears to build & work.
* cleanupGravatar Joey Hess2011-11-30
|
* add support for using hashDirLower in addition to hashDirMixedGravatar Joey Hess2011-11-28
| | | | | | | | | | Supporting multiple directory hash types will allow converting to a different one, without a flag day. gitAnnexLocation now checks which of the possible locations have a file. This means more statting of files. Several places currently use gitAnnexLocation and immediately check if the returned file exists; those need to be optimised.
* support .git/annex on a different disk than the rest of the repoGravatar Joey Hess2011-11-28
| | | | | | | | | | | | | | | | | | The only fully supported thing is to have the main repository on one disk, and .git/annex on another. Only commands that move data in/out of the annex will need to copy it across devices. There is only partial support for putting arbitrary subdirectories of .git/annex on different devices. For one thing, but this can require more copies to be done. For example, when .git/annex/tmp is on one device, and .git/annex/journal on another, every journal write involves a call to mv(1). Also, there are a few places that make hard links between various subdirectories of .git/annex with createLink, that are not handled. In the common case without cross-device, the new moveFile is actually faster than renameFile, avoiding an unncessary stat to check that a file (not a directory) is being moved. Of course if a cross-device move is needed, it is as slow as mv(1) of the data.
* tweaksGravatar Joey Hess2011-11-19
|
* tweakGravatar Joey Hess2011-11-19
|
* ensure branch exists before trying to update itGravatar Joey Hess2011-11-16
| | | | | | The branch may not exist, if .git/annex has been copied over from another repo (or a corrupted repo). I suppose it could also have gotten deleted somehow. Without this, there is a confusing failure.
* improve type signatures with a Ref newtypeGravatar Joey Hess2011-11-16
| | | | | | | | | | | In git, a Ref can be a Sha, or a Branch, or a Tag. I added type aliases for those. Note that this does not prevent mixing up of eg, refs and branches at the type level. Since git really doesn't care, except rare cases like git update-ref, or git tag -d, that seems ok for now. There's also a tree-ish, but let's just use Ref for it. A given Sha or Ref may or may not be a tree-ish, depending on the object type, so there seems no point in trying to represent it at the type level.
* better nameGravatar Joey Hess2011-11-16
|
* merge: Now runs in constant space.Gravatar Joey Hess2011-11-15
| | | | | | | | | | | | | | | Before, a merge was first calculated, by running various actions that called git and built up a list of lines, which were at the end sent to git update-index. This necessarily used space proportional to the size of the diff between the trees being merged. Now, lines are streamed into git update-index from each of the actions in turn. Runtime size of git-annex merge when merging 50000 location log files drops from around 100 mb to a constant 4 mb. Presumably it runs quite a lot faster, too.
* Optimised union merging; now only runs git cat-file once.Gravatar Joey Hess2011-11-12
|
* avoid unnecessary auto-merge when only changing a file in the branch.Gravatar Joey Hess2011-11-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Avoids doing auto-merging in commands that don't need fully current information from the git-annex branch. In particular, git annex add no longer needs to auto-merge. Affected commands: Anything that doesn't look up data from the branch, but does write a change to it. It might seem counterintuitive that we can change a value without first making sure we have the current value. This optimisation works because these two sequences are equivilant: 1. pull from remote 2. union merge 3. read file from branch 4. modify file and write to branch vs. 1. read file from branch 2. modify file and write to branch 3. pull from remote 4. union merge After either sequence, the git-annex branch contains the same logical content for the modified file. (Possibly with lines in a different order or additional old lines of course).