summaryrefslogtreecommitdiff
path: root/Annex
Commit message (Collapse)AuthorAge
* support .git/annex on a different disk than the rest of the repoGravatar Joey Hess2011-11-28
| | | | | | | | | | | | | | | | | | The only fully supported thing is to have the main repository on one disk, and .git/annex on another. Only commands that move data in/out of the annex will need to copy it across devices. There is only partial support for putting arbitrary subdirectories of .git/annex on different devices. For one thing, but this can require more copies to be done. For example, when .git/annex/tmp is on one device, and .git/annex/journal on another, every journal write involves a call to mv(1). Also, there are a few places that make hard links between various subdirectories of .git/annex with createLink, that are not handled. In the common case without cross-device, the new moveFile is actually faster than renameFile, avoiding an unncessary stat to check that a file (not a directory) is being moved. Of course if a cross-device move is needed, it is as slow as mv(1) of the data.
* tweaksGravatar Joey Hess2011-11-19
|
* tweakGravatar Joey Hess2011-11-19
|
* ensure branch exists before trying to update itGravatar Joey Hess2011-11-16
| | | | | | The branch may not exist, if .git/annex has been copied over from another repo (or a corrupted repo). I suppose it could also have gotten deleted somehow. Without this, there is a confusing failure.
* improve type signatures with a Ref newtypeGravatar Joey Hess2011-11-16
| | | | | | | | | | | In git, a Ref can be a Sha, or a Branch, or a Tag. I added type aliases for those. Note that this does not prevent mixing up of eg, refs and branches at the type level. Since git really doesn't care, except rare cases like git update-ref, or git tag -d, that seems ok for now. There's also a tree-ish, but let's just use Ref for it. A given Sha or Ref may or may not be a tree-ish, depending on the object type, so there seems no point in trying to represent it at the type level.
* better nameGravatar Joey Hess2011-11-16
|
* merge: Now runs in constant space.Gravatar Joey Hess2011-11-15
| | | | | | | | | | | | | | | Before, a merge was first calculated, by running various actions that called git and built up a list of lines, which were at the end sent to git update-index. This necessarily used space proportional to the size of the diff between the trees being merged. Now, lines are streamed into git update-index from each of the actions in turn. Runtime size of git-annex merge when merging 50000 location log files drops from around 100 mb to a constant 4 mb. Presumably it runs quite a lot faster, too.
* Optimised union merging; now only runs git cat-file once.Gravatar Joey Hess2011-11-12
|
* avoid unnecessary auto-merge when only changing a file in the branch.Gravatar Joey Hess2011-11-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | Avoids doing auto-merging in commands that don't need fully current information from the git-annex branch. In particular, git annex add no longer needs to auto-merge. Affected commands: Anything that doesn't look up data from the branch, but does write a change to it. It might seem counterintuitive that we can change a value without first making sure we have the current value. This optimisation works because these two sequences are equivilant: 1. pull from remote 2. union merge 3. read file from branch 4. modify file and write to branch vs. 1. read file from branch 2. modify file and write to branch 3. pull from remote 4. union merge After either sequence, the git-annex branch contains the same logical content for the modified file. (Possibly with lines in a different order or additional old lines of course).
* merge: Improve commit messages to mention what was merged.Gravatar Joey Hess2011-11-12
|
* lintGravatar Joey Hess2011-11-11
|
* factored out some useful error catching methodsGravatar Joey Hess2011-11-10
|
* better message when content is lockedGravatar Joey Hess2011-11-10
|
* exclusive locks, ughGravatar Joey Hess2011-11-09
|
* content lockingGravatar Joey Hess2011-11-09
| | | | | I've tested that this solves the cyclic drop problem. Have not looked at cyclic move, etc.
* safer inannex checkingGravatar Joey Hess2011-11-09
| | | | | | | | git-annex-shell inannex now returns always 0, 1, or 100 (the last when it's unclear if content is currently in the index due to it currently being moved or dropped). (Actual locking code still not yet written.)
* reorg to allow taking content lockGravatar Joey Hess2011-11-09
| | | | | | | The lock will only persist during the perform stage, so the content must be removed from the annex then, rather than in the cleanup stage. (No lock is actually taken yet.)
* cleanupGravatar Joey Hess2011-11-09
|
* reorder repo parameters lastGravatar Joey Hess2011-11-08
| | | | | | | | | | | | | Many functions took the repo as their first parameter. Changing it consistently to be the last parameter allows doing some useful things with currying, that reduce boilerplate. In particular, g <- gitRepo is almost never needed now, instead use inRepo to run an IO action in the repo, and fromRepo to get a value from the repo. This also provides more opportunities to use monadic and applicative combinators.
* clean up read/show abuseGravatar Joey Hess2011-11-08
| | | | | | | Avoid ever using read to parse a non-haskell formatted input string. show :: Key is arguably still show abuse, but displaying Keys as filenames is just too useful to give up.
* add a UUID typeGravatar Joey Hess2011-11-07
| | | | Should have done this a long time ago.
* optimizationGravatar Joey Hess2011-11-06
| | | | | The last commit added some git-log calls to a merge. This removes some, by only merging branches that have unique refs.
* merge: Use fast-forward merges when possible.Gravatar Joey Hess2011-11-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Thanks Valentin Haenel for a test case showing how non-fast-forward merges could result in an ongoing pull/merge/push cycle. While the git-annex branch is fast-forwarded, git-annex's index file is still updated using the union merge strategy as before. There's no other way to update the index that would be any faster. It is possible that a union merge and a fast-forward result in different file contents: Files should have the same lines, but a union merge may change their order. If this happens, the next commit made to the git-annex branch will have some unnecessary changes to line orders, but the consistency of data should be preserved. Note that when the journal contains changes, a fast-forward is never attempted, which is fine, because committing those changes would be vanishingly unlikely to leave the git-annex branch at a commit that already exists in one of the remotes. The real difficulty is handling the case where multiple remotes have all changed. git-annex does find the best (ie, newest) one and fast forwards to it. If the remotes are diverged, no fast-forward is done at all. It would be possible to pick one, fast forward to it, and make a merge commit to the rest, I see no benefit to adding that complexity. Determining the best of N changed remotes requires N*2+1 calls to git-log, but these are fast git-log calls, and N is typically small. Also, typically some or all of the remote refs will be the same, and git-log is not called to compare those. In the real world I expect this will almost always add only 1 git-log call to the merge process. (Which already makes N anyway.)
* ensure directory exists when locking journalGravatar Joey Hess2011-11-02
| | | | | Fixes git annex init in a bare repository that already has a git-annex branch.
* cleanupGravatar Joey Hess2011-10-27
|
* Sped up some operations on remotes that are on the same host.Gravatar Joey Hess2011-10-27
| | | | | | | | | | | | | | Specifically, disabled trying to update the git-annex branch on the remote, since that data is never used by operations that act on such remotes. Also, when copying content to such a remote, skip committing the presence information changes to its git-annex branch. Leaving it in the journal there is ok: Any command run on the remote that needs the info will flush the journal. This may partially solve this bug: http://git-annex.branchable.com/bugs/fails_to_handle_lot_of_files/ Although I still see unreaped git processes piling up when doing a copy --to.
* clean Annex stuff out of Utility/Gravatar Joey Hess2011-10-16
|
* break out non-log stuff to separate moduleGravatar Joey Hess2011-10-15
|
* reorganize log modulesGravatar Joey Hess2011-10-15
| | | | no code changes
* minor syntax changesGravatar Joey Hess2011-10-11
|
* tweaksGravatar Joey Hess2011-10-10
|
* fix a raceGravatar Joey Hess2011-10-09
| | | | | | | | | Another process may stage journalled files before the lock is taken, so need to get the list of journalled files afterwards. It's unfortunate this means getting the directory contents twice, but it seems better to do that than sometimes take the lock unnecessarily.
* better layoutGravatar Joey Hess2011-10-07
| | | | | And a theoretical fix to branchstate cache invalidation, but not a bug that could actually happen.
* performance fixGravatar Joey Hess2011-10-07
| | | | | It was checking if it needed to merge on every branch access, fix it to only check once.
* avoid merging multiple branches that point to the same treeGravatar Joey Hess2011-10-07
| | | | avoids git warning "error: duplicate parent xxx ignored"
* faster union merge of multiple branches into indexGravatar Joey Hess2011-10-07
| | | | only write index once
* renameGravatar Joey Hess2011-10-05
|
* renameGravatar Joey Hess2011-10-04
|
* factor out Annex exception handling moduleGravatar Joey Hess2011-10-04