aboutsummaryrefslogtreecommitdiff
path: root/Annex
Commit message (Collapse)AuthorAge
* Always use filesystem encoding for all file and handle reads and writes.Gravatar Joey Hess2016-12-24
| | | | | This is a big scary change. I have convinced myself it should be safe. I hope!
* Revert ServerAliveIntervalGravatar Joey Hess2016-12-13
| | | | | | | | Revert ServerAliveInterval change in 6.20161111, which caused problems with too many old versions of ssh and unusual ssh configurations. It should have not been needed anyway since ssh is supposted to have TCPKeepAlive enabled by default.
* make tor hidden service work when directory watching is not availableGravatar Joey Hess2016-12-09
| | | | Avoid crashing when built w/o inotify..
* refactor ref change watchingGravatar Joey Hess2016-12-09
| | | | | | | | | | | | | | | | | | Added to change notification to P2P protocol. Switched to a TBChan so that a single long-running thread can be started, and serve perhaps intermittent requests for change notifications, without buffering all changes in memory. The P2P runner currently starts up a new thread each times it waits for a change, but that should allow later reusing a thread. Although each connection from a peer will still need a new watcher thread to run. The dependency on stm-chans is more or less free; some stuff in yesod uses it, so it was already indirectly pulled in when building with the webapp. This commit was sponsored by Francois Marier on Patreon.
* update progress logs in remotedaemon send/receiveGravatar Joey Hess2016-12-08
|
* plumb assicated files through P2P protocol for updating transfer logsGravatar Joey Hess2016-12-02
| | | | | | | | | | ReadContent can't update the log, since it reads lazily. This part of the P2P monad will need to be rethought. Associated files are heavily sanitized when received from a peer; they could be an exploit vector. This commit was sponsored by Jochen Bartl on Patreon.
* implement p2p commandGravatar Joey Hess2016-11-30
|
* Avoid backtraces on expected failures when built with ghc 8; only use ↵Gravatar Joey Hess2016-11-15
| | | | | | | | | | | | | backtraces for unexpected errors. ghc 8 added backtraces on uncaught errors. This is great, but git-annex was using error in many places for a error message targeted at the user, in some known problem case. A backtrace only confuses such a message, so omit it. Notably, commands like git annex drop that failed due to eg, numcopies, used to use error, so had a backtrace. This commit was sponsored by Ethan Aubin.
* Make .git/annex/ssh.config file work with versions of ssh older than 7.3, ↵Gravatar Joey Hess2016-11-07
| | | | | | | | | which don't support Include. When used with an older version of ssh, any ServerAliveInterval in ~/.ssh/config will be overridden by .git/annex/ssh.config. This commit was sponsored by Josh Taylor on Patreon.
* Run ssh with ServerAliveInterval 60Gravatar Joey Hess2016-10-26
| | | | | | | | So that stalled transfers will be noticed within about 3 minutes, even if TCPKeepAlive is disabled or doesn't work. Rather than setting with -o, use -F with another config file, so that any settings in ~/.ssh/config or /etc/ssh/ssh_config overrides this.
* Improve ssh socket cleanup code to skip over the cruft that NFS sometimes ↵Gravatar Joey Hess2016-10-26
| | | | puts in a directory when a file is being deleted.
* upgrade: Handle upgrade to v6 when the repository already contains v6 ↵Gravatar Joey Hess2016-10-17
| | | | | | | | | | | | | unlocked files whose content is already present. Closes https://github.com/datalad/datalad/issues/1020 The use of runWriter in scanUnlockedFiles broke due to this change; it failed with blocked indefinitely in mvar, because the database write handle was taken while linkFromAnnex needed to also write to it (to update the inode cache). So, switched to using a separate runWriter for each call to addAssociatedFileFast. A little less efficient, but not greatly; the writes should all still be cached.
* refactorGravatar Joey Hess2016-10-17
|
* lock: Fix edge cases where data loss could occur in v6 mode.Gravatar Joey Hess2016-10-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case where the pointer file is in place, and not the content of the object, lock's performNew was called with filemodified=True, which caused it to try to repopulate the object from an unmodified associated file, of which there were none. So, the content of the object got thrown away incorrectly. This was the cause (although not the root cause) of data loss in https://github.com/datalad/datalad/issues/1020 The same problem could also occur when the work tree file is modified, but the object is not, and lock is called with --force. Added a test case for this, since it's excercising the same code path and is easier to set up than the problem above. Note that this only occurred when the keys database did not have an inode cache recorded for the annex object. Normally, the annex object would be in there, but there are of course circumstances where the inode cache is out of sync with reality, since it's only a cache. Fixed by checking if the object is unmodified; if so we don't need to try to repopulate it. This does add an additional checksum to the unlock path, but it's already checksumming the worktree file in another case, so it doesn't slow it down overall. Further investigation found a similar problem occurred when smudge --clean is called on a file and the inode cache is not populated. cleanOldKeys deleted the unmodified old object file in this case. This was also fixed by checking if the object is unmodified. In general, use of getInodeCaches and sameInodeCache is potentially dangerous if the inode cache has not gotten populated for some reason. Better to use isUnmodified. I breifly auited other places that check the inode cache, and did not see any immediate problems, but it would be easy to miss this kind of problem.
* Support using v3 repositories without upgrading them to v5.Gravatar Joey Hess2016-10-05
| | | | | | | An easy change now that supportedVersions is a list. Since v3 and v5 are identical other than version number, just add v3 to the list. This commit was sponsored by andrea rota.
* When auto-upgrading a v3 remote, avoid upgrading to version 6, instead keep ↵Gravatar Joey Hess2016-10-05
| | | | | | | | | | it at version 5. Fixes a bug introduced with v6 mode that I didn't notice until now. Probably not many v3 repos left out there, and upgrading them to v6 mode is not disastrous, only a little premature. This commit was sponsored by Riku Voipio
* Avoid using a lot of memory when large objects are present in the git repositoryGravatar Joey Hess2016-10-05
| | | | | | | | | | | | | | | | | | | | | | .. and have to be checked to see if they are a pointed to an annexed file. Cases where such memory use could occur included, but were not limited to: - git commit -a of a large unlocked file (in v5 mode) - git-annex adjust when a large file was checked into git directly Generally, any use of catKey was a potential problem. Fix by using git cat-file --batch-check to check size before catting. This adds another git batch process, which is included in the CatFileHandle for simplicity. There could be performance impact, anywhere catKey is used. Particularly likely to affect adjusted branch generation speed, and operations on unlocked files in v6 mode. Hopefully since the --batch-check and --batch read the same data, disk buffering will avoid most overhead. Leaving only the overhead of talking to the process over the pipe and whatever computation --batch-check needs to do. This commit was sponsored by Bruno BEAUFILS on Patreon.
* Optimisations to git-annex branch query and setting, avoiding repeated ↵Gravatar Joey Hess2016-09-29
| | | | | | | | | | | | | | | | | | | | | | copies of the environment. Speeds up commands like "git-annex find --in remote" by over 50%. Profiling showed that adjustGitEnv was 21% of the time and 37% of the allocations of that command. It copied the environment each time with getEnvironment. The only repeated use of adjustGitEnv is in withIndexFile, which tends to be run at least once per file. So, it was optimised by keeping a cache of the environment, which can be reused. There could be other better ways to optimise this. Maybe get the while environment once at startup. But, then it would have to be serialized back out each time running a child process, so I doubt that would be a net win. It might be better to cache a version of the environment that is pre-modified to use .git-annex/index. But, profiling doesn't show that modifying the enviroment is taking any significant time.
* followupGravatar Joey Hess2016-09-29
|
* Optimisations to time it takes git-annex to walk working tree and find files ↵Gravatar Joey Hess2016-09-26
| | | | | | | | | | to work on. Sped up by around 18%. key2file and file2key were top cost centers according to profiling. The repeated use of replace was not efficient. This new approach is quite a lot more efficient. This commit was sponsored by Denis Dzyubenko on Patreon.
* fix bugs in handing of deep branches with sync and adjusted branchesGravatar Joey Hess2016-09-21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * sync: Previously, when run in a branch with a slash in its name, such as "foo/bar", the sync branch was "synced/bar". That conflicted with the sync branch used for branch "bar", so has been changed to "synced/foo/bar". * adjust: Previously, when adjusting a branch with a slash in its name, such as "foo/bar", the adjusted branch was "adjusted/bar(unlocked)". That conflicted with the adjusted branch used for branch "bar", so has been changed to "adjusted/foo/bar(unlocked)" * Also, running sync in an adjusted branch did not correctly sync changes back to the parent branch when it had a slash in its name. This bug has been fixed. Eliminate use of Git.Ref.under and Git.Ref.basename; using Git.Ref.underBase and Git.Ref.base make everything handle deep branches correctly. Probably noone was adjusting deep branches, and v6 is still experimental anyway, so I'm not going to worry about the mess that was left by that bug. In the case of git-annex sync, using a fixed git-annex with an old unfixed one will mean they use different sync branches for a deep branch, and so they may stop syncing until the old one is upgraded. However, that's only a problem when syncing between repositories without going via a central bare repository. Added a warning about this to the CHANGELOG, but it's probably not going to affect many people at all. This commit was sponsored by Riku Voipio.
* make --json-progress work for url downloadsGravatar Joey Hess2016-09-09
|
* disentangle concurrency and message typeGravatar Joey Hess2016-09-09
| | | | | | | | This makes -Jn work with --json and --quiet, where before setting -Jn disabled those options. Concurrent json output is currently a mess though since threads output chunks over top of one-another.
* get -J: Download different files from different remotes when the remotes ↵Gravatar Joey Hess2016-09-06
| | | | | | | | | | | | | | | | | have the same costs. Only done in -J mode because only if there's concurrency can downloading from two remotes be faster. Without concurrency, it's likely the case that sequential downloads from the same remote are faster than switching back and forth between two remotes. There is some hairy MVar code here, but basically it just keeps the activeremotes MVar full except when deciding which remote to assign to a thread. Also affects gets by sync --content -J This commit was sponsored by Jochen Bartl.
* remove TransferObserverGravatar Joey Hess2016-08-03
| | | | unused after last commit
* Re-enable accumulating transfer failure log files for command-line actionsGravatar Joey Hess2016-08-03
| | | | | | | | | | | | | | | | | This was disabled in commit 7ca8bf3321d1b62ea4e817e28914ed2fa56afe30, because only the assistant used them, and they were clutter. But, now --failed also uses them. Remove the failure log files after successful transfers. Should avoid most of the clutter problems. Commit 7ca8bf3321d1b62ea4e817e28914ed2fa56afe30 mentions a subtle behavior change, which has now been reverted: There is one behavior change from this. If glacier is being used, and a manual git annex get --from glacier fails because the file isn't available yet, the assistant will no longer later see that failed transfer file and retry the get.
* get, move, copy, mirror: Added --failed switch which retries failed copies/movesGravatar Joey Hess2016-08-03
| | | | | | | | | Note that get --from foo --failed will get things that a previous get --from bar tried and failed to get, etc. I considered making --failed only retry transfers from the same remote, but it was easier, and seems more useful, to not have the same remote requirement. Noisy due to some refactoring into Types/
* Added metadata --batch option, which allows getting, setting, deleting, and ↵Gravatar Joey Hess2016-07-27
| | | | modifying metadata for multiple files/keys.
* When built with ut uid-1.3.12, generate more random UUIDs than beforeGravatar Joey Hess2016-07-27
| | | | | | | | | | | | | | | | | | | | | | | | Use nextRandom to generate the random UUID, rather than using randomIO. This gets fixes for the following two bugs in the uuid library. However, this did not impact git-annex much, so a hard depedency has not been added on uuid-1.3.12. https://github.com/aslatter/uuid/issues/15 "v4 UUIDs are not that random" This doesn't greatly affect git-annex, because even with only 2^64 possible UUIDs, the chance that two git-annex repositories that are clones of the same git repo get the same UUID is miniscule. And, git-annex generates only one UUID per run, so preducting subsequent UUIDs is not a problem. https://github.com/aslatter/uuid/issues/16 "Remove Random instance for UUID, or mark it as deprecated" git-annex was using that instance; let's stop before it gets deprecated or removed.
* --branch, stage 2Gravatar Joey Hess2016-07-20
| | | | | | | | Show branch:file that is being operated on. I had to make ActionItem a type and not a type class because withKeyOptions' passed two different types of values when using the type class, and I could not get the type checker to accept that.
* Avoid any access to keys database in v5 mode repositories, which are not ↵Gravatar Joey Hess2016-07-19
| | | | supposed to use that database.
* Speed up startup time by caching the refs that have been merged into the ↵Gravatar Joey Hess2016-07-17
| | | | | | | git-annex branch. This can speed up git-annex commands by as much as a second, depending on the number of remotes.
* handle SomeAsyncException same as AsyncExceptionGravatar Joey Hess2016-06-20
| | | | | This new class was added to base a while ago; I don't know what uses it, but it's intended to be an async exception, so make sure we don't catch it.
* fix build on windowsGravatar Joey Hess2016-06-13
|
* v6: Fix bad merge in an adjusted branch that resulted in an empty tree.Gravatar Joey Hess2016-06-13
|
* Make git clean filter preserve the backend that was used for a file.Gravatar Joey Hess2016-06-09
|
* Fix bug in initialization of clone from a repo with an adjusted branch that ↵Gravatar Joey Hess2016-06-09
| | | | | | | | | had not been synced back to master. This bug caused broken tree objects to get built by a later git annex sync. This is a somewhat unlikely but not impossible situation, and the test suite's union_merge_regression test tickled it when it was run on FAT.
* also avoid crashing in most circumstances if unable to determine the usernameGravatar Joey Hess2016-06-08
| | | | | | | | Mostly the username is only used for the git committer or other display purposes, and we can just fall back to a dummy value in these cases. The only remaining place where an error is thrown is when starting local pairing, which needs the username to be known.
* Fix bad automatic merge conflict resolution between an annexed file and a ↵Gravatar Joey Hess2016-06-07
| | | | | | | directory with the same name when in an adjusted branch. When running in an overlay work tree, all unchanged files show as deleted, so this code that stages deletions should not run.
* withAltRepo needs a separate queue of changesGravatar Joey Hess2016-06-03
| | | | | | | | | | | | | | | | | | | | | The queue could potentially contain changes from before withAltRepo, and get flushed inside the call, which would apply the changes to the modified repo. Or, changes could be queued in withAltRepo that were intended to affect the modified repo, but don't get flushed until later. I don't know of any cases where either happens, but better safe than sorry. Note that this affect withIndexFile, which is used in git-annex branch updates. So, it potentially makes things slower. Should not be by much; the overhead consists only of querying the current queue a couple of times, and potentially flushing changes queued within withAltRepo earlier, that could have maybe been bundled with other later changes. Notice in particular that the existing queue is not flushed when calling withAltRepo. So eg when git annex add needs to stage files in the index, it will still bundle them together efficiently.
* Fix initialization of a bare clone of a repo that has an adjusted branch ↵Gravatar Joey Hess2016-06-02
| | | | checked out.
* refactor isBareRepoGravatar Joey Hess2016-06-02
|
* better avoid switching to direct mode in clone of adjusted branch repoGravatar Joey Hess2016-06-02
|
* avoid switching to direct mode in clone of adjusted branch repoGravatar Joey Hess2016-06-02
|
* Automatically enable v6 mode when initializing in a clone from a repo that ↵Gravatar Joey Hess2016-06-02
| | | | | | | has an adjusted branch checked out. The clone also has the adjusted branch checked out, so it needs to be initialized to a version that supports that.
* sync --content: Fix bug that caused transfers of files to be made to a git ↵Gravatar Joey Hess2016-06-02
| | | | | | | | | | | remote that does not have a UUID. This particularly impacted clones from gcrypt repositories. Added guard in Annex.Transfer to prevent this problem at a deeper level. I'm unhappy ith NoUUID, but having Maybe UUID instead wouldn't help either if nothing checked that there was a UUID. Since there legitimately need to be Remotes that do not have a UUID, I can't see a way to fix it at the type level, short making there be two separate types of Remotes.
* minor typo fixes throughoutGravatar Yaroslav Halchenko2016-06-02
| | | | | problematic flexibility
* include 3 in upgradableVersionsGravatar Joey Hess2016-05-24
| | | | Does not change behavior, only git annex version output
* Pass the various gnupg-options configs to gpg in several cases where they ↵Gravatar Joey Hess2016-05-23
| | | | | | | | | | | | were not before. Removed the instance LensGpgEncParams RemoteConfig because it encouraged code that does not take the RemoteGitConfig into account. RemoteType's setup was changed to take a RemoteGitConfig, although the only place that is able to provide a non-empty one is enableremote, when it's changing an existing remote. This led to several folow-on changes, and got RemoteGitConfig plumbed through.
* fix recent test suite reversionGravatar Joey Hess2016-05-23
| | | | | | git annex adjust --force will overwrite any current adjusted branch. I didn't document this because for the user, deleting the branch is just as good.