aboutsummaryrefslogtreecommitdiff
path: root/Remote/Git.hs
Commit message (Collapse)AuthorAge
* use P2P protocol for dropGravatar Joey Hess2018-03-08
| | | | | | | | | | | | | | Not yet used for everything else, but this is enough to verify that it works, and do some benchmarking. Some bugfixes included, which got it working. Also fallback to old actions has been verified to work correctly. Benchmarked dropping one thousand files from a ssh remote on localhost. Using the old git-annex 40.867 seconds. With the P2P protocol 9.905 seconds! This commit was sponsored by Jochen Bartl on Patreon.
* refactor p2p remote action codeGravatar Joey Hess2018-03-08
| | | | | | | | | Make a Remote.Helper.P2P using code that was in Remote.P2P, converted to use generic protocol runner actions. This will allow it to be reused in Remote.Git. This commit was sponsored by mo on Patreon.
* p2p ssh connection poolsGravatar Joey Hess2018-03-08
| | | | | | | | | | | | | | | | | | | | | | Much like Remote.P2P, there's a pool of connections to a peer, in order to support concurrent operations. Deals with old git-annex-ssh on the remote that does not support p2pstdio, by only trying once to use it, and remembering if it's not supported. Made p2pstdio send an AUTH_SUCCESS with its uuid, which serves the dual purposes of something to detect to see that the connection is working, and a way to verify that it's connected to the right uuid. (There's a redundant uuid check since the uuid field is sent by git_annex_shell, but I anticipate that being removed later when the legacy git-annex-shell stuff gets removed.) Not entirely happy with Remote.Git.runSsh's behavior when the proto action fails. Running the fallback will work ok, but what will we do when the fallbacks later get removed? It might be better to try to reconnect, in case the connection got closed. This commit was sponsored by Boyd Stephen Smith Jr. on Patreon.
* make sure that lockContentShared is always paired with an inAnnex checkGravatar Joey Hess2018-03-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | lockContentShared had a screwy caveat that it didn't verify that the content was present when locking it, but in the most common case, eg indirect mode, it failed to lock when the content is not present. That led to a few callers forgetting to check inAnnex when using it, but the potential data loss was unlikely to be noticed because it only affected direct mode I think. Fix data loss bug when the local repository uses direct mode, and a locally modified file is dropped from a remote repsitory. The bug caused the modified file to be counted as a copy of the original file. (This is not a severe bug because in such a situation, dropping from the remote and then modifying the file is allowed and has the same end result.) And, in content locking over tor, when the remote repository is in direct mode, it neglected to check that the content was actually present when locking it. This could cause git annex drop to remove the only copy of a file when it thought the tor remote had a copy. So, make lockContentShared do its own inAnnex check. This could perhaps be optimised for direct mode, to avoid the check then, since locking the content necessarily verifies it exists there, but I have not bothered with that. This commit was sponsored by Jeff Goeke-Smith on Patreon.
* add remote.<name>.annex-checkuuidGravatar Joey Hess2018-01-10
| | | | | | | | | | | | | | | | | Added remote.<name>.annex-checkuuid config, which can be set to false to disable the default checking of the uuid of remotes that point to directories. This can be useful to avoid unncessary drive spin-ups and automounting. Note that the UUID check is still done before writing to the repository, to avoid writing to the wrong repository if it got relocated. Check is also done before checkPresent to avoid getting confused about what is in which repo. This is effectively the same as the use of git-annex-shell with a uuid to check that the remote repository is the expected one. Did not bother with the check for retrieveKeyFile because it doesn't matter if the wrong repo is used then. This commit was sponsored by Trenton Cronholm on Patreon.
* Improve startup time for commands that do not operate on remotesGravatar Joey Hess2018-01-09
| | | | | | | | | | | | | | And for tab completion, by not unnessessarily statting paths to remotes, which used to cause eg, spin-up of removable drives. Got rid of the remotes member of Git.Repo. This was a bit painful. Remote.Git modifies the list of remotes as it reads their configs, so still need a persistent list of remotes. So, put it in as Annex.gitremotes. It's only populated by getGitRemotes, so commands like examinekey that don't care about remotes won't do so. This commit was sponsored by Jake Vosloo on Patreon.
* Display progress meter when uploading a key without size informationGravatar Joey Hess2017-11-14
| | | | | | Getting the size by statting the content file. This commit was supported by the NSF-funded DataLad project.
* fix process and FD leakGravatar Joey Hess2017-09-29
| | | | | | | | | | | | | Fix process and file descriptor leak that was exposed when git-annex was built with ghc 8.2.1. Apparently ghc has changed its behavior of GC of open file handles that are pipes to running processes. That broke git-annex test on OSX due to running out of FDs. Audited for all uses of Annex.new and made stopCoProcesses be called once it's done with the state. Fixed several places that might have leaked in other situations than running the test suite. This commit was sponsored by Ewen McNeill.
* prevent exporttree=yes on remotes that don't support exportsGravatar Joey Hess2017-09-07
| | | | | | | | | Don't allow "exporttree=yes" to be set when the special remote does not support exports. That would be confusing since the user would set up a special remote for exports, but `git annex export` to it would later fail. This commit was supported by the NSF-funded DataLad project.
* implement exporttree=yes configurationGravatar Joey Hess2017-09-04
| | | | | | | | | | | | | | | | * Only export to remotes that were initialized to support it. * Prevent storing key/value on export remotes. * Prevent enabling exporttree=yes and encryption in the same remote. SetupStage Enable was changed to take the old RemoteConfig. This allowed only setting exporttree when initially setting up a remote, and not configuring it later after stuff might already be stored in the remote. Went with =yes rather than =true for consistency with other parts of git-annex. Changed docs accordingly. This commit was supported by the NSF-funded DataLad project.
* refactor ExportActionsGravatar Joey Hess2017-09-01
| | | | | | | | This will allow disabling exports for remotes that are not configured to allow them. Also, exportSupported will be useful for the external special remote to probe. This commit was supported by the NSF-funded DataLad project
* add API for exportingGravatar Joey Hess2017-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implemented so far for the directory special remote. Several remotes don't make sense to export to. Regular Git remotes, obviously, do not. Bup remotes almost certianly do not, since bup would need to be used to extract the export; same store for Ddar. Web and Bittorrent are download-only. GCrypt is always encrypted so exporting to it would be pointless. There's probably no point complicating the Hook remotes with exporting at this point. External, S3, Glacier, WebDAV, Rsync, and possibly Tahoe should be modified to support export. Thought about trying to reuse the storeKey/retrieveKeyFile/removeKey interface, rather than adding a new interface. But, it seemed better to keep it separate, to avoid a complicated interface that sometimes encrypts/chunks key/value storage and sometimes users non-key/value storage. Any common parts can be factored out. Note that storeExport is not atomic. doc/design/exporting_trees_to_special_remotes.mdwn has some things in the "resuming exports" section that bear on this decision. Basically, I don't think, at this time, that an atomic storeExport would help with resuming, because exports are not key/value storage, and we can't be sure that a partially uploaded file is the same content we're currently trying to export. Also, note that ExportLocation will always use unix path separators. This is important, because users may export from a mix of windows and unix, and it avoids complicating the API with path conversions, and ensures that in such a mix, they always use the same locations for exports. This commit was sponsored by Bruno BEAUFILS on Patreon.
* add annex-ignore-command and annex-sync-command configsGravatar Joey Hess2017-08-17
| | | | | | | | | | | | | | | | Added remote configuration settings annex-ignore-command and annex-sync-command, which are dynamic equivilants of the annex-ignore and annex-sync configurations. For this I needed a new DynamicConfig infrastructure. Its implementation should be as fast as before when there is no dynamic config, and it caches so shell commands are only run once. Note that annex-ignore-command exits nonzero when the remote should be ignored. While that may seem backwards, it allows using the same command for it as for annex-sync-command when you want to disable both. This commit was sponsored by Trenton Cronholm on Patreon.
* de-Maybe remoteGitConfigGravatar Joey Hess2017-05-11
| | | | It's always set, so does not need to be a Maybe.
* When a http remote does not expose an annex.uuid config, only warn about it ↵Gravatar Joey Hess2017-03-29
| | | | | | once, not every time git-annex is run. Same behavior as for a ssh remote.
* AssociatedFile newtypeGravatar Joey Hess2017-03-10
| | | | | | To prevent any further mistakes like 1a497cefb47557f0b4788c606f9071be422b2511 This commit was sponsored by Francois Marier on Patreon.
* sync hack to make updateInstead work on eg FATGravatar Joey Hess2017-02-17
| | | | | | | | | | | | sync: When syncing with a local repository located on a crippled filesystem, run the post-receive hook there, since it wouldn't get run otherwise. This makes pushing to repos on FAT-formatted removable drives update them when receive.denyCurrentBranch=updateInstead. Made Remote.Git export onLocal, which was cleaned up to not have so many caveats about its use. This commit was sponsored by Jeff Goeke-Smith on Patreon.
* have onLocal stop any coprocesses, not only cat-fileGravatar Joey Hess2017-02-17
| | | | | I have not seen any other coprocesses being started, but let's avoid problems if any do for whatever reason.
* Run ssh with -n whenever input is not being piped into itGravatar Joey Hess2017-02-15
| | | | | | | | | | | | | | | | | | | | ... to avoid it consuming stdin that it shouldn't. This fixes git-annex-checkpresentkey --batch remote, which didn't output results for all keys passed into it. Other git-annex commands that communicate with a remote over ssh may also have been consuming stdin that they shouldn't have, which could have impacted using them in eg, shell scripts. For example, a shell script reading files from stdin and passing them to git annex drop would be impacted by this bug, whenever git annex drop ran git-annex-shell checkpresent, it would consume part/all of the stdin that the shell script was supposed to consume. Fixed by adding a ConsumeStdin parameter to Annex.Ssh.sshOptions, which is used throughout git-annex to run ssh (in order for ssh connection caching to work). Every call site was checked to see if it used CreatePipe for stdin, and if not was marked NoConsumeStdin.
* add SetupStage parameter to RemoteType.setupGravatar Joey Hess2017-02-07
| | | | | | | | | | | | | | | | | Most remotes have an idempotent setup that can be reused for enableremote, but in a few cases, it needs to tell which, and whether a UUID was provided to setup was used. This is groundwork for making initremote be able to provide a UUID. It should not change any behavior. Note that it would be nice to make the UUID always be provided to setup, and make setup not need to generate and return a UUID. What prevented this simplification is Remote.Git.gitSetup, which needs to reuse the UUID of the git remote when setting it up, and so has to return that UUID. This commit was sponsored by Thom May on Patreon.
* git-annex-shell, remotedaemon, git remote: Fix some memory DOS attacks.Gravatar Joey Hess2016-12-09
| | | | | | | | | | | | | | | | | | | | | The attacker could just send a very lot of data, with no \n and it would all be buffered in memory until the kernel killed git-annex or perhaps OOM killed some other more valuable process. This is a low impact security hole, only affecting communication between local git-annex and git-annex-shell on the remote system. (With either able to be the attacker). Only those with the right ssh key can do it. And, there are probably lots of ways to construct git repositories that make git use a lot of memory in various ways, which would have similar impact as this attack. The fix in P2P/IO.hs would have been higher impact, if it had made it to a released version, since it would have allowed DOSing the tor hidden service without needing to authenticate. (The LockContent and NotifyChanges instances may not be really exploitable; since the line is read and ignored, it probably gets read lazily and does not end up staying buffered in memory.)
* fixGravatar Joey Hess2016-12-09
|
* make clear that log is only updated after successful removalGravatar Joey Hess2016-12-09
| | | | | This does not change behavior, because an exception is thrown on unsuccessful removal. But is clearer.
* stub Remote.P2PGravatar Joey Hess2016-12-06
| | | | | | | Similar to GCrypt remotes, P2P remotes have an url, so Remote.Git has to separate them out and handle them, passing off to Remote.P2P. This commit was sponsored by Ignacio on Patreon.
* Avoid backtraces on expected failures when built with ghc 8; only use ↵Gravatar Joey Hess2016-11-15
| | | | | | | | | | | | | backtraces for unexpected errors. ghc 8 added backtraces on uncaught errors. This is great, but git-annex was using error in many places for a error message targeted at the user, in some known problem case. A backtrace only confuses such a message, so omit it. Notably, commands like git annex drop that failed due to eg, numcopies, used to use error, so had a backtrace. This commit was sponsored by Ethan Aubin.
* enable forwardRetry for command-line transfersGravatar Joey Hess2016-10-26
| | | | | | | | | | | | | | | | | If a transfer fails for some reason, but some data managed to be sent, the transfer will be retried. (The assistant already did this.) Possible impacts: * More ssh prompts if ssh needs to prompt for a password to connect to a host, or is prompting about some other problem like a ssh key mismatch. * More data transfer due to retrying, epecially when a remote does not support resuming a transfer. In the worst case, a lot of data will be transferred but it fails before the end, and then all that data gets transferred again plus one byte more; repeat until it manages to get the whole file.
* make --json-progress update meter when getting from git remote with rsyncGravatar Joey Hess2016-09-09
|
* remove TransferObserverGravatar Joey Hess2016-08-03
| | | | unused after last commit
* fix warningGravatar Joey Hess2016-05-27
|
* enableremote: Remove annex-ignore configuration from a remote.Gravatar Joey Hess2016-05-24
|
* Pass the various gnupg-options configs to gpg in several cases where they ↵Gravatar Joey Hess2016-05-23
| | | | | | | | | | | | were not before. Removed the instance LensGpgEncParams RemoteConfig because it encouraged code that does not take the RemoteGitConfig into account. RemoteType's setup was changed to take a RemoteGitConfig, although the only place that is able to provide a non-empty one is enableremote, when it's changing an existing remote. This led to several folow-on changes, and got RemoteGitConfig plumbed through.
* Improve behavior when a just added http remote is not available during uuid ↵Gravatar Joey Hess2016-05-03
| | | | probe. Do not mark it as annex-ignore, so it will be tried again later.
* Fix duplicate progress meter display when downloading from a git remote over ↵Gravatar Joey Hess2016-04-19
| | | | http with -J.
* fix drop hang reported by musicmatzeGravatar Joey Hess2016-04-18
| | | | | | | | | | | | | | | | | Fix hang when dropping content needs to lock the content on a ssh remote, which occurred when the remote has git-annex version 5.20151019 or newer. Analysis: `race` runs 2 threads at once, and the hGetLine finishes first. So, it tries to cancel the waitForProcess, but unfortunately that is making a foreign call and so cannot be canceled. The remote git-annex-shell is waiting for a line on stdin before it will exit. Deadlock. This only occurred sometimes; I reproduced it going from darkstar to elephant, but not from darkstar to darkstar. Not sure how that fits into the above analysis -- perhaps a race condition is also involved? Fixed by not using `race`; now the hGetLine will fail with an exception if the remote git-annex-shell exits without any output.
* hard links on windowsGravatar Joey Hess2016-04-08
| | | | | * annex.thin and annex.hardlink are now supported on Windows. * unannex --fast now makes hard links on Windows.
* remove 163 lines of code without changing anything except importsGravatar Joey Hess2016-01-20
|
* avoid hard linking object from other repository when annex.thin is setGravatar Joey Hess2016-01-13
| | | | | This is simpler and less expensive than checking if the src file has a link count >= 2, and also is unlocked.
* remove reundant isDirect checkGravatar Joey Hess2016-01-13
| | | | Already checked in wantHardLink
* typoGravatar Joey Hess2015-12-26
|
* deal with unlocked files when calling rsyncParamsRemoteGravatar Joey Hess2015-12-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In copyFromRemote, it used to check isDirect, but that was not needed; the remote is sending the file, so it doesn't matter if the local, receiving repository is in direct mode or not. And, since the content is not present, yet, it's certianly not unlocked. Note that, the remote may indeed be sending an unlocked file, but sendkey uses sendAnnex, which will detect if the file is modified before or during transfer, and will exit nonzero, aborting the upload. So, the receiver doesn't need any checks. In copyToRemote, it forces recvkey to verify content whenever it's being sent from a v6 repository. recvkey is almost always going to verify content anyway, unless annex.verify is not set. So, this doesn't make it any more expensive, except for in that unusual configuration. The alternative would be to change the recvkey interface, so that the sender checks afterwards if what it was sending changed, and the receiver then throws out the bad transfer. That would be less expensive for the reciever, as it would not need to do a checksum verification. But, it would mean another network round trip, and since rsync closes the connection, it would need to open another ssh connection to do this. Even with connction caching, that would add latency to uploads. It would also complicate the interface, especially because an older git-annex-shell would not have the new interface available. For these reasons, I prefer punting on that at this time, and instead someone might set annex.verify=false and be unhappy that it still verifies.. (One other gotcha not dealt with is that a v5 repo could be upgraded to v6 while an upload is in progress, and a file unlocked and modified.) (Also, I double-checked Remote.GCrypt's calls to rsyncParamsRemote, and they're fine. When a file is being uploaded to gcrypt, or any other special repository, it is mediated by sendAnnex, so changes will be detected at that level and the special remote implementation doesn't need to worry about them.)
* check inode cache in prepSendAnnexGravatar Joey Hess2015-12-10
| | | | | This does mean one query of the database every time an object is sent. May impact performance.
* Display progress meter in -J mode when downloading from the web.Gravatar Joey Hess2015-11-16
| | | | | Including in addurl, and get --from web, but also in S3 and External special remotes when a web url is known for content in those remotes.
* refactorGravatar Joey Hess2015-11-16
|
* Display progress meter in -J mode when copying from a local git repo, to a ↵Gravatar Joey Hess2015-11-16
| | | | | | | | | | | | local git repo, and from a remote git repo. Had everything available, just didn't combine the progress meter with the other places progress is sent to update it. (And to a remote repo already did show progress.) Most special remotes should already display progress meters with -J, same as without it. One exception to this is the web, since it relies on wget/curl progress display without -J. Still todo..
* concurrent-output, first passGravatar Joey Hess2015-11-04
| | | | | | Output without -Jn should be unchanged from before. With -Jn, concurrent-output is used for messages, but regions are not used yet, so it's a mess.
* Avoid displaying network transport warning when a ssh remote does not yet ↵Gravatar Joey Hess2015-10-15
| | | | | | | | | | | have an annex.uuid set. Instead, only display transport error if the configlist output doesn't include an annex.uuid line, even an empty one. A recent change made git-annex init try to get all the remote uuids, and so the transport error would be displayed by it. It was also displayed when eg, copying files to a remote that had no uuid yet.
* fix various build warnings, mostly on WindowsGravatar Joey Hess2015-10-13
| | | | And some when S3 is disabled
* add inAnnex check to local lockKeyGravatar Joey Hess2015-10-09
|
* improve display when lockcontent failsGravatar Joey Hess2015-10-09
| | | | | | | | /dev/null stderr; ssh is still able to display a password prompt despite this Show some messages so the user knows it's locking a remote, and knows if that locking failed.
* implement lockContent for ssh remotesGravatar Joey Hess2015-10-09
|