aboutsummaryrefslogtreecommitdiff
path: root/Remote/Rsync.hs
Commit message (Collapse)AuthorAge
* Support exporttree=yes for rsync special remotes.Gravatar Joey Hess2018-02-28
| | | | | | | | | | | | | | | | | Renaming is not supported; it might be possible to use --fuzzy to get rsync to notice the file is being renamed, but that is a bit ..fuzzy. On the other hand, interrupted transfers of an exported file are resumed, since rsync is great at that. Had to adjust the exporttree docs, which said interrupted transfers would restart. Note that remove no longer makes the empty directory dummy, instead sending the top-level empty directory. This works just as well and I noticed the dummy was unncessary when refactoring it into removeGeneric. Verified that behavior of remove is not changed, and git annex testremote does pass. This commit was sponsored by Brock Spratlen on Patreon.
* finally really add back custom-setup stanzaGravatar Joey Hess2017-12-31
| | | | | | | | | | | | Fourth or fifth try at this and finally found a way to make it work. Absurd amount of busy-work forced on me by change in cabal's behavior. Split up Utility modules that need posix stuff out of ones used by Setup. Various other hacks around inability for Setup to use anything that ifdefs a use of unix. Probably lost a full day of my life to this. This is how build systems make their users hate them. Just saying.
* prevent exporttree=yes on remotes that don't support exportsGravatar Joey Hess2017-09-07
| | | | | | | | | Don't allow "exporttree=yes" to be set when the special remote does not support exports. That would be confusing since the user would set up a special remote for exports, but `git annex export` to it would later fail. This commit was supported by the NSF-funded DataLad project.
* refactor ExportActionsGravatar Joey Hess2017-09-01
| | | | | | | | This will allow disabling exports for remotes that are not configured to allow them. Also, exportSupported will be useful for the external special remote to probe. This commit was supported by the NSF-funded DataLad project
* add API for exportingGravatar Joey Hess2017-08-29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implemented so far for the directory special remote. Several remotes don't make sense to export to. Regular Git remotes, obviously, do not. Bup remotes almost certianly do not, since bup would need to be used to extract the export; same store for Ddar. Web and Bittorrent are download-only. GCrypt is always encrypted so exporting to it would be pointless. There's probably no point complicating the Hook remotes with exporting at this point. External, S3, Glacier, WebDAV, Rsync, and possibly Tahoe should be modified to support export. Thought about trying to reuse the storeKey/retrieveKeyFile/removeKey interface, rather than adding a new interface. But, it seemed better to keep it separate, to avoid a complicated interface that sometimes encrypts/chunks key/value storage and sometimes users non-key/value storage. Any common parts can be factored out. Note that storeExport is not atomic. doc/design/exporting_trees_to_special_remotes.mdwn has some things in the "resuming exports" section that bear on this decision. Basically, I don't think, at this time, that an atomic storeExport would help with resuming, because exports are not key/value storage, and we can't be sure that a partially uploaded file is the same content we're currently trying to export. Also, note that ExportLocation will always use unix path separators. This is important, because users may export from a mix of windows and unix, and it avoids complicating the API with path conversions, and ensures that in such a mix, they always use the same locations for exports. This commit was sponsored by Bruno BEAUFILS on Patreon.
* avoid the dashed ssh hostname class of security holesGravatar Joey Hess2017-08-17
| | | | | | | | | | | | | | | | | | | | | | | | Security fix: Disallow hostname starting with a dash, which would get passed to ssh and be treated an option. This could be used by an attacker who provides a crafted ssh url (for eg a git remote) to execute arbitrary code via ssh -oProxyCommand. No CVE has yet been assigned for this hole. The same class of security hole recently affected git itself, CVE-2017-1000117. Method: Identified all places where ssh is run, by git grep '"ssh"' Converted them all to use a SshHost, if they did not already, for specifying the hostname. SshHost was made a data type with a smart constructor, which rejects hostnames starting with '-'. Note that git-annex already contains extensive use of Utility.SafeCommand, which fixes a similar class of problem where a filename starting with a dash gets passed to a program which treats it as an option. This commit was sponsored by Jochen Bartl on Patreon.
* Support GIT_SSH and GIT_SSH_COMMANDGravatar Joey Hess2017-03-17
| | | | | | | | | | | | | | | | | | | | They are handled close the same as they are by git. However, unlike git, git-annex sometimes needs to pass the -n parameter when using these. So, this has the potential for breaking some setup, and perhaps there ought to be a ANNEX_USE_GIT_SSH=1 needed to use these. But I'd rather avoid that if possible, so let's see if anyone complains. Almost all places where "ssh" was run have been changed to support the env vars. Anything still calling sshOptions does not support them. In particular, rsync special remotes don't. Seems that annex-rsync-transport already gives sufficient control there. (Fixed in passing: Remote.Helper.Ssh.toRepo used to extract remoteAnnexSshOptions and pass them to sshOptions, which was redundant since sshOptions also extracts those.) This commit was sponsored by Jeff Goeke-Smith on Patreon.
* Run ssh with -n whenever input is not being piped into itGravatar Joey Hess2017-02-15
| | | | | | | | | | | | | | | | | | | | ... to avoid it consuming stdin that it shouldn't. This fixes git-annex-checkpresentkey --batch remote, which didn't output results for all keys passed into it. Other git-annex commands that communicate with a remote over ssh may also have been consuming stdin that they shouldn't have, which could have impacted using them in eg, shell scripts. For example, a shell script reading files from stdin and passing them to git annex drop would be impacted by this bug, whenever git annex drop ran git-annex-shell checkpresent, it would consume part/all of the stdin that the shell script was supposed to consume. Fixed by adding a ConsumeStdin parameter to Annex.Ssh.sshOptions, which is used throughout git-annex to run ssh (in order for ssh connection caching to work). Every call site was checked to see if it used CreatePipe for stdin, and if not was marked NoConsumeStdin.
* add SetupStage parameter to RemoteType.setupGravatar Joey Hess2017-02-07
| | | | | | | | | | | | | | | | | Most remotes have an idempotent setup that can be reused for enableremote, but in a few cases, it needs to tell which, and whether a UUID was provided to setup was used. This is groundwork for making initremote be able to provide a UUID. It should not change any behavior. Note that it would be nice to make the UUID always be provided to setup, and make setup not need to generate and return a UUID. What prevented this simplification is Remote.Git.gitSetup, which needs to reuse the UUID of the git remote when setting it up, and so has to return that UUID. This commit was sponsored by Thom May on Patreon.
* Avoid backtraces on expected failures when built with ghc 8; only use ↵Gravatar Joey Hess2016-11-15
| | | | | | | | | | | | | backtraces for unexpected errors. ghc 8 added backtraces on uncaught errors. This is great, but git-annex was using error in many places for a error message targeted at the user, in some known problem case. A backtrace only confuses such a message, so omit it. Notably, commands like git annex drop that failed due to eg, numcopies, used to use error, so had a backtrace. This commit was sponsored by Ethan Aubin.
* get, move, copy, mirror: Added --failed switch which retries failed copies/movesGravatar Joey Hess2016-08-03
| | | | | | | | | Note that get --from foo --failed will get things that a previous get --from bar tried and failed to get, etc. I considered making --failed only retry transfers from the same remote, but it was easier, and seems more useful, to not have the same remote requirement. Noisy due to some refactoring into Types/
* plumb RemoteGitConfig through to decryptCipherGravatar Joey Hess2016-05-23
|
* Pass the various gnupg-options configs to gpg in several cases where they ↵Gravatar Joey Hess2016-05-23
| | | | | | | | | | | | were not before. Removed the instance LensGpgEncParams RemoteConfig because it encouraged code that does not take the RemoteGitConfig into account. RemoteType's setup was changed to take a RemoteGitConfig, although the only place that is able to provide a non-empty one is enableremote, when it's changing an existing remote. This led to several folow-on changes, and got RemoteGitConfig plumbed through.
* remove 163 lines of code without changing anything except importsGravatar Joey Hess2016-01-20
|
* add removeKey action to RemoteGravatar Joey Hess2015-10-08
| | | | | Not implemented for any remotes yet; probably the git remote is the only one that will ever implement it.
* refactorGravatar Joey Hess2015-08-17
|
* Simplify setup process for a ssh remote.Gravatar Joey Hess2015-08-05
| | | | | | | | | | | | | | | | | | | | | | Now it suffices to run git remote add, followed by git-annex sync. Now the remote is automatically initialized for use by git-annex, where before the git-annex branch had to manually be pushed before using git-annex sync. Note that this involved changes to git-annex-shell, so if the remote is using an old version, the manual push is still needed. Implementation required git-annex-shell be changed, so configlist can autoinit a repository even when no git-annex branch has been pushed yet. Unfortunate because we'll have to wait for it to get deployed to servers before being able to rely on this change in the documentation. Did consider making git-annex sync push the git-annex branch to repos that didn't have a uuid, but this seemed difficult to do without complicating it in messy ways. It would be cleaner to split a command out from configlist to handle the initialization. But this is difficult without sacrificing backwards compatability, for users of old git-annex versions which would not use the new command.
* remove unused importsGravatar Joey Hess2015-07-31
|
* Fix rsync special remote to work when -Jn is used for concurrent uploads.Gravatar Joey Hess2015-07-30
|
* remove Params constructor from Utility.SafeCommandGravatar Joey Hess2015-06-01
| | | | | | | | | | | | | | | | | | This removes a bit of complexity, and should make things faster (avoids tokenizing Params string), and probably involve less garbage collection. In a few places, it was useful to use Params to avoid needing a list, but that is easily avoided. Problems noticed while doing this conversion: * Some uses of Params "oneword" which was entirely unnecessary overhead. * A few places that built up a list of parameters with ++ and then used Params to split it! Test suite passes.
* add filename to progress bar, and display ok/failed at endGravatar Joey Hess2015-04-14
| | | | This needed plumbing an AssociatedFile through retrieveKeyFileCheap.
* well along the way to fully quiet --quietGravatar Joey Hess2015-04-04
| | | | | | | Came up with a generic way to filter out progress messages while keeping errors, for commands that use stderr for both. --json mode will disable command outputs too.
* WIP on making --quiet silence progress, and infra for concurrent progress barsGravatar Joey Hess2015-04-03
|
* The ssh-options git config is now used by gcrypt, rsync, and ddar special ↵Gravatar Joey Hess2015-02-12
| | | | remotes that use ssh as a transport.
* test suite found a problem with today's workGravatar Joey Hess2015-01-28
| | | | ". def" did not do what I thought it would, at all.
* implement annex.tune.objecthashlowerGravatar Joey Hess2015-01-28
| | | | Split out Annex.DirHashes which never really belonged in Locations.
* import Data.Default in CommonGravatar Joey Hess2015-01-28
|
* groundwork for parameterizing hash depthGravatar Joey Hess2015-01-28
|
* update my email address and homepage urlGravatar Joey Hess2015-01-21
|
* revert parentDir changeGravatar Joey Hess2015-01-09
| | | | | | | | Reverts 2bba5bc22d049272d3328bfa6c452d3e2e50e86c Unfortunately, this caused breakage on Windows, and possibly elsewhere, because parentDir and takeDirectory do not behave the same when there is a trailing directory separator.
* made parentDir return a Maybe FilePath; removed most uses of itGravatar Joey Hess2015-01-06
| | | | | | | | parentDir is less safe than takeDirectory, especially when working with relative FilePaths. It's really only useful in loops that want to terminate at / This commit was sponsored by Audric SCHILTKNECHT.
* Expand checkurl to support recommended filename, and multi-file-urlsGravatar Joey Hess2014-12-11
| | | | This commit was sponsored by an anonymous bitcoiner.
* Urls can now be claimed by remotes. This will allow creating, for example, a ↵Gravatar Joey Hess2014-12-08
| | | | external special remote that handles magnet: and *.torrent urls.
* add stub claimUrlGravatar Joey Hess2014-12-08
|
* add per-remote-type infoGravatar Joey Hess2014-10-21
| | | | | | | | | | Now `git annex info $remote` shows info specific to the type of the remote, for example, it shows the rsync url. Remote types that support encryption or chunking also include that in their info. This commit was sponsored by Ævar Arnfjörð Bjarmason.
* fix some mixed space+tab indentationGravatar Joey Hess2014-10-09
| | | | | | | | | This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.
* glacier, S3: Fix bug that caused embedded creds to not be encypted using the ↵Gravatar Joey Hess2014-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | remote's key. encryptionSetup must be called before setRemoteCredPair. Otherwise, the RemoteConfig doesn't have the cipher in it, and so no cipher is used to encrypt the embedded creds. This is a security fix for non-shared encryption methods! For encryption=shared, there's no security problem, just an inconsistentency in whether the embedded creds are encrypted. This is very important to get right, so used some types to help ensure that setRemoteCredPair is only run after encryptionSetup. Note that the external special remote bypasses the type safety, since creds can be set after the initial remote config, if the external special remote program requests it. Also note that IA remotes never use encryption, so encryptionSetup is not run for them at all, and again the type safety is bypassed. This leaves two open questions: 1. What to do about S3 and glacier remotes that were set up using encryption=pubkey/hybrid with embedcreds? Such a git repo has a security hole embedded in it, and this needs to be communicated to the user. Is the changelog enough? 2. enableremote won't work in such a repo, because git-annex will try to decrypt the embedded creds, which are not encrypted, so fails. This needs to be dealt with, especially for ecryption=shared repos, which are not really broken, just inconsistently configured. Noticing that problem for encryption=shared is what led to commit cc54ff9e49260cd94f938e69e926a273e231ef4e, which tried to fix the problem by not decrypting the embedded creds. This commit was sponsored by Josh Taylor.
* The annex-rsync-transport configuration is now also used when checking if a ↵Gravatar Joey Hess2014-09-11
| | | | key is present on a rsync remote, and when dropping a key from the remote.
* testremote: Add testing of behavior when remote is not availableGravatar Joey Hess2014-08-10
| | | | | | | | | | | | | | | | | | | | Added a mkUnavailable method, which a Remote can use to generate a version of itself that is not available. Implemented for several, but not yet all remotes. This allows testing that checkPresent properly throws an exceptions when it cannot check if a key is present or not. It also allows testing that the other methods don't throw exceptions in these circumstances. This immediately found several bugs, which this commit also fixes! * git remotes using ssh accidentially had checkPresent return an exception, rather than throwing it * The chunking code accidentially returned False rather than propigating an exception when there were no chunks and checkPresent threw an exception for the non-chunked key. This commit was sponsored by Carlo Matteo Capocasa.
* run Preparer to get Remover and CheckPresent actionsGravatar Joey Hess2014-08-06
| | | | | | | | | | | | | | | | | | | | | | | | This will allow special remotes to eg, open a http connection and reuse it, while checking if chunks are present, or removing chunks. S3 and WebDAV both need this to support chunks with reasonable speed. Note that a special remote might want to cache a http connection across multiple requests. A simple case of this is that CheckPresent is typically called before Store or Remove. A remote using this interface can certianly use a Preparer that eg, uses a MVar to cache a http connection. However, it's up to the remote to then deal with things like stale or stalled http connections when eg, doing a series of downloads from a remote and other places. There could be long delays between calls to a remote, which could lead to eg, http connection stalls; the machine might even move to a new network, etc. It might be nice to improve this interface later to allow the simple case without needing to handle the full complex case. One way to do it would be to have a `Transaction SpecialRemote cache`, where SpecialRemote contains methods for Storer, Retriever, Remover, and CheckPresent, that all expect to be passed a `cache`.
* pushed checkPresent exception handling out of Remote implementationsGravatar Joey Hess2014-08-06
| | | | | | | | | | | | | | | | I tend to prefer moving toward explicit exception handling, not away from it, but in this case, I think there are good reasons to let checkPresent throw exceptions: 1. They can all be caught in one place (Remote.hasKey), and we know every possible exception is caught there now, which we didn't before. 2. It simplified the code of the Remotes. I think it makes sense for Remotes to be able to be implemented without needing to worry about catching exceptions inside them. (Mostly.) 3. Types.StoreRetrieve.Preparer can only work on things that return a Bool, which all the other relevant remote methods already did. I do not see a good way to generalize that type; my previous attempts failed miserably.
* convert gcrypt to new regime, including chunkingGravatar Joey Hess2014-08-03
| | | | Some reorg of Remote.Rsync code to export the things gcrypt needs.
* finish making rsync support chunkingGravatar Joey Hess2014-08-03
| | | | | This breaks gcrypt, which relies on some internals of the rsync remote. To fix next..
* rsync special remote: Fix slashes when used on Windows.Gravatar Joey Hess2014-03-18
|
* Put non-object tmp files in .git/annex/misctmp, leaving .git/annex/tmp for ↵Gravatar Joey Hess2014-02-26
| | | | | | | | | | | | | | | | | | | | only partially transferred objects. This allows eg, putting .git/annex/tmp on a ram disk, if the disk IO of temp object files is too annoying (and if you don't want to keep partially transferred objects across reboots). .git/annex/misctmp must be on the same filesystem as the git work tree, since files are moved to there in a way that will not work cross-device, as well as symlinked into there. I first wanted to put the tmp objects in .git/annex/objects/tmp, but that would pose transition problems on upgrade when partially transferred objects existed. git annex info does not currently show the size of .git/annex/misctemp, since it should stay small. It would also be ok to make something clean it out, periodically.
* Fix handling of rsync remote urls containing a username, including rsync.net.Gravatar Joey Hess2014-02-21
| | | | | | | | | | | This breakage seems to have been caused way back in 0957b771, but I am pretty sure rsync.net support has not been entirely broken since last April. AFAICS, the generated .ssh/config has not changed since then -- it has never included a Username setting line. So, I am puzzled at when this reversion was introduced. Note that the breakage only affected checkpresent and remove. Upload and download use the ssh connection caching, which includes a -l username.
* cleanup thanks to Utility.PIDGravatar Joey Hess2014-02-11
|
* plumb creds from webapp to initremoteGravatar Joey Hess2014-02-11
| | | | | Avoids abusing setting environment variables, which was always a hack and won't work on windows.
* Added ways to configure rsync options to be used only when uploading or ↵Gravatar Joey Hess2014-02-02
| | | | downloading from a remote. Useful to eg limit upload bandwidth.
* change a few renameFile's to renameGravatar Joey Hess2014-01-29
| | | | | AFAIK, none of these ever operate on directories, but nor do I want to explicitly check if they're files and fail if not.