summaryrefslogtreecommitdiff
path: root/Remote/Bup.hs
Commit message (Collapse)AuthorAge
* add per-remote-type infoGravatar Joey Hess2014-10-21
| | | | | | | | | | Now `git annex info $remote` shows info specific to the type of the remote, for example, it shows the rsync url. Remote types that support encryption or chunking also include that in their info. This commit was sponsored by Ævar Arnfjörð Bjarmason.
* glacier, S3: Fix bug that caused embedded creds to not be encypted using the ↵Gravatar Joey Hess2014-09-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | remote's key. encryptionSetup must be called before setRemoteCredPair. Otherwise, the RemoteConfig doesn't have the cipher in it, and so no cipher is used to encrypt the embedded creds. This is a security fix for non-shared encryption methods! For encryption=shared, there's no security problem, just an inconsistentency in whether the embedded creds are encrypted. This is very important to get right, so used some types to help ensure that setRemoteCredPair is only run after encryptionSetup. Note that the external special remote bypasses the type safety, since creds can be set after the initial remote config, if the external special remote program requests it. Also note that IA remotes never use encryption, so encryptionSetup is not run for them at all, and again the type safety is bypassed. This leaves two open questions: 1. What to do about S3 and glacier remotes that were set up using encryption=pubkey/hybrid with embedcreds? Such a git repo has a security hole embedded in it, and this needs to be communicated to the user. Is the changelog enough? 2. enableremote won't work in such a repo, because git-annex will try to decrypt the embedded creds, which are not encrypted, so fails. This needs to be dealt with, especially for ecryption=shared repos, which are not really broken, just inconsistently configured. Noticing that problem for encryption=shared is what led to commit cc54ff9e49260cd94f938e69e926a273e231ef4e, which tried to fix the problem by not decrypting the embedded creds. This commit was sponsored by Josh Taylor.
* testremote: Add testing of behavior when remote is not availableGravatar Joey Hess2014-08-10
| | | | | | | | | | | | | | | | | | | | Added a mkUnavailable method, which a Remote can use to generate a version of itself that is not available. Implemented for several, but not yet all remotes. This allows testing that checkPresent properly throws an exceptions when it cannot check if a key is present or not. It also allows testing that the other methods don't throw exceptions in these circumstances. This immediately found several bugs, which this commit also fixes! * git remotes using ssh accidentially had checkPresent return an exception, rather than throwing it * The chunking code accidentially returned False rather than propigating an exception when there were no chunks and checkPresent threw an exception for the non-chunked key. This commit was sponsored by Carlo Matteo Capocasa.
* run Preparer to get Remover and CheckPresent actionsGravatar Joey Hess2014-08-06
| | | | | | | | | | | | | | | | | | | | | | | | This will allow special remotes to eg, open a http connection and reuse it, while checking if chunks are present, or removing chunks. S3 and WebDAV both need this to support chunks with reasonable speed. Note that a special remote might want to cache a http connection across multiple requests. A simple case of this is that CheckPresent is typically called before Store or Remove. A remote using this interface can certianly use a Preparer that eg, uses a MVar to cache a http connection. However, it's up to the remote to then deal with things like stale or stalled http connections when eg, doing a series of downloads from a remote and other places. There could be long delays between calls to a remote, which could lead to eg, http connection stalls; the machine might even move to a new network, etc. It might be nice to improve this interface later to allow the simple case without needing to handle the full complex case. One way to do it would be to have a `Transaction SpecialRemote cache`, where SpecialRemote contains methods for Storer, Retriever, Remover, and CheckPresent, that all expect to be passed a `cache`.
* pushed checkPresent exception handling out of Remote implementationsGravatar Joey Hess2014-08-06
| | | | | | | | | | | | | | | | I tend to prefer moving toward explicit exception handling, not away from it, but in this case, I think there are good reasons to let checkPresent throw exceptions: 1. They can all be caught in one place (Remote.hasKey), and we know every possible exception is caught there now, which we didn't before. 2. It simplified the code of the Remotes. I think it makes sense for Remotes to be able to be implemented without needing to worry about catching exceptions inside them. (Mostly.) 3. Types.StoreRetrieve.Preparer can only work on things that return a Bool, which all the other relevant remote methods already did. I do not see a good way to generalize that type; my previous attempts failed miserably.
* roll ChunkedEncryptable into Special and improve interfaceGravatar Joey Hess2014-08-03
| | | | Allow disabling progress displays, for eg, rsync.
* better byteRetrieverGravatar Joey Hess2014-08-03
| | | | | | | | | | | | | | Make the byteRetriever be passed the callback that consumes the bytestring. This way, there's no worries about the lazy bytestring not all being read when the resource that's creating it is closed. Which in turn lets bup, ddar, and S3 each switch from using an unncessary fileRetriver to a byteRetriever. So, more efficient on chunks and encrypted files. The only remaining fileRetrievers are hook and external, which really do retrieve to files.
* convert bup to new ChunkedEncryptable API (but do not support chunking)Gravatar Joey Hess2014-08-02
| | | | | | | | | | | | | | | | | | bup already splits files and does rolling deltas, so there is no reason to use chunking here. The new API made it easier to add progress support for storeKey, so that's done. Unfortunately, bup-split still outputs its own progress with -q, so a little ugly, but not too bad. Made dropping remove the branch for an object, for two reasons: 1. The new API calls removeKey to roll back a storeKey when the content changed unexpectedly. 2. So that testremote will be happy. Also, fixed a bug that caused a crash when removing the branch for an object in rollback.
* factor out getRemoteGitConfigGravatar Joey Hess2014-05-16
|
* execute remote.<name>.annex-shell on remote, if setGravatar Fraser Tweedale2014-05-16
| | | | | | | | It is useful to be able to specify an alternative git-annex-shell program to execute on the remote, e.g., to run a version not on the PATH. Use remote.<name>.annex-shell if specified, instead of the default "git-annex-shell" i.e., first so-named executable on the PATH.
* plumb creds from webapp to initremoteGravatar Joey Hess2014-02-11
| | | | | Avoids abusing setting environment variables, which was always a hack and won't work on windows.
* fix failing test case on WindowsGravatar Joey Hess2014-02-03
| | | | ensure file being modified is all read before it's opened for write
* avoid using openFile when withFile can be usedGravatar Joey Hess2014-02-03
| | | | | | Potentially fixes some FD leak if an action on an opened file handle fails for some reason. There have been some hard to reproduce reports of git-annex leaking FDs, and this may solve them.
* add GETAVAILABILITY to external special remote protocolGravatar Joey Hess2014-01-13
| | | | | And some reworking of types, and added an annex-availability git config setting.
* assistant: Support repairing git remotes that are locally accessibleGravatar Joey Hess2013-10-27
| | | | | | | | (eg, on removable drives) gcrypt remotes are not yet handled. This commit was sponsored by Sören Brunk.
* add remote fsck interfaceGravatar Joey Hess2013-10-11
| | | | | | | | | | | | | | | | | | | | Currently only implemented for local git remotes. May try to add support to git-annex-shell for ssh remotes later. Could concevably also be supported by some special remote, although that seems unlikely. Cronner user this when available, and when not falls back to fsck --fast --from remote git annex fsck --from does not itself use this interface. To do so, I would need to pass --fast and all other options that influence fsck on to the git annex fsck that it runs inside the remote. And that seems like a lot of work for a result that would be no better than cd remote; git annex fsck This may need to be revisited if git-annex-shell gets support, since it may be the case that the user cannot ssh to the server to run git-annex fsck there, but can run git-annex-shell there. This commit was sponsored by Damien Diederen.
* factor out more ssh stuff from git remoteGravatar Joey Hess2013-09-24
| | | | | This has the dual benefits of making Remote.Git shorter, and letting Remote.GCrypt use these utilities.
* Use cryptohash rather than SHA for hashing.Gravatar Joey Hess2013-09-22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a massive win on OSX, which doesn't have a sha256sum normally. Only use external hash commands when the file is > 1 mb, since cryptohash is quite close to them in speed. SHA is still used to calculate HMACs. I don't quite understand cryptohash's API for those. Used the following benchmark to arrive at the 1 mb number. 1 mb file: benchmarking sha256/internal mean: 13.86696 ms, lb 13.83010 ms, ub 13.93453 ms, ci 0.950 std dev: 249.3235 us, lb 162.0448 us, ub 458.1744 us, ci 0.950 found 5 outliers among 100 samples (5.0%) 4 (4.0%) high mild 1 (1.0%) high severe variance introduced by outliers: 10.415% variance is moderately inflated by outliers benchmarking sha256/external mean: 14.20670 ms, lb 14.17237 ms, ub 14.27004 ms, ci 0.950 std dev: 230.5448 us, lb 150.7310 us, ub 427.6068 us, ci 0.950 found 3 outliers among 100 samples (3.0%) 2 (2.0%) high mild 1 (1.0%) high severe 2 mb file: benchmarking sha256/internal mean: 26.44270 ms, lb 26.23701 ms, ub 26.63414 ms, ci 0.950 std dev: 1.012303 ms, lb 925.8921 us, ub 1.122267 ms, ci 0.950 variance introduced by outliers: 35.540% variance is moderately inflated by outliers benchmarking sha256/external mean: 26.84521 ms, lb 26.77644 ms, ub 26.91433 ms, ci 0.950 std dev: 347.7867 us, lb 210.6283 us, ub 571.3351 us, ci 0.950 found 6 outliers among 100 samples (6.0%) import Crypto.Hash import Data.ByteString.Lazy as L import Criterion.Main import Common testfile :: FilePath testfile = "/run/shm/data" -- on ram disk main = defaultMain [ bgroup "sha256" [ bench "internal" $ whnfIO internal , bench "external" $ whnfIO external ] ] sha256 :: L.ByteString -> Digest SHA256 sha256 = hashlazy internal :: IO String internal = show . sha256 <$> L.readFile testfile external :: IO String external = do s <- readProcess "sha256sum" [testfile] return $ fst $ separate (== ' ') s
* Support hot-swapping of removable drives containing gcrypt repositories.Gravatar Joey Hess2013-09-12
| | | | | | | | | | | To support this, a core.gcrypt-id is stored by git-annex inside the git config of a local gcrypt repository, when setting it up. That is compared with the remote's cached gcrypt-id. When different, a drive has been changed. git-annex then looks up the remote config for the uuid mapped from the core.gcrypt-id, and tweaks the configuration appropriately. When there is no known config for the uuid, it will refuse to use the remote.
* partially complete gcrypt remote (local send done; rest not)Gravatar Joey Hess2013-09-07
| | | | | | | | | | | | | | | | | | | | | | | | This is a git-remote-gcrypt encrypted special remote. Only sending files in to the remote works, and only for local repositories. Most of the work so far has involved making initremote work. A particular problem is that remote setup in this case needs to generate its own uuid, derivied from the gcrypt-id. That required some larger changes in the code to support. For ssh remotes, this will probably just reuse Remote.Rsync's code, so should be easy enough. And for downloading from a web remote, I will need to factor out the part of Remote.Git that does that. One particular thing that will need work is supporting hot-swapping a local gcrypt remote. I think it needs to store the gcrypt-id in the git config of the local remote, so that it can check it every time, and compare with the cached annex-uuid for the remote. If there is a mismatch, it can change both the cached annex-uuid and the gcrypt-id. That should work, and I laid some groundwork for it by already reading the remote's config when it's local. (Also needed for other reasons.) This commit was sponsored by Daniel Callahan.
* Allow public-key encryption of file content.Gravatar guilhem2013-09-03
| | | | | | | | | | | | With the initremote parameters "encryption=pubkey keyid=788A3F4C". /!\ Adding or removing a key has NO effect on files that have already been copied to the remote. Hence using keyid+= and keyid-= with such remotes should be used with care, and make little sense unless the point is to replace a (sub-)key by another. /!\ Also, a test case has been added to ensure that the cipher and file contents are encrypted as specified by the chosen encryption scheme.
* Strip leading /~/ from bup relatively pathed bup remotesGravatar Oliver Matthews2013-06-21
|
* expose Control.Monad.joinGravatar Joey Hess2013-04-22
| | | | | I think I've been looking for that function for some time. Ie, I remember wanting to collapse Just Nothing to Nothing.
* connect existing meters to the transfer log for downloadsGravatar Joey Hess2013-04-11
| | | | | | | | | | | | | | Most remotes have meters in their implementations of retrieveKeyFile already. Simply hooking these up to the transfer log makes that information available. Easy peasy. This is particularly valuable information for encrypted remotes, which otherwise bypass the assistant's polling of temp files, and so don't have good progress bars yet. Still some work to do here (see progressbars.mdwn changes), but this is entirely an improvement from the lack of progress bars for encrypted downloads.
* webapp: Progess bar fixes for many types of special remotes.Gravatar Joey Hess2013-03-28
| | | | | | | | | | | | | There was confusion in different parts of the progress bar code about whether an update contained the total number of bytes transferred, or the number of bytes transferred since the last update. One way this bug showed up was progress bars that seemed to stick at zero for a long time. In order to fix it comprehensively, I add a new BytesProcessed data type, that is explicitly a total quantity of bytes, not a delta. Note that this doesn't necessarily fix every problem with progress bars. Particularly, buffering can now cause progress bars to seem to run ahead of transfers, reaching 100% when data is still being uploaded.
* add globallyAvailable to remotesGravatar Joey Hess2013-03-15
|
* split cost out into its own moduleGravatar Joey Hess2013-03-13
| | | | | Added a function to insert a new cost into a list, which could be used to asjust costs after a drag and drop.
* GnuPG options for symmetric encryption.Gravatar guilhem2013-03-11
|
* git subcommand cleanupGravatar Joey Hess2013-03-03
| | | | | | Pass subcommand as a regular param, which allows passing git parameters like -c before it. This was already done in the pipeing set of functions, but not the command running set.
* Special remotes now all rollback storage of keys that get modified during ↵Gravatar Joey Hess2013-01-09
| | | | the transfer, which can happen in direct mode.
* Fix transferring files to special remotes in direct mode.Gravatar Joey Hess2013-01-06
|
* type based git config handling for remotesGravatar Joey Hess2013-01-01
| | | | | Still a couple of places that use git config ad-hoc, but this is most of it done.
* whitespace fixesGravatar Joey Hess2012-12-13
|
* avoid unnecessary MaybeGravatar Joey Hess2012-11-30
|
* better streaming while encrypting/decryptingGravatar Joey Hess2012-11-18
| | | | | | Both the directory and webdav special remotes used to have to buffer the whole file contents before it could be decrypted, as they read from chunks. Now the chunks are streamed through gpg with no buffering.
* where indentingGravatar Joey Hess2012-11-11
|
* Use USER and HOME environment when set, and only fall back to getpwent, ↵Gravatar Joey Hess2012-10-25
| | | | which doesn't work with LDAP or NIS.
* fix buildGravatar Joey Hess2012-10-24
|
* bup: Don't pass - to bup-split to make it read stdinGravatar Joey Hess2012-10-23
| | | | | | | bup 0.25 does not accept that; and bup split reads from stdin by default if no file is given. I'm not sure what version of bup changed this. This only affected bup special remotes that were encrypted.
* unify typesGravatar Joey Hess2012-09-21
|
* add a progress callback to storeKey, and threaded it all the way throughGravatar Joey Hess2012-09-19
| | | | | | | | Transfer info files are updated when the callback is called, updating the number of bytes transferred. Left unused p variables at every place the callback should be used. Which is rather a lot..
* add support for readonly remotesGravatar Joey Hess2012-08-26
| | | | | | | Currently only the web special remote is readonly, but it'd be possible to also have readonly drives, or other remotes. These are handled in the assistant by only downloading from them, and never trying to upload to them.
* tweak field nameGravatar Joey Hess2012-08-26
|
* add routes to pause/start/cancel transfersGravatar Joey Hess2012-08-08
| | | | | | | | | | | | | | | | This commit includes a paydown on technical debt incurred two years ago, when I didn't know that it was bad to make custom Read and Show instances for types. As the routes need Read and Show for Transfer, which includes a Key, and deriving my own Read instance of key was not practical, I had to finally clean that up. So the compact Key read and show functions are now file2key and key2file, and Read and Show are now derived instances. Changed all code that used the old instances, compiler checked. (There were a few places, particularly in Command.Unused, and the test suite where the Show instance continue to be used for legitimate comparisons; ie show key_x == show key_y (though really in a bloom filter))
* add a path field to remotesGravatar Joey Hess2012-07-22
| | | | | Also broke out some helper functions around constructing remotes, to be used later.
* add back debug loggingGravatar Joey Hess2012-07-19
| | | | | | | | | | | | | Make Utility.Process wrap the parts of System.Process that I use, and add debug logging to them. Also wrote some higher-level code that allows running an action with handles to a processes stdin or stdout (or both), and checking its exit status, all in a single function call. As a bonus, the debug logging now indicates whether the process is being run to read from it, feed it data, chat with it (writing and reading), or just call it for its side effect.
* switch from System.Cmd.Utils to System.ProcessGravatar Joey Hess2012-07-18
| | | | | | | | | | | | | | | | | | Test suite now passes with -threaded! I traced back all the hangs with -threaded to System.Cmd.Utils. It seems it's just crappy/unsafe/outdated, and should not be used. System.Process seems to be the cool new thing, so converted all the code to use it instead. In the process, --debug stopped printing commands it runs. I may try to bring that back later. Note that even SafeSystem was switched to use System.Process. Since that was a modified version of code from System.Cmd.Utils, it needed to be converted too. I also got rid of nearly all calls to forkProcess, and all calls to executeFile, which I'm also doubtful about working well with -threaded.
* record transfer information on local git remotesGravatar Joey Hess2012-07-01
| | | | | | | | | | | | | | | In order to record a semi-useful filename associated with the key, this required plumbing the filename all the way through to the remotes' storeKey and retrieveKeyFile. Note that there is potential for deadlock here, narrowly avoided. Suppose the repos are A and B. A sends file foo to B, and at the same time, B gets file foo from A. So, A locks its upload transfer info file, and then locks B's download transfer info file. At the same time, B is taking the two locks in the opposite order. This is only not a deadlock because the lock code does not wait, and aborts. So one of A or B's transfers will be aborted and the other transfer will continue. Whew!
* avoid ByteString.Char8 where not neededGravatar Joey Hess2012-06-20
| | | | | Its truncation behavior is a red flag, so avoid using it in these places where only raw ByteStrings are used, without looking at the data inside.
* Clean up handling of git directory and git worktree.Gravatar Joey Hess2012-05-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Baked into the code was an assumption that a repository's git directory could be determined by adding ".git" to its work tree (or nothing for bare repos). That fails when core.worktree, or GIT_DIR and GIT_WORK_TREE are used to separate the two. This was attacked at the type level, by storing the gitdir and worktree separately, so Nothing for the worktree means a bare repo. A complication arose because we don't learn where a repository is bare until its configuration is read. So another Location type handles repositories that have not had their config read yet. I am not entirely happy with this being a Location type, rather than representing them entirely separate from the Git type. The new code is not worse than the old, but better types could enforce more safety. Added support for core.worktree. Overriding it with -c isn't supported because it's not really clear what to do if a git repo's config is read, is not bare, and is then overridden to bare. What is the right git directory in this case? I will worry about this if/when someone has a use case for overriding core.worktree with -c. (See Git.Config.updateLocation) Also removed and renamed some functions like gitDir and workTree that misused git's terminology. One minor regression is known: git annex add in a bare repository does not print a nice error message, but runs git ls-files in a way that fails earlier with a less nice error message. This is because before --work-tree was always passed to git commands, even in a bare repo, while now it's not.