summaryrefslogtreecommitdiff
path: root/Remote/Helper/Chunked.hs
Commit message (Collapse)AuthorAge
* update my email address and homepage urlGravatar Joey Hess2015-01-21
|
* add getFileSize, which can get the real size of a large file on WindowsGravatar Joey Hess2015-01-20
| | | | | | | | | | | | | | Avoid using fileSize which maxes out at just 2 gb on Windows. Instead, use hFileSize, which doesn't have a bounded size. Fixes support for files > 2 gb on Windows. Note that the InodeCache code only needs to compare a file size, so it doesn't matter it the file size wraps. So it has been left as-is. This was necessary both to avoid invalidating existing inode caches, and because the code passed FileStatus around and would have become more expensive if it called getFileSize. This commit was sponsored by Christian Dietrich.
* add per-remote-type infoGravatar Joey Hess2014-10-21
| | | | | | | | | | Now `git annex info $remote` shows info specific to the type of the remote, for example, it shows the rsync url. Remote types that support encryption or chunking also include that in their info. This commit was sponsored by Ævar Arnfjörð Bjarmason.
* fix some mixed space+tab indentationGravatar Joey Hess2014-10-09
| | | | | | | | | This fixes all instances of " \t" in the code base. Most common case seems to be after a "where" line; probably vim copied the two space layout of that line. Done as a background task while listening to episode 2 of the Type Theory podcast.
* testremote: Add testing of behavior when remote is not availableGravatar Joey Hess2014-08-10
| | | | | | | | | | | | | | | | | | | | Added a mkUnavailable method, which a Remote can use to generate a version of itself that is not available. Implemented for several, but not yet all remotes. This allows testing that checkPresent properly throws an exceptions when it cannot check if a key is present or not. It also allows testing that the other methods don't throw exceptions in these circumstances. This immediately found several bugs, which this commit also fixes! * git remotes using ssh accidentially had checkPresent return an exception, rather than throwing it * The chunking code accidentially returned False rather than propigating an exception when there were no chunks and checkPresent threw an exception for the non-chunked key. This commit was sponsored by Carlo Matteo Capocasa.
* unify exception handling into Utility.ExceptionGravatar Joey Hess2014-08-07
| | | | | | | | | | | | | | | | | | | | Removed old extensible-exceptions, only needed for very old ghc. Made webdav use Utility.Exception, to work after some changes in DAV's exception handling. Removed Annex.Exception. Mostly this was trivial, but note that tryAnnex is replaced with tryNonAsync and catchAnnex replaced with catchNonAsync. In theory that could be a behavior change, since the former caught all exceptions, and the latter don't catch async exceptions. However, in practice, nothing in the Annex monad uses async exceptions. Grepping for throwTo and killThread only find stuff in the assistant, which does not seem related. Command.Add.undo is changed to accept a SomeException, and things that use it for rollback now catch non-async exceptions, rather than only IOExceptions.
* pushed checkPresent exception handling out of Remote implementationsGravatar Joey Hess2014-08-06
| | | | | | | | | | | | | | | | I tend to prefer moving toward explicit exception handling, not away from it, but in this case, I think there are good reasons to let checkPresent throw exceptions: 1. They can all be caught in one place (Remote.hasKey), and we know every possible exception is caught there now, which we didn't before. 2. It simplified the code of the Remotes. I think it makes sense for Remotes to be able to be implemented without needing to worry about catching exceptions inside them. (Mostly.) 3. Types.StoreRetrieve.Preparer can only work on things that return a Bool, which all the other relevant remote methods already did. I do not see a good way to generalize that type; my previous attempts failed miserably.
* remove redundant progress meter display codeGravatar Joey Hess2014-08-03
| | | | specialRemote handles all meter display, so this is redundant.
* roll ChunkedEncryptable into Special and improve interfaceGravatar Joey Hess2014-08-03
| | | | Allow disabling progress displays, for eg, rsync.
* whitespaceGravatar Joey Hess2014-08-03
|
* minor optimisationGravatar Joey Hess2014-08-01
|
* testremote: Test retrieveKeyFile resumeGravatar Joey Hess2014-08-01
| | | | | | | And fixed a bug found by these tests; retrieveKeyFile would fail when the dest file was already complete. This commit was sponsored by Bradley Unterrheiner.
* fix a fenchpost bug when resuming chunked store at endGravatar Joey Hess2014-08-01
| | | | Discovered thanks to testremote command!
* fix chunk=0Gravatar Joey Hess2014-08-01
| | | | Found by testremote
* only chunk stable keysGravatar Joey Hess2014-07-30
| | | | | | The content of unstable keys can potentially be different in different repos, so eg, resuming a chunked upload started by another repo would corrupt data.
* update progress after each chunk, at leastGravatar Joey Hess2014-07-29
| | | | | | This way, when the remote implementation neglects to update progress, there will still be a somewhat useful progress display, as long as chunks are used.
* optimise case of remote that retrieves FileContent, when chunks and ↵Gravatar Joey Hess2014-07-29
| | | | | | | | | | encryption are not being used No need to read whole FileContent only to write it back out to a file in this case. Can just rename! Yay. Also indidentially, fixed an attempt to open a file for write that was already opened for write, which caused a crash and deadlock.
* support chunking for all external special remotes!Gravatar Joey Hess2014-07-29
| | | | | | | Removing code and at the same time adding great features, including upload/download resuming. This commit was sponsored by Romain Lenglet.
* better type for RetrieverGravatar Joey Hess2014-07-29
| | | | | | | | Putting a callback in the Retriever type allows for the callback to remove the retrieved file when it's done with it. I did not really want to make Retriever be fixed to Annex Bool, but when I tried to use Annex a, I got into some type of type mess.
* allow Retriever action to update the progress meterGravatar Joey Hess2014-07-29
| | | | | | | | Needed for eg, Remote.External. Generally, any Retriever that stores content in a file is responsible for updating the meter, while ones that procude a lazy bytestring cannot update the meter, so are not asked to.
* lift types from IO to AnnexGravatar Joey Hess2014-07-29
| | | | | | | | | | | Some remotes like External need to run store and retrieve actions in Annex, not IO. In order to do that lift, I had to dive pretty deep into the utilities, making Utility.Gpg and Utility.Tmp be partly converted to using MonadIO, and Control.Monad.Catch for exception handling. There should be no behavior changes in this commit. This commit was sponsored by Michael Barabanov.
* add ContentSource type, for remotes that act on files rather than ByteStringsGravatar Joey Hess2014-07-29
| | | | | Note that currently nothing cleans up a ContentSource's file, when eg, retrieving chunks.
* fix non-checked hasKeyChunksGravatar Joey Hess2014-07-29
|
* resume interrupted chunked uploadsGravatar Joey Hess2014-07-28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Leverage the new chunked remotes to automatically resume uploads. Sort of like rsync, although of course not as efficient since this needs to start at a chunk boundry. But, unlike rsync, this method will work for S3, WebDAV, external special remotes, etc, etc. Only directory special remotes so far, but many more soon! This implementation will also allow starting an upload from one repository, interrupting it, and then resuming the upload to the same remote from an entirely different repository. Note that I added a comment that storeKey should atomically move the content into place once it's all received. This was already an undocumented requirement -- it's necessary for hasKey to work reliably. This resume code just uses hasKey to find the first chunk that's missing. Note that if there are two uploads of the same key to the same chunked remote, one might resume at the point the other had gotten to, but both will then redundantly upload. As before. In the non-resume case, this adds one hasKey call per storeKey, and only if the remote is configured to use chunks. Future work: Try to eliminate that hasKey. Notice that eg, `git annex copy --to` checks if the key is present before sending it, so is already running hasKey.. which could perhaps be cached and reused. However, this additional overhead is not very large compared with transferring an entire large file, and the ability to resume is certianly worth it. There is an optimisation in place for small files, that avoids trying to resume if the whole file fits within one chunk. This commit was sponsored by Georg Bauer.
* add ChunkMethod type and make Logs.Chunk use it, rather than assuming fixed ↵Gravatar Joey Hess2014-07-28
| | | | | | | | size chunks (so eg, rolling hash chunks can be supported later) If a newer git-annex starts logging something else in the chunk log, it won't be used by this version, but it will be preserved when updating the log.
* resume interrupted chunked downloadsGravatar Joey Hess2014-07-27
| | | | | | | | | | | | | | | | | | Leverage the new chunked remotes to automatically resume downloads. Sort of like rsync, although of course not as efficient since this needs to start at a chunk boundry. But, unlike rsync, this method will work for S3, WebDAV, external special remotes, etc, etc. Only directory special remotes so far, but many more soon! This implementation will also properly handle starting a download from one remote, interrupting, and resuming from another one, and so on. (Resuming interrupted chunked uploads is similarly doable, although slightly more expensive.) This commit was sponsored by Thomas Djärv.
* use existing chunks even when chunk=0Gravatar Joey Hess2014-07-27
| | | | | | | | | | When chunk=0, always try the unchunked key first. This avoids the overhead of needing to read the git-annex branch to find the chunkcount. However, if the unchunked key is not present, go on and try the chunks. Also, when removing a chunked key, update the chunkcounts even when chunk=0.
* reorgGravatar Joey Hess2014-07-27
|
* faster storeChunksGravatar Joey Hess2014-07-27
| | | | | | | | | | | No need to process each L.ByteString chunk, instead ask it to split. Doesn't seem to have really sped things up much, but it also made the code simpler. Note that this does (and already did) buffer in memory. It seems that only the directory special remote could take advantage of streaming chunks to files w/o buffering, so probably won't add an interface to allow for that.
* improve exception handlingGravatar Joey Hess2014-07-26
| | | | | | | | Push it down from needing to be done in every Storer, to being checked once inside ChunkedEncryptable. Also, catch exceptions from PrepareStorer and PrepareRetriever, just in case..
* better exception displayGravatar Joey Hess2014-07-26
|
* fix another fallback bugGravatar Joey Hess2014-07-26
|
* allM has slightly better memory useGravatar Joey Hess2014-07-26
|
* fix fallback to other chunk size when first does not have itGravatar Joey Hess2014-07-26
|
* finish up basic chunked remote groundworkGravatar Joey Hess2014-07-26
| | | | | | | Chunk retrieval and reassembly, removal, and checking if all necessary chunks are present. This commit was sponsored by Damien Raude-Morvan.
* reorgGravatar Joey Hess2014-07-26
|
* core implementation of new style chunkingGravatar Joey Hess2014-07-25
| | | | | | | | | | | | | | | | Not yet used by any special remotes, but should not be too hard to add it to most of them. storeChunks is the hairy bit! It's loosely based on Remote.Directory.storeLegacyChunked. The object is read in using a lazy bytestring, which is streamed though, creating chunks as needed, without ever buffering more than 1 chunk in memory. Getting the progress meter update to work right was also fun, since progress meter values are absolute. Finessed by constructing an offset meter. This commit was sponsored by Richard Collins.
* move meteredWriteFileChunks out of legacyGravatar Joey Hess2014-07-24
|
* implement chunk logsGravatar Joey Hess2014-07-24
| | | | | | | Slightly tricky as they are not normal UUIDBased logs, but are instead maps from (uuid, chunksize) to chunkcount. This commit was sponsored by Frank Thomas.
* improve chunk data typesGravatar Joey Hess2014-07-24
|
* prepare for new style chunkingGravatar Joey Hess2014-07-24
| | | | | | | | | | | | Moved old legacy chunking code, and cleaned up the directory and webdav remotes use of it, so when no chunking is configured, that code is not used. The config for new style chunking will be chunk=1M instead of chunksize=1M. There should be no behavior changes from this commit. This commit was sponsored by Andreas Laas.
* directory, webdav: Fix bug introduced in version 4.20131002 that caused the ↵Gravatar Joey Hess2013-10-26
| | | | chunkcount file to not be written. Work around repositories without such a file, so files can still be retreived from them.
* fix inverted logic when determining whether to write a chunkcount fileGravatar Joey Hess2013-10-26
| | | | | | late-night hlint bit me on this one.. Reviewed f32cb2cf1576db1395f77bd5f7f0c0a3e86c1334 and the rest of it seems ok
* hlintGravatar Joey Hess2013-09-25
|
* webapp: Progess bar fixes for many types of special remotes.Gravatar Joey Hess2013-03-28
| | | | | | | | | | | | | There was confusion in different parts of the progress bar code about whether an update contained the total number of bytes transferred, or the number of bytes transferred since the last update. One way this bug showed up was progress bars that seemed to stick at zero for a long time. In order to fix it comprehensively, I add a new BytesProcessed data type, that is explicitly a total quantity of bytes, not a delta. Note that this doesn't necessarily fix every problem with progress bars. Particularly, buffering can now cause progress bars to seem to run ahead of transfers, reaching 100% when data is still being uploaded.
* show errorsGravatar Joey Hess2013-01-02
|
* avoid unnecessary MaybeGravatar Joey Hess2012-11-30
|
* directory special remote: Made more efficient and robust.Gravatar Joey Hess2012-11-19
| | | | | | | | | Files are now written to a tmp directory in the remote, and once all chunks are written, etc, it's moved into the final place atomically. For now, checkpresent still checks every single chunk of a file, because the old method could leave partially transferred files with some chunks present and others not.
* S3: Added progress display for uploading and downloading.Gravatar Joey Hess2012-11-18
|
* simplifyGravatar Joey Hess2012-11-18
|