summaryrefslogtreecommitdiff
path: root/Command
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2017-10-17 14:50:48 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2017-10-17 17:10:50 -0400
commitf31dbb13cad2e8e1b29180fff755026256eabd57 (patch)
tree24fb7615a56bd81b56facbb14f2faa98e0e687b3 /Command
parent461d559ee32033266c253f9d8e004664258efed1 (diff)
Improve behavior when -J transfers multiple files that point to the same key
After a false start, I found a fairly non-intrusive way to deal with it. Although it only handles transfers -- there may be issues with eg concurrent dropping of the same key, or other operations. There is no added overhead when -J is not used, other than an added inAnnex check. When -J is used, it has to maintain and check a small Set, which should be negligible overhead. It could output some message saying that the transfer is being done by another thread. Or it could even display the same progress info for both files that are being downloaded since they have the same content. But I opted to keep it simple, since this is rather an edge case, so it just doesn't say anything about the transfer of the file until the other thread finishes. Since the deferred transfer action still runs, actions that do more than transfer content will still get a chance to do their other work. (An example of something that needs to do such other work is P2P.Annex, where the download always needs to receive the content from the peer.) And, if the first thread fails to complete a transfer, the second thread can resume it. But, this unfortunately means that there's a risk of redundant work being done to transfer a key that just got transferred. That's not ideal, but should never cause breakage; the same thing can occur when running two separate git-annex processes. The get/move/copy/mirror --from commands had extra inAnnex checks added, inside the download actions. Without those checks, the first thread downloaded the content, and then the second thread woke up and downloaded the same content redundantly. move/copy/mirror --to is left doing redundant uploads for now. It would need a second checkPresent of the remote inside the upload to avoid them, which would be expensive. A better way to avoid redundant work needs to be found.. This commit was supported by the NSF-funded DataLad project.
Diffstat (limited to 'Command')
-rw-r--r--Command/Get.hs9
-rw-r--r--Command/Move.hs7
2 files changed, 10 insertions, 6 deletions
diff --git a/Command/Get.hs b/Command/Get.hs
index 5cb0245d9..e91798eba 100644
--- a/Command/Get.hs
+++ b/Command/Get.hs
@@ -109,9 +109,10 @@ getKey' key afile = dispatch
| Remote.hasKeyCheap r =
either (const False) id <$> Remote.hasKey r key
| otherwise = return True
- docopy r witness = getViaTmp (RemoteVerify r) key $ \dest ->
- download (Remote.uuid r) key afile forwardRetry
- (\p -> do
+ docopy r = download (Remote.uuid r) key afile forwardRetry $ \p ->
+ ifM (inAnnex key)
+ ( return True
+ , getViaTmp (RemoteVerify r) key $ \dest -> do
showAction $ "from " ++ Remote.name r
Remote.retrieveKeyFile r key afile dest p
- ) witness
+ )
diff --git a/Command/Move.hs b/Command/Move.hs
index b9e0b6548..9e6c03e3b 100644
--- a/Command/Move.hs
+++ b/Command/Move.hs
@@ -200,8 +200,11 @@ fromPerform src move key afile = do
where
go = notifyTransfer Download afile $
download (Remote.uuid src) key afile forwardRetry $ \p ->
- getViaTmp (RemoteVerify src) key $ \t ->
- Remote.retrieveKeyFile src key afile t p
+ ifM (inAnnex key)
+ ( return True
+ , getViaTmp (RemoteVerify src) key $ \t ->
+ Remote.retrieveKeyFile src key afile t p
+ )
dispatch _ False = stop -- failed
dispatch False True = next $ return True -- copy complete
-- Finish by dropping from remote, taking care to verify that