summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2011-03-27 18:34:30 -0400
committerGravatar Joey Hess <joey@kitenet.net>2011-03-27 18:34:30 -0400
commit4868b64868747455a9c5d512650f9e7074e6009e (patch)
tree3b4cf46d14b1c04a34a106ca55cca7b0cc00bd87
parent9a4127f0fed9fa9cd684fff78b3a2af9da3c62ad (diff)
Provide a less expensive version of `git annex copy --to`, enabled via --fast. This assumes that location tracking information is correct, rather than contacting the remote for every file.
-rw-r--r--Command/Move.hs10
-rw-r--r--debian/changelog3
-rw-r--r--doc/forum/batch_check_on_remote_when_using_copy.mdwn19
-rw-r--r--doc/git-annex.mdwn19
4 files changed, 42 insertions, 9 deletions
diff --git a/Command/Move.hs b/Command/Move.hs
index 907bbf00e..3ac5a7ab2 100644
--- a/Command/Move.hs
+++ b/Command/Move.hs
@@ -84,8 +84,14 @@ toStart dest move file = isAnnexed file $ \(key, _) -> do
return $ Just $ toPerform dest move key
toPerform :: Remote.Remote Annex -> Bool -> Key -> CommandPerform
toPerform dest move key = do
- -- checking the remote is expensive, so not done in the start step
- isthere <- Remote.hasKey dest key
+ -- Checking the remote is expensive, so not done in the start step.
+ -- In fast mode, location tracking is assumed to be correct,
+ -- and an explicit check is not done, when copying. When moving,
+ -- it has to be done, to avoid inaverdent data loss.
+ fast <- Annex.getState Annex.fast
+ isthere <- if fast && not move
+ then return $ Right True
+ else Remote.hasKey dest key
case isthere of
Left err -> do
showNote $ show err
diff --git a/debian/changelog b/debian/changelog
index e995009db..2f532784d 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -3,6 +3,9 @@ git-annex (0.20110326) UNRELEASED; urgency=low
* annex.diskreserve can be given in arbitrary units (ie "0.5 gigabytes")
* Generalized remotes handling, laying groundwork for remotes that are
not regular git remotes.
+ * Provide a less expensive version of `git annex copy --to`, enabled
+ via --fast. This assumes that location tracking information is correct,
+ rather than contacting the remote for every file.
-- Joey Hess <joeyh@debian.org> Sat, 26 Mar 2011 14:36:16 -0400
diff --git a/doc/forum/batch_check_on_remote_when_using_copy.mdwn b/doc/forum/batch_check_on_remote_when_using_copy.mdwn
index 0f20ab645..b08c33b8b 100644
--- a/doc/forum/batch_check_on_remote_when_using_copy.mdwn
+++ b/doc/forum/batch_check_on_remote_when_using_copy.mdwn
@@ -6,3 +6,22 @@ Once all checks are done, one single transfer session should be started. Creatin
-- RichiH
+
+> (Use of SHA is irrelevant here, copy does not checksum anything.)
+>
+> I think what you're seeing is
+> that `git annex copy --to remote` is slow, going to the remote repository
+> every time to see if it has the file, while `git annex copy --from remote`
+> is fast, since it looks at what files are locally present.
+>
+> That is something I mean to improve. At least `git annex copy --fast --to remote`
+> could easily do a fast copy of all files that are known to be missing from
+> the remote repository. When local and remote git repos are not 100% in sync,
+> relying on that data could miss some files that the remote doesn't have anymore,
+> but local doesn't know it dropped. That's why it's a candidate for `--fast`.
+>
+> I've just implemented that.
+>
+> While I do hope to improve ssh usage so that it sshs once, and feeds
+> `git-annex-shell` a series of commands to run, that is a much longer-term
+> thing. --[[Joey]]
diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn
index 32f190e75..8afe93c10 100644
--- a/doc/git-annex.mdwn
+++ b/doc/git-annex.mdwn
@@ -84,20 +84,22 @@ Many git-annex commands will stage changes for later `git commit` by you.
it is safe to do so, typically because of the setting of annex.numcopies.
* move [path ...]
+
+ When used with the --from option, moves the content of annexed files
+ from the specified repository to the current one.
When used with the --to option, moves the content of annexed files from
the current repository to the specified one.
- When used with the --from option, moves the content of annexed files
- from the specified repository to the current one.
-
* copy [path ...]
+ When used with the --from option, copies the content of annexed files
+ from the specified repository to the current one.
+
When used with the --to option, copies the content of annexed files from
the current repository to the specified one.
- When used with the --from option, copies the content of annexed files
- from the specified repository to the current one.
+ To avoid contacting the remote to check if it has every file, specify --fast
* unlock [path ...]
@@ -137,11 +139,15 @@ Many git-annex commands will stage changes for later `git commit` by you.
With parameters, only the specified files are checked.
+ To avoid expensive checksum calculations, specify --fast
+
* unused
Checks the annex for data that is not used by any files currently
in the annex, and prints a numbered list of the data.
+ To only show unused temp files, specify --fast
+
* dropunused [number ...]
Drops the data corresponding to the numbers, as listed by the last
@@ -286,8 +292,7 @@ Many git-annex commands will stage changes for later `git commit` by you.
* --fast
Enables less expensive, but also less thorough versions of some commands.
- What is avoided depends on the command. A fast fsck avoids calculating
- checksums; a fast unused only shows temp files and not other unused files.
+ What is avoided depends on the command.
* --quiet