diff options
author | Joey Hess <joey@kitenet.net> | 2014-01-20 17:34:58 -0400 |
---|---|---|
committer | Joey Hess <joey@kitenet.net> | 2014-01-20 17:35:29 -0400 |
commit | 049bea4659e108967977852b5ca0cc00b74a8831 (patch) | |
tree | 87b5cb9f48f4692eee93ce2565589441509857c8 | |
parent | be851d42f4a78fc05494e1204c5914a19b305893 (diff) |
Add and use numcopiesneeded preferred content expression.
* Add numcopiesneeded preferred content expression.
* Client, transfer, incremental backup, and archive repositories
now want to get content that does not yet have enough copies.
This means the asssistant will make copies of files that don't yet
meet the configured numcopies, even to places that would not normally want
the file.
For example, if numcopies is 4, and there are 2 client repos and
2 transfer repos, and 2 removable backup drives, the file will be sent
to both transfer repos in order to make 4 copies. Once a removable drive
get a copy of the file, it will be dropped from one transfer repo or the
other (but not both).
Another example, numcopies is 3 and there is a client that has a backup
removable drive and two small archive repos. Normally once one of the small
archives has a file, it will not be put into the other one. But, to satisfy
numcopies, the assistant will duplicate it into the other small archive
too, if the backup repo is not available to receive the file.
I notice that these examples are fairly unlikely setups .. the old behavior
was not too bad, but it's nice to finally have it really correct.
.. Almost. I have skipped checking the annex.numcopies .gitattributes
out of fear it will be too slow.
This commit was sponsored by Florian Schlegel.
-rw-r--r-- | Annex/FileMatcher.hs | 1 | ||||
-rw-r--r-- | GitAnnex/Options.hs | 2 | ||||
-rw-r--r-- | Limit.hs | 27 | ||||
-rw-r--r-- | Types/StandardGroups.hs | 4 | ||||
-rw-r--r-- | debian/changelog | 3 | ||||
-rw-r--r-- | doc/git-annex.mdwn | 9 | ||||
-rw-r--r-- | doc/preferred_content.mdwn | 8 | ||||
-rw-r--r-- | doc/todo/preferred_content_numcopies_check.mdwn | 7 |
8 files changed, 52 insertions, 9 deletions
diff --git a/Annex/FileMatcher.hs b/Annex/FileMatcher.hs index 96cb8fd6f..6ec0bace9 100644 --- a/Annex/FileMatcher.hs +++ b/Annex/FileMatcher.hs @@ -70,6 +70,7 @@ parseToken checkpresent checkpreferreddir groupmap t [ ("include", limitInclude) , ("exclude", limitExclude) , ("copies", limitCopies) + , ("numcopiesneeded", limitNumCopiesNeeded) , ("inbackend", limitInBackend) , ("largerthan", limitSize (>)) , ("smallerthan", limitSize (<)) diff --git a/GitAnnex/Options.hs b/GitAnnex/Options.hs index fbb34470b..ad1e0c93b 100644 --- a/GitAnnex/Options.hs +++ b/GitAnnex/Options.hs @@ -41,6 +41,8 @@ options = Option.common ++ "match files present in a remote" , Option ['C'] ["copies"] (ReqArg Limit.addCopies paramNumber) "skip files with fewer copies" + , Option [] ["numcopiesneeded"] (ReqArg Limit.addNumCopiesNeeded paramNumber) + "match files that need more copies" , Option ['B'] ["inbackend"] (ReqArg Limit.addInBackend paramName) "match files using a key-value backend" , Option [] ["inallgroup"] (ReqArg Limit.addInAllGroup paramGroup) @@ -1,6 +1,6 @@ {- user-specified limits on files to act on - - - Copyright 2011-2013 Joey Hess <joey@kitenet.net> + - Copyright 2011-2014 Joey Hess <joey@kitenet.net> - - Licensed under the GNU GPL version 3 or higher. -} @@ -23,6 +23,7 @@ import qualified Backend import Annex.Content import Annex.UUID import Logs.Trust +import Logs.NumCopies import Types.TrustLevel import Types.Key import Types.Group @@ -177,6 +178,30 @@ limitCopies want = case split ":" want of | "+" `isSuffixOf` s = (>=) <$> readTrustLevel (beginning s) | otherwise = (==) <$> readTrustLevel s +{- Adds a limit to match files that need more copies made. + - + - Does not look at annex.numcopies .gitattributes, because that + - would require querying git check-attr every time a preferred content + - expression is checked, which would probably be quite slow. + -} +addNumCopiesNeeded :: String -> Annex () +addNumCopiesNeeded = addLimit . limitNumCopiesNeeded + +limitNumCopiesNeeded :: MkLimit +limitNumCopiesNeeded want = case readish want of + Just needed -> Right $ \notpresent -> checkKey $ + handle needed notpresent + Nothing -> Left "bad value for numcopiesneeded" + where + handle needed notpresent key = do + gv <- getGlobalNumCopies + case gv of + Nothing -> return False + Just numcopies -> do + us <- filter (`S.notMember` notpresent) + <$> (trustExclude UnTrusted =<< Remote.keyLocations key) + return $ numcopies - length us >= needed + {- Adds a limit to skip files not believed to be present in all - repositories in the specified group. -} addInAllGroup :: String -> Annex () diff --git a/Types/StandardGroups.hs b/Types/StandardGroups.hs index 51788ec4e..c4c3ba9f3 100644 --- a/Types/StandardGroups.hs +++ b/Types/StandardGroups.hs @@ -93,6 +93,6 @@ notArchived :: String notArchived = "not (copies=archive:1 or copies=smallarchive:1)" {- Most repositories want any content that is only on untrusted - - or dead repositories. -} + - or dead repositories, or that otherwise does not have enough copies. -} lastResort :: String -> PreferredContentExpression -lastResort s = "(" ++ s ++ ") or (not copies=semitrusted+:1)" +lastResort s = "(" ++ s ++ ") or numcopiesneeded=1" diff --git a/debian/changelog b/debian/changelog index 9ecb4d2b6..923fb1692 100644 --- a/debian/changelog +++ b/debian/changelog @@ -14,6 +14,9 @@ git-annex (5.20140118) UNRELEASED; urgency=medium command is used to set the global number of copies, any annex.numcopies git configs will be ignored. * assistant: Make the prefs page set the global numcopies. + * Add numcopiesneeded preferred content expression. + * Client, transfer, incremental backup, and archive repositories + now want to get content that does not yet have enough copies. -- Joey Hess <joeyh@debian.org> Sat, 18 Jan 2014 11:54:17 -0400 diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn index 4e7bd2395..8a948d303 100644 --- a/doc/git-annex.mdwn +++ b/doc/git-annex.mdwn @@ -1020,6 +1020,15 @@ file contents are present at either of two repositories. copies, on remotes in the specified group. For example, `--copies=archive:2` +* `--numcopiesneeded=number` + + Matches only files that git-annex believes need the specified number or + more additional copies to be made in order to satisfy their numcopies + setting, as configured by the global numcopies setting of the repository. + + Note that for various reasons, including speed, this does not look + at the annex.numcopies .gitattributes settings of files. + * `--inbackend=name` Matches only files whose content is stored using the specified key-value diff --git a/doc/preferred_content.mdwn b/doc/preferred_content.mdwn index 9c698c8ba..b18f46c33 100644 --- a/doc/preferred_content.mdwn +++ b/doc/preferred_content.mdwn @@ -113,7 +113,7 @@ any repository that can will back it up.) All content is preferred, unless it's for a file in a "archive" directory, which has reached an archive repository. -`((exclude=*/archive/* and exclude=archive/*) or (not (copies=archive:1 or copies=smallarchive:1))) or (not copies=semitrusted+:1)` +`((exclude=*/archive/* and exclude=archive/*) or (not (copies=archive:1 or copies=smallarchive:1))) or numcopiesneeded=1` ### transfer @@ -147,20 +147,20 @@ All content is preferred. Only prefers content that's not already backed up to another backup or incremental backup repository. -`(include=* and (not copies=backup:1) and (not copies=incrementalbackup:1)) or (not copies=semitrusted+:1)` +`(include=* and (not copies=backup:1) and (not copies=incrementalbackup:1)) or numcopiesneeded=1` ### small archive Only prefers content that's located in an "archive" directory, and only if it's not already been archived somewhere else. -`((include=*/archive/* or include=archive/*) and not (copies=archive:1 or copies=smallarchive:1)) or (not copies=semitrusted+:1)` +`((include=*/archive/* or include=archive/*) and not (copies=archive:1 or copies=smallarchive:1)) or numcopiesneeded=1` ### full archive All content is preferred, unless it's already been archived somewhere else. -`(not (copies=archive:1 or copies=smallarchive:1)) or (not copies=semitrusted+:1)` +`(not (copies=archive:1 or copies=smallarchive:1)) or numcopiesneeded=1` Note that if you want to archive multiple copies (not a bad idea!), you should instead configure all your archive repositories with a diff --git a/doc/todo/preferred_content_numcopies_check.mdwn b/doc/todo/preferred_content_numcopies_check.mdwn index 152afe08c..8aa736a04 100644 --- a/doc/todo/preferred_content_numcopies_check.mdwn +++ b/doc/todo/preferred_content_numcopies_check.mdwn @@ -54,9 +54,12 @@ Conclusion: * Add "numcopiesneeded=N" preferred content expression using the git-annex branch numcopies setting, overridden by any .gitattributes numcopies setting for a particular file. It should ignore the other ways to specify - numcopies. + numcopies, particularly git config annex.numcopies. **done** * Make the repo groups that currently end with "or (not copies=semitrusted+:1)" - to instead end with "or numcopiesneeded=1" + to instead end with "or numcopiesneeded=1" **done** +* See if "numcopiesneeded=N" can check .gitattributes without getting + a lot slower. If now, perhaps add a "numcopiesneededaccurate=N" that + checks it. ## Stability analysis |