diff options
author | Joey Hess <joey@kitenet.net> | 2012-10-19 16:09:21 -0400 |
---|---|---|
committer | Joey Hess <joey@kitenet.net> | 2012-10-19 16:09:21 -0400 |
commit | 5237b17193befdac87dfbeac9391e5ccad3049bb (patch) | |
tree | a107b4120c993103a0d37668ec9e6c9f3b28041e | |
parent | 5f835c769eb673b28d2d0211adfd6cbdf420b4bc (diff) |
Replace "in=" with "present" in preferred content expressions
in= was problimatic in two ways. First, it referred to a remote by name,
but preferred content expressions can be evaluated elsewhere, where that
remote doesn't exist, or a different remote has the same name. This name
lookup code could error out at runtime. Secondly, in= seemed pretty useless.
in=here did not cause content to be gotten, but it did let present content
be dropped.
present is more useful, although "not present" is unstable and should be
avoided.
-rw-r--r-- | Limit.hs | 14 | ||||
-rw-r--r-- | Logs/PreferredContent.hs | 18 | ||||
-rw-r--r-- | debian/changelog | 2 | ||||
-rw-r--r-- | doc/preferred_content.mdwn | 53 |
4 files changed, 73 insertions, 14 deletions
@@ -113,6 +113,20 @@ limitIn name = Right $ \notpresent -> check $ then return False else inAnnex key +{- Limit to content that is currently present on a uuid. -} +limitPresent :: Maybe UUID -> MkLimit +limitPresent u name = Right $ const $ check $ \key -> do + hereu <- getUUID + if u == Just hereu || u == Nothing + then inAnnex key + else do + us <- Remote.keyLocations key + return $ maybe False (`elem` us) u + where + check a = lookupFile >=> handle a + handle _ Nothing = return False + handle a (Just (key, _)) = a key + {- Adds a limit to skip files not believed to have the specified number - of copies. -} addCopies :: String -> Annex () diff --git a/Logs/PreferredContent.hs b/Logs/PreferredContent.hs index d3c120b70..f3454cc7d 100644 --- a/Logs/PreferredContent.hs +++ b/Logs/PreferredContent.hs @@ -88,7 +88,7 @@ makeMatcher groupmap u s | null (lefts tokens) = Utility.Matcher.generate $ rights tokens | otherwise = matchAll where - tokens = map (parseToken groupmap) (tokenizeMatcher s) + tokens = map (parseToken (Just u) groupmap) (tokenizeMatcher s) {- Standard matchers are pre-defined for some groups. If none is defined, - or a repository is in multiple groups with standard matchers, match all. -} @@ -103,26 +103,26 @@ matchAll = Utility.Matcher.generate [] checkPreferredContentExpression :: String -> Maybe String checkPreferredContentExpression s | s == "standard" = Nothing - | otherwise = case lefts $ map (parseToken emptyGroupMap) (tokenizeMatcher s) of + | otherwise = case lefts $ map (parseToken Nothing emptyGroupMap) (tokenizeMatcher s) of [] -> Nothing l -> Just $ unwords $ map ("Parse failure: " ++) l -parseToken :: GroupMap -> String -> Either String (Utility.Matcher.Token MatchFiles) -parseToken groupmap t +parseToken :: (Maybe UUID) -> GroupMap -> String -> Either String (Utility.Matcher.Token MatchFiles) +parseToken mu groupmap t | any (== t) Utility.Matcher.tokens = Right $ Utility.Matcher.token t - | otherwise = maybe (Left $ "near " ++ show t) use $ M.lookup k m - where - (k, v) = separate (== '=') t - m = M.fromList + | t == "present" = use $ limitPresent mu + | otherwise = maybe (Left $ "near " ++ show t) use $ M.lookup k $ + M.fromList [ ("include", limitInclude) , ("exclude", limitExclude) - , ("in", limitIn) , ("copies", limitCopies) , ("inbackend", limitInBackend) , ("largerthan", limitSize (>)) , ("smallerthan", limitSize (<)) , ("inallgroup", limitInAllGroup groupmap) ] + where + (k, v) = separate (== '=') t use a = Utility.Matcher.Operation <$> a v {- This is really dumb tokenization; there's no support for quoted values. diff --git a/debian/changelog b/debian/changelog index 7371a351b..314b500d5 100644 --- a/debian/changelog +++ b/debian/changelog @@ -2,6 +2,8 @@ git-annex (3.20121018) UNRELEASED; urgency=low * Fix handling of GIT_DIR when it refers to a git submodule. * Preferred content path matching bugfix. + * Preferred content expressions cannot use "in=". + * Preferred content expressions can use "present". -- Joey Hess <joeyh@debian.org> Wed, 17 Oct 2012 14:24:10 -0400 diff --git a/doc/preferred_content.mdwn b/doc/preferred_content.mdwn index d74986503..ac2cd1ecf 100644 --- a/doc/preferred_content.mdwn +++ b/doc/preferred_content.mdwn @@ -20,17 +20,18 @@ The expressions are very similar to the file matching options documented on the [[git-annex]] man page. At the command line, you can use those options in commands like this: - git annex get --include='*.mp3' --and -'(' --not --in=archive -')' + git annex get --include='*.mp3' --and -'(' --not --largerthan=100mb -')' The equivilant preferred content expression looks like this: - include=*.mp3 and (not in=archive) + include=*.mp3 and (not largerthan=100mb) -So, just remove the dashes, basically. +So, just remove the dashes, basically. However, there are some differences +from the command line options to keep in mind: -## file matching +### difference: file matching -Note that while --include and --exclude match files relative to the current +While --include and --exclude match files relative to the current directory, preferred content expressions always match files relative to the top of the git repository. Perhaps you put files into `archive` directories when you're done with them. Then you could configure your laptop to prefer @@ -38,6 +39,48 @@ to not retain those files, like this: exclude=*/archive/* +### difference: no "in=" + +Preferred content expressions have no direct equivilant to `--in`. + +Often, it's best to add repositories to groups, and match against +the groups in a preferred content expression. So rather than +`--in=usbdrive`, put all the USB drives into a "transfer" group, +and use "copies=transfer:1" + +### difference: dropping + +To decide if content should be dropped, git-annex evaluates the preferred +content expression under the assumption that the content has *already* been +dropped. If the content would not be preferred then, the drop can be done. +So, for example, `copies=2` in a preferred content expression lets +content be dropped only when there are currently 3 copies of it, including +the repo it's being dropped from. This is different than running `git annex +drop --copies=2`, which will drop files that current have 2 copies. + +A wrinkle of this approach is how `in=` is handled. When deciding if +content should be dropped, git-annex looks at the current status, not +the status if the content would be dropped. So `in=here` means that +any currently present content is preferred, which can be useful if you +want manual control over content. Meanwhile `not (in=here)` should be +avoided -- it will cause content that's not here to be preferred, +but once the content arrives, it'll stop being preferred and will be +dropped again! + +## difference: "present" + +There's a special "present" keyword you can use in a preferred content +expression. This means that content is preferred if it's present, +and not otherwise. This leaves it up to you to use git-annex manually +to move content around. You can use this to avoid preferred content +settings from affecting a subdirectory. For example: + + auto/* or (include=ad-hoc/* and present) + +Note that `not present` is a very bad thing to put in a preferred content +expression. It'll make it prefer to get content that's not present, and +drop content that is present! Don't go there.. + ## standard expressions git-annex comes with some standard preferred content expressions, that can |