summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2012-10-19 16:09:21 -0400
committerGravatar Joey Hess <joey@kitenet.net>2012-10-19 16:09:21 -0400
commit5237b17193befdac87dfbeac9391e5ccad3049bb (patch)
treea107b4120c993103a0d37668ec9e6c9f3b28041e
parent5f835c769eb673b28d2d0211adfd6cbdf420b4bc (diff)
Replace "in=" with "present" in preferred content expressions
in= was problimatic in two ways. First, it referred to a remote by name, but preferred content expressions can be evaluated elsewhere, where that remote doesn't exist, or a different remote has the same name. This name lookup code could error out at runtime. Secondly, in= seemed pretty useless. in=here did not cause content to be gotten, but it did let present content be dropped. present is more useful, although "not present" is unstable and should be avoided.
-rw-r--r--Limit.hs14
-rw-r--r--Logs/PreferredContent.hs18
-rw-r--r--debian/changelog2
-rw-r--r--doc/preferred_content.mdwn53
4 files changed, 73 insertions, 14 deletions
diff --git a/Limit.hs b/Limit.hs
index cdaadfe2d..ae77d8a5a 100644
--- a/Limit.hs
+++ b/Limit.hs
@@ -113,6 +113,20 @@ limitIn name = Right $ \notpresent -> check $
then return False
else inAnnex key
+{- Limit to content that is currently present on a uuid. -}
+limitPresent :: Maybe UUID -> MkLimit
+limitPresent u name = Right $ const $ check $ \key -> do
+ hereu <- getUUID
+ if u == Just hereu || u == Nothing
+ then inAnnex key
+ else do
+ us <- Remote.keyLocations key
+ return $ maybe False (`elem` us) u
+ where
+ check a = lookupFile >=> handle a
+ handle _ Nothing = return False
+ handle a (Just (key, _)) = a key
+
{- Adds a limit to skip files not believed to have the specified number
- of copies. -}
addCopies :: String -> Annex ()
diff --git a/Logs/PreferredContent.hs b/Logs/PreferredContent.hs
index d3c120b70..f3454cc7d 100644
--- a/Logs/PreferredContent.hs
+++ b/Logs/PreferredContent.hs
@@ -88,7 +88,7 @@ makeMatcher groupmap u s
| null (lefts tokens) = Utility.Matcher.generate $ rights tokens
| otherwise = matchAll
where
- tokens = map (parseToken groupmap) (tokenizeMatcher s)
+ tokens = map (parseToken (Just u) groupmap) (tokenizeMatcher s)
{- Standard matchers are pre-defined for some groups. If none is defined,
- or a repository is in multiple groups with standard matchers, match all. -}
@@ -103,26 +103,26 @@ matchAll = Utility.Matcher.generate []
checkPreferredContentExpression :: String -> Maybe String
checkPreferredContentExpression s
| s == "standard" = Nothing
- | otherwise = case lefts $ map (parseToken emptyGroupMap) (tokenizeMatcher s) of
+ | otherwise = case lefts $ map (parseToken Nothing emptyGroupMap) (tokenizeMatcher s) of
[] -> Nothing
l -> Just $ unwords $ map ("Parse failure: " ++) l
-parseToken :: GroupMap -> String -> Either String (Utility.Matcher.Token MatchFiles)
-parseToken groupmap t
+parseToken :: (Maybe UUID) -> GroupMap -> String -> Either String (Utility.Matcher.Token MatchFiles)
+parseToken mu groupmap t
| any (== t) Utility.Matcher.tokens = Right $ Utility.Matcher.token t
- | otherwise = maybe (Left $ "near " ++ show t) use $ M.lookup k m
- where
- (k, v) = separate (== '=') t
- m = M.fromList
+ | t == "present" = use $ limitPresent mu
+ | otherwise = maybe (Left $ "near " ++ show t) use $ M.lookup k $
+ M.fromList
[ ("include", limitInclude)
, ("exclude", limitExclude)
- , ("in", limitIn)
, ("copies", limitCopies)
, ("inbackend", limitInBackend)
, ("largerthan", limitSize (>))
, ("smallerthan", limitSize (<))
, ("inallgroup", limitInAllGroup groupmap)
]
+ where
+ (k, v) = separate (== '=') t
use a = Utility.Matcher.Operation <$> a v
{- This is really dumb tokenization; there's no support for quoted values.
diff --git a/debian/changelog b/debian/changelog
index 7371a351b..314b500d5 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -2,6 +2,8 @@ git-annex (3.20121018) UNRELEASED; urgency=low
* Fix handling of GIT_DIR when it refers to a git submodule.
* Preferred content path matching bugfix.
+ * Preferred content expressions cannot use "in=".
+ * Preferred content expressions can use "present".
-- Joey Hess <joeyh@debian.org> Wed, 17 Oct 2012 14:24:10 -0400
diff --git a/doc/preferred_content.mdwn b/doc/preferred_content.mdwn
index d74986503..ac2cd1ecf 100644
--- a/doc/preferred_content.mdwn
+++ b/doc/preferred_content.mdwn
@@ -20,17 +20,18 @@ The expressions are very similar to the file matching options documented
on the [[git-annex]] man page. At the command line, you can use those
options in commands like this:
- git annex get --include='*.mp3' --and -'(' --not --in=archive -')'
+ git annex get --include='*.mp3' --and -'(' --not --largerthan=100mb -')'
The equivilant preferred content expression looks like this:
- include=*.mp3 and (not in=archive)
+ include=*.mp3 and (not largerthan=100mb)
-So, just remove the dashes, basically.
+So, just remove the dashes, basically. However, there are some differences
+from the command line options to keep in mind:
-## file matching
+### difference: file matching
-Note that while --include and --exclude match files relative to the current
+While --include and --exclude match files relative to the current
directory, preferred content expressions always match files relative to the
top of the git repository. Perhaps you put files into `archive` directories
when you're done with them. Then you could configure your laptop to prefer
@@ -38,6 +39,48 @@ to not retain those files, like this:
exclude=*/archive/*
+### difference: no "in="
+
+Preferred content expressions have no direct equivilant to `--in`.
+
+Often, it's best to add repositories to groups, and match against
+the groups in a preferred content expression. So rather than
+`--in=usbdrive`, put all the USB drives into a "transfer" group,
+and use "copies=transfer:1"
+
+### difference: dropping
+
+To decide if content should be dropped, git-annex evaluates the preferred
+content expression under the assumption that the content has *already* been
+dropped. If the content would not be preferred then, the drop can be done.
+So, for example, `copies=2` in a preferred content expression lets
+content be dropped only when there are currently 3 copies of it, including
+the repo it's being dropped from. This is different than running `git annex
+drop --copies=2`, which will drop files that current have 2 copies.
+
+A wrinkle of this approach is how `in=` is handled. When deciding if
+content should be dropped, git-annex looks at the current status, not
+the status if the content would be dropped. So `in=here` means that
+any currently present content is preferred, which can be useful if you
+want manual control over content. Meanwhile `not (in=here)` should be
+avoided -- it will cause content that's not here to be preferred,
+but once the content arrives, it'll stop being preferred and will be
+dropped again!
+
+## difference: "present"
+
+There's a special "present" keyword you can use in a preferred content
+expression. This means that content is preferred if it's present,
+and not otherwise. This leaves it up to you to use git-annex manually
+to move content around. You can use this to avoid preferred content
+settings from affecting a subdirectory. For example:
+
+ auto/* or (include=ad-hoc/* and present)
+
+Note that `not present` is a very bad thing to put in a preferred content
+expression. It'll make it prefer to get content that's not present, and
+drop content that is present! Don't go there..
+
## standard expressions
git-annex comes with some standard preferred content expressions, that can