summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc')
-rw-r--r--doc/bugs/S3_memory_leaks.mdwn4
-rw-r--r--doc/design/assistant/blog/day_100__cursed_clouds.mdwn19
-rw-r--r--doc/design/assistant/transfer_control.mdwn18
-rw-r--r--doc/git-annex.mdwn17
-rw-r--r--doc/preferred_content.mdwn37
-rw-r--r--doc/walkthrough/automatically_managing_content.mdwn5
6 files changed, 89 insertions, 11 deletions
diff --git a/doc/bugs/S3_memory_leaks.mdwn b/doc/bugs/S3_memory_leaks.mdwn
index f612de396..2f72b09ac 100644
--- a/doc/bugs/S3_memory_leaks.mdwn
+++ b/doc/bugs/S3_memory_leaks.mdwn
@@ -8,3 +8,7 @@ file size.
The author of hS3 is aware of the problem, and working on it. I think I
have identified the root cause of the buffering; it's done by hS3 so it can
resend the data if S3 sends it a 307 redirect. --[[Joey]]
+
+At least the send leak should be fixed by the patch in the s3-memory-leak
+branch in git. That needs a patch to hS3, which I have sent to its author.
+--[[Joey]]
diff --git a/doc/design/assistant/blog/day_100__cursed_clouds.mdwn b/doc/design/assistant/blog/day_100__cursed_clouds.mdwn
new file mode 100644
index 000000000..7ac38f463
--- /dev/null
+++ b/doc/design/assistant/blog/day_100__cursed_clouds.mdwn
@@ -0,0 +1,19 @@
+Preferred content control is wired up to `--auto` and working for `get`,
+`copy`, and `drop`. Note that `drop --from remote --auto` drops files that
+the remote's preferred content settings indicate it doesn't want;
+likewise `copy --to remote --auto` sends content that the remote does want.
+
+Also implemented `smallerthan`, `largerthan`, and `ingroup` limits,
+which should be everything needed for the scenarios described in
+[[transfer_control]].
+
+Dying to hook this up to the assistant, but a cloudy day is forcing me to
+curtail further computer use.
+
+----
+
+Also, last night I developed a patch for the hS3 library, that should let
+git-annex upload large files to S3 without buffering their whole content in
+memory. I have a `s3-memory-leak` in git-annex that uses the new API I
+developed. Hopefully hS3's maintainer will release a new version with that
+soon.
diff --git a/doc/design/assistant/transfer_control.mdwn b/doc/design/assistant/transfer_control.mdwn
index 204f5d090..1f53a5603 100644
--- a/doc/design/assistant/transfer_control.mdwn
+++ b/doc/design/assistant/transfer_control.mdwn
@@ -8,9 +8,9 @@ But often the remote is just a removable drive or a cloud remote,
that has a limited size. This page is about making the assistant do
something smart with such remotes.
-## specifying what data belongs on a remote
+## specifying what data a remote prefers to contain **done**
-Imagine a per-remote `annex-accept` setting, that matches things that
+Imagine a per-remote preferred content setting, that matches things that
should be stored on the remote.
For example, a MP3 player might use:
@@ -23,15 +23,11 @@ A USB drive that is carried between three laptops and used to sync data
between them might use: `not (in=laptop1 and in=laptop2 and in=laptop3)`
In this case, transferring data from the usb repo should
-check if `annex-accept` then rejects the data, and if so, drop it
+check if preferred content settings rejects the data, and if so, drop it
from the repo. So once all three laptops have the data, it is
pruned from the transfer drive.
-It may make sense to have annex-accept info also be stored in the git-annex
-branch, for settings that should apply to a repo globally. Does it make
-sense to have local configuration too?
-
-## repo groups
+## repo groups **done**
Seems like git-annex needs a way to know the groups of repos. Some
groups:
@@ -57,14 +53,14 @@ Some examples of using groups:
* Make a cloud repo only hold data until all known clients have a copy:
- `not inall(enduser)`
+ `not ingroup(enduser)`
## configuration
The above is all well and good for those who enjoy boolean algebra, but
how to configure these sorts of expressions in the webapp?
-## the state change problem
+## the state change problem **done**
Imagine that a trusted repo has setting like `not copies=trusted:2`
This means that `git annex get --auto` should get files not in 2 trusted
@@ -81,3 +77,5 @@ Or, perhaps simulation could be used to detect the problem. Before
dropping, check the expression. Then simulate that the drop has happened.
Does the expression now make it want to add it? Then don't drop it!
How to implement this simulation?
+
+> Solved, fwiw..
diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn
index abda54f76..bfdbbf737 100644
--- a/doc/git-annex.mdwn
+++ b/doc/git-annex.mdwn
@@ -502,7 +502,8 @@ subdirectories).
* --auto
Enables automatic mode. Commands that get, drop, or move file contents
- will only do so when needed to help satisfy the setting of annex.numcopies.
+ will only do so when needed to help satisfy the setting of annex.numcopies,
+ and preferred content configuration.
* --quiet
@@ -642,6 +643,20 @@ file contents are present at either of two repositories.
Matches only files whose content is stored using the specified key-value
backend.
+* --ingroup=groupname
+
+ Matches only files that git-annex believes are present in all repositories
+ in the specified group.
+
+* --smallerthan=size
+* --largerthan=size
+
+ Matches only files whose content is smaller than, or larger than the
+ specified size.
+
+ The size can be specified with any commonly used units, for example,
+ "0.5 gb" or "100 KiloBytes"
+
* --not
Inverts the next file matching option. For example, to only act on
diff --git a/doc/preferred_content.mdwn b/doc/preferred_content.mdwn
new file mode 100644
index 000000000..7c7d11267
--- /dev/null
+++ b/doc/preferred_content.mdwn
@@ -0,0 +1,37 @@
+git-annex tries to ensure that the configured number of [[copies]] of your
+data always exist, and leaves it up to you to use commands like `git annex
+get` and `git annex drop` to move the content to the repositories you want
+to contain it. But sometimes, it can be good to have more fine-grained
+control over which repositories prefer to have which content. Configuring
+this allows `git annex get --auto`, `git annex drop --auto`, etc to do
+smarter things.
+
+Currently, preferred content settings can only be edited using `git
+annex vicfg`. Each repository can have its own settings, and other
+repositories may also try to honor those settings. So there's no local
+`.git/config` setting it.
+
+The idea is that you write an expression that files are matched against.
+If a file matches, it's preferred to have its content stored in the
+repository. If it doesn't, it's preferred to drop its content from
+the repository (if there are enough copies elsewhere).
+
+The expressions are very similar to the file matching options documented
+on the [[git-annex]] man page. At the command line, you can use those
+options in commands like this:
+
+ git annex get --include='*.mp3' --and -'(' --not --in=archive -')'
+
+The equivilant preferred content expression looks like this:
+
+ include=*.mp3 and (not in=archive)
+
+So, just remove the dashes, basically.
+
+Note that while --include and --exclude match files relative to the current
+directory, preferred content expressions always match files relative to the
+top of the git repository. Perhaps you put files into `out/` directories
+when you're done with them. Then you could configure your laptop to prefer
+to not retain those files, like this:
+
+ exclude=*/out/*
diff --git a/doc/walkthrough/automatically_managing_content.mdwn b/doc/walkthrough/automatically_managing_content.mdwn
index ef883efef..0080ebcb5 100644
--- a/doc/walkthrough/automatically_managing_content.mdwn
+++ b/doc/walkthrough/automatically_managing_content.mdwn
@@ -38,3 +38,8 @@ work toward having two copies of your files.
The --auto option can also be used with the copy command,
again this lets git-annex decide whether to actually copy content.
+
+The above shows how to use --auto to manage content based on the number
+of copies. It's also possible to configure, on a per-repository basis,
+which content is desired. Then --auto also takes that into account
+see [[preferred_content]] for details.