summaryrefslogtreecommitdiff
path: root/doc/todo/preferred_content_numcopies_check.mdwn
diff options
context:
space:
mode:
Diffstat (limited to 'doc/todo/preferred_content_numcopies_check.mdwn')
-rw-r--r--doc/todo/preferred_content_numcopies_check.mdwn61
1 files changed, 61 insertions, 0 deletions
diff --git a/doc/todo/preferred_content_numcopies_check.mdwn b/doc/todo/preferred_content_numcopies_check.mdwn
new file mode 100644
index 000000000..956888cca
--- /dev/null
+++ b/doc/todo/preferred_content_numcopies_check.mdwn
@@ -0,0 +1,61 @@
+The assistant and git annex sync --content do not try to proactively
+download content that is not otherwise wanted in order to get numcopies
+satisfied. (Unlike get --auto, which does take numcopies into account.)
+
+Should these automated systems try to proactively satisfy numcopies? I
+don't feel they should. It could result in surprising results. For example,
+a transfer repository, which is of limited size, could start being filled
+up with lots of content that all clients have, just because numcopies was
+set to a larger number than the total number of clients. Another example,
+a source repository on eg an Android phone, should never have content in it
+that was not created on that device.
+
+However, it would make sense for some specific
+types of repositories to proactively get content to satisfy numcopies.
+Currently some types of repositories use "or (not copies=semitrusted+:1)",
+to ensure that if the only copy of a file is on a dead repository, they
+will try to get that file before the repo goes away. This is done
+by client repositories, and backup, and archive. Probably the same set
+would make sense to proactively satisfy numcopies.
+
+So, a new type of preferred content expression is called for. Such as, for
+example, "numcopiesneeded=1". Which indicates that at least 1 more copy
+is needed to satifsy numcopies.
+
+(Note that it should only count semittrusted and higher trust
+level repos as satisfying numcopies.)
+
+But, preferred content expressions can only operate on info stored in the
+git repo, or they will fail to be stable. Ie, repo A needs to be able to
+calculate whether a file is preferred content by repo B and get the same
+result as when repo B calculates that.
+
+numcopies is currently configured in 3 places:
+
+* .git/config `annex.numcopies` (global, stored only locally)
+* .gitattributes `annex.numcopies` (per file, stored in git repo)
+* --numcopies (not relevant)
+
+So, need to add a global numcopies setting that is stored in the git repo.
+That could either be a file in the git-annex branch, or just
+`* annex.numcopies=2` in the toplevel .gitattributes. Note that the
+assistant needs to be able to query and set it, which I think argues
+against using .gitattributes for it. Also arguing against that is that the
+.git/config numcopies valie applies even to objects with no file in the
+work tree, which gitattributes settings do not.
+
+Conclusion:
+
+* Add to the git-annex branch a numcopies file that holds the global
+ numcopies default if present.
+* Modify the assistant to use it when configuring numcopies.
+* To deprecate .git/config's annex.numcopies, only make it take effect
+ when there is no numcopies file in the git-annex branch.
+* Add "numcopiesneeded=N" preferred content expression using the git-annex
+ branch numcopies setting, overridden by any .gitattributes numcopies setting
+ for a particular file. It should ignore the other ways to specify
+ numcopies.
+* Make the repo groups that currently end with "or (not copies=semitrusted+:1)"
+ to instead end with "or (not numcopiesneeded=1)"
+
+--[[Joey]]