diff options
author | Joey Hess <joey@kitenet.net> | 2014-01-20 14:28:33 -0400 |
---|---|---|
committer | Joey Hess <joey@kitenet.net> | 2014-01-20 14:28:33 -0400 |
commit | 5c4a88a575449cdce510540cd36b28699a907011 (patch) | |
tree | cb32161cf12d745f2f0e620d6032ff39ad36da8e /doc/todo/preferred_content_numcopies_check.mdwn | |
parent | 2834e9f5813c320c9374a4d928c4ccf09a6ee33b (diff) |
design for preferred content numcopies check
Diffstat (limited to 'doc/todo/preferred_content_numcopies_check.mdwn')
-rw-r--r-- | doc/todo/preferred_content_numcopies_check.mdwn | 61 |
1 files changed, 61 insertions, 0 deletions
diff --git a/doc/todo/preferred_content_numcopies_check.mdwn b/doc/todo/preferred_content_numcopies_check.mdwn new file mode 100644 index 000000000..956888cca --- /dev/null +++ b/doc/todo/preferred_content_numcopies_check.mdwn @@ -0,0 +1,61 @@ +The assistant and git annex sync --content do not try to proactively +download content that is not otherwise wanted in order to get numcopies +satisfied. (Unlike get --auto, which does take numcopies into account.) + +Should these automated systems try to proactively satisfy numcopies? I +don't feel they should. It could result in surprising results. For example, +a transfer repository, which is of limited size, could start being filled +up with lots of content that all clients have, just because numcopies was +set to a larger number than the total number of clients. Another example, +a source repository on eg an Android phone, should never have content in it +that was not created on that device. + +However, it would make sense for some specific +types of repositories to proactively get content to satisfy numcopies. +Currently some types of repositories use "or (not copies=semitrusted+:1)", +to ensure that if the only copy of a file is on a dead repository, they +will try to get that file before the repo goes away. This is done +by client repositories, and backup, and archive. Probably the same set +would make sense to proactively satisfy numcopies. + +So, a new type of preferred content expression is called for. Such as, for +example, "numcopiesneeded=1". Which indicates that at least 1 more copy +is needed to satifsy numcopies. + +(Note that it should only count semittrusted and higher trust +level repos as satisfying numcopies.) + +But, preferred content expressions can only operate on info stored in the +git repo, or they will fail to be stable. Ie, repo A needs to be able to +calculate whether a file is preferred content by repo B and get the same +result as when repo B calculates that. + +numcopies is currently configured in 3 places: + +* .git/config `annex.numcopies` (global, stored only locally) +* .gitattributes `annex.numcopies` (per file, stored in git repo) +* --numcopies (not relevant) + +So, need to add a global numcopies setting that is stored in the git repo. +That could either be a file in the git-annex branch, or just +`* annex.numcopies=2` in the toplevel .gitattributes. Note that the +assistant needs to be able to query and set it, which I think argues +against using .gitattributes for it. Also arguing against that is that the +.git/config numcopies valie applies even to objects with no file in the +work tree, which gitattributes settings do not. + +Conclusion: + +* Add to the git-annex branch a numcopies file that holds the global + numcopies default if present. +* Modify the assistant to use it when configuring numcopies. +* To deprecate .git/config's annex.numcopies, only make it take effect + when there is no numcopies file in the git-annex branch. +* Add "numcopiesneeded=N" preferred content expression using the git-annex + branch numcopies setting, overridden by any .gitattributes numcopies setting + for a particular file. It should ignore the other ways to specify + numcopies. +* Make the repo groups that currently end with "or (not copies=semitrusted+:1)" + to instead end with "or (not numcopiesneeded=1)" + +--[[Joey]] |