diff options
author | Joey Hess <joey@kitenet.net> | 2014-01-22 14:35:38 -0400 |
---|---|---|
committer | Joey Hess <joey@kitenet.net> | 2014-01-22 14:35:38 -0400 |
commit | b09828d1b7bbd7f18a4186b47fbd70b66e682af4 (patch) | |
tree | a61607eeb63e0721442c793709172bd537deefb2 /doc/todo/Limit_file_revision_history.mdwn | |
parent | d3f76d9d0f0811f71be39d09a9c1703eb5d4cb4c (diff) |
promote forum post to feature request; add design
Diffstat (limited to 'doc/todo/Limit_file_revision_history.mdwn')
-rw-r--r-- | doc/todo/Limit_file_revision_history.mdwn | 89 |
1 files changed, 89 insertions, 0 deletions
diff --git a/doc/todo/Limit_file_revision_history.mdwn b/doc/todo/Limit_file_revision_history.mdwn new file mode 100644 index 000000000..593e93013 --- /dev/null +++ b/doc/todo/Limit_file_revision_history.mdwn @@ -0,0 +1,89 @@ +Hi, I am assuming to use git-annex-assistant for two usecases, but I would like to ask about the options or planed roadmap for dropped/removed files from the repository. + +Usecases: + +1. sync working directory between laptop, home computer, work komputer +2. archive functionality for my photograps + +Both usecases have one common factor. Some files might become obsolate and +in long time frame nobody is interested to keep their revisions. Let's +assume photographs. Usuall workflow I take is to import all photograps to +filesystem, then assess (select) the good ones I want to keep and then +process them what ever way. + +Problem with git-annex(-assistant) I have is that it start to revision all +of the files at the time they are added to directory. This is welcome at +first but might be an issue if you are used to put 80% of the size of your +imported files to trash. + +I am aware of what git-annex is not. I have been reading documentation for +"git-annex drop" and "unused" options including forums. I do understand +that I am actually able to delete all revisions of the file if I will drop +it, remove it and if I will run git annex unused 1..###. (on all synced +repositories). + +I actually miss the option to have above process automated/replicated to the other synced repositories. + +I would formulate the 'use case' requirements for git-annex as: + +* command to drop an file including revisions from all annex repositories? + (for example like moving a file to /trash folder) that will schedulle + it's deletition) +* option to keep like max. 10 last revisions of the file? +* option to keep only previous revisions if younger than 6 months from now? + +Finally, how to specify a feature request for git-annex? + +> By moving it here ;-) --[[Joey]] + +> So, let's spec out a design. +> +> * Add preferred content terminal to configure whether a repository wants +> to hang on to unused content. +> Something like "unused=true" I suppose, because not having a parameter +> would complicate preferred content parsing, and I cannot think +> of a useful parameter. +> * In order to quickly match that terminal, the Annex monad will need +> to keep a Set of unused Keys. This should only be loaded on demand. +> NB: There is some potential for a great many unused Keys to cause +> memory usage to balloon. +> * Client repositories will end their preferred content with +> `and unused=false`. Transfer repositories too, because typically +> only client repos connect to them, and so otherwise unused files +> would build up there. Backup repos would want unused files. I +> think that archive repos would too. +> * Make the assistant check for unused files periodically. Exactly +> how often may need to be tuned, but once per day seems reasonable +> for most repos. Note that the assistant could also notice on the +> fly when files are removed and mark their keys as unused if that was +> the last associated file. (Only currently possible in direct mode.) +> * It makes sense for the +> assistant to queue transfers of unused files to any remotes that +> do want them (eg, backup remotes). If the files can successfully be +> sent to a remote, that will lead to them being dropped locally as +> they're not wanted. +> * Add a git config setting like annex.expireunused=7d. This causes +> *deletion* of unused files after the specified time period if they are +> not able to be moved to a repo that wants them. +> (The default should be annex.expireunused=false.) +> * How to detect how long a file has been unused? We can't look at the +> time stamp of the object; we could use the mtime of the .map file, +> that that's direct mode only and may be replaced with a database +> later. Seems best to just keep a unused log file with timestamps. +> * After the assistant scans for unused files, if annex.expireunused +> is not set, and there is some significant quantity of unused files +> (eg, more than 1000, or more than 1 gb, or more than the amount of +> remaining free disk space), +> it can pop up a webapp alert asking to configure it. +> +> This does not cover every use case that was requested. +> But I don't see a cheap way to ensure it keeps eg the past 10 versions of +> a file. I guess that if you care about that, you leave +> annex.expireunused=false, and set up a backup repository where the unused +> files will be moved to. +> +> Note that since the assistant uses direct mode by default, old versions +> of modififed files are not guaranteed to be retained. But they very well +> might be. For example, if a file is replicated to 2 clients, and one +> client directly edits it, or deletes it, it loses the old version, +> but the other client will still be storing that old version. |