diff options
author | 2014-05-29 15:23:05 -0400 | |
---|---|---|
committer | 2014-05-29 15:23:05 -0400 | |
commit | 1f6cfecc972b121fa42ea80383183bbaccc2195a (patch) | |
tree | 0a450c4226f5e05c2a3597a9f520376de281fffe /doc/todo/Limit_file_revision_history.mdwn | |
parent | a95fb731cd117f35a6e0fce90d9eb35d0941e26e (diff) |
remove old closed bugs and todo items to speed up wiki updates and reduce size
Remove closed bugs and todos that were least edited before 2014.
Command line used:
for f in $(grep -l '\[\[done\]\]' *.mdwn); do if [ -z $(git log --since=2014 --pretty=oneline "$f") ]; then git rm $f; git rm -rf $(echo "$f" | sed 's/.mdwn$//'); fi; done
Diffstat (limited to 'doc/todo/Limit_file_revision_history.mdwn')
-rw-r--r-- | doc/todo/Limit_file_revision_history.mdwn | 117 |
1 files changed, 0 insertions, 117 deletions
diff --git a/doc/todo/Limit_file_revision_history.mdwn b/doc/todo/Limit_file_revision_history.mdwn deleted file mode 100644 index 48b44dea2..000000000 --- a/doc/todo/Limit_file_revision_history.mdwn +++ /dev/null @@ -1,117 +0,0 @@ -Hi, I am assuming to use git-annex-assistant for two usecases, but I would like to ask about the options or planed roadmap for dropped/removed files from the repository. - -Usecases: - -1. sync working directory between laptop, home computer, work komputer -2. archive functionality for my photograps - -Both usecases have one common factor. Some files might become obsolate and -in long time frame nobody is interested to keep their revisions. Let's -assume photographs. Usuall workflow I take is to import all photograps to -filesystem, then assess (select) the good ones I want to keep and then -process them what ever way. - -Problem with git-annex(-assistant) I have is that it start to revision all -of the files at the time they are added to directory. This is welcome at -first but might be an issue if you are used to put 80% of the size of your -imported files to trash. - -I am aware of what git-annex is not. I have been reading documentation for -"git-annex drop" and "unused" options including forums. I do understand -that I am actually able to delete all revisions of the file if I will drop -it, remove it and if I will run git annex unused 1..###. (on all synced -repositories). - -I actually miss the option to have above process automated/replicated to the other synced repositories. - -I would formulate the 'use case' requirements for git-annex as: - -* command to drop an file including revisions from all annex repositories? - (for example like moving a file to /trash folder) that will schedulle - it's deletition) -* option to keep like max. 10 last revisions of the file? -* option to keep only previous revisions if younger than 6 months from now? - -Finally, how to specify a feature request for git-annex? - -> By moving it here ;-) --[[Joey]] - -> So, let's spec out a design. -> -> * Add preferred content terminal to configure whether a repository wants -> to hang on to unused content. Simply `unused`. -> (It cannot include a timestamp, because there's -> no way repos can agree on about when a key became unused.) **done** -> * In order to quickly match that terminal, the Annex monad will need -> to keep a Set of unused Keys. This should only be loaded on demand. -> **done** -> NB: There is some potential for a great many unused Keys to cause -> memory usage to balloon. -> * Client repositories will end their preferred content with -> `and (not unused)`. Transfer repositories too, because typically -> only client repos connect to them, and so otherwise unused files -> would build up there. Backup repos would want unused files. I -> think that archive repos would too. **done** -> * Make the assistant check for unused files periodically. Exactly -> how often may need to be tuned, but once per day seems reasonable -> for most repos. Note that the assistant could also notice on the -> fly when files are removed and mark their keys as unused if that was -> the last associated file. (Only currently possible in direct mode.) -> **done** -> * After scanning for unused files, it makes sense for the -> assistant to queue transfers of unused files to any remotes that -> do want them (eg, backup remotes). If the files can successfully be -> sent to a remote, that will lead to them being dropped locally as -> they're not wanted. -> * Add a git config setting like annex.expireunused=7d. This causes -> *deletion* of unused files after the specified time period if they are -> not able to be moved to a repo that wants them. -> (The default should be annex.expireunused=false.) -> * How to detect how long a file has been unused? We can't look at the -> time stamp of the object; we could use the mtime of the .map file, -> that that's direct mode only and may be replaced with a database -> later. Seems best to just keep a unused log file with timestamps. -> **done** -> * After the assistant scans for unused files, if annex.expireunused -> is not set, and there is some significant quantity of unused files -> (eg, more than 1000, or more than 1 gb, or more than the amount of -> remaining free disk space), -> it can pop up a webapp alert asking to configure it. **done** -> * Webapp interface to configure annex.expireunused. Reasonable values -> are no expiring, or any number of days. **done** -> -> [[done]] This does not cover every use case that was requested. -> But I don't see a cheap way to ensure it keeps eg the past 10 versions of -> a file. I guess that if you care about that, you leave -> annex.expireunused=false, and set up a backup repository where the unused -> files will be moved to. -> -> Note that since the assistant uses direct mode by default, old versions -> of modififed files are not guaranteed to be retained. But they very well -> might be. For example, if a file is replicated to 2 clients, and one -> client directly edits it, or deletes it, it loses the old version, -> but the other client will still be storing that old version. -> -> ## Stability analysis for unused in preferred content expressions -> -> This is tricky, because two repos that are otherwise entirely -> in sync may have differing opinons about whether a key is unused, -> depending on when each last scanned for unused keys. -> -> So, this preferred content terminal is *not stable*. -> It may be possible to write preferred content expressions -> that constantly moved such keys around without reaching a steady state. -> -> Example: -> -> A and B are clients directly connected, and both also connected -> to BACKUP. -> -> A deletes F. B syncs with A, and runs unused check; decides F -> is unused. B sends F to BACKUP. B will then think A doesn't want F, -> and will drop F from A. Next time A runs a full transfer scan, it will -> *not* find F (because the file was deleted!). So it won't get F back from -> BACKUP. -> -> So, it looks like the fact that unused files are not going to be -> looked for on the full transfer scan seems to make this work out ok. |