summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2014-01-22 14:35:38 -0400
committerGravatar Joey Hess <joey@kitenet.net>2014-01-22 14:35:38 -0400
commitb09828d1b7bbd7f18a4186b47fbd70b66e682af4 (patch)
treea61607eeb63e0721442c793709172bd537deefb2 /doc
parentd3f76d9d0f0811f71be39d09a9c1703eb5d4cb4c (diff)
promote forum post to feature request; add design
Diffstat (limited to 'doc')
-rw-r--r--doc/forum/Limit_file_revision_history.mdwn22
-rw-r--r--doc/todo/Limit_file_revision_history.mdwn89
2 files changed, 89 insertions, 22 deletions
diff --git a/doc/forum/Limit_file_revision_history.mdwn b/doc/forum/Limit_file_revision_history.mdwn
deleted file mode 100644
index 0e68ebb6d..000000000
--- a/doc/forum/Limit_file_revision_history.mdwn
+++ /dev/null
@@ -1,22 +0,0 @@
-Hi, I am assuming to use git-annex-assistant for two usecases, but I would like to ask about the options or planed roadmap for dropped/removed files from the repository.
-
-Usecases:
-
-1. sync working directory between laptop, home computer, work komputer
-2. archive functionality for my photograps
-
-Both usecases have one common factor. Some files might become obsolate and in long time frame nobody is interested to keep their revisions. Let's assume photographs. Usuall workflow I take is to import all photograps to filesystem, then assess (select) the good ones I want to keep and then process them what ever way.
-
-Problem with git-annex(-assistant) I have is that it start to revision all of the files at the time they are added to directory. This is welcome at first but might be an issue if you are used to put 80% of the size of your imported files to trash.
-
-I am aware of what git-annex is not. I have been reading documentation for "git-annex drop" and "unused" options including forums. I do understand that I am actually able to delete all revisions of the file if I will drop it, remove it and if I will run git annex unused 1..###. (on all synced repositories).
-
-I actually miss the option to have above process automated/replicated to the other synced repositories.
-
-I would formulate the 'use case' requirements for git-annex as:
-
-* command to drop an file including revisions from all annex repositories? (for example like moving a file to /trash folder) that will schedulle it's deletition)
-* option to keep like max. 10 last revisions of the file?
-* option to keep only previous revisions if younger than 6 months from now?
-
-Finally, how to specify a feature request for git-annex?
diff --git a/doc/todo/Limit_file_revision_history.mdwn b/doc/todo/Limit_file_revision_history.mdwn
new file mode 100644
index 000000000..593e93013
--- /dev/null
+++ b/doc/todo/Limit_file_revision_history.mdwn
@@ -0,0 +1,89 @@
+Hi, I am assuming to use git-annex-assistant for two usecases, but I would like to ask about the options or planed roadmap for dropped/removed files from the repository.
+
+Usecases:
+
+1. sync working directory between laptop, home computer, work komputer
+2. archive functionality for my photograps
+
+Both usecases have one common factor. Some files might become obsolate and
+in long time frame nobody is interested to keep their revisions. Let's
+assume photographs. Usuall workflow I take is to import all photograps to
+filesystem, then assess (select) the good ones I want to keep and then
+process them what ever way.
+
+Problem with git-annex(-assistant) I have is that it start to revision all
+of the files at the time they are added to directory. This is welcome at
+first but might be an issue if you are used to put 80% of the size of your
+imported files to trash.
+
+I am aware of what git-annex is not. I have been reading documentation for
+"git-annex drop" and "unused" options including forums. I do understand
+that I am actually able to delete all revisions of the file if I will drop
+it, remove it and if I will run git annex unused 1..###. (on all synced
+repositories).
+
+I actually miss the option to have above process automated/replicated to the other synced repositories.
+
+I would formulate the 'use case' requirements for git-annex as:
+
+* command to drop an file including revisions from all annex repositories?
+ (for example like moving a file to /trash folder) that will schedulle
+ it's deletition)
+* option to keep like max. 10 last revisions of the file?
+* option to keep only previous revisions if younger than 6 months from now?
+
+Finally, how to specify a feature request for git-annex?
+
+> By moving it here ;-) --[[Joey]]
+
+> So, let's spec out a design.
+>
+> * Add preferred content terminal to configure whether a repository wants
+> to hang on to unused content.
+> Something like "unused=true" I suppose, because not having a parameter
+> would complicate preferred content parsing, and I cannot think
+> of a useful parameter.
+> * In order to quickly match that terminal, the Annex monad will need
+> to keep a Set of unused Keys. This should only be loaded on demand.
+> NB: There is some potential for a great many unused Keys to cause
+> memory usage to balloon.
+> * Client repositories will end their preferred content with
+> `and unused=false`. Transfer repositories too, because typically
+> only client repos connect to them, and so otherwise unused files
+> would build up there. Backup repos would want unused files. I
+> think that archive repos would too.
+> * Make the assistant check for unused files periodically. Exactly
+> how often may need to be tuned, but once per day seems reasonable
+> for most repos. Note that the assistant could also notice on the
+> fly when files are removed and mark their keys as unused if that was
+> the last associated file. (Only currently possible in direct mode.)
+> * It makes sense for the
+> assistant to queue transfers of unused files to any remotes that
+> do want them (eg, backup remotes). If the files can successfully be
+> sent to a remote, that will lead to them being dropped locally as
+> they're not wanted.
+> * Add a git config setting like annex.expireunused=7d. This causes
+> *deletion* of unused files after the specified time period if they are
+> not able to be moved to a repo that wants them.
+> (The default should be annex.expireunused=false.)
+> * How to detect how long a file has been unused? We can't look at the
+> time stamp of the object; we could use the mtime of the .map file,
+> that that's direct mode only and may be replaced with a database
+> later. Seems best to just keep a unused log file with timestamps.
+> * After the assistant scans for unused files, if annex.expireunused
+> is not set, and there is some significant quantity of unused files
+> (eg, more than 1000, or more than 1 gb, or more than the amount of
+> remaining free disk space),
+> it can pop up a webapp alert asking to configure it.
+>
+> This does not cover every use case that was requested.
+> But I don't see a cheap way to ensure it keeps eg the past 10 versions of
+> a file. I guess that if you care about that, you leave
+> annex.expireunused=false, and set up a backup repository where the unused
+> files will be moved to.
+>
+> Note that since the assistant uses direct mode by default, old versions
+> of modififed files are not guaranteed to be retained. But they very well
+> might be. For example, if a file is replicated to 2 clients, and one
+> client directly edits it, or deletes it, it loses the old version,
+> but the other client will still be storing that old version.