From 00cbfca37e28170dd28df1f921b1bfc93ced16ff Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Fri, 3 Apr 2015 18:19:56 -0400 Subject: oh wow, why didn't I think of this before? --- doc/todo/cheaper_global_fsck.mdwn | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) create mode 100644 doc/todo/cheaper_global_fsck.mdwn diff --git a/doc/todo/cheaper_global_fsck.mdwn b/doc/todo/cheaper_global_fsck.mdwn new file mode 100644 index 000000000..3ca73e7da --- /dev/null +++ b/doc/todo/cheaper_global_fsck.mdwn @@ -0,0 +1,36 @@ +Global fsck updates all location log entries for a repo. This wastes disk +space. + +I realized now that it can be implemented w/o such waste. Probably cheaply +enough to be the default! + +What we need is a new log file, call it fscktimes.log. +This records the time of the last fsck of each repo. + +`git annex fsck --expire` no longer needs to look at the location log at +all. It can just check the repo's fscktimes.log entry. If the entry is +recent enough, we know that the repo has fscked recently, and its location +log is good, and nothing needs to be done. Otherwise, we know that the repo +has stopped fscking, and we simply expire *all* its location logs. + +Note that fscktime.log is only used by fsck; it does not impact git-annex +generally or make it slower. And, it's very low overhead to update the one +file. Repos could do a fsck --fast on a daily basis and not grow the +git-annex branch much. Maybe on an hourly basis even. + +(BTW, there is some overlap with the fsck.log file that is currently used to +hold the timestamp of the last local fsck. May be able to eliminate that +file too.) + +---- + +It might be worth making the fsck.log record --fast and full fscks +separately so we know the last of each for each repo. This would let +--expire require periodic full fscks and more frequent fast fscks. + +---- + +It would even be possible to make regular fsck check for expiry of other +repos. This would need the expiration values to be stored in the git-annex +branch. It would not add much overhead, but I don't know if I see any +reason to do that. -- cgit v1.2.3