summaryrefslogtreecommitdiff
path: root/doc/todo/cheaper_global_fsck.mdwn
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2015-04-03 18:19:56 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2015-04-03 18:19:56 -0400
commit00cbfca37e28170dd28df1f921b1bfc93ced16ff (patch)
tree5d51a8851bf980b1db3d9887ffc3d72c8d5325ef /doc/todo/cheaper_global_fsck.mdwn
parentb13309f1d5dcf5f8427d5752dc25063e396c2b84 (diff)
oh wow, why didn't I think of this before?
Diffstat (limited to 'doc/todo/cheaper_global_fsck.mdwn')
-rw-r--r--doc/todo/cheaper_global_fsck.mdwn36
1 files changed, 36 insertions, 0 deletions
diff --git a/doc/todo/cheaper_global_fsck.mdwn b/doc/todo/cheaper_global_fsck.mdwn
new file mode 100644
index 000000000..3ca73e7da
--- /dev/null
+++ b/doc/todo/cheaper_global_fsck.mdwn
@@ -0,0 +1,36 @@
+Global fsck updates all location log entries for a repo. This wastes disk
+space.
+
+I realized now that it can be implemented w/o such waste. Probably cheaply
+enough to be the default!
+
+What we need is a new log file, call it fscktimes.log.
+This records the time of the last fsck of each repo.
+
+`git annex fsck --expire` no longer needs to look at the location log at
+all. It can just check the repo's fscktimes.log entry. If the entry is
+recent enough, we know that the repo has fscked recently, and its location
+log is good, and nothing needs to be done. Otherwise, we know that the repo
+has stopped fscking, and we simply expire *all* its location logs.
+
+Note that fscktime.log is only used by fsck; it does not impact git-annex
+generally or make it slower. And, it's very low overhead to update the one
+file. Repos could do a fsck --fast on a daily basis and not grow the
+git-annex branch much. Maybe on an hourly basis even.
+
+(BTW, there is some overlap with the fsck.log file that is currently used to
+hold the timestamp of the last local fsck. May be able to eliminate that
+file too.)
+
+----
+
+It might be worth making the fsck.log record --fast and full fscks
+separately so we know the last of each for each repo. This would let
+--expire require periodic full fscks and more frequent fast fscks.
+
+----
+
+It would even be possible to make regular fsck check for expiry of other
+repos. This would need the expiration values to be stored in the git-annex
+branch. It would not add much overhead, but I don't know if I see any
+reason to do that.