summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2011-11-08 01:27:06 -0400
committerGravatar Joey Hess <joey@kitenet.net>2011-11-08 01:27:06 -0400
commit05b7608113a6b9abf92064884361f3e035ef3255 (patch)
tree91fad34d9b794a457aceb18e19e5a1df0aef684b
parentb11a63a860e8446cf3a4b35a5d8ef76329d5135c (diff)
update
-rw-r--r--doc/todo/git-annex_unused_eats_memory.mdwn6
1 files changed, 4 insertions, 2 deletions
diff --git a/doc/todo/git-annex_unused_eats_memory.mdwn b/doc/todo/git-annex_unused_eats_memory.mdwn
index fcb09a1af..3e9942e98 100644
--- a/doc/todo/git-annex_unused_eats_memory.mdwn
+++ b/doc/todo/git-annex_unused_eats_memory.mdwn
@@ -2,12 +2,14 @@
(all keys with content present in the repository,
with all keys used by files in the repository), and so
uses more memory than git-annex typically needs; around
-60-80 mb when run in a repository with 80 thousand files.
+50 mb when run in a repository with 80 thousand files.
+
+(Used to be 80 mb, but implementation improved.)
I would like to reduce this. One idea is to use a bloom filter.
For example, construct a bloom filter of all keys used by files in
the repository. Then for each key with content present, check if it's
-in the bloom filter. Since there can be false negatives, this might
+in the bloom filter. Since there can be false positives, this might
miss finding some unused keys. The probability/size of filter
could be tunable.