summaryrefslogtreecommitdiff
path: root/doc/todo/git-annex_unused_eats_memory.mdwn
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2014-05-29 15:23:05 -0400
committerGravatar Joey Hess <joey@kitenet.net>2014-05-29 15:23:05 -0400
commit1f6cfecc972b121fa42ea80383183bbaccc2195a (patch)
tree0a450c4226f5e05c2a3597a9f520376de281fffe /doc/todo/git-annex_unused_eats_memory.mdwn
parenta95fb731cd117f35a6e0fce90d9eb35d0941e26e (diff)
remove old closed bugs and todo items to speed up wiki updates and reduce size
Remove closed bugs and todos that were least edited before 2014. Command line used: for f in $(grep -l '\[\[done\]\]' *.mdwn); do if [ -z $(git log --since=2014 --pretty=oneline "$f") ]; then git rm $f; git rm -rf $(echo "$f" | sed 's/.mdwn$//'); fi; done
Diffstat (limited to 'doc/todo/git-annex_unused_eats_memory.mdwn')
-rw-r--r--doc/todo/git-annex_unused_eats_memory.mdwn32
1 files changed, 0 insertions, 32 deletions
diff --git a/doc/todo/git-annex_unused_eats_memory.mdwn b/doc/todo/git-annex_unused_eats_memory.mdwn
deleted file mode 100644
index 760a6ccf5..000000000
--- a/doc/todo/git-annex_unused_eats_memory.mdwn
+++ /dev/null
@@ -1,32 +0,0 @@
-`git-annex unused` has to compare large sets of data
-(all keys with content present in the repository,
-with all keys used by files in the repository), and so
-uses more memory than git-annex typically needs.
-
-It used to be a lot worse (hundreds of megabytes).
-
-Now it only needs enough memory to store a Set of all Keys that currently
-have content in the annex. On a lightly populated repository, it runs in
-quite low memory use (like 8 mb) even if the git repo has 100 thousand
-files. On a repository with lots of file contents, it will use more.
-
-Still, I would like to reduce this to a purely constant memory use,
-as running in constant memory no matter the repo size is a git-annex design
-goal.
-
-One idea is to use a bloom filter.
-For example, construct a bloom filter of all keys used by files in
-the repository. Then for each key with content present, check if it's
-in the bloom filter. Since there can be false positives, this might
-miss finding some unused keys. The probability/size of filter
-could be tunable.
-
-> Fixed in `bloom` branch in git. --[[Joey]]
->> [[done]]! --[[Joey]]
-
-Another way might be to scan the git log for files that got removed
-or changed what key they pointed to. Correlate with keys with content
-currently present in the repository (possibly using a bloom filter again),
-and that would yield a shortlist of keys that are probably not used.
-Then scan thru all files in the repo to make sure that none point to keys
-on the shortlist.