diff options
author | Joey Hess <joey@kitenet.net> | 2014-05-29 15:23:05 -0400 |
---|---|---|
committer | Joey Hess <joey@kitenet.net> | 2014-05-29 15:23:05 -0400 |
commit | 1f6cfecc972b121fa42ea80383183bbaccc2195a (patch) | |
tree | 0a450c4226f5e05c2a3597a9f520376de281fffe /doc/todo/git-annex_unused_eats_memory.mdwn | |
parent | a95fb731cd117f35a6e0fce90d9eb35d0941e26e (diff) |
remove old closed bugs and todo items to speed up wiki updates and reduce size
Remove closed bugs and todos that were least edited before 2014.
Command line used:
for f in $(grep -l '\[\[done\]\]' *.mdwn); do if [ -z $(git log --since=2014 --pretty=oneline "$f") ]; then git rm $f; git rm -rf $(echo "$f" | sed 's/.mdwn$//'); fi; done
Diffstat (limited to 'doc/todo/git-annex_unused_eats_memory.mdwn')
-rw-r--r-- | doc/todo/git-annex_unused_eats_memory.mdwn | 32 |
1 files changed, 0 insertions, 32 deletions
diff --git a/doc/todo/git-annex_unused_eats_memory.mdwn b/doc/todo/git-annex_unused_eats_memory.mdwn deleted file mode 100644 index 760a6ccf5..000000000 --- a/doc/todo/git-annex_unused_eats_memory.mdwn +++ /dev/null @@ -1,32 +0,0 @@ -`git-annex unused` has to compare large sets of data -(all keys with content present in the repository, -with all keys used by files in the repository), and so -uses more memory than git-annex typically needs. - -It used to be a lot worse (hundreds of megabytes). - -Now it only needs enough memory to store a Set of all Keys that currently -have content in the annex. On a lightly populated repository, it runs in -quite low memory use (like 8 mb) even if the git repo has 100 thousand -files. On a repository with lots of file contents, it will use more. - -Still, I would like to reduce this to a purely constant memory use, -as running in constant memory no matter the repo size is a git-annex design -goal. - -One idea is to use a bloom filter. -For example, construct a bloom filter of all keys used by files in -the repository. Then for each key with content present, check if it's -in the bloom filter. Since there can be false positives, this might -miss finding some unused keys. The probability/size of filter -could be tunable. - -> Fixed in `bloom` branch in git. --[[Joey]] ->> [[done]]! --[[Joey]] - -Another way might be to scan the git log for files that got removed -or changed what key they pointed to. Correlate with keys with content -currently present in the repository (possibly using a bloom filter again), -and that would yield a shortlist of keys that are probably not used. -Then scan thru all files in the repo to make sure that none point to keys -on the shortlist. |