summaryrefslogtreecommitdiff
path: root/Annex/BloomFilter.hs
Commit message (Collapse)AuthorAge
* Another redundant constraintGravatar Gabor Greif2016-01-28
|
* remove 163 lines of code without changing anything except importsGravatar Joey Hess2016-01-20
|
* use bloom filter in second pass of sync --all --contentGravatar Joey Hess2015-06-16
| | | | | | | | This is needed because when preferred content matches on files, the second pass would otherwise want to drop all keys. Using a bloom filter avoids this, and in the case of a false positive, a key will be left undropped that preferred content would allow dropping. Chances of that happening are a mere 1 in 1 million.
* instance Hashable Key for bloomfilterGravatar Joey Hess2015-06-16
|
* Increased the default annex.bloomaccuracy from 1000 to 10000000Gravatar Joey Hess2015-06-16
This makes git annex unused use around 48 mb more memory than it did before, but the massive increase in accuracy makes this worthwhile for all but the smallest systems. Also, I want to use the bloom filter for sync --all --content, to avoid dropping files that the preferred content doesn't want, and 1/1000 false positives would be far too many in that use case, even if it were acceptable for unused. Actual memory use numbers: 1000: 21.06user 3.42system 0:26.40elapsed 92%CPU (0avgtext+0avgdata 501552maxresident)k 1000000: 21.41user 3.55system 0:26.84elapsed 93%CPU (0avgtext+0avgdata 549496maxresident)k 10000000: 21.84user 3.52system 0:27.89elapsed 90%CPU (0avgtext+0avgdata 549920maxresident)k Based on these numbers, 10 million seemed a better pick than 1 million.