aboutsummaryrefslogtreecommitdiff
path: root/doc/todo
diff options
context:
space:
mode:
authorGravatar http://joeyh.name/ <http://joeyh.name/@web>2014-07-30 15:03:46 +0000
committerGravatar admin <admin@branchable.com>2014-07-30 15:03:46 +0000
commit64c9d697568cf84b275cc72af89917d64ee4a7b3 (patch)
treec71572256c7af22e3a260a6cd19abb2c2ef33ef5 /doc/todo
parent054b15bb5bedc0abaf84adf4d0638ecf067854be (diff)
Added a comment: interesting idea
Diffstat (limited to 'doc/todo')
-rw-r--r--doc/todo/Speed_up___39__import_--clean-duplicates__39__/comment_1_9268c639d3d21cce4ca7b60d08e9cb65._comment10
1 files changed, 10 insertions, 0 deletions
diff --git a/doc/todo/Speed_up___39__import_--clean-duplicates__39__/comment_1_9268c639d3d21cce4ca7b60d08e9cb65._comment b/doc/todo/Speed_up___39__import_--clean-duplicates__39__/comment_1_9268c639d3d21cce4ca7b60d08e9cb65._comment
new file mode 100644
index 000000000..8584d5ae8
--- /dev/null
+++ b/doc/todo/Speed_up___39__import_--clean-duplicates__39__/comment_1_9268c639d3d21cce4ca7b60d08e9cb65._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="http://joeyh.name/"
+ ip="24.159.78.125"
+ subject="interesting idea"
+ date="2014-07-30T15:03:46Z"
+ content="""
+This could be done in constant space using a bloom filter of known file sizes. Files with wrong sizes would sometimes match, but no problem, it would then just do the work it does now.
+
+However, to build such a filter, git-annex would need to do a scan of all keys it knows about. This would take approximately as long to run as `git annex unused` does. It might make sense to only build the filter if it runs into a fairly large file. Alternatively, a bloom filter of file sizes could be cached and updated on the fly as things change (but this gets pretty complex).
+"""]]