summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar http://johan.kiviniemi.name/ <Johan@web>2014-04-04 14:16:25 +0000
committerGravatar admin <admin@branchable.com>2014-04-04 14:16:25 +0000
commite4bef1cddc57affbf6a810c830728341c2a9f844 (patch)
tree895e0cfc6fd04e08f26b344b7c59e9e5abcfa5a3
parent2e4c82aa1fa427ff9c72bf1d92a64e678dcac0c7 (diff)
Added a comment: Rolling hash chunking
-rw-r--r--doc/design/git-remote-daemon/comment_1_bfa8f33a3fdb6e271dfbdd0378b5d364._comment16
1 files changed, 16 insertions, 0 deletions
diff --git a/doc/design/git-remote-daemon/comment_1_bfa8f33a3fdb6e271dfbdd0378b5d364._comment b/doc/design/git-remote-daemon/comment_1_bfa8f33a3fdb6e271dfbdd0378b5d364._comment
new file mode 100644
index 000000000..d93bab090
--- /dev/null
+++ b/doc/design/git-remote-daemon/comment_1_bfa8f33a3fdb6e271dfbdd0378b5d364._comment
@@ -0,0 +1,16 @@
+[[!comment format=mdwn
+ username="http://johan.kiviniemi.name/"
+ nickname="Johan"
+ subject="Rolling hash chunking"
+ date="2014-04-04T14:16:25Z"
+ content="""
+I am not sure which page is the best for this comment, but this one seems somewhat relevant.
+
+Given that a future telehash implementation may download files from multiple peers, it might be a good idea to download files in chunks, possibly in parallel. In this case, it might be a good idea to use a rolling hash for chunking (like rsync et al). [There is a package for that on Hackage](http://hackage.haskell.org/package/hash-0.2.0.1/docs/Data-Hash-Rolling.html).
+
+git-annex could store a list of chunk checksums in `.git/annex/objects/…/SHA….chunks` whenever the repository holds a copy of the file. The checksum list would be a small fraction of the file in size, but all the checksum lists for all the files in a repository might take up too much space to store in the `git-annex` branch.
+
+When getting an object, git-annex could first download the `.chunks` file from a remote/peer and then proceed to download missing chunks in a BitTorrent-like fashion.
+
+If git-annex has an idea about what locally present object might be an earlier version of the file, it could compare the checksum lists and only download the parts that have changed (à la rsync).
+"""]]