summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2017-05-09 18:16:49 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2017-05-09 18:16:49 -0400
commit8bed5c07ecc1331b0fc85a1fbd0766fe244136c3 (patch)
treebb326d97bbc2a8634228e7e5d1697feab11946f9
parent98038d7371951fc31f2e9bf737cc7e2b35f90e49 (diff)
update
-rw-r--r--doc/forum/Lots_of_4k_symlinks/comment_2_be9617e8cbc231069c44bc9f077ce673._comment32
1 files changed, 32 insertions, 0 deletions
diff --git a/doc/forum/Lots_of_4k_symlinks/comment_2_be9617e8cbc231069c44bc9f077ce673._comment b/doc/forum/Lots_of_4k_symlinks/comment_2_be9617e8cbc231069c44bc9f077ce673._comment
new file mode 100644
index 000000000..1b2d91eed
--- /dev/null
+++ b/doc/forum/Lots_of_4k_symlinks/comment_2_be9617e8cbc231069c44bc9f077ce673._comment
@@ -0,0 +1,32 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 2"""
+ date="2017-05-09T21:07:11Z"
+ content="""
+You also get better seek speed with packed inodes.
+
+With default 256 byte inodes, there seems to be 59 bytes to play with.
+(Determined experimentally.)
+
+Note that disks over 4 tb default to 32 kilobyte inodes, so probably most
+spinning hard disks these days *do* pack regular git-annex symlinks
+efficiently. (I don't have a 4 tb disk online to check this.. And I doubt
+CandyAngel was counting only the sizes of symlinks and not git repos
+or at least directory inodes to hold all the symlinks.)
+
+With a prefix like ".git/annex/objects/zX/Wx/S-s1000000000-"
+that leaves 20 bytes out of the 59 for the hash.
+
+That's not enough data to be cryptographically secure, but if
+we use SHA1 or MD5 as the base hash, it wouldn't be anyway. 15 bytes
+of hash state will base64 encode to 20 bytes. SHA1 is a 20 byte hash;
+MD5 is a 16 byte hash. So even MD5 would need to be truncated a little bit.
+Chances of (non-malicious) collision would still be small, only 256
+times as likely as a (non-malicious) MD5 collision. It could easily be made
+harder than MD5/SHA1 to maliciously collide by using truncated SHA2.
+
+(Files larger than 9.3 gb would still have too long symlinks due to the size
+field. The size field could also be omitted or encoded more efficiently,
+but omitting it would reduce git-annex's ability to not overfill disk
+and I don't think re-encoding buys enough to bother.)
+"""]]