summaryrefslogtreecommitdiff
path: root/doc/forum/benefit_of_splitting_a_repository/comment_1_93a86cb03b66e7ab5dd7146e7b86c9e8._comment
blob: 72ea4f29bbe7cb92a20d82ee1d7da4c0aaa30371 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[[!comment format=mdwn
 username="http://joeyh.name/"
 ip="4.154.6.49"
 subject="comment 1"
 date="2012-11-28T18:16:10Z"
 content="""
`git-annex unused` needs to scan the entire repository. But it uses a bloom filter, so its complexity is O(n) to the number of keys.

`git annex fsck` scans the entire repository and also reads all available file content. But we have incremental fsck support now.

The rest of git-annex is designed to have good locality.

The main problem you are likely to run into is innefficiencies with git's index file. This file records the status of every file in the repository, and commands like `git add` rewrite the whole file. git-annex uses a journal to minimise operations that need to rewrite the git index file, but this won't help you when you're using raw git commands in the repository.

"""]]