summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2015-09-03 13:52:43 -0700
committerGravatar Joey Hess <joeyh@joeyh.name>2015-09-03 13:52:43 -0700
commit7346484fcd91d668cbffc0f4a3bcea3f6d1b1823 (patch)
tree5218ab628fcd4087b00928e7613e69e603a44f44
parent3f52a8b777c2888aafa240d23834a1d6587cc536 (diff)
possible optimisation idea
-rw-r--r--doc/todo/use_git-mktree_rather_than_index_file.mdwn29
1 files changed, 29 insertions, 0 deletions
diff --git a/doc/todo/use_git-mktree_rather_than_index_file.mdwn b/doc/todo/use_git-mktree_rather_than_index_file.mdwn
new file mode 100644
index 000000000..f02fafb08
--- /dev/null
+++ b/doc/todo/use_git-mktree_rather_than_index_file.mdwn
@@ -0,0 +1,29 @@
+When git-annex is updating the git-annex branch, it currently
+uses a separate index file. This adds overhead and complexity to the code.
+Especially when there are many files, the index file gets large and writing
+it gets slow.
+
+It might be an improvement to use `git mktree --batch` to inject a
+tree object into git, without using the index file. `git hash-object`
+is already used to add the files to git. All that would be needed is to
+generate an updated tree containing the new file(s), and then update each
+parent tree up to the root tree. This new tree can then be committed using
+`git commit-tree`
+
+The only thing I can see that might make this slow at all is reading the old
+tree contents, in order to update it. This would need a `git ls-tree` for
+each tree; it does not have a batch mode, so 4 processes would need to be
+spawned when generating a tree that changes 1 file. For any repo that's not
+very small, that's probably still faster than rewriting the index file.
+
+Notes:
+
+* The union merge code currently uses the index. No particular reason
+ it needs to; that's just how the code is written, and it might be a large
+ rewrite to change it.
+* A new git-annex branch can be pushed into the repository at any time.
+ The current code uses the index to detect when this happens, and
+ union merges the new branch head into the index. Would need something
+ like a `GIT_ANNEX_HEAD` ref to do the same if the index is removed.
+
+Thanks to sm for indirectly suggesting this. --[[Joey]]