diff options
author | Joey Hess <joey@kitenet.net> | 2012-02-14 14:35:52 -0400 |
---|---|---|
committer | Joey Hess <joey@kitenet.net> | 2012-02-14 14:37:59 -0400 |
commit | 7ebd98d8d829005c7dae38b789146d98e6800e5b (patch) | |
tree | e20bfa80563081153d712633eb25b8c6a0f5fc78 /doc/bugs/git_annex_add_memory_leak.mdwn | |
parent | cdd6cdbb67ea87a4a03357fb75c22d432fa2e15d (diff) |
fix memory leak when staging the journal
The list of files had to be retained until the end so it could be deleted.
Also, a list of update-index lines was generated and only then fed into it.
Now everything streams in constant space.
Diffstat (limited to 'doc/bugs/git_annex_add_memory_leak.mdwn')
-rw-r--r-- | doc/bugs/git_annex_add_memory_leak.mdwn | 21 |
1 files changed, 11 insertions, 10 deletions
diff --git a/doc/bugs/git_annex_add_memory_leak.mdwn b/doc/bugs/git_annex_add_memory_leak.mdwn index b6ae60f7b..891ba318f 100644 --- a/doc/bugs/git_annex_add_memory_leak.mdwn +++ b/doc/bugs/git_annex_add_memory_leak.mdwn @@ -12,26 +12,27 @@ A history of the leaks: * Originally, `git annex add` remembered all the files it had added, and fed them to git at the end. Of course that made its memory use grow, so it was fixed to periodically - flush its buffer. Affected versions: before 0.20110417 + flush its buffer. Fixed in version 0.20110417. * Something called a "lazy state monad" caused "thunks" to build up and memory to leak. Also affected other git annex commands than `add`. Adding files using a SHA* backend hit the worst. Fixed in versions afer 3.20120123. -* A strange GHC bug seemed to be responsible for another leak. - (In particular, a child process was forked. All the child did - was read filenames from one pipe and shove them reformatted out - another pipe. For some reason, it steadily grew in size.) - Code was rewritten in a way that happens to avoid that leak. - Apparently fixed in versions afer 3.20120123, but this one is not - well understood. - * Committing journal files turned out to have another memory leak. After adding a lot of files ran out of memory, this left the journal - behind and could affect other git-anne commands. Fixed in versions afer + behind and could affect other git-annex commands. Fixed in versions afer 3.20120123. +* Something is still causing a slow leak when adding files. + I tested by adding many copies of the whole linux kernel + tree into the annex using the WORM backend, and once + it had added 1 million files, git-annex used ~100 mb of ram. + That's 100 bytes leaked per file on average .. roughly the + size of a filename? It's worth noting that `git add` uses more memory + than that in such a large tree. + **not fixed yet** + * (Note that `git ls-files --others`, which is used to find files to add, also uses surpsisingly large amounts of memory when you have a lot of files. It buffers |