From 2b28c70f5fd9c03cadd39615a4abd4d12f4a9c35 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Tue, 14 Feb 2012 01:01:38 -0400 Subject: add, and immediately close bug. useful documentation though --- doc/bugs/git_annex_add_memory_leak.mdwn | 38 +++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) create mode 100644 doc/bugs/git_annex_add_memory_leak.mdwn (limited to 'doc/bugs/git_annex_add_memory_leak.mdwn') diff --git a/doc/bugs/git_annex_add_memory_leak.mdwn b/doc/bugs/git_annex_add_memory_leak.mdwn new file mode 100644 index 000000000..846e76436 --- /dev/null +++ b/doc/bugs/git_annex_add_memory_leak.mdwn @@ -0,0 +1,38 @@ +For the record, `git annex add` has had a series of memory leaks. +Mostly these are minor -- until you need to check in a few +million files in a single operation. + +If this happens to you, git-annex will run out of memory and stop. +(Generally well before your system runs out of memory, since it has some +built-in ulimits.) You can recover by just re-running the `git annex add` +-- it will automatically pick up where it left off. + +A history of the leaks: + +* Originally, `git annex add` remembered all the files + it had added, and fed them to git at the end. Of course + that made its memory use grow, so it was fixed to periodically + flush its buffer. Affected versions: before 0.20110417 + +* Something called a "lazy state monad" caused "thunks" to build + up and memory to leak. Also affected other git annex commands + than `add`. Adding files using a SHA* backend hit the worst. + Fixed in versions afer 3.20120123. + +* A strange GHC bug seemed to be responsible for another leak. + (In particular, a child process was forked. All the child did + was read filenames from one pipe and shove them reformatted out + another pipe. For some reason, it steadily grew in size.) + Code was rewritten in a way that happens to avoid that leak. + Apparently fixed in versions afer 3.20120123, but this one is not + well understood. + +* (Note that `git ls-files --others`, which is used to find files to add, + also uses surpsisingly large amounts + of memory when you have a lot of files. It buffers + the entire list, so it can compare it with the files in the index, + before outputting anything. + This is Not Our Problem, but I'm sure the git developers + would appreciate a patch that fixes it.) + +[[done]] ... for now --[[Joey]] -- cgit v1.2.3