blob: 7e2c5568d229b656c210a04dbeb7c82e96bd4c64 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
|
[[!comment format=mdwn
username="http://joeyh.name/"
ip="209.250.56.55"
subject="comment 3"
date="2014-07-04T19:26:00Z"
content="""
Looking at the code, it's pretty clear why this is using a lot of memory:
<pre>
fs <- getJournalFiles jl
liftIO $ do
h <- hashObjectStart g
Git.UpdateIndex.streamUpdateIndex g
[genstream dir h fs]
hashObjectStop h
return $ liftIO $ mapM_ (removeFile . (dir </>)) fs
</pre>
So the big list in `fs` has to be retained in memory after the files are streamed to update-index, in order for them to be deleted!
Fixing is a bit tricky.. New journal files can appear while this is going on, so it can't just run getJournalFiles a second time to get the files to clean.
Maybe delete the file after it's been sent to git-update-index? But git-update-index is going to want to read the file, and we don't really know when it will choose to do so. It could wait a while after we've sent the filename to it, potentially.
Also, per [[!commit 750c4ac6c282d14d19f79e0711f858367da145e4]], we cannot delete the journal files until *after* the commit, or another git-annex process would see inconsistent data!
I actually think I am going to need to use a temp file to hold the list of files..
"""]]
|