summaryrefslogtreecommitdiff
path: root/doc/bugs/git-annex_opens_too_many_files/comment_4_c03bde64be8fdd962826bc7afa07d2a9._comment
blob: b4c4587063f16368953c165bf78fd1dcc9195793 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
[[!comment format=mdwn
 username="https://www.google.com/accounts/o8/id?id=AItOawlmRpGORNKWimtzqItvwm4I6cn16vx8OvU"
 nickname="hayden"
 subject="Many network sockets with associated fds hanging around"
 date="2013-08-15T18:00:42Z"
 content="""
I see something similar in logs and after roughly 10 mins the web-apps dies. 
So I think I hit the same as the above user. 
Sometimes I get thread deaths and restart requests but the root cause appears to match the scenario mentioned over.  
Often the webapp just hangs.  But always when it hits the fd ulimit... 1024 on this system.

git-annex version is 4.20130802-g1452ac3 and I used the static-linked linux tar.gz linux-binary download.

Test setup is a from scratch assistant startup on Ubuntu 12.04.  
Not exactly a clean ubuntu though, so maybe difficult to duplicate troubles at your end.

I fired up the web-app with a cleaned out config.  No signs of leaks until an annex is created. 
On creation of an empty annex I get fd leaks a about 1 per second after a repository is created.  
Strace'ing the main process only shows 8-bytes writes (see below) at the same rate as the leak.
Sometimes the fd-leak stops before the resource limit, sometimes not.  
Creating a new annex on top of an existing directory tree with many files is pretty reliable trigger though.
Startup scan finishes and fds leak away until the ulimit is hit.

    hayden@orca:~/gamma$ ls /proc/26319/fd
    0  10  12  14  16  18  2   21  23  25  27  29  30  32  34  36  38  4   41  43  45  6  8
    1  11  13  15  17  19  20  22  24  26  28  3   31  33  35  37  39  40  42  44  5   7  9
    hayden@orca:~/gamma$ ls /proc/26319/fd
    0  10  12  14  16  18  2   21  23  25  27  29  30  32  34  36  38  4   41  43  45  6  8
    1  11  13  15  17  19  20  22  24  26  28  3   31  33  35  37  39  40  42  44  5   7  9
    hayden@orca:~/gamma$ ls /proc/26319/fd
    0  10  12  14  16  18  2   21  23  25  27  29  30  32  34  36  38  4   41  43  45  5  7  9
    1  11  13  15  17  19  20  22  24  26  28  3   31  33  35  37  39  40  42  44  46  6  8

    hayden@orca:~/gamma$ ls /proc/26319/fd/43
    /proc/26319/fd/43

    hayden@orca:~/gamma$ ls -l  /proc/26319/fd/43
    ls -l  /proc/26319/fd/43
    lrwx------ 1 hayden hayden 64 Aug 14 21:10 /proc/26319/fd/43 -> socket:[568994]

    hayden@orca:~/gamma$ lsof | grep 568994
    git-annex 26319     hayden   43u     IPv4             568994      0t0      UDP 224.0.0.251:55556 

    hayden@orca:~/gamma$ uname -a
    Linux orca 3.2.0-25-generic #40-Ubuntu SMP Wed May 23 20:30:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

    hayden@orca:~/gamma$ cat /etc/issue
    Ubuntu 12.04 LTS \n \l

    hayden@orca:~/gamma$ fuser 55556/udp -a
    55556/udp:            4415 26319 27080 27083

    hayden@orca:~/gamma$ ps aux | grep 4415
    hayden    4415  0.0  0.0  66716  3036 ?        S    Aug12   0:03 curl -s --head -L http://127.0.0.1:38464/?auth=da9c4aba4cc2db9cf78574753f6e94d8031c6a7bdf8bfe100bde868f57b81fd751965cc9d68a9afac79f826d257a256a04ce62615a7f23cd7c925969dda1c7b8 -w %{http_code}
    hayden   27458  0.0  0.0  10612   924 pts/0    S+   21:50   0:00 grep --color=auto 4415

    hayden@orca:~/gamma$ ps aux | grep 26319
    hayden   26319  0.1  0.7 497188 28560 pts/5    Sl   21:09   0:04 git-annex webapp
    hayden   26338  3.4  3.4 1035572 137864 pts/5  Sl   21:09   1:25 /usr/lib/firefox/firefox /tmp/webapp26319.html
    hayden   27460  0.0  0.0  10612   920 pts/0    S+   21:50   0:00 grep --color=auto 26319

    hayden@orca:~/gamma$ ps aux | grep 27080
    hayden   27080  0.0  0.0  16476  1288 pts/5    S    21:33   0:00 git --git-dir=/home/hayden/boo/.git --work-tree=/home/hayden/boo cat-file --batch
    hayden   27462  0.0  0.0  10612   924 pts/0    S+   21:50   0:00 grep --color=auto 27080

    hayden@orca:~/gamma$ ps aux | grep 27083
    hayden   27083  0.0  0.0  16476  1060 pts/5    S    21:33   0:00 git --git-dir=/home/hayden/boo/.git --work-tree=/home/hayden/boo check-attr -z --stdin annex.backend annex.numcopies --
    hayden   27464  0.0  0.0  10612   920 pts/0    S+   21:51   0:00 grep --color=auto 27083

----> has 579 open fds at this point but this number holds stable over 10 min
(copy in new tree to provoke)
(no change)
----> restart daemon in gui to provoke
(new process has open fds slowly climbing after startup scan) 

straces look like this repeating every second. (clipped down)
 
    futex(0x34f001c, FUTEX_WAIT_PRIVATE, 51, NULL) = ? ERESTARTSYS (To be restarted)
    --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
    rt_sigreturn(0x1a)                      = 202
    futex(0x34f001c, FUTEX_WAIT_PRIVATE, 51, NULL) = ? ERESTARTSYS (To be restarted)
    --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
    rt_sigreturn(0x1a)                      = 202
    futex(0x34f001c, FUTEX_WAIT_PRIVATE, 51, NULL) = ? ERESTARTSYS (To be restarted)
    --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
    rt_sigreturn(0x1a)                      = 202
    futex(0x34f001c, FUTEX_WAIT_PRIVATE, 51, NULL) = ? ERESTARTSYS (To be restarted)
    --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---
    write(6, \"\377\0\0\0\0\0\0\0\", 8)       = 8
    rt_sigreturn(0x2)                       = 202
    futex(0x34f001c, FUTEX_WAIT_PRIVATE, 51, NULL^C <unfinished ...>
    Process 29803 detached

Until after roughly 10 mins...

    hayden@orca:~/boo$ ls /proc/29803/fd | wc -l
    1017
    hayden@orca:~/boo$ ls /proc/29803/fd | wc -l
    1023
    hayden@orca:~/boo$ ls /proc/29803/fd | wc -l
    1024
    hayden@orca:~/boo$ ls /proc/29803/fd | wc -l
    1024
    hayden@orca:~/boo$ ls /proc/29803/fd | wc -l
    1024
    hayden@orca:~/boo$ ls /proc/29803/fd | wc -l
    1024

    hayden@orca:~/boo$ ulimit -a
    core file size          (blocks, -c) 0
    data seg size           (kbytes, -d) unlimited
    scheduling priority             (-e) 0
    file size               (blocks, -f) unlimited
    pending signals                 (-i) 31164
    max locked memory       (kbytes, -l) 64
    max memory size         (kbytes, -m) unlimited
    open files                      (-n) 1024
    pipe size            (512 bytes, -p) 8
    POSIX message queues     (bytes, -q) 819200
    real-time priority              (-r) 0
    stack size              (kbytes, -s) 8192
    cpu time               (seconds, -t) unlimited
    max user processes              (-u) 31164
    virtual memory          (kbytes, -v) unlimited
    file locks                      (-x) unlimited
    hayden@orca:~/boo$ 

At this point the webapp hangs but a number of interesting crashes may occur. I've also seen the particular error in the previous users log (on a big tree). 

    [2013-08-14 21:56:12 CEST] main: starting assistant version 4.20130802-g1452ac3
    (scanning...) [2013-08-14 21:56:12 CEST] Watcher: Performing startup scan
    (started...) DaemonStatus crashed: /home/hayden/boo/.git/annex/: openTempFile: resource exhausted (Too many open files)
    [2013-08-14 22:06:12 CEST] DaemonStatus: warning DaemonStatus crashed: /home/hayden/boo/.git/annex/: openTempFile: resource exhausted (Too many open files)

Is any of the above helpful?  Anything else useful to kick for testing that you'd like done?
I'd guess this is something weird with my ubuntu setup that provokes this as more users would see it otherwise. 
"""]]