[[!comment format=mdwn username="https://www.google.com/accounts/o8/id?id=AItOawlmRpGORNKWimtzqItvwm4I6cn16vx8OvU" nickname="hayden" subject="Many network sockets with associated fds hanging around" date="2013-08-15T18:00:42Z" content=""" I see something similar in logs and after roughly 10 mins the web-apps dies. So I think I hit the same as the above user. Sometimes I get thread deaths and restart requests but the root cause appears to match the scenario mentioned over. Often the webapp just hangs. But always when it hits the fd ulimit... 1024 on this system. git-annex version is 4.20130802-g1452ac3 and I used the static-linked linux tar.gz linux-binary download. Test setup is a from scratch assistant startup on Ubuntu 12.04. Not exactly a clean ubuntu though, so maybe difficult to duplicate troubles at your end. I fired up the web-app with a cleaned out config. No signs of leaks until an annex is created. On creation of an empty annex I get fd leaks a about 1 per second after a repository is created. Strace'ing the main process only shows 8-bytes writes (see below) at the same rate as the leak. Sometimes the fd-leak stops before the resource limit, sometimes not. Creating a new annex on top of an existing directory tree with many files is pretty reliable trigger though. Startup scan finishes and fds leak away until the ulimit is hit. hayden@orca:~/gamma$ ls /proc/26319/fd 0 10 12 14 16 18 2 21 23 25 27 29 30 32 34 36 38 4 41 43 45 6 8 1 11 13 15 17 19 20 22 24 26 28 3 31 33 35 37 39 40 42 44 5 7 9 hayden@orca:~/gamma$ ls /proc/26319/fd 0 10 12 14 16 18 2 21 23 25 27 29 30 32 34 36 38 4 41 43 45 6 8 1 11 13 15 17 19 20 22 24 26 28 3 31 33 35 37 39 40 42 44 5 7 9 hayden@orca:~/gamma$ ls /proc/26319/fd 0 10 12 14 16 18 2 21 23 25 27 29 30 32 34 36 38 4 41 43 45 5 7 9 1 11 13 15 17 19 20 22 24 26 28 3 31 33 35 37 39 40 42 44 46 6 8 hayden@orca:~/gamma$ ls /proc/26319/fd/43 /proc/26319/fd/43 hayden@orca:~/gamma$ ls -l /proc/26319/fd/43 ls -l /proc/26319/fd/43 lrwx------ 1 hayden hayden 64 Aug 14 21:10 /proc/26319/fd/43 -> socket:[568994] hayden@orca:~/gamma$ lsof | grep 568994 git-annex 26319 hayden 43u IPv4 568994 0t0 UDP 224.0.0.251:55556 hayden@orca:~/gamma$ uname -a Linux orca 3.2.0-25-generic #40-Ubuntu SMP Wed May 23 20:30:51 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux hayden@orca:~/gamma$ cat /etc/issue Ubuntu 12.04 LTS \n \l hayden@orca:~/gamma$ fuser 55556/udp -a 55556/udp: 4415 26319 27080 27083 hayden@orca:~/gamma$ ps aux | grep 4415 hayden 4415 0.0 0.0 66716 3036 ? S Aug12 0:03 curl -s --head -L http://127.0.0.1:38464/?auth=da9c4aba4cc2db9cf78574753f6e94d8031c6a7bdf8bfe100bde868f57b81fd751965cc9d68a9afac79f826d257a256a04ce62615a7f23cd7c925969dda1c7b8 -w %{http_code} hayden 27458 0.0 0.0 10612 924 pts/0 S+ 21:50 0:00 grep --color=auto 4415 hayden@orca:~/gamma$ ps aux | grep 26319 hayden 26319 0.1 0.7 497188 28560 pts/5 Sl 21:09 0:04 git-annex webapp hayden 26338 3.4 3.4 1035572 137864 pts/5 Sl 21:09 1:25 /usr/lib/firefox/firefox /tmp/webapp26319.html hayden 27460 0.0 0.0 10612 920 pts/0 S+ 21:50 0:00 grep --color=auto 26319 hayden@orca:~/gamma$ ps aux | grep 27080 hayden 27080 0.0 0.0 16476 1288 pts/5 S 21:33 0:00 git --git-dir=/home/hayden/boo/.git --work-tree=/home/hayden/boo cat-file --batch hayden 27462 0.0 0.0 10612 924 pts/0 S+ 21:50 0:00 grep --color=auto 27080 hayden@orca:~/gamma$ ps aux | grep 27083 hayden 27083 0.0 0.0 16476 1060 pts/5 S 21:33 0:00 git --git-dir=/home/hayden/boo/.git --work-tree=/home/hayden/boo check-attr -z --stdin annex.backend annex.numcopies -- hayden 27464 0.0 0.0 10612 920 pts/0 S+ 21:51 0:00 grep --color=auto 27083 ----> has 579 open fds at this point but this number holds stable over 10 min (copy in new tree to provoke) (no change) ----> restart daemon in gui to provoke (new process has open fds slowly climbing after startup scan) straces look like this repeating every second. (clipped down) futex(0x34f001c, FUTEX_WAIT_PRIVATE, 51, NULL) = ? ERESTARTSYS (To be restarted) --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- rt_sigreturn(0x1a) = 202 futex(0x34f001c, FUTEX_WAIT_PRIVATE, 51, NULL) = ? ERESTARTSYS (To be restarted) --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- rt_sigreturn(0x1a) = 202 futex(0x34f001c, FUTEX_WAIT_PRIVATE, 51, NULL) = ? ERESTARTSYS (To be restarted) --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- rt_sigreturn(0x1a) = 202 futex(0x34f001c, FUTEX_WAIT_PRIVATE, 51, NULL) = ? ERESTARTSYS (To be restarted) --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- write(6, \"\377\0\0\0\0\0\0\0\", 8) = 8 rt_sigreturn(0x2) = 202 futex(0x34f001c, FUTEX_WAIT_PRIVATE, 51, NULL^C Process 29803 detached Until after roughly 10 mins... hayden@orca:~/boo$ ls /proc/29803/fd | wc -l 1017 hayden@orca:~/boo$ ls /proc/29803/fd | wc -l 1023 hayden@orca:~/boo$ ls /proc/29803/fd | wc -l 1024 hayden@orca:~/boo$ ls /proc/29803/fd | wc -l 1024 hayden@orca:~/boo$ ls /proc/29803/fd | wc -l 1024 hayden@orca:~/boo$ ls /proc/29803/fd | wc -l 1024 hayden@orca:~/boo$ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 31164 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 31164 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited hayden@orca:~/boo$ At this point the webapp hangs but a number of interesting crashes may occur. I've also seen the particular error in the previous users log (on a big tree). [2013-08-14 21:56:12 CEST] main: starting assistant version 4.20130802-g1452ac3 (scanning...) [2013-08-14 21:56:12 CEST] Watcher: Performing startup scan (started...) DaemonStatus crashed: /home/hayden/boo/.git/annex/: openTempFile: resource exhausted (Too many open files) [2013-08-14 22:06:12 CEST] DaemonStatus: warning DaemonStatus crashed: /home/hayden/boo/.git/annex/: openTempFile: resource exhausted (Too many open files) Is any of the above helpful? Anything else useful to kick for testing that you'd like done? I'd guess this is something weird with my ubuntu setup that provokes this as more users would see it otherwise. """]]