atom feed7 messages in ru.sysoev.nginxRe: Question about http_stub_status
FromSent OnAttachments
Marcus BianchyAug 22, 2008 4:50 am 
Maxim DouninAug 22, 2008 8:06 am 
Igor SysoevAug 23, 2008 12:04 am 
Marcus BianchyAug 23, 2008 4:17 am 
Igor SysoevAug 26, 2008 9:25 am 
Marcus BianchyAug 26, 2008 11:53 am 
Igor SysoevAug 26, 2008 12:03 pm 
Subject:Re: Question about http_stub_status
From:Marcus Bianchy (marc@public.gmane.org)
Date:Aug 23, 2008 4:17:44 am
List:ru.sysoev.nginx

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

Hi Igor,

thank you for the fast answer.

Igor Sysoev schrieb:

nginx has special rotation signal: SIGUSR1.

Ok, I will fix that. I just called our /etc/init.d script (which does an SIGHUP) after logrotate. It's a better idea to send a USR1 from logrotate direct to the master process. Or much better: Chenage the init-Script to do that.

This means that either someone killed nginx workers using SIGTERM/INT/KILL or workers exited abnornamally. Could you run

grep alert error_log

Well, I can say that no one of our team send's such signals around... We're observing strange signal 8 (SIGFPE) errors the last time: A typical "grep/zgrep signal" of our error.logs shows things similar like this:

############ snip ############ 2008/08/22 10:09:42 [notice] 28631#0: signal 17 (SIGCHLD) received 2008/08/22 10:09:42 [alert] 28631#0: worker process 27809 exited on signal 8 2008/08/22 10:09:42 [notice] 28631#0: signal 29 (SIGIO) received 2008/08/22 12:58:06 [notice] 28631#0: signal 17 (SIGCHLD) received 2008/08/22 12:58:06 [alert] 28631#0: worker process 27810 exited on signal 8 2008/08/22 12:58:06 [notice] 28631#0: signal 29 (SIGIO) received 2008/08/22 12:58:06 [notice] 28631#0: signal 17 (SIGCHLD) received 2008/08/22 12:58:06 [alert] 28631#0: worker process 32013 exited on signal 8 2008/08/22 12:58:06 [notice] 28631#0: signal 29 (SIGIO) received 2008/08/22 12:58:11 [notice] 28631#0: signal 17 (SIGCHLD) received 2008/08/22 12:58:11 [alert] 28631#0: worker process 27811 exited on signal 8 2008/08/22 12:58:11 [notice] 28631#0: signal 29 (SIGIO) received 2008/08/22 12:58:20 [notice] 28631#0: signal 17 (SIGCHLD) received 2008/08/22 12:58:20 [alert] 28631#0: worker process 785 exited on signal 8 2008/08/22 12:58:20 [notice] 28631#0: signal 29 (SIGIO) received 2008/08/22 12:59:36 [notice] 28631#0: signal 17 (SIGCHLD) received 2008/08/22 12:59:36 [alert] 28631#0: worker process 1342 exited on signal 8 2008/08/22 12:59:36 [notice] 28631#0: signal 29 (SIGIO) received 2008/08/22 13:00:06 [notice] 28631#0: signal 17 (SIGCHLD) received 2008/08/22 13:00:06 [alert] 28631#0: worker process 1343 exited on signal 8 2008/08/22 13:00:06 [notice] 28631#0: signal 29 (SIGIO) received 2008/08/23 04:02:18 [notice] 28631#0: signal 17 (SIGCHLD) received 2008/08/23 04:02:18 [alert] 28631#0: worker process 1344 exited on signal 8 2008/08/23 04:02:18 [notice] 28631#0: signal 29 (SIGIO) received ################## snip #############

The logrotate runs at 04:00 in the morning, that would explain the SIGCHLD/SIGFPE at 04:02:18. But the real problem are the signals at around 1pm; neither the access.log nor the error.log gives any hint for the thing that produces this behaviour. And guess: yesterday at 1pm the values for active/waiting connections increaesed to ~30000/35000.

Maybe it's a good idea to allow core dumps to exactly reproduce what causes these signals?

Marcus Bianchy

-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIr/HvbraE9AYT0pERAqDmAKCRI/WfiZxMGwzyHZlrBGNHxMXa8ACeMHA1 qILe8PxrIDVNY16Ihwg9wyk= =wXFe -----END PGP SIGNATURE-----