atom feed17 messages in ru.sysoev.nginxRe: Is it possible to monitor the fai...
FromSent OnAttachments
Robbie AllenJun 27, 2008 5:22 pm 
Alexander StauboJun 27, 2008 5:39 pm 
Robbie AllenJun 27, 2008 6:08 pm 
Rt IbmerJun 27, 2008 6:54 pm 
mikeJun 27, 2008 11:02 pm 
Grzegorz NosekJun 28, 2008 4:50 am 
Grzegorz NosekJun 28, 2008 5:31 am 
mikeJun 28, 2008 9:14 am 
Alexander StauboJun 28, 2008 12:28 pm 
Grzegorz NosekJun 28, 2008 12:53 pm 
Almir KaricJun 28, 2008 1:30 pm 
Brice FigureauJun 28, 2008 2:36 pm 
Alexander StauboJun 28, 2008 4:02 pm 
Rt IbmerJun 28, 2008 9:38 pm 
Grzegorz NosekJun 29, 2008 10:57 am.patch, .patch, .patch
Brice FigureauJun 30, 2008 12:23 pm 
Grzegorz NosekJun 30, 2008 12:49 pm 
Subject:Re: Is it possible to monitor the fair proxy balancer?
From:Grzegorz Nosek (grze@public.gmane.org)
Date:Jun 28, 2008 4:50:39 am
List:ru.sysoev.nginx

Hi,

On sob, cze 28, 2008 at 02:23:02 +0200, Robbie Allen wrote:

Periodically one or more of my mongrel instances will stop getting requests from nginx (via upstream fair). The mongrel process is still running, but not getting any requests.

The standard question: are you running the latest snapshot? (there was an update about a week ago). If not, please give it a try.

I guess I should start versioning the module ;)

How can I verify if nginx has taken it out of service? Is it possible to get details on the current status of the fair proxy?

No, not really. This is something I'd like to do but currently there's no support for pluggable status reports and I think that writing what would effectively become another module for monitoring upstream_fair is slightly overkill.

--with-debug and debug_http may help but you'd probably drown in the massive amount of logs (not only from upstream_fair, nginx itself is very chatty too).

I also see the following error in syslog, but I'm unsure if it is related....

nginx[17280]: segfault at 00007fffa0869fd0 rip 00002ac509ea61e3 rsp 00007fffa0869ed0 error 6

That one looks strange. It's a segfault while accessing the stack *above* the stack pointer, which should be legal, unless something has just allocated at least 304 bytes of stack space and overflowed it.

I can't see any large stack allocations in upstream_fair (though I may have overlooked something), so it may come from nginx itself as well.

Please try increasing the stack size (ulimit -s <something> in your nginx startup script).

If such a segfault repeats (or if you haven't restarted nginx since then; reloads are fine), please collect: - result of pmap pid-of-any-nginx-worker (they should have the same memory map), alternatively cat /proc/<pid>/maps - the log line with the faulting address, rip and rsp (like above) - your nginx binary

The above information should prove helpful while tracing the cause of the crash.