20 messages in net.sourceforge.lists.courier-usersRE: [courier-users] Re: Courier does ...
FromSent OnAttachments
David HumphreyOct 3, 2002 7:58 am 
Sam VarshavchikOct 3, 2002 2:43 pm 
David HumphreyOct 3, 2002 5:44 pm 
Sam VarshavchikOct 3, 2002 6:01 pm 
Gordon MessmerOct 3, 2002 6:09 pm 
David HumphreyOct 3, 2002 6:22 pm 
Sam VarshavchikOct 3, 2002 6:36 pm 
Mitchell YoungOct 3, 2002 6:37 pm 
David HumphreyOct 3, 2002 6:40 pm 
David HumphreyOct 3, 2002 7:21 pm 
Gordon MessmerOct 3, 2002 8:22 pm 
Mitchell YoungOct 3, 2002 9:38 pm 
Brian CandlerOct 4, 2002 2:32 am 
Benjamin SchleinzerOct 4, 2002 2:56 am 
David HumphreyOct 4, 2002 7:29 am 
Gordon MessmerOct 4, 2002 8:01 am 
David HumphreyOct 4, 2002 8:07 am 
Zenon PanoussisOct 4, 2002 10:28 am 
David HumphreyOct 4, 2002 11:21 am 
David HumphreyOct 4, 2002 1:21 pm 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:RE: [courier-users] Re: Courier does not deliver mail - courieresmtp spins on each outbound messageActions...
From:David Humphrey (knig@attbi.com)
Date:Oct 4, 2002 7:29:03 am
List:net.sourceforge.lists.courier-users

Brian,

Excellent suggestion. I have gotten so far away from code development that I clean forgot to even try the system debuggers. Largely because I am no longer familiar with their use, and the internals of Linux. However with your suggestion below I *did* give it a shot.

I 'straced' the courieresmtp process that was spinning on the message to be sent externally. Surprisingly it showed no activity in the trace. (?!? Uh-oh, a loop w/out a wait?) So I tried to attach to the parent courieresmtp process to see what was up there (and to see if the -f option would say anything about child process communications). In this case the process looked normal. It was sleeping on a select call:

select(5, [0 4], NULL, NULL, NULL

which made sense. The binaries are stripped, but I am not sure that that makes a difference to a strace. Have I missed something? The proc table shows that the process has two pipes open and two sockets open, but I'm not skilled enough to dig much deeper there. One socket is connected to the authdaemon at /var/spool/courier/authdaemon, but the other I have no info. on.

So, moving forward with Benjamin's advise, I decide to proceed with the recompile using "--without-ipv6". Sadly, however, that produced no change. Though moving through the thread to where I am now, it makes a lot of sense that this may in fact be the problem. I just can't figure out how.

Guys I *really* appreciate your help on this. I can't seem to figure it out on my own, and I'm really stumped. Thanks.

-Ace ps. I may not have mentioned - this is a Redhat system 7.3 w/ a 2.4.18-3 kernel, though I know of no problems with this release...

-----Original Message----- From: Brian Candler [mailto:B.Ca@pobox.com] Sent: Friday, October 04, 2002 5:32 AM To: David Humphrey Cc: cour@lists.sourceforge.net Subject: Re: [courier-users] Re: Courier does not deliver mail - courieresmtp spins on each outbound message

On Thu, Oct 03, 2002 at 08:44:29PM -0400, David Humphrey wrote:

Everything looks pretty OK. The message is 'submit'ted, and is seen in the msgs and msgq directories. But it never moves from there, and any ESMTP process started up that finds it (or *any* outbound messages) just sucks down the CPU.

Well there are a couple of things you can try to localise the problem when there is a runaway process. The easiest is to use strace (Linux), ktrace (FreeBSD) or truss (Solaris), which will show you what system calls the process is making.

Otherwise you can get the process to dump core (e.g. send a SIGABRT), cd to the source directory, and run gdb -c </path/to/corefile> <program>. Then 'bt' will give a backtrace showing where it was running at the instant it died and the function callback sequence - which you can post to Sam.

Actually getting it to create a core file is usually the hard part, especially if it is a non-root process and/or is setuid.

I'm not a Linux user; the tricks for FreeBSD are sysctl -a | grep core # set these sysctl variables appropriately

and Solaris: mkdir /var/core chmod 1777 /var/core coreadm -p /var/core/%f.%p <process-id>