6 messages in net.sourceforge.lists.courier-users[courier-users] Courier perlfilter pr...
FromSent OnAttachments
Michael CarmackApr 27, 2007 3:01 pm 
Sam VarshavchikApr 27, 2007 4:25 pm 
Michael CarmackApr 27, 2007 10:02 pm 
Sam VarshavchikApr 28, 2007 6:50 am 
Michael CarmackApr 28, 2007 9:39 am 
Sam VarshavchikApr 28, 2007 10:10 am.txt
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:[courier-users] Courier perlfilter problem on 64-bit LinuxActions...
From:Michael Carmack (kar@karmak.org)
Date:Apr 27, 2007 3:01:49 pm
List:net.sourceforge.lists.courier-users

(Sorry if this goes through twice. The first time I sent it there was a big attachment and I tried to cancel it.)

I've run into a perlfilter problem on a custom 64-bit Linux system. I do not see this problem on an identically configured 32-bit courier install.

The perlfilter will start and seemingly run ok, but almost always when I try to stop it, it goes into some kind of loop. It endlessly respawns all of the perlfitler processes except one, which I assume must be some kind of "root" perlfilter process, as when I manually kill that one, the rest of them finally die for good and the courierfilter stops.

Oddly enough, the act of running an strace on the root perlfilter process seems to fix the problem. If I run strace on the root perlfilter _before_ trying to stop courierfilter, I never see any looping when I shut it down. And if I run strace on the root perlfilter process _after_ trying to stop courierfilter, while all of the rest of the perlfilter processes are respawning, then the respawning immediately stops and the courierfitler stops.

The results of the tests I ran are shown below. Timestamps are included so you can see how fast the processes are respawning.

=========================================== Start testing... ===========================================

root@b1$ ./sbin/courierfilter start

root@b1$ ps -efww | grep courier courier 24720 1 0 07:10 ? 00:00:00 ./sbin/courierfilter start courier 24722 1 0 07:10 ? 00:00:00 /pkg/courier-authlib/0.59.3/.001/sbin/courierlogger courierfilter courier 24723 24720 0 07:10 ? 00:00:00 perlfilter courier 24724 24723 0 07:10 ? 00:00:00 perlfilter courier 24725 24723 0 07:10 ? 00:00:00 perlfilter courier 24726 24723 0 07:10 ? 00:00:00 perlfilter courier 24727 24723 0 07:10 ? 00:00:00 perlfilter courier 24728 24723 0 07:10 ? 00:00:00 perlfilter root 24731 8432 0 07:10 pts/1 00:00:00 grep courier

===================================================== The filter starts up fine, everything looks normal =====================================================

root@b1$ date; ./sbin/courierfilter stop Fri Apr 27 07:11:06 GMT 2007

root@b1$ date; ps -efww | grep courier Fri Apr 27 07:11:12 GMT 2007 courier 24720 1 0 07:10 ? 00:00:00 ./sbin/courierfilter start courier 24722 1 0 07:10 ? 00:00:00 /pkg/courier-authlib/0.59.3/.001/sbin/courierlogger courierfilter courier 24723 24720 0 07:10 ? 00:00:00 perlfilter courier 26156 24723 0 07:11 ? 00:00:00 [perlfilter] <defunct> courier 26157 24723 0 07:11 ? 00:00:00 [perlfilter] <defunct> courier 26158 24723 0 07:11 ? 00:00:00 [perlfilter] <defunct> root 26161 8432 0 07:11 pts/1 00:00:00 grep courier courier 26162 24723 0 07:11 ? 00:00:00 [perlfilter] <defunct> courier 26163 24723 0 07:11 ? 00:00:00 perlfilter

================================================================ But notice when it is stopped, a bunch of defunct processes ================================================================

root@b1$ date; ps -efww | grep courier Fri Apr 27 07:11:18 GMT 2007 courier 24720 1 0 07:10 ? 00:00:00 ./sbin/courierfilter start courier 24722 1 0 07:10 ? 00:00:00 /pkg/courier-authlib/0.59.3/.001/sbin/courierlogger courierfilter courier 24723 24720 0 07:10 ? 00:00:00 perlfilter courier 27599 24723 0 07:11 ? 00:00:00 [perlfilter] <defunct> courier 27600 24723 0 07:11 ? 00:00:00 [perlfilter] <defunct> courier 27601 24723 0 07:11 ? 00:00:00 [perlfilter] <defunct> root 27604 8432 0 07:11 pts/1 00:00:00 grep courier courier 27605 24723 0 07:11 ? 00:00:00 perlfilter courier 27606 24723 0 07:11 ? 00:00:00 perlfilter

============================================================= See how fast they are respawning: ~1500 in 6 seconds Except that first one (pid 24723) -- it never dies. =============================================================

root@b1$ strace -vfp 24723 &>strace.txt

root@b1$ date; ps -efww | grep courier Fri Apr 27 07:11:49 GMT 2007 root 1773 8432 0 07:11 pts/1 00:00:00 grep courier

=============================================================== But here I ran strace on the "root" perlfilter process, and all the rest of them died off for good. Link to strace output: http://karmak.org/2007/strace.txt ===============================================================

============================================================ Now for test #2... ============================================================

root@b1$ date; ./sbin/courierfilter start Fri Apr 27 07:12:08 GMT 2007

root@b1$ date; ps -efww | grep courier Fri Apr 27 07:12:11 GMT 2007 courier 1778 1 0 07:12 ? 00:00:00 ./sbin/courierfilter start courier 1780 1 0 07:12 ? 00:00:00 /pkg/courier-authlib/0.59.3/.001/sbin/courierlogger courierfilter courier 1781 1778 0 07:12 ? 00:00:00 perlfilter courier 1782 1781 0 07:12 ? 00:00:00 perlfilter courier 1783 1781 0 07:12 ? 00:00:00 perlfilter courier 1784 1781 0 07:12 ? 00:00:00 perlfilter courier 1785 1781 0 07:12 ? 00:00:00 perlfilter courier 1786 1781 0 07:12 ? 00:00:00 perlfilter root 1789 8432 0 07:12 pts/1 00:00:00 grep courier

root@b1$ strace -vfp 1781 &>/dev/null

============================================================== Here I've started back up again, and I now I strace the root process *before* trying to stop courierfilter. (There is nothing interesting in this strace, so it is just sent to /dev/null) ==============================================================

root@b1$ date; ps -efww | grep courier Fri Apr 27 07:12:49 GMT 2007 courier 1778 1 0 07:12 ? 00:00:00 ./sbin/courierfilter start courier 1780 1 0 07:12 ? 00:00:00 /pkg/courier-authlib/0.59.3/.001/sbin/courierlogger courierfilter courier 1781 1778 0 07:12 ? 00:00:00 perlfilter courier 1782 1781 0 07:12 ? 00:00:00 perlfilter courier 1783 1781 0 07:12 ? 00:00:00 perlfilter courier 1784 1781 0 07:12 ? 00:00:00 perlfilter courier 1785 1781 0 07:12 ? 00:00:00 perlfilter courier 1786 1781 0 07:12 ? 00:00:00 perlfilter root 1793 8432 0 07:12 pts/1 00:00:00 grep courier

root@b1$ date; ./sbin/courierfilter stop Fri Apr 27 07:12:52 GMT 2007

root@b1$ date; ps -efww | grep courier Fri Apr 27 07:12:53 GMT 2007 root 1798 8432 0 07:12 pts/1 00:00:00 grep courier

========================================================== As you can see, now when I stop courierfilter it shuts down like it should. The act of running strace on the perlfilter process in advance "fixed" it. ==========================================================

========================================================== Ok, one last test... ==========================================================

root@b1$ date; ./sbin/courierfilter start Fri Apr 27 07:12:59 GMT 2007

root@b1$ date; ps -efww | grep courier Fri Apr 27 07:13:09 GMT 2007 courier 1803 1 0 07:13 ? 00:00:00 ./sbin/courierfilter start courier 1805 1 0 07:13 ? 00:00:00 /pkg/courier-authlib/0.59.3/.001/sbin/courierlogger courierfilter courier 1806 1803 0 07:13 ? 00:00:00 perlfilter courier 1807 1806 0 07:13 ? 00:00:00 perlfilter courier 1808 1806 0 07:13 ? 00:00:00 perlfilter courier 1809 1806 0 07:13 ? 00:00:00 perlfilter courier 1810 1806 0 07:13 ? 00:00:00 perlfilter courier 1811 1806 0 07:13 ? 00:00:00 perlfilter root 1814 8432 0 07:13 pts/1 00:00:00 grep courier

========================================================= Start courierfilter. Everything is ok. =========================================================

root@b1$ date; ./sbin/courierfilter stop Fri Apr 27 07:13:11 GMT 2007

root@b1$ date; ps -efww | grep courier Fri Apr 27 07:13:12 GMT 2007 courier 1803 1 0 07:12 ? 00:00:00 ./sbin/courierfilter start courier 1805 1 0 07:12 ? 00:00:00 /pkg/courier-authlib/0.59.3/.001/sbin/courierlogger courierfilter courier 1806 1803 0 07:12 ? 00:00:00 perlfilter courier 2081 1806 0 07:13 ? 00:00:00 [perlfilter] <defunct> courier 2082 1806 0 07:13 ? 00:00:00 [perlfilter] <defunct> courier 2083 1806 0 07:13 ? 00:00:00 [perlfilter] <defunct> root 2086 8432 0 07:13 pts/1 00:00:00 grep courier courier 2087 1806 0 07:13 ? 00:00:00 perlfilter courier 2088 1806 0 07:13 ? 00:00:00 perlfilter

======================================================== Try to stop it though, and the respawning starts. ========================================================

root@b1$ date; ps -efww | grep courier Fri Apr 27 07:13:18 GMT 2007 courier 1803 1 0 07:13 ? 00:00:00 ./sbin/courierfilter start courier 1805 1 0 07:13 ? 00:00:00 /pkg/courier-authlib/0.59.3/.001/sbin/courierlogger courierfilter courier 1806 1803 0 07:13 ? 00:00:00 perlfilter courier 3320 1806 0 07:13 ? 00:00:00 [perlfilter] <defunct> courier 3321 1806 0 07:13 ? 00:00:00 [perlfilter] <defunct> courier 3322 1806 0 07:13 ? 00:00:00 [perlfilter] <defunct> courier 3323 1806 0 07:13 ? 00:00:00 [perlfilter] <defunct> root 3326 8432 0 07:13 pts/1 00:00:00 grep courier

root@b1$ kill 1806

root@b1$ date; ps -efww | grep courier Fri Apr 27 07:13:29 GMT 2007 root 5580 8432 0 07:13 pts/1 00:00:00 grep courier

=============================================================== But when I killed the "root" perlfitler, all the rest died and courierfitler stopped. ===============================================================