4 messages in net.sourceforge.lists.courier-usersRe: [courier-users] LDAP Tempfails, "...
FromSent OnAttachments
Adam BultmanJun 19, 2007 12:17 pm 
Sam VarshavchikJun 19, 2007 3:37 pm 
Adam BultmanJun 20, 2007 10:18 am 
Sam VarshavchikJun 20, 2007 3:19 pm 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:Re: [courier-users] LDAP Tempfails, "Enter username and password" boxes, and courier-authdaemonActions...
From:Sam Varshavchik (mrs@courier-mta.com)
Date:Jun 19, 2007 3:37:38 pm
List:net.sourceforge.lists.courier-users

Adam Bultman writes:

For about 6 months, it was only the one "large" server (The ~22k user one) and things ran fine. We have a Foundry SI4G load balancer which we were using to load balance LDAP requests over 3 servers (one linux, two solaris.)

What we found is that Courier-authdaemon will make a heck of a lot of connections to LDAP, and never close them down

No, it doesn't. Each authdaemon will open exactly one LDAP connection, that will remain open as long as authdaemon is running, or until there's 1-2 minutes of inactivity. There may be a temporary second connection, if you use authenticated binds, but there will never be more than two persistent LDAP connections from a single authdaemon processes.

I have also tried changing the number of daemons that the authdaemon fires up, but that doesn't seem to make much of a difference. It has ranged from 500 to 55 (I have it at 55 currently) but no matter what, courier tempfails and irritates the users.

55 connections is required for, maybe, Yahoo or AOL. For you, it's overkill, and the default of five authdaemon processes should be sufficient.

Assuming a very generous 100 milliseconds per LDAP lookup, a single authdaemon process will handle ten lookups per second. Five of them will handle fifty a second, three hundred a minute, and 18,000 per hour.

So, unless all of your 22K users logs on more often than once an hour, five connections will be more than enough.

I'm at the end of my rope, and I can't figure out what to do next. Any help would be appreciated.

More than likely your load balancer is broken, and must be fixed. My guess is that it assumes that LDAP connections are short term connections. And they are, with most simple-minded LDAP clients, that bind, query, and disconnect. Authdaemon is more efficient than that. It opens a connection and holds it open, avoiding the utterly useless waste of time for connecting and disconnecting from the LDAP server, for each authentication time.

Each authdaemon is probably connecting to your LDAP server, through your load balancer, with the first authentication attempt. After the response is received, the load balancer assumes that the connection is no longer needed, and drops it from its memory, but as far as the authdaemon and LDAP server is concerned, it's still a valid connection.

With the next connection attempt, authdaemon gets a broken socket indication, since the load balancer has dropped the first connection from memory, and automatically reconnects to the LDAP server, that still thinks the first connection exists.

It won't take long before everything comes crashing down.