Peter Ford writes:
Hi,
I'm trying to work out how to get more info into my logs, in order to
track a problem one of our customers is having when sending mail to us.
I have his log output (which is more than I usually get :-) ), and a
fleeting glimpse of it in my log.
His log says:
(snip)
I found the MX for his domain and got the IP address of his mail server,
then grepped that in my maillog, and all I have is lots of instances of
Jul 27 11:00:05 alien courieresmtpd: started,ip=[::ffff:203.59.129.54]
(His server is in australia, mine in the UK, so this particular log entry
corresponds to the log he sent...)
There's no follow-up to this "started" entry - which I suppose matches the
timeout in his log.
Is there anyway to have more information logged about connections? I can't
see any configurations which affect logging levels.
Start a networksniffer and let it run until some data is captured (if the
other end flushes it's queues, this should be easily accomplished) and see
what really goes on.
Yes I know that I give this advice very often, maybe two out of three mails,
but it really is the basis to isolate problems. Every sysadmin should be
(basically) proficient with a network sniffer. Every network should have
(carefully controlled) facilities to sniff the network in case of problems.
Luckily, most unixes come with a basic sniffer preinstalled.
I really want to be able to demonstrate that this problem is out of my
control - I suspect a misconfiguration on the remote end, since most
people manage to send us mail perfectly well.
The nice thing about networksniffing is that it operates (more or less) at
the boundary between the systems. More often than not, you can say what the
problem is and more importantly who's causing the problem.
[ As an example, we just debugged intermittend failures from a messaging
system. Completely contrary to our expectations, we found that our end did
strange things at the network level. Turned out to be a bug in .net. Only
seeing what goes over the wire could reveal this. ]
HTH,
M4