32 messages in net.sourceforge.lists.courier-maildrop[maildropl] Re: OpenBSD 3.2 breaks Co...
FromSent OnAttachments
Sam VarshavchikJan 13, 2003 3:46 pm 
D. J. BernsteinJan 13, 2003 6:11 pm 
Sam VarshavchikJan 13, 2003 9:11 pm 
Russell NelsonJan 13, 2003 9:46 pm 
Sam VarshavchikJan 13, 2003 10:19 pm 
Russell NelsonJan 13, 2003 11:11 pm 
Sam VarshavchikJan 13, 2003 11:35 pm 
mw-l...@csi.huJan 14, 2003 7:40 am 
Sam VarshavchikJan 14, 2003 3:22 pm 
mw-l...@csi.huJan 14, 2003 11:13 pm 
Sam VarshavchikJan 15, 2003 5:11 am 
Matthias AndreeJan 15, 2003 9:55 am 
Matthias AndreeJan 15, 2003 12:59 pm 
Matthias AndreeJan 15, 2003 1:36 pm 
Sam VarshavchikJan 15, 2003 3:11 pm 
Matthias AndreeJan 15, 2003 4:13 pm 
Sam VarshavchikJan 15, 2003 4:47 pm 
Johan LindhJan 15, 2003 10:16 pm 
Peter C. NortonJan 15, 2003 11:52 pm 
Bill MichellJan 16, 2003 1:30 am 
Johan LindhJan 16, 2003 2:00 am 
Bill MichellJan 16, 2003 2:28 am 
Matthias AndreeJan 16, 2003 2:28 am 
Matthias AndreeJan 16, 2003 2:45 am 
David LaightJan 16, 2003 3:14 am 
Sam VarshavchikJan 16, 2003 5:01 am 
Johan LindhJan 16, 2003 6:28 am 
Matthias AndreeJan 16, 2003 9:47 am 
mw-l...@csi.huJan 16, 2003 12:48 pm 
Sam VarshavchikJan 16, 2003 2:55 pm 
mw-l...@csi.huJan 17, 2003 12:30 pm 
Matthias AndreeJan 18, 2003 5:05 am 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:[maildropl] Re: OpenBSD 3.2 breaks Courier, Qmail.Actions...
From:Sam Varshavchik (mrs@courier-mta.com)
Date:Jan 15, 2003 4:47:43 pm
List:net.sourceforge.lists.courier-maildrop

Matthias Andree writes:

In case you've missed it, I'll repeat my suggestion: I suggested keeping the tmp/timestamp.pid.hostname files for a finite amount of time after linking them to new/ (so that for 20 seconds or 2 seconds or what timeout you choose, both file names exist) -- the tmp/ is then a kind of

Who's gonna remove them?

You want qmail-local, or whatever delivers the message, to sleep around, with its thumb up its ass for 2-20 seconds, before unlinking the tmp file, and exiting?

Some 16-processor machine with huge caches might have a real problem sooner than we want...

It doesn't matter how many CPUs the machine has. If microseconds are added to the filename, you have to turn things around in less than a microsecond for the race to remain (I might've messed up my units a few times in the last couple of days, but I'm always referring to tv_usec's, one millionth of a second). It takes a finite amount of CPU instructions to clean up an existing process, then set up another one, and start running it. That number of CPU instructions does not change no matter how many CPUs are in there. They have to be executed in turn. At different times, different CPUs might be running the code (and adding the overhead of context switches to the whole mix), but there's a definite starting point, and a definite ending point here, and the whole show must occur in less than a microsecond in order for the race to still exist.

I've ran the RAM bandwidth numbers before. It's out of the second. You say something about large CPU caches. Still, the CPU has to run them. Assume everything is in the L1 cache, with zero latency. Assume the CPU requires 5 cycles for an average instruction. My abacus says that a 1Ghz CPU will execute 200 instructions per microsecond.

Let's measure CPU latency. Let's try to have everything cached:

struct timeval tv_array[100];

int i;

for (i=0; i<sizeof(tv_array)/sizeof(tv_array[0]); i++) gettimeofday(&tv_array[i], NULL);

for (i=0; i<sizeof(tv_array)/sizeof(tv_array[0]); i++) printf("%06d\n", tv_array[i].tv_usec);

The output, on a box with a pair of 1.13 Ghz CPUs, no other load, shows that the first loop executes a max number of three times per microsecond.

Divide that into the ratio of the complexity between this loop, and the code to tear down one process, and build another process, and draw your own conclusions.

I actually had another brainstorm today. Add the file's inode, when moving the message from tmp to new:

tmp/timestamp.unique_pid.hostname -> new/timestamp.inode.unique_pid.hostname

How does that grab you? It does add an extra stat() call, to obtain the message's inode, after the file is created in tmp.