|Alex||Dec 16, 1997 6:59 pm|
|Tim Liddelow||Dec 16, 1997 8:05 pm|
|John S. Dyson||Dec 16, 1997 8:37 pm|
|Alex||Dec 16, 1997 9:17 pm|
|Tim Liddelow||Dec 16, 1997 9:36 pm|
|Scott Michel||Dec 16, 1997 10:02 pm|
|John S. Dyson||Dec 16, 1997 10:23 pm|
|Brian Handy||Dec 16, 1997 10:47 pm|
|John S. Dyson||Dec 16, 1997 11:04 pm|
|Warner Losh||Dec 16, 1997 11:49 pm|
|John S. Dyson||Dec 17, 1997 12:04 am|
|Poul-Henning Kamp||Dec 17, 1997 2:55 am|
|Warner Losh||Dec 17, 1997 7:09 am|
|Russell L. Carter||Dec 17, 1997 7:42 am|
|Eivind Eklund||Dec 17, 1997 10:13 am|
|Tim Liddelow||Dec 17, 1997 2:26 pm|
|Doug Rabson||Dec 18, 1997 12:35 pm|
|John Polstra||Dec 21, 1997 1:35 pm|
|Subject:||Re: Pentium optimizations|
|From:||Doug Rabson (df...@nlsystems.com)|
|Date:||Dec 18, 1997 12:35:10 pm|
On Wed, 17 Dec 1997, Russell L. Carter wrote:
}Alex said: }> }> The response(s) I got to my "I'm a newbie, anyone know about this problem" }> was basically met with "well no FreeBSD developers have contacted us, and }> if they did we'd accept/commit/whatever some changes..". }> }I expected that they would be cooperative (the EGCS group appears to be }culturally similar to us (modulo-GPL).) John Polstra is really our }most active ELF/Compiler person, and so he would likely be a better }"official FreeBSD" interface. He is also less politically likely to }insert his foot into his eating orifice. I do have some PPro mods, }and they appear to help performance on average. The PPro is a }really wierd creature (like the K6.) The darned processor does so }much optimization, it appears to be insensitive to code mods. There are ^^^^^^^^^^^^^^^^^^^^^^^^ Noticed that too, eh? Recently I hacked up some of the SSLeay asm code and while I could improve P5 performance about 30%, the best that I could do, with a lot of effort was maybe 4% for PII and PPro. The out-of-order execution seems to help a lot. Oh, and the P5 specific asm actually makes the PPro slow down over the C source; not good tidings for ye merry old tuners.
}areas of reasonable payoffs, and lots of "obvious" optimizations that }end up being neutral.
Yep. I wouldn't worry too much about other people's claims about code optimized for Pentium Pro.
There are some odd things with PPro memory accesses. If you do a write to a location which isn't cached, the write is queued in a write buffer (assuming there is one free, otherwise you stall). If you then try to read any memory location, you stall till the write is completed. This is something to do with enforcing read/write ordering in SMP systems.
-- Doug Rabson Mail: df...@nlsystems.com Nonlinear Systems Ltd. Phone: +44 181 951 1891 Fax: +44 181 381 1039