atom feed18 messages in org.freebsd.freebsd-currentRe: Pentium optimizations
FromSent OnAttachments
AlexDec 16, 1997 6:59 pm 
Tim LiddelowDec 16, 1997 8:05 pm 
John S. DysonDec 16, 1997 8:37 pm 
AlexDec 16, 1997 9:17 pm 
Tim LiddelowDec 16, 1997 9:36 pm 
Scott MichelDec 16, 1997 10:02 pm 
John S. DysonDec 16, 1997 10:23 pm 
Brian HandyDec 16, 1997 10:47 pm 
John S. DysonDec 16, 1997 11:04 pm 
Warner LoshDec 16, 1997 11:49 pm 
John S. DysonDec 17, 1997 12:04 am 
Poul-Henning KampDec 17, 1997 2:55 am 
Warner LoshDec 17, 1997 7:09 am 
Russell L. CarterDec 17, 1997 7:42 am 
Eivind EklundDec 17, 1997 10:13 am 
Tim LiddelowDec 17, 1997 2:26 pm 
Doug RabsonDec 18, 1997 12:35 pm 
John PolstraDec 21, 1997 1:35 pm 
Subject:Re: Pentium optimizations
From:Doug Rabson (df@nlsystems.com)
Date:Dec 18, 1997 12:35:10 pm
List:org.freebsd.freebsd-current

On Wed, 17 Dec 1997, Russell L. Carter wrote:

}Alex said: }> }> The response(s) I got to my "I'm a newbie, anyone know about this problem" }> was basically met with "well no FreeBSD developers have contacted us, and }> if they did we'd accept/commit/whatever some changes..". }> }I expected that they would be cooperative (the EGCS group appears to be }culturally similar to us (modulo-GPL).) John Polstra is really our }most active ELF/Compiler person, and so he would likely be a better }"official FreeBSD" interface. He is also less politically likely to }insert his foot into his eating orifice. I do have some PPro mods, }and they appear to help performance on average. The PPro is a }really wierd creature (like the K6.) The darned processor does so }much optimization, it appears to be insensitive to code mods. There are ^^^^^^^^^^^^^^^^^^^^^^^^ Noticed that too, eh? Recently I hacked up some of the SSLeay asm code and while I could improve P5 performance about 30%, the best that I could do, with a lot of effort was maybe 4% for PII and PPro. The out-of-order execution seems to help a lot. Oh, and the P5 specific asm actually makes the PPro slow down over the C source; not good tidings for ye merry old tuners.

}areas of reasonable payoffs, and lots of "obvious" optimizations that }end up being neutral.

Yep. I wouldn't worry too much about other people's claims about code optimized for Pentium Pro.

There are some odd things with PPro memory accesses. If you do a write to a location which isn't cached, the write is queued in a write buffer (assuming there is one free, otherwise you stall). If you then try to read any memory location, you stall till the write is completed. This is something to do with enforcing read/write ordering in SMP systems.