atom feed30 messages in org.python.python-dev[Python-Dev] String hash function mul...
FromSent OnAttachments
Raymond HettingerApr 13, 2004 8:00 pm 
Jeff EplerApr 13, 2004 9:10 pm 
Bob IppolitoApr 13, 2004 9:26 pm 
Jeff EplerApr 13, 2004 10:04 pm 
Raymond HettingerApr 13, 2004 10:17 pm 
Jeff EplerApr 13, 2004 11:10 pm 
Guido van RossumApr 13, 2004 11:26 pm 
Tim PetersApr 13, 2004 11:56 pm 
Jeff EplerApr 14, 2004 9:08 am 
Raymond HettingerApr 14, 2004 12:06 pm 
Andrew MacIntyreApr 14, 2004 3:23 pm 
Jeff EplerApr 14, 2004 3:35 pm 
Mike PallApr 14, 2004 5:50 pm 
Tim PetersApr 14, 2004 11:14 pm 
Michael HudsonApr 15, 2004 7:05 am 
Mike PallApr 15, 2004 9:36 am 
Guido van RossumApr 15, 2004 10:27 am 
Jeremy HyltonApr 15, 2004 10:38 am 
Guido van RossumApr 15, 2004 10:42 am 
Mike PallApr 15, 2004 11:56 am 
Mike PallApr 15, 2004 11:56 am 
Skip MontanaroApr 15, 2004 11:59 am 
Michael HudsonApr 15, 2004 1:27 pm 
Raymond HettingerApr 15, 2004 2:22 pm 
Thomas HellerApr 15, 2004 2:31 pm 
"Martin v. Löwis"Apr 15, 2004 3:07 pm 
Jeremy HyltonApr 15, 2004 11:26 pm 
Tim PetersApr 16, 2004 12:18 am 
"Martin v. Löwis"Apr 16, 2004 2:00 am 
Andrew MacIntyreApr 16, 2004 9:14 pm 
Subject:[Python-Dev] String hash function multiplier
From:Raymond Hettinger (pyt@rcn.com)
Date:Apr 14, 2004 12:06:49 pm
List:org.python.python-dev

[Raymond]

Does anyone have any issues with changing the hash multiplier for the string and Unicode hash functions?

[Tim]

Don't touch it unless you can prove major benefits -- it's a remarkable fact of life that the current multiplier hasn't resulted in any real-life (but non-contrived) pathological cases.

Will leave it alone.

Perhaps you think shifts and adds are faster? I wouldn't -- the imul instruction on modern Pentiums is very fast.

On the P4, the documented latency went up from 4 cycles to 14 cycles while shifts and adds went down to 0.5 cycles and 1 cycle respectively. Timings confirm the result.

It looks like the best bet is to try to speedup the code without changing the multiplier. Intel's software optimization cookbook recommends a partial unrolling and elimination of data dependencies so that a second multiply can start 4 cycles after the previous one started. If practice bears out the theory, the timings could show a three or fourfold speedup without changing the multiplier.

(read Knuth).

Of course, I already have :-)

The right thing to compare Python's string hash to is "the standard" Fowler-Noll-Vo string hash

Ditto.

Raymond

################################################################# ################################################################# ################################################################# ##### ##### ##### ################################################################# ################################################################# #################################################################