atom feed30 messages in org.python.python-dev[Python-Dev] String hash function mul...
FromSent OnAttachments
Raymond HettingerApr 13, 2004 8:00 pm 
Jeff EplerApr 13, 2004 9:10 pm 
Bob IppolitoApr 13, 2004 9:26 pm 
Jeff EplerApr 13, 2004 10:04 pm 
Raymond HettingerApr 13, 2004 10:17 pm 
Jeff EplerApr 13, 2004 11:10 pm 
Guido van RossumApr 13, 2004 11:26 pm 
Tim PetersApr 13, 2004 11:56 pm 
Jeff EplerApr 14, 2004 9:08 am 
Raymond HettingerApr 14, 2004 12:06 pm 
Andrew MacIntyreApr 14, 2004 3:23 pm 
Jeff EplerApr 14, 2004 3:35 pm 
Mike PallApr 14, 2004 5:50 pm 
Tim PetersApr 14, 2004 11:14 pm 
Michael HudsonApr 15, 2004 7:05 am 
Mike PallApr 15, 2004 9:36 am 
Guido van RossumApr 15, 2004 10:27 am 
Jeremy HyltonApr 15, 2004 10:38 am 
Guido van RossumApr 15, 2004 10:42 am 
Mike PallApr 15, 2004 11:56 am 
Mike PallApr 15, 2004 11:56 am 
Skip MontanaroApr 15, 2004 11:59 am 
Michael HudsonApr 15, 2004 1:27 pm 
Raymond HettingerApr 15, 2004 2:22 pm 
Thomas HellerApr 15, 2004 2:31 pm 
"Martin v. Löwis"Apr 15, 2004 3:07 pm 
Jeremy HyltonApr 15, 2004 11:26 pm 
Tim PetersApr 16, 2004 12:18 am 
"Martin v. Löwis"Apr 16, 2004 2:00 am 
Andrew MacIntyreApr 16, 2004 9:14 pm 
Subject:[Python-Dev] String hash function multiplier
From:Jeff Epler (jep@unpythonic.net)
Date:Apr 14, 2004 9:08:41 am
List:org.python.python-dev

I benchmarked this on a Pentium 4 and Raymond's hash does indeed come out ahead, regardless of compiler flags.

Pentium IV, 2.4GHz: -O2 -mcpu=i386 -DMUL=100003 1.56 -O2 -mcpu=i386 -DMUL=65599 0.78 -O2 -mcpu=i386 -DCLEVER 0.54 *

-O2 -mcpu=pentium3 -DMUL=100003 0.78 -O2 -mcpu=pentium3 -DMUL=65599 0.79 -O2 -mcpu=pentium3 -DCLEVER 0.53 *

-O2 -mcpu=pentium4 -DMUL=100003 0.63 -O2 -mcpu=pentium4 -DMUL=65599 0.65 -O2 -mcpu=pentium4 -DCLEVER 0.50 *

With AMD CPUs, the current multiplier beats both the new multipler and the version expressed as shifts and adds/subtracts:

On a Duron, 1GHz: -O2 -mcpu=i386 -DMUL=100003 2.03 -O2 -mcpu=i386 -DMUL=65599 1.04 -O2 -mcpu=i386 -DCLEVER 0.95 *

-O2 -mcpu=athlon -DMUL=100003 0.92 * -O2 -mcpu=athlon -DMUL=65599 1.03 -O2 -mcpu=athlon -DCLEVER 0.94

On an Athlon XP 2600+: -O2 -mcpu=i386 -DMUL=100003 0.95 -O2 -mcpu=i386 -DMUL=65599 0.49 -O2 -mcpu=i386 -DCLEVER 0.44 *

-O2 -mcpu=athlon-xp -DMUL=100003 0.43 * -O2 -mcpu=athlon-xp -DMUL=65599 0.48 -O2 -mcpu=athlon-xp -DCLEVER 0.45

If we want a fast hash function, then we should write one that works a "long" at a time. This probably can't be done if hashes must be equal on different machines, but aren't hashes already different on LP64 machines (because they're 64 bits instead of 32)? (I guess the low-order bits would be identical)

Long-at-a-time hash, Duron, 1GHz: -O2 -march=athlon-tbird -DMUL=100003 0.35 -O2 -march=athlon-tbird -DMUL=65599 0.41 -O2 -march=athlon-tbird -DCLEVER 0.42

With this code, it would be necessary to allocate strings "rounded up" to 4 bytes, and zero the unused bytes.

Or we could do as Tim and Guido suggest: nothing.

Jeff

/* long-at-a-time hashing */ typedef struct { long ob_shash; unsigned long ob_size; union { unsigned char ob_sval[16]; long ob_svall[4]; }; } PyStringObject;

static long string_hash(PyStringObject *a) { register int len; register long *p; register long x;

if (a->ob_shash != -1) return a->ob_shash; len = (a->ob_size+3) / 4; p = a->ob_svall; x = *p << 7; while (--len >= 0) #ifdef CLEVER x = (x << 6) + (x << 16) - x + *p++; #else x = (MUL*x) ^ *p++; #endif x ^= a->ob_size; if (x == -1) x = -2; a->ob_shash = x; return x; }