atom feed30 messages in org.python.python-dev[Python-Dev] String hash function mul...
FromSent OnAttachments
Raymond HettingerApr 13, 2004 8:00 pm 
Jeff EplerApr 13, 2004 9:10 pm 
Bob IppolitoApr 13, 2004 9:26 pm 
Jeff EplerApr 13, 2004 10:04 pm 
Raymond HettingerApr 13, 2004 10:17 pm 
Jeff EplerApr 13, 2004 11:10 pm 
Guido van RossumApr 13, 2004 11:26 pm 
Tim PetersApr 13, 2004 11:56 pm 
Jeff EplerApr 14, 2004 9:08 am 
Raymond HettingerApr 14, 2004 12:06 pm 
Andrew MacIntyreApr 14, 2004 3:23 pm 
Jeff EplerApr 14, 2004 3:35 pm 
Mike PallApr 14, 2004 5:50 pm 
Tim PetersApr 14, 2004 11:14 pm 
Michael HudsonApr 15, 2004 7:05 am 
Mike PallApr 15, 2004 9:36 am 
Guido van RossumApr 15, 2004 10:27 am 
Jeremy HyltonApr 15, 2004 10:38 am 
Guido van RossumApr 15, 2004 10:42 am 
Mike PallApr 15, 2004 11:56 am 
Mike PallApr 15, 2004 11:56 am 
Skip MontanaroApr 15, 2004 11:59 am 
Michael HudsonApr 15, 2004 1:27 pm 
Raymond HettingerApr 15, 2004 2:22 pm 
Thomas HellerApr 15, 2004 2:31 pm 
"Martin v. Löwis"Apr 15, 2004 3:07 pm 
Jeremy HyltonApr 15, 2004 11:26 pm 
Tim PetersApr 16, 2004 12:18 am 
"Martin v. Löwis"Apr 16, 2004 2:00 am 
Andrew MacIntyreApr 16, 2004 9:14 pm 
Subject:[Python-Dev] String hash function multiplier
From:Jeff Epler (jep@unpythonic.net)
Date:Apr 13, 2004 11:10:37 pm
List:org.python.python-dev

On Tue, Apr 13, 2004 at 10:16:16PM -0400, Raymond Hettinger wrote:

Yes, the loop runs 20% faster on my Pentium III. The speedup ought to be much more dramatic on the Pentium IV (where the integer ALU instructions speedup from 1 clock to 0.5 clocks while the latency on integer multiplication slows down from 4 clocks to 14-18 clocks).

I got different results here, also on a Pentium III (but in a laptop) I'm using gcc, I don't know what you're using.

I get the best results from -O3 -mcpu=pentium3 for both MUL values and do worse with CLEVER in most cases.

In my test, PyStringObject is not quite the same as in Python, so this could explain different results. Use of a different compiler, or gcc with the implied default -mcpu=i386 (?), might explain why you benchmarked the shifts as better, too. (-O{2,3} -mcpu=i386)

[* = best alternative for these optimizer flags] -O -mcpu=i386 -DMUL=100003 1.78 -O -mcpu=i386 -DMUL=65599 1.37 * -O -mcpu=i386 -DCLEVER 1.40

-O -mcpu=pentium3 -DMUL=100003 1.27 * -O -mcpu=pentium3 -DMUL=65599 1.27 * -O -mcpu=pentium3 -DCLEVER 1.35

-O2 -mcpu=i386 -DMUL=100003 1.93 -O2 -mcpu=i386 -DMUL=65599 1.54 -O2 -mcpu=i386 -DCLEVER 1.28 *

-O2 -mcpu=pentium3 -DMUL=100003 1.11 * -O2 -mcpu=pentium3 -DMUL=65599 1.12 -O2 -mcpu=pentium3 -DCLEVER 1.29

-O3 -mcpu=i386 -DMUL=100003 1.69 -O3 -mcpu=i386 -DMUL=65599 1.28 -O3 -mcpu=i386 -DCLEVER 1.10 *

-O3 -mcpu=pentium3 -DMUL=100003 0.90 * -O3 -mcpu=pentium3 -DMUL=65599 0.90 * -O3 -mcpu=pentium3 -DCLEVER 1.05

-Os -mcpu=i386 -DMUL=100003 1.16 * -Os -mcpu=i386 -DMUL=65599 1.16 * -Os -mcpu=i386 -DCLEVER 1.45

-Os -mcpu=pentium3 -DMUL=100003 1.05 * -Os -mcpu=pentium3 -DMUL=65599 1.05 * -Os -mcpu=pentium3 -DCLEVER 1.30

# alternatives.py OPT = [ '-O -mcpu=i386', '-O -mcpu=pentium3', '-O2 -mcpu=i386', '-O2 -mcpu=pentium3', '-O3 -mcpu=i386', '-O3 -mcpu=pentium3', '-Os -mcpu=i386', '-Os -mcpu=pentium3', ]

HASH = ['-DMUL=100003', '-DMUL=65599', '-DCLEVER']

import sys, os for o in OPT: for h in HASH: sys.stdout.write("%-20s %-20s " % (o, h)) sys.stdout.flush() os.system("gcc %s %s hashtest.c && TIME=%%U /usr/bin/time ./a.out" % (o, h)) print

// hashtest.c #ifndef MUL #define MUL 100003 #endif

typedef struct { long ob_shash; unsigned long ob_size; unsigned char ob_sval[16]; } PyStringObject;

static long string_hash(PyStringObject *a) { register int len; register unsigned char *p; register long x;

if (a->ob_shash != -1) return a->ob_shash; len = a->ob_size; p = (unsigned char *) a->ob_sval; x = *p << 7; while (--len >= 0) #ifdef CLEVER x = (x << 6) + (x << 16) - x + (long)*p++; #else x = (MUL*x) ^ *p++; #endif x ^= a->ob_size; if (x == -1) x = -2; a->ob_shash = x; return x; }

int main(void) { PyStringObject s = {-1, 13, "hello raymond"}; int i; for(i=0; i < 1<<23; i++) { string_hash(&s); s.ob_shash = -1; } return 0; }