From | Sent On | Attachments |
---|---|---|
Raymond Hettinger | Apr 13, 2004 8:00 pm | |
Jeff Epler | Apr 13, 2004 9:10 pm | |
Bob Ippolito | Apr 13, 2004 9:26 pm | |
Jeff Epler | Apr 13, 2004 10:04 pm | |
Raymond Hettinger | Apr 13, 2004 10:17 pm | |
Jeff Epler | Apr 13, 2004 11:10 pm | |
Guido van Rossum | Apr 13, 2004 11:26 pm | |
Tim Peters | Apr 13, 2004 11:56 pm | |
Jeff Epler | Apr 14, 2004 9:08 am | |
Raymond Hettinger | Apr 14, 2004 12:06 pm | |
Andrew MacIntyre | Apr 14, 2004 3:23 pm | |
Jeff Epler | Apr 14, 2004 3:35 pm | |
Mike Pall | Apr 14, 2004 5:50 pm | |
Tim Peters | Apr 14, 2004 11:14 pm | |
Michael Hudson | Apr 15, 2004 7:05 am | |
Mike Pall | Apr 15, 2004 9:36 am | |
Guido van Rossum | Apr 15, 2004 10:27 am | |
Jeremy Hylton | Apr 15, 2004 10:38 am | |
Guido van Rossum | Apr 15, 2004 10:42 am | |
Mike Pall | Apr 15, 2004 11:56 am | |
Mike Pall | Apr 15, 2004 11:56 am | |
Skip Montanaro | Apr 15, 2004 11:59 am | |
Michael Hudson | Apr 15, 2004 1:27 pm | |
Raymond Hettinger | Apr 15, 2004 2:22 pm | |
Thomas Heller | Apr 15, 2004 2:31 pm | |
"Martin v. Löwis" | Apr 15, 2004 3:07 pm | |
Jeremy Hylton | Apr 15, 2004 11:26 pm | |
Tim Peters | Apr 16, 2004 12:18 am | |
"Martin v. Löwis" | Apr 16, 2004 2:00 am | |
Andrew MacIntyre | Apr 16, 2004 9:14 pm |
Subject: | [Python-Dev] String hash function multiplier | ![]() |
---|---|---|
From: | Bob Ippolito (bo...@redivi.com) | |
Date: | Apr 13, 2004 9:26:16 pm | |
List: | org.python.python-dev |
On Apr 13, 2004, at 9:09 PM, Jeff Epler wrote:
With -O2 -mcpu=i686 or newer, gcc uses "imul" for both 100003 and 65599, rather than shifts and adds.
There may be a few people who care about some other processor, but I wouldn't listen to them. (the only non-x86 CPU I program for on a weekly basis doesn't have hardware multiply, but it's also much too small for Python)
The current value goes back a long way: http://cvs.sourceforge.net/viewcvs.py/python/python/dist/src/Objects/ stringobject.c#rev2.31 ... all the way back to when Python did string haching instead of hashing.
Other than some abstract beauty to 65599, are there some other practical advantages I'm missing?
It's not expected that GCC optimize an integer constant into shifts on its own. Anyways, a practical advantage is that with a sane instruction set, like PPC, it saves you a memory access or some instructions (depending on the compiler I guess). Both 100003 and 65599 are too big to be immediate values in a PPC instruction, but the shift constants are not.
I guess the real question for Raymond is, does it really make a measurable difference? And what effect does it have on pickled dicts (or other such hash-using data structures), if any?
-bob