I have a bunch of ideas to speed up spin and mutex locks somewhat. For
this I need benchmarks to test different modifications.
While the micro-benchmark from rwatson@ is a good way to quickly test
modifications to weed out unlikely candidates - jhb@ tests have shown
that micro and macro-benchmarks do not always show the same result.
Running benchmarks and booting takes a lot of time. Since this is NOT
one my favorite tasks I want to run generally accepted benchmarks so I
can test (boot) each modification exactly once for each test machine.
If you think I should run certain benchmarks with certain parameters
please tell me BEFORE I start testing!
I like to use netblast from src/tools/tools/netrate/netblast. It attempts
to send packets as quickly as possible on a network interface, which is a
CPU-intensive operation that is very sensitive to the cost of
synchronization. On an SMP system, it also generates a moderate ithread
load as the gig-e interface transmits, and that ithread will often contend
on the network interface driver lock with the running netblast thread. As
such, it changes that affect the cost and handling of contention are also
visible in this benchmark. With the synchronization micro-benchmark, I
see spin locks on SMP being faster with the atomic release removed, but in
the netblast test, I see those spinlocks as slower on SMP, since they
behave less well under contention.
(The above with 64-bit if_em cards on a dual-Xeon). Note that you'll want
to make sure netreceive is running on a second box, or that you're sending
to the broadcast address, or the icmp errors will substantially quench
your send ability due to the asynchronouse report of the port closed.
Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
rob...@fledge.watson.org Principal Research Scientist, McAfee Research