By 'network scaling', do you mean the aggregated throughput (bandwidth, packets/sec) of the entire cloud (or part of it)? I think picking up 'netperf' as micro benchmark is just 1st step, there's more work needs to be done.

Indeed. A great deal more.

For OpenStack network, there's 'inter-cloud' and 'cloud-to-external-world' throughput.  If we care about the performance for end user, then reason numbers (for network scaling) should be captured inside VM instances.  For example, spawn 1,000 VM instances across cloud, then pair them to do 'netperf' tests in order to measure 'inter-cloud' network throughput.

That would certainly be an interesting test yes.

We did a bunch of similar tests to determine the overhead caused by kvm and limitations of the nova network architecture. We found that VMs themselves were able to consistently saturate the network link available to the host system, whether it was 1GE or 10GE, with relatively modern node and network hardware. With the default VLANManager network setup, there isn't much you can do to scale your outbound connectivity beyond the hardware you can reasonably drive with a single node, but using multi-host nova-network, we were able to run a bunch of nodes in parallel, scaling up our outbound bandwidth linearly. We managed to get 10 nodes, with a single VM per node, each running 4 TCP streams, up to 99 gigabits on a dedicated cross country link. There was a bunch of tuning that we needed to do, but it wasn't anything particularly outlandish compared with the tuning needed for doing this with bare metal. We've been meaning to do a full writeup, but haven't had time yet. -nld