4 messages in com.mysql.lists.clusterRe: mysqld / api node performance drop
FromSent OnAttachments
Mike Hedlund22 Nov 2004 22:03 
Olivier Kaloudoff22 Nov 2004 23:57 
Mikael Ronström23 Nov 2004 02:10 
Joseph E. Sacco, Ph.D.23 Nov 2004 05:31 
Subject:Re: mysqld / api node performance drop
From:Joseph E. Sacco, Ph.D. (jsa@earthlink.net)
Date:11/23/2004 05:31:25 AM
List:com.mysql.lists.cluster

Mikael,

These sort of explanations are invaluable to folks thinking about using an NDB cluster. Is documentation on "how to tune an NDB cluster" being written?

-Joseph

=======================================================================

On Tue, 2004-11-23 at 05:10, Mikael Ronström wrote:

Hi Mike, This email tries to explain the reasons behind what you are seeing. It doesn't provide the recipy for your particular environment. It also certainly doesn't explain all effects.

Performance of distributed systems is an interesting area with lots of surprises. Understanding performance can be understood to some extent by thinking about the formula for cost of sending messages. In response to an email I am not able to fully explain the matter but I'll provide some basic things that can explain at least partly what's going on in the system.

Cost of TCP/IP message = Fixed Cost + Byte Cost * No of bytes

Last time I measured these parameters was on a UltraSparc machine with a 500 MHz processor. For that computer the formula was

Message cost = 30 microseconds + 100 nanoseconds * No of bytes

Thus sending a 60 byte message had the cost 30 microseconds + 0.1 * 60 = 36 microseconds

The way MySQL Cluster works is by trying to collect a few messages together before calling send, thus making it possible to share the fixed costs of several messages in one TCP/IP message. Actually it does so in two levels since the cost of communication is the key to performance in a distributed system.

Assume now the following simplified model (not exactly the way MySQL Cluster works but better from a teaching perspective): We are sending one TCP/IP message on each link every 1 millisecond. Thus if 10 messages of 100 bytes have been gathered during this millisecond we will get the total cost per TCP/IP message to be (using the UltraSparc model) 30 microseconds + 0.1 * (10 * 100) = 130 microseconds

Cost per message = 130 / 10 = 13 microseconds

Now assume we double the number of links in the system and everything else is constant. Then only 10 messages will be sent in one millisecond per link and the cost will be:

30 + 0.1 * (5 * 100) = 80 microseconds Cost per messsage = 80 / 5 = 16 microseconds

So the cost per message actually increased when we increased the size of the cluster.

Now another interesting property is what happens when CPU's gets faster. Assume our CPU now doubled in performance and so the basic model changes in two ways 1) TCP/IP Cost = 15 + 0.05 * #Bytes 2) We can accumulate twice the number of messages per time unit

So the 1 link case we get 20 messages per millisecond 15 + 0.05 (20 * 100) = 115 Cost per message is 115 / 20 = 5.75 microseconds

So faster CPU's actually has superlinear effect on performance.

Another interesting fact is that as CPU's gets more and more loaded they will actually do more work per percentage of CPU load. I won't go into details about proving this but I am sure somebody has seen this effect as well and been puzzled about it.

Now when using MySQL Cluster there are many things that affect those parameters which are unique per application and there are certain things we can do in the MySQL software to improve matters but the generalized conclusions are usually the same:

1) Bigger clusters increase the cost of sending one message 2) Faster CPU's have superlinear effect on performance 3) Increasing the load means that cost per message goes down 4) Cluster interconnects with a lower fixed cost have a very significant effect on cluster performance

The final thing is that you saw an increase of network traffic when going from 2-node to 4-node. The reason is that a lot of the messages in the 2-node case became internal messages within one node and in the 4-node case those messages were not internal anymore. This is mostly seen when scaling from 2 to 4 nodes, scaling from 4 to 8 doesn't see much of that effect since almost all messages which are distributed are really distributed in the 4-node case whereas in the 2-node case 50% of them will be internal.

So in essence the behaviour you are seeing is "strange" but normal for distributed systems and their behaviour. I am sure that by sharing these experiences and solutions on the list a number of "rules of thumb" will transpire for various application types.

Rgrds Mikael

2004-11-23 kl. 07.03 skrev Mike Hedlund:

Hi all,

I'm getting ready to roll-out a mysql cluster into production, and have been tuning it trying to get better performance.

My 'stable' environment is:

2 Data Nodes (2 single proc P4 2.4Ghz 1GB boxes, 1 ndbd each, and NoOfReplicas: 2) 2 MySQL API Nodes (1 old dual proc p450 with 512MB, and 1 old mini-itx with a VIA Nehemiah 1Ghz cpu, 512MB)

Load balancing requests 50/50 between the MySQL API nodes nets me around 1000 to 1100 queries per second (my application/engine uses the mysql api nodes basically as a simple storage/sorting engine) which does distinct()'s on tables which have around 1 million rows of data (scaling to 250-500million rows of data in production). Explain output shows the queries all using the index only, no filesorting or temp tables etc. Mysqld cpu utilization on each api node is at 60-70% and ndbd cpu utilization on each data node is around 5-10%. (network traffic between nodes around 5-6Mbs).

If i change the environment to use 4 Data Nodes (same 2 machines as described above, except 2 ndbd processes on each machine), my queries per second drops to around 600-700. mysqld CPU utilization on the api nodes is 99%, each ndbd process on the 2 data node machines are at 10-15% utilization. Network traffic is around 15-20Mbs between the data nodes.

The performance of my 'stable environment' is actually ok for me to roll out the application, but the behavior of the mysqlds with more data nodes seemed strange.

If i used 4 different machines, each with 1 ndbd process and NoOfReplicas: 2, would I see the same performance hit? I would have expected a performance increase... or am I tired and missing something? :)

Any thoughts?

btw, this is 4.1.7-max.

Mikael Ronström, Senior Software Architect MySQL AB, www.mysql.com

Clustering: http://www.infoworld.com/article/04/04/14/HNmysqlcluster_1.html

http://www.eweek.com/article2/0,1759,1567546,00.asp