10 messages in com.mysql.lists.clusterRe: Data persistency
FromSent OnAttachments
Richard Goh18 Aug 2004 23:32 
Mikael Ronström19 Aug 2004 02:00 
Devananda19 Aug 2004 10:26 
Mikael Ronström20 Aug 2004 00:55 
Devananda20 Aug 2004 09:45 
Devananda20 Aug 2004 09:51 
Mikael Ronström20 Aug 2004 16:03 
Devananda20 Aug 2004 16:34 
Clint Byrum20 Aug 2004 17:22 
Mikael Ronström21 Aug 2004 00:46 
Subject:Re: Data persistency
From:Mikael Ronström (mik@mysql.com)
Date:08/21/2004 12:46:07 AM
List:com.mysql.lists.cluster

Hi Clint,

2004-08-21 kl. 02.22 skrev Clint Byrum:

On Friday, August 20, 2004, at 04:03 PM, Mikael Ronström wrote: <snip>

64 bytes for the VARCHAR(20) seems correct. Hash Index for 64 byte PK actually is 33 + 64 = 107 bytes per key (there is a long key feature that kicks in above 32 bytes that has an 8 byte overhead) => This is obviously what causes the 3-4M limit due to the IndexMemory.

3 * 64 + 4* 4 + (3 * 100 + 4) + 16 = 524 bytes per record in DataMemory => 61 records per page => 75M records ~40GByte (Thus ~10 GByte DataMemory per node)

107 * 75 M * 2 replicas = 16 GB => 4 GByte IndexMemory per node

=> 14 GByte memory per machine.

Thus 16 GByte machines should do the trick hopefully.

Does this mean machines with 16GB of *virtual* memory free to devote to MySQL cluster? How bad is performance going to be on a database like this if the nodes have 8GB of RAM and 10GB of swap (on fast disks.. maybe RAID10)? <snip>

Not really recommendable to use less memory than the virtual memory allocated by the ndbd process. We don't test such scenarios. Probably it works with good performance most of the time, then some time with lousy performance (due to swapping stopping the process for every page-in and this will cause problems with the heartbeat mechanisms.

Particularly a full table scan over the table would be nasty since it will touch the entire tables memory pages and will do so one at a time.

There are work items in the loop to handle these thing for 5.0 Disk data for non-indexed attributes (will use caching mechanisms as most disk databases) Variable-sized records (will save space for VARCHAR's not using the maximum space)

Currently the ordered index would not work with UTF8 since it requires the ndb kernel to use the UTF8 compare routines which is still on the TODO list.

Uggggh thats kind of a bummer for my intended use. My databases are all UTF8. Is that on the TODO list for 4.1, or 5.0?

We have noted that Character sets is an important feature for many users. It has been on the TODO list for a long time but this request now comes from at least five directions so that will help in prioritisation of it. We are currently considering this.

Rgrds Mikael

Using the hash index is ok as long as there are exact matches. The NDB kernel doesn't handle the data, it only puts it into memory and later retrieves it from memory so the rest should be ok.

Mikael Ronström, Senior Software Architect MySQL AB, www.mysql.com

Clustering: http://www.infoworld.com/article/04/04/14/HNmysqlcluster_1.html

http://www.eweek.com/article2/0,1759,1567546,00.asp