15 messages in com.mysql.lists.clusterRe: Using Disk Data
FromSent OnAttachments
Joe Warren-Meeks17 Jul 2007 07:48 
Brian Moon17 Jul 2007 08:01 
Joe Warren-Meeks17 Jul 2007 08:07 
Stewart Smith17 Jul 2007 18:05 
Brian Moon17 Jul 2007 19:03 
Stewart Smith17 Jul 2007 20:50 
Brian Moon18 Jul 2007 08:34 
pek...@mysql.com18 Jul 2007 08:50 
Brian Moon18 Jul 2007 08:54 
Brian Moon18 Jul 2007 08:58 
pek...@mysql.com18 Jul 2007 09:02 
Robin Bowes18 Jul 2007 09:09 
Brian Moon18 Jul 2007 09:31 
Stewart Smith18 Jul 2007 21:28 
Mikael Ronström15 May 2008 02:39 
Subject:Re: Using Disk Data
From:Stewart Smith (stew@mysql.com)
Date:07/18/2007 09:28:20 PM
List:com.mysql.lists.cluster

On Wed, 2007-07-18 at 11:32 -0500, Brian Moon wrote:

Ok, lets start clean maybe with an exercise.

CREATE TABLE `pub_articles` ( `article_id` mediumint(8) unsigned NOT NULL default '0',

in fixed size memory

`category_id` mediumint(8) unsigned NOT NULL default '0',

on disk

`first_publish_time` datetime NOT NULL default '0000-00-00 00:00:00',

in fixed size memory

`expiration_time` datetime NOT NULL default '0000-00-00 00:00:00',

on disk

`headline` varchar(150) NOT NULL default '',

on disk

`price` double(8,2) default NULL,

on disk

`category_name` varchar(50) NOT NULL default '',

on disk

`publication_id` tinyint(3) unsigned NOT NULL default '0',

on disk

`hotness` double(6,4) NOT NULL default '0.0000',

on disk

`rank` tinyint(3) unsigned NOT NULL default '0',

on disk

`body` text NOT NULL,

first 256 in fixed size memory in 5.1 first 256 in variable sized memory in 5.1-telco (i.e. if body is only 10 bytes... only use 10bytes+length... not the full 256)

helper table data is on disk (i.e. everything over 256 bytes)

this helper table has an index.. so it's likely 20bytes for each 2k or something....

PRIMARY KEY (`article_id`,`publication_id`),

means article_id and publication_id are in MEMORY.

KEY `pub_time` (`publication_id`,`first_publish_time`),

means publication_id and first_publish_time in MEMORY.

) TABLESPACE ts_1 STORAGE DISK NGINE=ndbcluster DEFAULT CHARSET=utf8;

As I understand that page, every column and indexes in this table is completely in memory except for `body`. For `body`, the first 256 bytes are in memory and the rest is on disk.

see above... hopefully it makes it clearer.

Now, I know there is extra stuff in memory for different reasons. And I know you brilliant engineers (really, you are brilliant, I am not being flippant) know every bit that is stored and where it is stored. But, what I think we (as humble users) want to know is where are the bytes that I send in a query being stored. Because, in reality, that is what will affect my storage needs more than where x bits of meta data are stored. I know in a "telco" type of environment where you are running very tight it is different. But, I need to know if I need 4GB of DataMemory or 8GB or DataMemory. If I need 8GB, I will buy more and increase it. I am not concerned with 200 bytes vs. 210 bytes.

hrrm... certainly seems like ndb_size.pl with support for estimating usage for disk tables would be useful.... eep!

Jumpstart your cluster: http://www.mysql.com/consulting/packaged/cluster.html