38 messages in com.mysql.lists.clusterRe: Is MySQL Cluster stable and matur...| From | Sent On | Attachments |
|---|---|---|
| Konstantin Rozinov | 24 Jul 2008 00:01 | |
| Krishna Chandra Prajapati | 24 Jul 2008 00:19 | |
| Mark Callaghan | 24 Jul 2008 00:31 | |
| Serge Kozlov | 24 Jul 2008 03:46 | |
| Josh Miller | 24 Jul 2008 07:15 | |
| Konstantin Rozinov | 24 Jul 2008 12:49 | |
| Konstantin Rozinov | 24 Jul 2008 13:00 | |
| Konstantin Rozinov | 24 Jul 2008 13:04 | |
| Josh Miller | 25 Jul 2008 07:58 | |
| Ben Wiechman | 25 Jul 2008 08:43 | |
| Serge Kozlov | 25 Jul 2008 09:44 | |
| Pascal Charest | 25 Jul 2008 12:48 | |
| Konstantin Rozinov | 25 Jul 2008 17:40 | |
| Konstantin Rozinov | 25 Jul 2008 17:54 | |
| Konstantin Rozinov | 25 Jul 2008 17:56 | |
| Mark Callaghan | 25 Jul 2008 22:58 | |
| Serge Kozlov | 25 Jul 2008 23:19 | |
| Massimo | 26 Jul 2008 00:26 | |
| Mark Callaghan | 26 Jul 2008 07:38 | |
| Jeff Sturm | 26 Jul 2008 20:49 | |
| Andrew Garner | 26 Jul 2008 22:51 | |
| Mark Callaghan | 27 Jul 2008 06:23 | |
| Jeff Sturm | 27 Jul 2008 12:14 | |
| Jeff Sturm | 27 Jul 2008 12:49 | |
| Serge Fonville | 28 Jul 2008 01:04 | |
| Konstantin Rozinov | 28 Jul 2008 23:53 | |
| living liquid | Christian Meisinger | 29 Jul 2008 00:07 | |
| Konstantin Rozinov | 29 Jul 2008 00:28 | |
| living liquid | Christian Meisinger | 29 Jul 2008 00:56 | |
| Massimo | 29 Jul 2008 05:19 | |
| Matthew Montgomery | 29 Jul 2008 05:51 | |
| Hartmut Holzgraefe | 29 Jul 2008 05:54 | |
| Mikael Ronström | 29 Jul 2008 06:56 | |
| Mikael Ronström | 29 Jul 2008 07:07 | |
| Serge Fonville | 29 Jul 2008 07:29 | |
| Mikael Ronström | 29 Jul 2008 08:04 | |
| Burhan Khalid | 29 Jul 2008 14:37 | |
| Konstantin Rozinov | 29 Jul 2008 17:28 |
| Subject: | Re: Is MySQL Cluster stable and mature enough to run a social network?![]() |
|---|---|
| From: | Mark Callaghan (mcal...@google.com) |
| Date: | 07/24/2008 12:31:25 AM |
| List: | com.mysql.lists.cluster |
On Thu, Jul 24, 2008 at 12:02 AM, Konstantin Rozinov <kroz...@gmail.com> wrote:
Hi folks,
I know this is a long email, but if you could find the time to reply, it would be greatly appreciated.
I am in the design phase of a new project and am trying to determine if MySQL Cluster is a viable option for the project I'm working on. It's a social network with many of the typical social network site features (profiles, photos, comments, videos, etc). We are using LAMP environment. I'm trying to research MySQL Cluster, MySQL Replication, and how other large sites cope with large amounts of traffic and how they solve their scalability and HA issues.
We anticipate the database to grow very quickly over the next 12 months. We expect the database to reach around 800GB by the end of the 12 months. As the database size grows the number of transactions per second will also grow to the point that a single master MySQL server will probably not be able to keep up. So we are looking at ways to distribute the load of the database requests. We anticipate the ratio of selects to insert/updates to be 1 to 1 in the first 3-4 months because many people will be creating profiles, adding photos, posting comments, etc. By the end of the 12 month period, we expect the ratio to be 5 to 1. That is 5 selects to every insert/update.
What I've found is this: - Replication is widely used by many of the largest sites - memcached is widely used by many of the largest sites - MySQL Cluster is not widely used at all (not sure why - seems like a great product)
As far as I can tell, Replication has some problems: - the delay in syncing current data to the slaves. - replication is ideal for read-intensive applications. - need to modify web application to read from slaves and write to master. - MOST IMPORTANTLY: the single point of failure and bottleneck point with 1 master server.
Problems with master/slave replication include: * no write scale out unless you partition your dataset * failover can be messy - it is difficult to automate when you have many slaves, if you use something like DRBD then this is less of an issue
Cluster is great, but you still need something like replication. Cluster doesn't protect against the loss (fire/power/network) of a datacenter. You still need to get data from one cluster to another.
From what I read and understand, MySQL Cluster is ideal for both read and write intensive applications and is built for high availability and scalability. Seems like a great solution. But why is no one using it for web applications or social networks?
In this scenario would it be better to use replication with 1 Master database to handle inserts/updates and several slaves to handle selects. As load increases we would add more slaves to handle the additional select load until the insert/update load becomes too much for the single master. Then we could do a horizontal partition of the master database to split the insert/update load among multiple servers.
Can the current versions of MySQL Cluster (6.2.15 or 6.3.16) be used in this scenario to give a more streamlined method of growing this database over time. I understand there are limitations with Cluster, but what are they and how do they limit a social network type of site from growth? I read in the documentation that there's a practical maximum data file size limit of 32GB. Is this why more social networks don't use Cluster, where their data files are 100s of GBs each?
Is the size of the DB (800GB) too much for Cluster to hold in memory? As far as I understand, only the indexed data needs to be in memory, so it would require much less than 800GB of RAM to hold the data. What would happen if it can't all fit in memory? Would it be paged out to disk or would that limit the size of the DB?
It seems everyone is recommending replication, but I have concerns about it:
1. Initially, we expect that there will be as many writes as reads as more and more users create profiles, post photos, comments, etc. Considering that replication is great for read-intensive applications, would replication be of any help here? How would I aleviate the large number of writes to a single master server?
2. The SPOF with 1 Master MySQL server really scares me. I've read about Master-Master Replication but again, the bottleneck would be the writes. Am I wrong?
Deal with the SPOF by detecting failures quickly and promoting a slave to a master. It isn't trivial but it can be done.
3. Even if I partition my data across different master databases, if one of them fails then part of the site (and potentially the entire site) might go offline. Or am I wrong?
4. How do these big sites use replication without running into write performance issues?
5. Is MySQL Cluster 6.2 or 6.3 mature and stable enough to run a large social network?
Any thoughts on this scenario would be greatly appreciated.
Konstantin
-- MySQL Cluster Mailing List For list archives: http://lists.mysql.com/cluster To unsubscribe: http://lists.mysql.com/cluster?unsub=mcal...@google.com
-- Mark Callaghan mcal...@google.com




