14 messages in com.mysql.lists.clusterRe: API loses data during node restarts| From | Sent On | Attachments |
|---|---|---|
| Jim Hoadley | 19 Jul 2004 12:52 | |
| Devananda | 19 Jul 2004 14:20 | |
| Johan Andersson | 19 Jul 2004 14:31 | |
| Jim Hoadley | 19 Jul 2004 15:36 | |
| Justin Swanhart | 19 Jul 2004 16:08 | |
| Mikael Ronström | 19 Jul 2004 16:12 | |
| Mikael Ronström | 19 Jul 2004 16:29 | |
| Johan Andersson | 19 Jul 2004 16:30 | |
| Devananda | 19 Jul 2004 16:54 | |
| Johan Andersson | 19 Jul 2004 17:08 | |
| Jim Hoadley | 19 Jul 2004 17:21 | |
| Jim Hoadley | 19 Jul 2004 17:34 | |
| Mikael Ronström | 20 Jul 2004 02:45 | |
| Jim Hoadley | 21 Jul 2004 17:12 |
| Subject: | Re: API loses data during node restarts![]() |
|---|---|
| From: | Mikael Ronström (mik...@mysql.com) |
| Date: | 07/19/2004 04:29:39 PM |
| List: | com.mysql.lists.cluster |
Hi,
2004-07-20 kl. 01.08 skrev Justin Swanhart:
Do you have a [TCP] section for connections between the second API server and the first DB server?
The default if you don't specify any connections is that you get one between all storage nodes and between each API server and all storage nodes. It might however be worthwhile to test specifying each connection to see if there is a bug in the code generating the connections.
Rgrds Mikael
(assuming your app ids are 4 and 5 and your db ids are 2 and 3)
#DB1 / APP1 [TCP] node1=2 node2=4
#DB2 / APP1 [TCP] node1=3 node2=4
#DB1 / APP2 [TCP] node1=2 node2=5
#DB2 / APP2 [TCP] node1=3 node2=5
--- Jim Hoadley <j_ho...@yahoo.com> wrote:
Johan --
Thanks for the fast response! I read bug report 4585. It says:
- Description: - If entire DB cluster goes down, then the mysqld servers should retry - connecting to the DB. The mysql servers must not give up trying to reconnect - to DB nodes. - - If the mysqld is not restarted after a cluster restart and a query is - executed on that mysqld, then the mysqld will crash. Not so nice. - - How to repeat: - 1. restart cluster - 2. issue a query on one mysqld server - - Suggested fix: - Let be there be a configurable option (--ndbcluster_timeout) for how long - the mysqld should try to reconnect to the db nodes. - --ndbcluster_timeout={0,0x7fffffff} and let 0 be retry forever.
Not sure we're talking about the same issue. I'm not taking the entire cluster down, just one of the nodes. In that case, shouldn't the API seamlessly and instantly read from another node?
1) I have a 2-node cluster with 2 replicas, with an API running on each node. 2) I run a shell script that connects to the first API and executes one SELECT query per second. I can stop either DB node everything still works. 3) I run the same script against the second API. I can stop the DB node on the *other* computer, but if I stop the DB node on the same computer that the API is running on, mysqld reports it can't get a lock on the data file until the node comes back up. 4) When the node is started again the API begins answering queries again.
Comments? Thanks again for taking the time to look at my problem.
-- Jim
--- Johan Andersson <joh...@mysql.com> wrote:
Hi, A bug report (4585) relating to this has been filed. Sorry for your inconvenience,
b.r, Johan Andersson
Devananda wrote:
I've been experiencing this same general problem, but haven't tried to narrow it down to a reproduceable pattern. Seems to happen in relation to restarting a DB node, like Jim said.
Jim Hoadley wrote:
When I stop/start or restart a database node, the API (MySQL server) loses connection with the data until the node comes back online. This only happens on one of my 2 nodes (BOX2). The other (BOX1) is fine. Been puzzling over this for a week or so. Something I missed? Please forward any suggestions. Details below.
BOX1 = Pentium III/1000MHz/512MB RAM BOX2 = Pentium III/600MHz/512MB RAM Both running mysql-4.1.3-beta-nightly-20040628.tar.gz. Not a lot of RAM but only using a tiny test database at this point. Running the MGM on a separate computer (BOX4) to help isolate problem.
Connected to BOX1, issue SELECT against test.simpsons and get proper response:
---------------------------------------- mysql> select * from simpsons ; +----+------------+ | id | first_name | +----+------------+ | 2 | Lisa | | 4 | Homer | | 5 | Maggie | | 3 | Marge | | 1 | Bart | +----+------------+ 5 rows in set (0.03 sec)
----------------------------------------
Stop node 3 on BOX1. SELECT now fails:
---------------------------------------- mysql> select * from simpsons ; ERROR 1015: Can't lock file (errno: 4009)
----------------------------------------
Repeating SELECT fails:
---------------------------------------- mysql> select * from simpsons ; ERROR 2013: Lost connection to MySQL server during query
----------------------------------------
Repeating SELECT fails again, then succeeds after node 3 is restarted:
---------------------------------------- mysql> select * from simpsons ; ERROR 2006: MySQL server has gone away No connection. Trying to reconnect... Connection id: 1 Current database: test
+----+------------+ | id | first_name | +----+------------+ | 2 | Lisa | | 4 | Homer | | 5 | Maggie | | 3 | Marge | | 1 | Bart | +----+------------+ 5 rows in set (6.55 sec)
----------------------------------------
All data is intact. BTW new records added to node 2 on BOX2 while node 3 on BOX1 is down show up (this is good).
Here's what restarting node 3 on BOX1 with mgmd looks like (looks right to me):
---------------------------------------- NDB> show Cluster Configuration
--------------------- 2 NDB Node(s) DB node: 2 (Version: 3.5.0) DB node: 3 (Version: 3.5.0)
4 API Node(s) API node: 11 (not connected) API node: 12 (Version: 3.5.0) API node: 13 (not connected) API node: 14 (not connected)
1 MGM Node(s) MGM node: 1 (Version: 3.5.0)
NDB> 2 restart Executing RESTART on node 2. Database node 2 is being restarted.
NDB> 2 - endTakeOver
----------------------------------------
Here is the MySQL server error log output on BOX1 as node 3 is restarted:
---------------------------------------- 040713 10:53:31 mysqld started 040713 10:53:32 InnoDB: Started; log sequence number 0 44112 /usr/local/mysql/libexec/mysqld: ready for connections. Version: '4.1.3-beta-nightly-20040628-log'
socket: '/tmp/mysql.sock'
=== message truncated ===
-- MySQL Cluster Mailing List For list archives: http://lists.mysql.com/cluster To unsubscribe: http://lists.mysql.com/cluster?unsub=mik...@mysql.com
Mikael Ronström, Senior Software Architect MySQL AB, www.mysql.com
Clustering: http://www.infoworld.com/article/04/04/14/HNmysqlcluster_1.html
http://www.eweek.com/article2/0,1759,1567546,00.asp




