9 messages in com.mysql.lists.clusterFwd: DB Node Stuck in phase 5| From | Sent On | Attachments |
|---|---|---|
| Quim Calpe | 31 Aug 2005 07:50 | |
| Alex Davies | 31 Aug 2005 08:23 | |
| Alex Davies | 31 Aug 2005 08:41 | |
| Alex Davies | 31 Aug 2005 08:53 | |
| Alex Davies | 31 Aug 2005 09:13 | |
| Alex Davies | 31 Aug 2005 09:49 | |
| Alex Davies | 31 Aug 2005 10:06 | |
| Alex Davies | 31 Aug 2005 10:24 | |
| Alex Davies | 31 Aug 2005 10:56 |
| Subject: | Fwd: DB Node Stuck in phase 5![]() |
|---|---|
| From: | Alex Davies (davi...@gmail.com) |
| Date: | 08/31/2005 08:53:54 AM |
| List: | com.mysql.lists.cluster |
Dear Quim,
Can I suggest you upgrade from Version 5.0.7 (beta) to Version 5.0.11 (beta)?
How big is your database? (easy way to tell: how much ram is your remaining ndbd process using)
With best wishes,
Alex
On 31/08/05, Quim Calpe <quim...@96com.com> wrote:
I started the node for the very first time and all went fine, all nodes where
detected, up and running. Then I uploaded a big database and one of the nodes
crashed, here is the log file:
Date/Time: Wednesday 31 August 2005 - 16:10:26 Type of error: error Message: Arbitrator shutdown Fault ID: 2305 Problem data: Arbitrator decided to shutdown this node Object of reference: QMGR (Line: 3796) 0x0000000e ProgramName: ndbd ProcessID: 2964 TraceFile: /var/lib/mysql/ndb_2_trace.log.1 Version 5.0.7 (beta) ***EOM***
I managed to finish the upload with the other node and the database is now OK.
The crahed node refused to start so I decided to do a "nbdb --initial". It was
an hour ago and is still in the process... I can understand that the DB needs to
be replicated in the new node but there is very little traffic...
Thanks once more...
Quim
-----Mensaje original----- De: Alex Davies [mailto:davi...@gmail.com] Enviado el: miércoles, 31 de agosto de 2005 17:42 Para: Quim Calpe CC: clus...@lists.mysql.com Asunto: Re: DB Node Stuck in phase 5
Dear Quim,
I have never encountered this before and hope someone more experienced than me will give you a definite answer but the TIME_WAIT state is a state that all the TCP connections enter into when the connection has been closed. A large increase in this suggests that something is making a hell of a lot of TCP connections.
Can you explain in more detail what you have done so far? Is this the first time that the ndbd node has tried to start? What happens if you kill the ndbd node that is trying to start, delete the files in /var/lib/mysql-cluster (backup first) and then start ndbd with --initial? Leave it for a few hours to see if it works.
How big is your database?
Alex
On 31/08/05, Quim Calpe <quim...@96com.com> wrote:
There is very little traffic (100b/s) aprox. And TIME_WAIT ports keep growing
very fast (10 new ports per second)...
Quim
-----Mensaje original----- De: Alex Davies [mailto:davi...@gmail.com] Enviado el: miércoles, 31 de agosto de 2005 17:25 Para: Quim Calpe Asunto: Re: DB Node Stuck in phase 5
Dear Quim,
Can you confirm how much network traffic there is?
It sound to me like it is loading up, but will take some time. How long has it been stuck in phase 5?
Alex
On 31/08/05, Quim Calpe <quim...@96com.com> wrote:
Alex,
There are hundreds of tcp connections to the other data node, but all of them
but one are TIME_WAIT, I really don't know if the node is actually getting data
or stalled...
Thanks!
Quim
-----Mensaje original----- De: Alex Davies [mailto:davi...@gmail.com] Enviado el: miércoles, 31 de agosto de 2005 17:07 Para: Quim Calpe Asunto: Re: DB Node Stuck in phase 5
Dear Quim,
I believe that phase 5 copies the data from the alive nodes in the nodegroup (if available) or starts the recovery from local checkpoints and the REDO log.
This process could potentially take some time. What happens if you delete everything in the DataDir on the crashed node and then start ndbd with --initial?
It is possible that it is just taking a long time - what is the network traffic looking like? If there is some traffic, then it probably is actually recovering, just taking its time.
Alex
On 31/08/05, Quim Calpe <quim...@96com.com> wrote:
The config is: 1 MGM, 2 NDB, 2 MySQL
One of the NDB nodes crashed during a heavy load of "INSERTS", the other one kept up and I finished the job, but the crashed node refuses to start now, it gets stuck in phase 5, here is the MGM log:
2005-08-31 16:34:19 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 reserved for ip XXX.XXX.XXX.XXX, m_reserved_nodes 0000000000000036. 2005-08-31 16:34:19 [MgmSrvr] INFO -- Node 1: Node 2 Connected 2005-08-31 16:34:20 [MgmSrvr] INFO -- Mgmt server state: nodeid 2 freed, m_reserved_nodes 0000000000000032. 2005-08-31 16:34:59 [MgmSrvr] INFO -- Node 2: Start phase 0 completed 2005-08-31 16:34:59 [MgmSrvr] INFO -- Node 2: Communication to Node 3 opened 2005-08-31 16:34:59 [MgmSrvr] INFO -- Node 3: Node 2 Connected 2005-08-31 16:34:59 [MgmSrvr] INFO -- Node 2: Node 3 Connected 2005-08-31 16:34:59 [MgmSrvr] INFO -- Node 2: CM_REGCONF president = 3, own Node = 2, our dynamic id = 5 2005-08-31 16:34:59 [MgmSrvr] INFO -- Node 3: Node 2: API version 5.0.7 2005-08-31 16:34:59 [MgmSrvr] INFO -- Node 2: Node 3: API version 5.0.7 2005-08-31 16:34:59 [MgmSrvr] INFO -- Node 2: Start phase 1 completed 2005-08-31 16:34:59 [MgmSrvr] INFO -- Node 2: Receive arbitrator node 1 [ticket=0c2700040cefafe4] 2005-08-31 16:35:00 [MgmSrvr] INFO -- Node 2: Start phase 2 completed 2005-08-31 16:35:47 [MgmSrvr] INFO -- Node 2: Start phase 3 completed 2005-08-31 16:35:53 [MgmSrvr] INFO -- Node 2: Start phase 4 completed
Any help would be appreciated
Quim Calpe
-- MySQL Cluster Mailing List For list archives: http://lists.mysql.com/cluster To unsubscribe: http://lists.mysql.com/cluster?unsub=al...@davz.net
-- Alex Davies // http://www.davz.net
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender immediately by e-mail and delete this e-mail permanently.
Contact me - MSN: a_d...@hotmail.com SKYPE: alex.davies
-- Alex Davies // http://www.davz.net
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender immediately by e-mail and delete this e-mail permanently.
Contact me - MSN: a_d...@hotmail.com SKYPE: alex.davies
-- Alex Davies // http://www.davz.net
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender immediately by e-mail and delete this e-mail permanently.
Contact me - MSN: a_d...@hotmail.com SKYPE: alex.davies
-- Alex Davies // http://www.davz.net
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender immediately by e-mail and delete this e-mail permanently.
Contact me - MSN: a_d...@hotmail.com SKYPE: alex.davies
-- Alex Davies // http://www.davz.net
This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the sender immediately by e-mail and delete this e-mail permanently.
Contact me - MSN: a_d...@hotmail.com SKYPE: alex.davies




