19 messages in com.mysql.lists.clusterRe: Stuck in phase 2| From | Sent On | Attachments |
|---|---|---|
| Jerl Simpson | 06 Jul 2005 05:38 | |
| Jonathan Miller | 06 Jul 2005 06:22 | |
| Jerl Simpson | 06 Jul 2005 07:34 | |
| Jonathan Miller | 06 Jul 2005 07:48 | |
| Jerl Simpson | 06 Jul 2005 08:08 | |
| Martin Skold | 06 Jul 2005 08:12 | |
| Jerl Simpson | 06 Jul 2005 09:01 | |
| Jonathan Miller | 06 Jul 2005 09:03 | |
| Jerl Simpson | 06 Jul 2005 09:17 | |
| Jonathan Miller | 06 Jul 2005 09:24 | |
| Jerl Simpson | 06 Jul 2005 09:42 | |
| Stewart Smith | 06 Jul 2005 18:54 | |
| Jerl Simpson | 07 Jul 2005 05:46 | |
| Stewart Smith | 07 Jul 2005 08:22 | |
| Jerl Simpson | 07 Jul 2005 11:40 | |
| Stewart Smith | 07 Jul 2005 18:51 | |
| Adam Dixon | 02 Jan 2006 20:40 | |
| Alex Davies | 03 Jan 2006 00:10 | |
| Stewart Smith | 17 Jan 2006 21:01 |
| Subject: | Re: Stuck in phase 2![]() |
|---|---|
| From: | Adam Dixon (adam...@gmail.com) |
| Date: | 01/02/2006 08:40:45 PM |
| List: | com.mysql.lists.cluster |
Just FYI, I did a '15 restart -n' on a node, had it sucsessfully shutdown, then waited 30 seconds, and ran '15 start' and had it stuck on Phase 2 (gave it 15 mins etc but normally it gets up to phase 4 by the time you get a chance to look at the log file) So I killed the process, and started it again, and all came good.
One thing that was strange in the ndb log which was not normal was that it complained about ID allocation;
2006-01-03 14:13:07 [MgmSrvr] INFO -- Node 17: Communication to Node 15 closed ... 2006-01-03 14:13:07 [MgmSrvr] INFO -- Node 13: Communication to Node 15 closed 2006-01-03 14:13:07 [MgmSrvr] ALERT -- Node 11: Arbitration check won - node group majority 2006-01-03 14:13:07 [MgmSrvr] INFO -- Node 11: President restarts arbitration thread [state=6] 2006-01-03 14:13:08 [MgmSrvr] INFO -- Node 15: Node shutdown completed, restarting, no start. 2006-01-03 14:13:08 [MgmSrvr] WARNING -- Allocate nodeid (15) failed. Connection from ip 192.168.0.15. Returned error string "Id 15 already allocated by another node." 2006-01-03 14:13:08 [MgmSrvr] INFO -- Mgmt server state: node id's 11 12 13 14 15 16 17 18 connected but not reserved 2006-01-03 14:13:08 [MgmSrvr] INFO -- Mgmt server state: node id's 1 not connected but reserved 2006-01-03 14:13:11 [MgmSrvr] INFO -- Mgmt server state: nodeid 15 reserved for ip 192.168.0.15, m_reserved_nodes 0800000000008006. 2006-01-03 14:13:11 [MgmSrvr] INFO -- Node 1: Node 15 Connected 2006-01-03 14:13:12 [MgmSrvr] INFO -- Mgmt server state: nodeid 15 freed, m_reserved_nodes 0800000000000006. 2006-01-03 14:13:19 [MgmSrvr] INFO -- Node 15: Start initiated (version 5.0.15) 2006-01-03 14:13:19 [MgmSrvr] INFO -- Node 15: Start phase 0 completed
It seems that it disconnected it ok, however did not free up its node id allocation perhaps?? If this is true, shouldnt the ndbd process fail and exit, instead of hang on Phase 2.
Further to this, If a node is under load, eg during a checkpoint, could this cause an issue with it receving a slower update that a node has gone away?
Running 5.0.15.
Adam
On 7/8/05, Stewart Smith <stew...@mysql.com> wrote:
On Thu, 2005-07-07 at 13:41 -0500, Jerl Simpson wrote:
Is there another solution you might recommend for someone in my situation? I'd like to be able to replicate/redundancy in both directions.
Replication may be what you are looking for. It is certainly designed to be used over geographically remote sites. Best to ask in replication forums/lists (after looking of course!) for details.
-- Stewart Smith, Software Engineer MySQL AB, www.mysql.com Office: +14082136540 Ext: 6616 VoIP: 66...@sip.us.mysql.com Mobile: +61 4 3 8844 332
Jumpstart your cluster: http://www.mysql.com/consulting/packaged/cluster.html
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux)
iQCVAwUAQs3cSIwDm44RooHBAQJ97AQAp7qf0mc3SGQXpE7ylntf/aOdqeoln4rH fIvM9sfPPQPLxZSDg9qPWdzFaLdsSjyNVk/1oTnIFcfoz4/0C1SSZml4doQL2BPJ Z459Drxbif7CKkgWo9jgZtWjPD/XTS26aaA4NFPa3FN/TMpJH/CyyhAIx28yRB5c guTUyZdcYio= =O2zW -----END PGP SIGNATURE-----




