17 messages in com.mysql.lists.clusterRe: DB node hang on start| From | Sent On | Attachments |
|---|---|---|
| Brancaleoni Matteo | 20 Jun 2004 00:54 | |
| Brancaleoni Matteo | 20 Jun 2004 14:24 | |
| Tomas Ulin | 20 Jun 2004 15:43 | |
| Tomas Ulin | 21 Jun 2004 04:37 | |
| Tomas Ulin | 21 Jun 2004 04:45 | |
| Matteo Brancaleoni | 21 Jun 2004 05:22 | |
| Matteo Brancaleoni | 21 Jun 2004 07:30 | |
| Tomas Ulin | 21 Jun 2004 07:57 | |
| Tomas Ulin | 21 Jun 2004 08:34 | |
| Brancaleoni Matteo | 21 Jun 2004 10:33 | |
| Tomas Ulin | 21 Jun 2004 11:36 | |
| Tomas Ulin | 22 Jun 2004 02:57 | |
| tul...@mysql.com | 22 Jun 2004 14:37 | |
| Matteo Brancaleoni | 23 Jun 2004 00:23 | |
| Matteo Brancaleoni | 23 Jun 2004 01:40 | |
| Matteo Brancaleoni | 23 Jun 2004 01:46 | |
| Tomas Ulin | 23 Jun 2004 03:30 |
| Subject: | Re: DB node hang on start![]() |
|---|---|
| From: | Tomas Ulin (tom...@mysql.com) |
| Date: | 06/21/2004 04:37:11 AM |
| List: | com.mysql.lists.cluster |
Did you try to start the second node with "ndbd -i"?
T
Brancaleoni Matteo wrote:
Hi, thanks for the fast answer :) see my comments inline.
Il lun, 2004-06-21 alle 00:43, Tomas Ulin ha scritto:
first of all, if you download the latest source you don't have to specify the "[TCP]" connections at all
Ok, done.
1) please look where you started ndb_mgmd, you should find a cluster.log (look at the end "tail -n100 cluster.log")
ok, got it. unfortunately no trace about the db node #3, that's the one onto the remote machine
2) please make sure that you don't have any trailing "ndbd" processes on the failing machine. (we're working on better detection on clashes), if so kill and restart (if a "ndb" process hangs this is often due to that there are "multiple" processes trying to connect as the same "id")
ok. no trailing processes.
3) make sure you have your [COMPUTER] sections correct in the config file
ok, done
4) make sure that your Ndb.cfg/NDB_CONNECTSTRING points to the actual host:port that run the ndb_mgmd
sure done. If I write something wrong (done just 4 testing) the node doesn't go at all into starting phase (should be phase 1, I think). But when starts, is stick in that state.
and try again until you get the config right
mmh... I tried to start 2 db nodes on the same machine (of course with different fs), the 2nd db node starts, but after phase #4 crashes.
I have a rather long trace file for that. the error into ndbd error.log is :
Date/Time: x 20 June 2004 - 23:15:49 Type of error: error Message: Internal program error (failed ndbrequire) Fault ID: 2341 Problem data: DbdihMain.cpp Object of reference: DBDIH (Line: 1080) 0x00000002 ProgramName: NDB Kernel ProcessID: 10904 TraceFile: NDB_TraceFile_1.trace ***EOM***
The mgm config is (for 2 db nodes on same machine) [COMPUTER] Id: 1 ByteOrder: Little HostName: bestia [COMPUTER] Id: 2 ByteOrder: Little HostName: bestia [MGM] Id: 1 ExecuteOnComputer: 1 ArbitrationRank: 1 [DB DEFAULT] NoOfReplicas: 2 [DB] Id: 2 ExecuteOnComputer: 1 FileSystemPath: /root/ndb/ndb_data1 [DB] Id: 3 ExecuteOnComputer: 2 FileSystemPath: /root/ndb/ndb_data2 [API] Id: 4 ExecuteOnComputer: 1 ArbitrationRank: 1
Regarding 2 db nodes on different machines, I'm stick to node #3 not starting (stops at phase 1, without exiting...) The only difference in mgm config.ini is the hostname of COMPUTER with id #2
any clue?




