13 messages in com.mysql.lists.clusterMax number of open files exceeded / E...| From | Sent On | Attachments |
|---|---|---|
| Alex Davies | 27 Feb 2005 10:54 | |
| Alex Davies | 27 Feb 2005 13:39 | |
| pek...@mysql.com | 27 Feb 2005 14:56 | |
| Alex Davies | 28 Feb 2005 00:16 | |
| Alex Davies | 28 Feb 2005 08:11 | |
| pek...@mysql.com | 28 Feb 2005 14:06 | |
| Alex Davies | 01 Mar 2005 00:28 | |
| Mikael Ronström | 01 Mar 2005 01:31 | |
| Alex Davies | 01 Mar 2005 01:34 | |
| Alex Davies | 01 Mar 2005 09:43 | |
| Mikael Ronström | 01 Mar 2005 12:12 | |
| Alex Davies | 02 Mar 2005 03:35 | |
| Jonas Oreland | 04 Mar 2005 02:14 |
| Subject: | Max number of open files exceeded / Error while reading REDO log / I can't start my cluster!![]() |
|---|---|
| From: | Alex Davies (davi...@gmail.com) |
| Date: | 02/27/2005 10:54:40 AM |
| List: | com.mysql.lists.cluster |
Dear All,
I have a three server cluster that I am trying to restart. For various reasons it got SHUTDOWN (cleanly). When I attempt to restart it I am getting all sorts of problems. When I run ndbd on each storage machine, the managment server shows them as "starting" for about 5 minutes. It then shows one server up:
[ndbd(NDB)] 2 node(s) id=2 @81.29.81.196 (Version: 4.1.9, Nodegroup: 0, Master) id=3 @81.29.81.197 (Version: 4.1.9, starting, Nodegroup: 0)
But after another few minutes, both disconnect:
[ndbd(NDB)] 2 node(s) id=2 (not connected, accepting connect from 81.29.81.196) id=3 (not connected, accepting connect from 81.29.81.197)
The errors are
Server ID 2: Date/Time: x 27 February 2005 - 18:50:53 Type of error: error Message: Max number of open files exceeded Fault ID: 2806 Problem data: Object of reference: Ndbfs::createAsyncFile ProgramName: ndbd ProcessID: 3864 TraceFile: /var/lib/mysql-cluster/ndb_2_trace.log.10 ***EOM***
Server ID 3: Date/Time: x 27 February 2005 - 18:50:57 Type of error: error Message: Node failed during system restart Fault ID: 2308 Problem data: Unhandled node failure during restart Object of reference: NDBCNTR (Line: 1389) 0x0000000e ProgramName: ndbd ProcessID: 26476 TraceFile: /var/lib/mysql-cluster/ndb_3_trace.log.1 ***EOM***
Please note I have tried to increase /proc/sys/fs/file-max to avoid the number of open files problem but it has not worked.
Any ideas? Thanks everyone for your help as usual,
Alex
PS - I am starting server with ID 3 from an empty DataDir with the ndbd --initial command because I got a horrible error with it earlier (below) but I am using just plain ndbd on server ID 2 because I want/need to keep the data on the cluster.
Server 2 error log: Date/Time: x 27 February 2005 - 18:22:10 Type of error: error Message: Error while reading the REDO log Fault ID: 2310 Problem data: Error while reading REDO log. D=8, F=0 Mb=0 FP=1 W1=35 W2=0 Object of reference: DBLQH (Line: 14928) 0x0000000a ProgramName: ndbd ProcessID: 25653 TraceFile: /var/lib/mysql-cluster/ndb_3_trace.log.12 ***EOM***




