5 messages in com.mysql.lists.clusterRE: 1 node in cluster fails hourly: N...
FromSent OnAttachments
James Graham04 Oct 2007 01:32 
slie...@guesswho.com04 Oct 2007 03:57 
James Graham19 Oct 2007 09:03 
Stewart Smith21 Oct 2007 18:00 
James Graham23 Oct 2007 09:25 
Subject:RE: 1 node in cluster fails hourly: Ndb kernel is stuck in: JobHandling
From:slie...@guesswho.com (slie@guesswho.com)
Date:10/04/2007 03:57:48 AM
List:com.mysql.lists.cluster

We had a similar issue on our cluster. Even though there is ample ram we had a query taking 90 minutes that was taking 30 sec on the standalone SQL. By forcing the index the query now takes 20 sec or less.

Simon

-----Original Message----- From: James Graham [mailto:jam@asperity.co.uk] Sent: Thursday, October 04, 2007 4:32 AM To: Stewart Smith Cc: clus@lists.mysql.com Subject: RE: 1 node in cluster fails hourly: Ndb kernel is stuck in: JobHandling

The machine is certainly not swapping, there is ample RAM.

Our load averages are typically:

3.76 (1 min) 3.40 (5 mins) 3.26 (15 mins)

Sure enough we can optimize queries (FORCE INDEX anyone?) but for now our priority is stability.

Thanks

-----Original Message----- From: Stewart Smith [mailto:stew@mysql.com] Sent: 03 October 2007 04:33 To: James Graham Cc: clus@lists.mysql.com Subject: Re: 1 node in cluster fails hourly: Ndb kernel is stuck in: JobHandling

On Mon, 2007-10-01 at 09:51 +0100, James Graham wrote:

Since yesterday evening, one of our data nodes has been crashing every hour or so. Our setup is a load balancer running the ndb_mgm, and 2 machines running ndbd/mysqld.

The error is below: Time: Monday 1 October 2007 - 06:54:03 Status: Temporary error, restart node Message: WatchDog terminate, internal error or massive overload on the machine r unning this node (Internal error, programming error or missing error message, pl ease report a bug) Error: 6050 Error data: Job Handling Error object: WatchDog.cpp Program: ndbd Pid: 4218 Trace: /var/lib/mysql-cluster/ndb_3_trace.log.21 Version: Version 5.0.32

Is the machine overloaded? swapping at all?

Jumpstart your cluster: http://www.mysql.com/consulting/packaged/cluster.html