3 messages in com.mysql.lists.mysqlRE: Replication corruption and 64 bit...
FromSent OnAttachments
Matthew Kent28 Jun 2004 16:10 
Matthew Kent30 Jun 2004 10:07 
Andrew Pattison30 Jun 2004 11:58 
Subject:RE: Replication corruption and 64 bit mysql
From:Matthew Kent (ma@bravenet.com)
Date:06/30/2004 10:07:43 AM
List:com.mysql.lists.mysql

For the record/list archives,

The solution seems to have been upgrading to Fedora Core 2 kernel-smp-2.6.6-1.435.x86_64.rpm. What fix it contained that affected my case... I'm not sure :)

Been running okay for 18 hours at high volume!

- Matt

-----Original Message----- From: Matthew Kent Sent: Monday, June 28, 2004 4:11 PM To: mys@lists.mysql.com Subject: Replication corruption and 64 bit mysql

After several long days trying to fix this I'm running out of ideas.

Master: RedHat 7.3 kernel 2.4, MySQL 4.0.20 32 bit (mysql.com rpm) -> Slave: Fedora Core 2 64 bit kernel 2.6.5, MySQL-Max-4.0.20-0 64 bit (mysql.com rpm)

In a varying amount of time after a few hundred thousand queries replication dies with

<snippy> 040625 16:19:12 Error in Log_event::read_log_event(): 'Event too small', data_len: 0, event_type: 0 040625 16:19:12 Error reading relay log event: slave SQL thread aborted because of I/O error </snipped>

Using instructions from Sasha Pachev http://groups.google.ca/groups?hl=en&lr=&ie=UTF- 8&selm=c400pk%245pd%241% 40FreeBSD.csie.NCTU.edu.tw I've looked at the binlog on the slave and can indeed verify a large chunk of empty space and that query is indeed logged on the master.

Fun part is that it does work when I point our 32 bit master to different 32 bit slave. So I know it's not a problem with our old servers, just this fancy new one.

So far I've

- Tried a different master (we have a pool of 5 similar servers to use as a master). - Tried 32-bit server instead of 64-bit Max on the slave (couldn't get 64 bit non-Max to start at all, would just dump). - Tried swapping nic to a different brand. - Used tcpdump to attempt to spot any network level issues. - Tried pointing the binlogs on the master to another local disk separate from the data. - Examined the changelogs for the nic drivers. - Googled this to no end.

With no luck.

I'm open for suggestions.

I suppose the next step is to install core 2 32-bit and try again.

Thanks,