atom feed2 messages in org.kernel.vger.linux-kernelRe: EDAC i5000 MC0: NON-FATAL ERRORS
FromSent OnAttachments
Jack HowarthMay 30, 2008 9:55 am 
Doug ThompsonJun 2, 2008 11:46 am 
Subject:Re: EDAC i5000 MC0: NON-FATAL ERRORS
From:Doug Thompson (nor@yahoo.com)
Date:Jun 2, 2008 11:46:14 am
List:org.kernel.vger.linux-kernel

--- Jack Howarth <howa@bromo.msbb.uc.edu> wrote:

I am seeing the following errors on a Fedora 7 x86_64 linux box, running on a Tyan Tempest i5000XL motherboard, after upgrading from kernel-2.6.25-14.fc9.x86_64 to kernel-2.6.25.3-18.fc9.x86_64...

May 25 04:30:56 fourier kernel: EDAC i5000 MC0: NON-FATAL ERRORS Found!!! 1st
NON-FATAL Err Reg= 0x10000 May 25 04:30:56 fourier kernel: EDAC MC0: CE row 1, channel 0, label "":
(Branch=0 DRAM-Bank=3 RDWR=Read RAS=14339 CAS=672, CE Err=0x10000)

These messages occur about once an hour and are always for the same DRAM-Bank. I've not been able to find any memory errors when running memtest86 with or without ECC checking being enabled. Are there any known issues with the EDAC support in the kernel that might cause false positives like this? The errors are always marked as non-fatal and are for reads. Also, does anyone know how this code numbers ram banks? Is the first ram bank considered 0 or 1 by this code? Thanks in advance for any clarifications. Jack

Yes, it is a known false positive bug.

The hardware has some type of error which it calls NON-FATAL, and the driver is
TOO verbose in reporting that event. I am working on a patch to quiet that down

thanks

doug t

W1DUG