atom feed368 messages in org.postgresql.pgsql-hackers[HACKERS] Block-level CRC checks
FromSent OnAttachments
Alvaro HerreraSep 30, 2008 11:02 am 
Jonah H. HarrisSep 30, 2008 11:33 am 
Tom LaneSep 30, 2008 11:43 am 
Heikki LinnakangasSep 30, 2008 11:48 am 
Joshua DrakeSep 30, 2008 11:49 am 
Jonah H. HarrisSep 30, 2008 11:51 am 
Markus WannerSep 30, 2008 11:56 am 
Heikki LinnakangasSep 30, 2008 12:00 pm 
pgs...@mohawksoft.comSep 30, 2008 12:17 pm 
Greg SmithSep 30, 2008 12:23 pm 
Bruce MomjianSep 30, 2008 1:41 pm 
Jeffrey BakerSep 30, 2008 1:48 pm 
Decibel!Sep 30, 2008 2:10 pm 
Joshua DrakeSep 30, 2008 2:11 pm 
pgs...@mohawksoft.comSep 30, 2008 2:13 pm 
Decibel!Sep 30, 2008 2:17 pm 
Greg StarkSep 30, 2008 3:49 pm 
Andrew ChernowSep 30, 2008 6:20 pm 
Joshua D. DrakeSep 30, 2008 9:28 pm 
Paul SchlieSep 30, 2008 9:43 pm 
Paul SchlieSep 30, 2008 11:57 pm 
Albe LaurenzOct 1, 2008 12:01 am 
Zdenek KotalaOct 1, 2008 2:22 am 
Harald Armin MassaOct 1, 2008 2:56 am 
Hannu KrosingOct 1, 2008 5:59 am 
Tom LaneOct 1, 2008 6:24 am 
342 later messages
Subject:[HACKERS] Block-level CRC checks
From:Alvaro Herrera (
Date:Sep 30, 2008 11:02:09 am

A customer of ours has been having trouble with corrupted data for some time. Of course, we've almost always blamed hardware (and we've seen RAID controllers have their firmware upgraded, among other actions), but the useful thing to know is when corruption has happened, and where.

So we've been tasked with adding CRCs to data files.

The idea is that these CRCs are going to be checked just after reading files from disk, and calculated just before writing it. They are just a protection against the storage layer going mad; they are not intended to protect against faulty RAM, CPU or kernel.

This code would be run-time or compile-time configurable. I'm not absolutely sure which yet; the problem with run-time is what to do if the user restarts the server with the setting flipped. It would have almost no impact on users who don't enable it.

The implementation I'm envisioning requires the use of a new relation fork to store the per-block CRCs. Initially I'm aiming at a CRC32 sum for each block. FlushBuffer would calculate the checksum and store it in the CRC fork; ReadBuffer_common would read the page, calculate the checksum, and compare it to the one stored in the CRC fork.

A buffer's io_in_progress lock protects the buffer's CRC. We read and pin the CRC page before acquiring the lock, to avoid having two buffer IO operations in flight.

I'd like to submit this for 8.4, but I want to ensure that -hackers at large approve of this feature before starting serious coding.