atom feed22 messages in org.freebsd.freebsd-hackersOS support for fault tolerance
FromSent OnAttachments
Maninya MFeb 14, 2012 6:23 am 
Julian ElischerFeb 14, 2012 8:56 am 
Jason HellenthalFeb 14, 2012 9:05 am 
Joshua IsomFeb 14, 2012 9:12 am 
md...@FreeBSD.orgFeb 14, 2012 9:20 am 
Brandon FalkFeb 14, 2012 9:25 am 
Rayson HoFeb 14, 2012 9:26 am 
Eitan AdlerFeb 14, 2012 10:04 am 
Uffe JakobsenFeb 14, 2012 10:43 am 
Julian ElischerFeb 14, 2012 3:00 pm 
Jan MikkelsenFeb 14, 2012 3:50 pm 
Devin TeskeFeb 14, 2012 4:20 pm 
Rayson HoFeb 14, 2012 4:53 pm 
Jim BryantFeb 14, 2012 5:34 pm 
Jim BryantFeb 14, 2012 5:38 pm 
Julian ElischerFeb 14, 2012 9:40 pm 
Da RockFeb 20, 2012 6:32 am 
Dieter BSDFeb 20, 2012 10:57 am 
per...@pluto.rain.comFeb 20, 2012 11:12 pm 
Julian ElischerFeb 21, 2012 12:22 am 
Dieter BSDFeb 24, 2012 1:09 pm 
Adam Vande MoreFeb 24, 2012 1:28 pm 
Subject:OS support for fault tolerance
From:Maninya M (mani@gmail.com)
Date:Feb 14, 2012 6:23:20 am
List:org.freebsd.freebsd-hackers

For multicore desktop computers, suppose one of the cores fails, the FreeBSD OS crashes. My question is about how I can make the OS tolerate this hardware fault. The strategy is to checkpoint the state of each core at specific intervals of time in main memory. Once a core fails, its previous state is retrieved from the main memory, and the processes that were running on it are rescheduled on the remaining cores.

I read that the OS tolerates faults in large servers. I need to make it do this for a Desktop OS. I assume I would have to change the scheduler program. I am using FreeBSD 9.0 on an Intel core i5 quad core machine. How do I go about doing this? What exactly do I need to save for the "state" of the core? What else do I need to know? I have absolutely no experience with kernel programming or with FreeBSD. Any pointers to good sources about modifying the source-code of FreeBSD would be greatly appreciated.