12 messages in com.xensource.lists.xen-develRe: [Xen-devel] Re: [PATCH] SMP dom0 ...
FromSent OnAttachments
Kamble, Nitin A27 Oct 2005 18:05.patch
Keir Fraser28 Oct 2005 00:31 
Puthiyaparambil, Aravindh28 Oct 2005 07:34 
Ryan Harper28 Oct 2005 07:59 
Keir Fraser28 Oct 2005 08:14 
Keir Fraser28 Oct 2005 08:45 
Ryan Harper28 Oct 2005 09:06 
Kamble, Nitin A28 Oct 2005 10:13 
Ky Srinivasan28 Oct 2005 10:26 
Ian Pratt28 Oct 2005 16:51 
Ky Srinivasan31 Oct 2005 10:31 
Ky Srinivasan31 Oct 2005 10:31 
Subject:Re: [Xen-devel] Re: [PATCH] SMP dom0 boot fix
From:Ky Srinivasan (ksri@novell.com)
Date:10/28/2005 10:26:48 AM
List:com.xensource.lists.xen-devel

Thanks Kier. With this fix applied, I am able to boot SMP dom0. The box I am testing on is x86_64 machine with two hardware threads. However, if I turn on SMT in the Linux configuration, the kernel takes a fault in early startup (a NULL pointer reference at find_busiest_group +144). It appears that the sched domain hierarchy is not correctly set up here. Looking at the new smpboot.c, is turning on SMT support no longer valid?

K. Y

Keir Fraser <Keir@cl.cam.ac.uk> 10/28/05 11:45 am >>>

On 28 Oct 2005, at 16:15, Keir Fraser wrote:

On 28 Oct 2005, at 15:59, Ryan Harper wrote:

At this point send_IPI_allbutself() has been invoked and the system just sits and waits on CPU1 to run the function. But, CPU1's evtchn_upcall_mask was set (1), so I'm guessing the pending interrupt is never acknowledged.

Okay, the good news is that's the same bug I was able to repro last week. Turns out that CPU1's upcall mask is getting weirdly set under

its feet. Since it's waiting on the big kernel lock, which is held by

CPU0, which is waiting for acknowledgement of an interrupt in CPU1, we have a deadlock.

Given the problem is in that one changeset, this can't be hard to track down now.

Now fixed in our staging tree. sizeof_vcpu_shift in arch/xen/x86_64/xen_entry.S should be 4, not 3.

-- Keir