26 messages in com.xensource.lists.xen-ia64-develRE: [Xen-ia64-devel] RE: Latest statu...
FromSent OnAttachments
Magenheimer, Dan (HP Labs Fort Collins)15 Sep 2005 12:23 
Tian, Kevin15 Sep 2005 17:45 
Tian, Kevin19 Sep 2005 07:18 
Magenheimer, Dan (HP Labs Fort Collins)19 Sep 2005 11:59 
Magenheimer, Dan (HP Labs Fort Collins)19 Sep 2005 16:33 
Tian, Kevin19 Sep 2005 18:00 
Tian, Kevin19 Sep 2005 18:02 
Tian, Kevin20 Sep 2005 06:28.Other, .Other
Magenheimer, Dan (HP Labs Fort Collins)20 Sep 2005 12:13 
Tian, Kevin21 Sep 2005 05:02 
Magenheimer, Dan (HP Labs Fort Collins)21 Sep 2005 05:07 
Tian, Kevin21 Sep 2005 05:10 
Magenheimer, Dan (HP Labs Fort Collins)21 Sep 2005 07:13 
Magenheimer, Dan (HP Labs Fort Collins)21 Sep 2005 16:13 
Tian, Kevin22 Sep 2005 03:24 
Magenheimer, Dan (HP Labs Fort Collins)22 Sep 2005 06:48 
Tian, Kevin22 Sep 2005 07:45 
Magenheimer, Dan (HP Labs Fort Collins)22 Sep 2005 08:03 
Tian, Kevin22 Sep 2005 08:42 
Magenheimer, Dan (HP Labs Fort Collins)22 Sep 2005 10:15 
Tian, Kevin22 Sep 2005 17:44 
Tian, Kevin23 Sep 2005 05:10 
Tristan Gingold23 Sep 2005 05:33 
Magenheimer, Dan (HP Labs Fort Collins)23 Sep 2005 06:05 
Tristan Gingold23 Sep 2005 06:27 
Magenheimer, Dan (HP Labs Fort Collins)23 Sep 2005 06:34 
Subject:RE: [Xen-ia64-devel] RE: Latest status about multiple domains on XEN/IPF
From:Magenheimer, Dan (HP Labs Fort Collins) (dan.@hp.com)
Date:09/19/2005 04:33:15 PM
List:com.xensource.lists.xen-ia64-devel

FYI, I tried this patch and 'xend start' freezes the system, requiring a reboot.

-----Original Message----- From: Tian, Kevin [mailto:kevi@intel.com] Sent: Monday, September 19, 2005 8:18 AM To: Tian, Kevin; Magenheimer, Dan (HP Labs Fort Collins) Cc: xen-@lists.xensource.com Subject: RE: [Xen-ia64-devel] RE: Latest status about multiple domains on XEN/IPF

Now I found the issue coming from the event injection mechanism. Actually under some circumstance, the evtchn_upcall_pending and some evtchn_pending will be set on but related irr bit is cleared. Once this condition happens, later event notification always failed to enable irr bit. The reason comes from guest who may re-generate event ignored before and this path has nothing to do with irr however. Based upon following rough patch, I can see event injected into guest however, to see nested event injection and dead lock happening. So may need a bit more investigation. If this mechanism can be promised to work again, we may see something interesting happen since previous progress was completely triggered by "xm console" instead of event.

[Xen] diff -r 55bc6698c889 xen/arch/ia64/xen/domain.c --- a/xen/arch/ia64/xen/domain.c Thu Sep 15 00:00:23 2005 +++ b/xen/arch/ia64/xen/domain.c Mon Sep 19 22:02:27 2005 @@ -916,7 +916,7 @@ #endif

/* Mask all upcalls... */ - for ( i = 0; i < MAX_VIRT_CPUS; i++ ) + for ( i = 1; i < MAX_VIRT_CPUS; i++ ) d->shared_info->vcpu_data[i].evtchn_upcall_mask = 1;

#ifdef CONFIG_VTI diff -r 55bc6698c889 xen/arch/ia64/xen/vcpu.c --- a/xen/arch/ia64/xen/vcpu.c Thu Sep 15 00:00:23 2005 +++ b/xen/arch/ia64/xen/vcpu.c Mon Sep 19 22:02:27 2005 @@ -21,6 +21,7 @@ #include <asm/processor.h> #include <asm/delay.h> #include <asm/vmx_vcpu.h> +#include <xen/event.h>

typedef union { struct ia64_psr ia64_psr; @@ -631,6 +632,16 @@ { UINT64 *p, *q, *r, bits, bitnum, mask, i, vector;

+ /* Always check pending event, since guest may just ack the + * event injection without handle. Later guest may throw out + * the event itself. + */ + if (event_pending(vcpu) && + !test_bit(vcpu->vcpu_info->arch.evtchn_vector, + &PSCBX(vcpu, insvc[0]))) + test_and_set_bit(vcpu->vcpu_info->arch.evtchn_vector, + &PSCBX(vcpu, irr[0])); + p = &PSCBX(vcpu,irr[3]); /* q = &PSCB(vcpu,delivery_mask[3]); */ r = &PSCBX(vcpu,insvc[3]);

[XENO] diff -r c9522a6d03a8 drivers/xen/core/evtchn_ia64.c --- a/drivers/xen/core/evtchn_ia64.c Wed Sep 14 23:35:51 2005 +++ b/drivers/xen/core/evtchn_ia64.c Mon Sep 19 22:04:04 2005 @@ -81,6 +81,7 @@ shared_info_t *s = HYPERVISOR_shared_info; vcpu_info_t *vcpu_info = &s->vcpu_data[smp_processor_id()];

+ vcpu_info->evtchn_upcall_mask = 1; vcpu_info->evtchn_upcall_pending = 0;

/* NB. No need for a barrier here -- XCHG is a barrier on x86. */ @@ -107,6 +108,7 @@ } } } + vcpu_info->evtchn_upcall_mask = 0; return IRQ_HANDLED; }

Thanks, Kevin

-----Original Message----- From: xen-@lists.xensource.com [mailto:xen-@lists.xensource.com] On Behalf Of Tian, Kevin Sent: 2005年9月16日 8:45 To: Magenheimer, Dan (HP Labs Fort Collins) Cc: xen-@lists.xensource.com Subject: [Xen-ia64-devel] RE: Latest status about multiple domains on XEN/IPF

Yeah, seems we're on same page now. I doubt the console issue may be also the reason of the blkfront connection, since unwanted delay may cause timeout. Still need more investigation. ;-(

Thanks, Kevin

-----Original Message-----

From: Magenheimer, Dan (HP Labs Fort Collins) [mailto:dan.@hp.com]

Sent: 2005年9月16日 3:24 To: Tian, Kevin Cc: xen-@lists.xensource.com Subject: RE: Latest status about multiple domains on XEN/IPF

I got it all built with all the patches. I am now able to run xend. But when I do "xm create" I just get as far as:

xen-event-channel using irq 233 store-evtchn = 1

and then the 0+1+01 (etc) debug output.

Wait... I tried launching another domain and got further. Or I guess this is just delayed console output from the first "xm create"?

It gets as far as: Xen virtual console successfully installed as tty0 Event-channel device installed. xen_blk: Initialising virtual block device driver

and then nothing else.

So I tried launching some more domains (with name=xxx). Now I get as far as the kernel unable-to-mount-root panic.

It's hard to tell what is working because of the console problems (that I see you have posted a question about on xen-devel).

-----Original Message----- From: Tian, Kevin [mailto:kevi@intel.com] Sent: Thursday, September 15, 2005 6:32 AM To: Magenheimer, Dan (HP Labs Fort Collins) Cc: ipf-xen Subject: RE: Latest status about multiple domains on XEN/IPF

Hi, Dan,

Attached are updated xeno patch (xen patch still same), but no functional enhancement actually. Some Makefile change is required to build latest xenolinux.hg, though bit ugly. ;-) Together with another patch I sent out for solving domU crash on the mailing list (Took me most time of the day), hope you can reach same point as mine: Blkfront failed to connect to xenstore, and mount root fs panic.

Thanks, Kevin

-----Original Message-----

From: Magenheimer, Dan (HP Labs Fort Collins) [mailto:dan.@hp.com]

Sent: 2005年9月15日 12:05 To: Tian, Kevin Cc: ipf-xen Subject: RE: Latest status about multiple domains on XEN/IPF

Thanks for comments. When I sent out the patch, I didn't mean it as the final one and just for you to continue debug. So the style is a bit messed, and your most comments regarding coding style are correct. I anyway will be careful next time even when sending out temp patch.

Oh, OK. I didn't realize it was a "continue debug" patch.

I haven't seen any machine crashes, but I am both running on a different machine and exercising it differently. If you have any test to reproduce it, please let me know. I have noticed that running "hg clone" seems to reproducibly cause a segmentation fault... I haven't had any time to try to track this down. (I think Intel has better hardware debugging capabilities... perhaps if you can reproduce this, someone on the Intel team can track it down?)

I see the crash when domU was executing. Actually if only dom0 is up, it can run safely for several days.

OK. Yes, I have seen dom0 stay up for many days too; that's why I was concerned if it was crashing.

When I last tried, I wasn't able to get xend to run (lots of python errors). It looks like you have gotten it to run?

Is it possible due to the python version? The default python version on EL3 is 2.2, and with it we saw many python errors before. Now we're using 2.4.1.

I am using 2.3.5 but that has always worked before.

One more question. Did you try xend with all my patches applied? Without change to do_memory_ops which is explained below, xend doesn't start since its memory reservation request will fail.

I bet that is the problem. I haven't tried it since receiving your patch and will try it again tomorrow.

3) In privcmd.c (other than the same comment about ifdef'ing every change), why did you change the direct_remap_... --> remap__... define back? Was it incorrect or just a style change? Again, I am trying to change the patches to something that will likely be more acceptable upstream and I think we will be able to move this simple define into an asm header file. If my change to your patch is broken, please let me know.

But as you may note, two functions requires different parameters, one for mm_struct and another for vma. So your previous change is incorrect.

No I missed that difference entirely! Good catch!

6) I will add your patch to hypercall.c (in the hypervisor). But the comment immediately preceding concerns me... are reservations implemented or not? (I think not, unless maybe they are only in VTI?)

No, both don't handle the reservation. However the issue is that now nr_extents is not the level 1 parameter which previous code simply retrieves from pt_regs. Now it's a sub field in a new reservation structure, with the later only parameter passed in. So I have to add above logic to get nr_extents and return result that caller wants.

OK.

If you have an updated patch by the end of your day, please send it and I will try it out tomorrow.