10 messages in com.xensource.lists.xen-ia64-develRE: [Xen-ia64-devel] [PATCH] fully vi...
FromSent OnAttachments
Xu, Anthony27 Nov 2005 19:21.patch
Xu, Anthony30 Nov 2005 01:35 
Magenheimer, Dan (HP Labs Fort Collins)30 Nov 2005 05:49 
Yang, Fred30 Nov 2005 08:30.txt
Tian, Kevin30 Nov 2005 22:54 
Xu, Anthony01 Dec 2005 03:57 
Magenheimer, Dan (HP Labs Fort Collins)01 Dec 2005 07:37 
Magenheimer, Dan (HP Labs Fort Collins)12 Jan 2006 12:31 
Shuji Kurihara30 Mar 2006 06:34 
Magenheimer, Dan (HP Labs Fort Collins)30 Mar 2006 09:12 
Subject:RE: [Xen-ia64-devel] [PATCH] fully virtualize psr and ipsr on non-VTIdomain
From:Shuji Kurihara (kuri@mxn.nes.nec.co.jp)
Date:03/30/2006 06:34:23 AM
List:com.xensource.lists.xen-ia64-devel

Hi Dan,

Some time ago, you reported that the fully virtualize psr and ipsr patch by Anthony caused Linux compilation to crash the system. This panic seems to be solved by applying the vcpu_translate_patch. However, the original gcc segmentation fault problem still occurs on this system, and it still remains to be solved. Details below.

psr.patch: http://lists.xensource.com/archives/html/xen-ia64-devel/2005-11/msg00312.html

vcpu_tranalate.patch: http://lists.xensource.com/archives/html/xen-ia64-devel/2006-03/msg00328.html

The panic occurs when trying to handle a tlb miss following "itc" instruction. Below is console output:

(XEN) vcpu_translate: bad physical address: a00000010000a090 (XEN) translate_domain_pte: bad mpa=000000010000a090 (> 0000000018000000), vadr=a00000010000a090,pteval=001000010000a761,itir=0000000000000038 (XEN) lookup_domain_mpa: bad mpa 000000010000a090 (> 0000000018000000 (XEN) handle_op: can't handle privop at 0xa00000010000a090
(op=0x000001a7a7a7a7a7) slot 0 (type=5), ipsr=0000101208026010 (XEN) priv_emulate: priv_handle_op fails, isr=0000000000000000 (XEN) $$$$$ PANIC in domain 1 (k6=f000000007f98000): psr.dt off, trying to deliver nested dtlb! (XEN) (XEN) CPU 0 (XEN) psr : 0000101208026010 ifs : 800000000000040e ip : [<a00000010000a090>] (XEN) ip is at ??? (XEN) unat: 0000000000000000 pfs : c00000000000040e rsc : 000000000000000f (XEN) rnat: 0000000000000000 bsps: 60000fff7fffc160 pr : 000000000555a261 (XEN) ldrs: 0000000000700000 ccv : 0010000001c585a1 fpsr: 0009804c8a70033f (XEN) csd : 0000000000000000 ssd : 0000000000000000 (XEN) b0 : a00000010000a070 b6 : 20000000001f8780 b7 : 0000000000000000 (XEN) f6 : 000000000000000000000 f7 : 000000000000000000000 (XEN) f8 : 000000000000000000000 f9 : 000000000000000000000 (XEN) f10 : 000000000000000000000 f11 : 000000000000000000000 (XEN) r1 : 60000000000021f0 r2 : 0000000000000000 r3 : 0000000000000308 (XEN) r8 : 0000000000000000 r9 : 20000000002c64a0 r10 : 0000000000000000 (XEN) r11 : c00000000000040e r12 : 60000fffffaa7610 r13 : 20000000002d06a0 (XEN) r14 : 0000000000000030 r15 : 6000000000100000 r16 : 6000000000100000 (XEN) r17 : 0000000001bf4200 r18 : 0010000001c585a1 r19 : 0001800000000040 (XEN) r20 : 000000001613c000 r21 : 0000000000000000 r22 : 5fffff0000000000 (XEN) r23 : 000000001613c000 r24 : 0000000000000038 r25 : 0010000001c585e1 (XEN) r26 : 0010000001c585a1 r27 : 0000000000000038 r28 : 0000000000000000 (XEN) r29 : 4000000000001870 r30 : a00000010000a070 r31 : 000000000555a2a1 (XEN) vcpu_translate: bad physical address: 60000fff7fffc1d0 (XEN) translate_domain_pte: bad mpa=00000fff7fffc1d0 (> 0000000018000000), vadr=60000fff7fffc1d0,pteval=00100fff7fffc761,itir=0000000000000038 (XEN) lookup_domain_mpa: bad mpa 00000fff7fffc1d0 (> 0000000018000000 (XEN) r32 : f0000000f0000000 r33 : f0000000f0000000 r34 : f0000000f0000000 (XEN) r35 : f0000000f0000000 r36 : f0000000f0000000 r37 : f4f4f4f4f4f4f4f4 (XEN) r38 : f4f4f4f4f4f4f4f4 r39 : f4f4f4f4f4f4f4f4 r40 : f4f4f4f4f4f4f4f4 (XEN) r41 : f4f4f4f4f4f4f4f4 r42 : f4f4f4f4f4f4f4f4 r43 : f4f4f4f4f4f4f4f4 (XEN) r44 : f4f4f4f4f4f4f4f4 r45 : f4f4f4f4f4f4f4f4 (XEN) BUG at domain.c:339 (XEN) bad hyperprivop; ignored (XEN) iim=0, iip=f0000000040203d0 (XEN) bad hyperprivop; ignored (XEN) iim=0, iip=f0000000040203d0

One of the above messages:

(XEN) vcpu_translate: bad physical address: a00000010000a090

The address "a00000010000a090" points to the instruction below.

a00000010000a090: cb 00 64 00 2e 04 [MMI] (p06) itc.d r25;;

When the VMM tries to get the opcode to call priv_handle_op(), it seems to trigger a tlb miss, and causes domU to hang. It seems from the message that domain is in metaphysical mode after executing "rsm psr.dt" instruction, and the fault address is in region 5.

This situation is similar to the problem vcpu_translate_patch tries to solve. The patch fixes vcpu_translate() so that the guest OS does not operate in metaphysical mode in such a case.

We have run the same test program on Xen 3.0-unstable with CSet#9395 (which includes vcpu_translate patch) and it run throughout the weekend without causing any panic. On the other hand, the original Xen without the patch crashes within 2 hours. However, the original gcc segmentation faults still occured on the system, so neither CSet#8671 nor #9395 seems to solve the original problem.

Thanks, Shuji

Hi Anthony --

Since things have stabilized, I decided to give this patch some testing, primarily to see if it might fix the gcc segmentation faults that Fujita and I have been seeing. Without this patch, I am able to compile Linux 20 times on domU; generally 1 or 2 of the compiles fails because of the gcc segfault. With the patch, Xen *crashed* on the sixth Linux compile (first try) and halfway through the first Linux compile (second try). This is on a Tiger4, but I currently am not able to get console output so I don't have any information about the crash -- other than that the machine didn't reboot. Could you see if you could reproduce this?

As an aside, turning off the FAST_BREAK, FAST_ACCESS_REFLECT, and FAST_RFI features (which your patch turns off) slowed down the benchmark by about 4%.

-----Original Message----- From: Xu, Anthony [mailto:anthony.xu@xxxxxxxxx] Sent: Sunday, November 27, 2005 8:22 PM To: Magenheimer, Dan (HP Labs Fort Collins) Cc: xen-ia64-devel@xxxxxxxxxxxxxxxxxxx Subject: [Xen-ia64-devel] [PATCH] fully virtualize psr and ipsr on non-VTI domain

Dan, This patch is intended to fully virtualize psr and ipsr on non-VTI domain. Following things are done in this patch. 1, previously when guest reads psr, it always get psr dt rt it equal to 1. that is because HV doesn't restore these information, metaphysical_mode can't present all these information. I save these information into privregs->vpsr. Thus guest can get correct information about dt, rt and it. 2, when guest reads psr, we should only return low 32bits and 35 and 36 bits, previously return all bits. 3, when guest rsm and ssm psr, HV rsm and ssm some bits of current psr which is used by HV, that is not correct, guest rsm and ssm should only impact guest psr(that is regs->ipsr). 4, mistakenly uses guest DCR, guest DCR should impact guest psr when injecting interruption into guest, but not impact guest ipsr. When injecting interruption into guest,The current implementation is Guest ipsr.be=guest dcr.be Guest ipsr.pp=guest dcr.pp Correct implementation should be, Guest psr.be=guest dcr.be Guest psr.pp=guest dcr.pp.

Because of above modifications, I turn off FAST_RFI, FAST_BREAK and FAST_ACCESS_REFLECT.

Signed-off-by Anthony Xu < anthony.xu@xxxxxxxxx>

One question, why do we need to virtualize guest psr.pp and always set guest psr.pp to 1?

Thanks -Anthony