15 messages in com.xensource.lists.xen-ia64-develRe: [Xen-ia64-devel] [PATCH] NEW_TLBF...
FromSent OnAttachments
Kouya SHIMURA26 Jan 2007 03:05.patch
Alex Williamson26 Jan 2007 15:06 
Xu, Anthony28 Jan 2007 22:30 
Isaku Yamahata28 Jan 2007 23:35 
Kouya SHIMURA29 Jan 2007 00:09 
Xu, Anthony29 Jan 2007 00:59 
Isaku Yamahata29 Jan 2007 01:03 
Isaku Yamahata29 Jan 2007 01:20 
Xu, Anthony29 Jan 2007 01:58 
Isaku Yamahata29 Jan 2007 02:28 
Kouya SHIMURA29 Jan 2007 02:30 
Xu, Anthony29 Jan 2007 17:45 
Isaku Yamahata29 Jan 2007 19:35 
Xu, Anthony29 Jan 2007 20:16 
Isaku Yamahata29 Jan 2007 21:16 
Subject:Re: [Xen-ia64-devel] [PATCH] NEW_TLBFLUSH_CLOCK_PERIOD_SOFTIRQ is notregistered.
From:Isaku Yamahata (yama@valinux.co.jp)
Date:01/29/2007 02:28:42 AM
List:com.xensource.lists.xen-ia64-devel

On Mon, Jan 29, 2007 at 05:58:44PM +0800, Xu, Anthony wrote:

Isaku Yamahata write on 2007年1月29日 17:21:

On Mon, Jan 29, 2007 at 05:00:11PM +0800, Xu, Anthony wrote: It doesn't optimize NEED_FLUSH itself. The optimization path is executed when NEED_FLUSH return 0. See flush_vtlb_for_context_switch() @ xen/arch/ia64/xehn/domain.c.

When CONFIG_XEN_IA64_TLBFLUSH_CLOCK is defined, NEED_FLUSH() always returns 1. No optimization. Suppose that CONFIG_XEN_IA64_TLBFLUSH_CLOCK is defined and NEED_FLUSH() returns 0. In that case, we can skip local_vhpt_flush() or local_flush_tlb_all().

Hi Isaku,

Thanks for your explanation.

Suppose that CONFIG_XEN_IA64_TLBFLUSH_CLOCK is defined NEED_FLUSH() returns 0. In that case, we can skip local_vhpt_flush() or local_flush_tlb_all().

But the skip is on the cost of new_tlbflush_clock_period calling
vcpu_vhpt_flush.

Anyway, vcpu_vhpt_flush must be called, the difference is where it is called.

I don't see the benefit of new_tlbflush_clock_period.

I must miss something.

Can you explain more?

How about the following example? For simplicity, we consider only local_flush_tlb_all(). (The similar argument can be applied to vcpu_vhpt_flush())

suppose domM has two vcpus, vcpu0, vcpu1. domN has one vcpu, vcpu2.

- case 1 vcpu0 and vcpu1 are running on same pcpu. vcpu0 runs. context switch <<<< local_flush_tlb_all() is necessry here vcpu1 runs.

- case 2 vcpu0, vcpu1 and vcpu2 are running on the same pcpu vcpu0 runs context switch vcpu2 runs vcpu2 issues local_tlb_flush(). context switch <<< local_flush_tlb_all() can be skipped. vcpu1 runs

You can confirm its effect by the perf-counters, tlbflush_clock_cswitch_skip, flush_vtlb_for_context_switch and tlbflush_clock_cswitch_purge. Please note that local_flush_tlb_all() (or vcpu_vhpt_flush()) is called everytime grant table unmapping without tlb insert tracking optimization. But they aren't so often called with tlb insert tracking optimization, tlb flush clock optimization becomes less effetive than before. -- yamahata