22 messages in com.xensource.lists.xen-develRe: [Xen-devel] Calculating real cpu ...
FromSent OnAttachments
Jerone Young23 Feb 2005 15:29 
Anthony Liguori23 Feb 2005 15:44 
Ian Pratt23 Feb 2005 16:48 
Anthony Liguori23 Feb 2005 17:02 
Andrew Theurer23 Feb 2005 18:10 
Ian Pratt23 Feb 2005 18:37 
Anthony Liguori23 Feb 2005 19:09 
John L Griffin24 Feb 2005 05:56 
Anthony Liguori24 Feb 2005 12:43 
John L Griffin24 Feb 2005 13:46 
Keir Fraser24 Feb 2005 14:02 
Anthony Liguori24 Feb 2005 15:01 
Matt Ayres24 Feb 2005 18:25 
Anthony Liguori24 Feb 2005 18:38 
John L Griffin24 Feb 2005 19:53 
Stephan Diestelhorst25 Feb 2005 02:31 
Stephan Diestelhorst25 Feb 2005 02:38 
Rob Gardner25 Feb 2005 08:44 
Anthony Liguori25 Feb 2005 09:09 
Rob Gardner25 Feb 2005 13:35.Other
Ian Pratt26 Feb 2005 12:26 
Rob Gardner27 Feb 2005 22:19 
Subject:Re: [Xen-devel] Calculating real cpu usage of Xen domains correctly!
From:Stephan Diestelhorst (sd3@cam.ac.uk)
Date:02/25/2005 02:31:10 AM
List:com.xensource.lists.xen-devel

1. The guest OS calls HYPERVISOR_block() (thus setting the BLOCKED flag) whenever it wants to yield the processor because it's waiting for an event. 2. This blocking can happen anytime -- including after the guest OS has been running for quite some time. Both correct! 3. All the "event_pending(prev)" check in __enter_scheduler() is for is to say "whoops, an event arrived in the time between when the guest OS blocked & right now, so I should clear the BLOCKED flag." This is true as well! This is so the domain can be rescheduled at the scheduler's earliest discretion (possibly immediately).

There is a subtle point here: When we do that check, the domain is actually still running! It will get (probably) descheduled in the "do_schedule" function of the scheduler which is invoked by ops.do_schedule a few lines later in this function.

If these are true, then the original code was correct -- "prev->cpu_time" should be updated during any call to the __enter_scheduler() function, regardless of the state of the BLOCKED flag.

Thats what I think too. Because the domain stays scheduled regardless what is happening till the call of do_schedule, and should get the time accounted!

Which makes me wonder if something is seriously misbehaving to cause the weird CPU usage totals you're seeing -- like a yield()ed or block()ed domain improperly getting rescheduled immediately, or an improper modification of the prev->lastschd counter, or the "if (prev == next)" optimization [later in __enter_scheduler()] leaves out some crucial accounting, or...?

Indeed, those weird results should never occur. I.e. the sum of the relative usage of domains on one cpu (you are not having those two domains spread on two CPUs, are you?) should be <=100%. So what I mean by that is: delta(cpu_time_0 )/delta(real_time) + ... + delta(cpu_time_n) / delta(real_time) <= 100% Assuming that all measurements of delta(cpu_time_i) take place at the same points in time t1, t2.

BTW: Which scheduler are you using?

Cheers, Stephan