26 messages in com.xensource.lists.xen-develRe: [Xen-devel] suspending a domain i...
FromSent OnAttachments
Keir Fraser13 May 2004 23:34 
Kip Macy14 May 2004 13:56 
Kip Macy14 May 2004 14:01 
Kip Macy14 May 2004 14:25 
Ian Pratt14 May 2004 14:33 
Kip Macy14 May 2004 14:48 
Kip Macy14 May 2004 15:37 
Kip Macy14 May 2004 15:39 
Ian Pratt14 May 2004 15:48 
Kip Macy14 May 2004 16:14 
Neugebauer, Rolf14 May 2004 16:28 
Kip Macy14 May 2004 17:46 
Kip Macy14 May 2004 18:38 
Kip Macy14 May 2004 21:26 
Kip Macy14 May 2004 22:17 
Keir Fraser15 May 2004 01:15 
Ian Pratt15 May 2004 01:58 
Neugebauer, Rolf15 May 2004 02:39 
Kip Macy15 May 2004 08:50 
Kip Macy15 May 2004 08:57 
Kip Macy15 May 2004 08:59 
Keir Fraser15 May 2004 09:11 
Kip Macy15 May 2004 10:02 
Keir Fraser15 May 2004 10:42 
Kip Macy15 May 2004 11:09 
Keir Fraser15 May 2004 16:21 
Subject:Re: [Xen-devel] suspending a domain in the ngio world
From:Kip Macy (kma@fsmware.com)
Date:05/15/2004 11:09:40 AM
List:com.xensource.lists.xen-devel

The dd is running in DOM1. The OOM killer is getting run in DOM0. There is clearly a memory leak in the block I/O path.

DOM0 is curly and DOM1 is xen-vm0.

A large amount of memory has already been leaked:

kmacy@curly cat /proc/meminfo total: used: free: shared: buffers: cached: Mem: 262565888 205619200 56946688 0 23339008 28123136 == [root@xen-vm0 ~]$ dd if=/dev/zero of=/tmp/bwout bs=1024k count=256 == kmacy@curly cat /proc/meminfo total: used: free: shared: buffers: cached: Mem: 262565888 214687744 47878144 0 23339008 28123136 == [root@xen-vm0 ~]$ dd if=/dev/zero of=/tmp/bwout count=256 bs=1024k 256+0 records in 256+0 records out == kmacy@curly cat /proc/meminfo | head -3 total: used: free: shared: buffers: cached: Mem: 262565888 223727616 38838272 0 23339008 28123136 == [root@xen-vm0 ~]$ dd if=/dev/zero of=/tmp/bwout count=256 bs=1024k 256+0 records in 256+0 records out == kmacy@curly cat /proc/meminfo | head -2 total: used: free: shared: buffers: cached: Mem: 262565888 232873984 29691904 0 23339008 28123136

So ~40MB is leaked for every 1GB transferred.

I can give you a stack backtrace of the memory allocation failure in DOM0 if you like, but as far as I can tell the horse has long since left the barn at that point.

This is within DOM1 (i.e., not DOM0) right? If so, I guess that doing this 'dd' test within DOM0 doesn't get you similar messages?

This is rather unexpected -- if you could add a stack backtrace to the out-of-memory path in the page allocator (page_alloc.c in Xenolinux) an d post me that with the kernel image (vmlinux) then I'll see what I can work out. I guess I haven't tested all that hard so there might be a memory leak.

On a side note - I don't need suspend/restore, I just need coredump and almost immediately after that PTRACE_STOP. So long as I can stop the domain long enough to write out its state I have what I need.