atom feed16 messages in org.freebsd.freebsd-currentRe: Process stuck in vmmaps on 8.0-BETA1
FromSent OnAttachments
John MarshallJul 8, 2009 11:05 pm 
pluknetJul 8, 2009 11:42 pm 
John MarshallJul 9, 2009 12:30 am 
John MarshallJul 9, 2009 1:52 am 
Kostik BelousovJul 9, 2009 7:20 am 
Kostik BelousovJul 9, 2009 9:08 am 
John MarshallJul 9, 2009 8:58 pm 
Kostik BelousovJul 10, 2009 1:23 am 
Tom EvansJul 10, 2009 1:28 am 
John MarshallJul 10, 2009 4:42 am 
Kostik BelousovJul 10, 2009 6:24 am 
John MarshallJul 11, 2009 2:10 am 
Kostik BelousovJul 11, 2009 5:41 am 
John MarshallJul 13, 2009 3:44 am 
Kostik BelousovJul 13, 2009 3:57 am 
John MarshallJul 16, 2009 5:56 pm 
Subject:Re: Process stuck in vmmaps on 8.0-BETA1
From:John Marshall (john@riverwillow.com.au)
Date:Jul 9, 2009 1:52:19 am
List:org.freebsd.freebsd-current

On Thu, 09 Jul 2009, 17:30 +1000, John Marshall wrote:

On Thu, 09 Jul 2009, 10:42 +0400, pluknet wrote:

2009/7/9 John Marshall <john@riverwillow.com.au>:

After upgrading... - boot new kernel to single-user - make installworld - make delete-old - make delete-old-libs - mergemaster - reboot

I re-built a few of my applications. I noticed a problem with ntpd 4.2.4p7. The build was fine, it started fine, but got stuck in vmmaps and I couldn't kill it. Stopping the operating system appears to be the only remedy. I have re-built a few times (starting with 'make distclean') just to make sure.

UID PID PPID CPU PRI NI VSZ RSS MWCHAN STAT TT TIME COMMAND 0 791 1 0 44 0 4944 4920 vmmaps Ds ?? 0:00.01 ntpd

Can you place here 'procstat -k 791', where 791 is pid of ntpd? It'd be nice also if you go through all ddb steps described in Debugging Deadlocks chapter of FreeBSD Developers' Handbook.

Here is some procstat output. I'm just rebuilding the kernel with the debugging options enabled - not something I've ever done before.

rwsrv05# procstat 2788 PID PPID PGID SID TSID THR LOGIN WCHAN EMUL COMM 2788 1 2788 2788 0 1 john vmmaps FreeBSD ELF32 ntpd rwsrv05# procstat -k 2788 PID TID COMM TDNAME KSTACK 2788 100164 ntpd - mi_switch sleepq_switch
sleepq_wait _sleep vm_map_unlock_and_wait vm_map_delete vm_map_fixed vm_mmap
mmap syscall Xint0x80_syscall rwsrv05# procstat -v 2788 PID START END PRT RES PRES REF SHD FL TP PATH 2788 0x8048000 0x807e000 r-x 54 60 2 1 CN vn /usr/local/bin/ntpd 2788 0x807e000 0x8080000 rw- 2 0 1 0 C- vn /usr/local/bin/ntpd 2788 0x8080000 0x8100000 rw- 128 0 1 0 C- df 2788 0x2807e000 0x280ab000 r-x 45 0 171 75 CN vn /libexec/ld-elf.so.1 2788 0x280ab000 0x280ad000 rw- 2 0 1 0 C- vn /libexec/ld-elf.so.1 2788 0x280ad000 0x280c0000 rw- 19 0 1 0 C- df 2788 0x280c0000 0x280d7000 r-x 23 0 1 0 CN vn /lib/libm.so.5 2788 0x280d7000 0x280d8000 r-x 1 0 1 0 CN vn /lib/libm.so.5 2788 0x280d8000 0x280d9000 rw- 1 0 1 0 C- vn /lib/libm.so.5 2788 0x280d9000 0x28211000 r-x 312 0 1 0 CN vn /lib/libcrypto.so.5 2788 0x28211000 0x28212000 r-x 1 0 1 0 CN vn /lib/libcrypto.so.5 2788 0x28212000 0x2822a000 rw- 24 0 1 0 C- vn /lib/libcrypto.so.5 2788 0x2822a000 0x2822c000 rw- 2 0 1 0 C- df 2788 0x2822c000 0x28232000 r-x 6 0 1 0 CN vn /lib/libkvm.so.4 2788 0x28232000 0x28233000 r-x 1 0 1 0 CN vn /lib/libkvm.so.4 2788 0x28233000 0x28234000 rw- 1 0 1 0 C- vn /lib/libkvm.so.4 2788 0x28234000 0x2824c000 r-x 24 0 1 0 CN vn /usr/lib/libelf.so.1 2788 0x2824c000 0x2824d000 r-x 1 0 1 0 CN vn /usr/lib/libelf.so.1 2788 0x2824d000 0x2824e000 rw- 1 0 1 0 C- vn /usr/lib/libelf.so.1 2788 0x2824e000 0x28251000 r-x 3 0 15 10 CN vn /usr/lib/librt.so.1 2788 0x28251000 0x28252000 r-x 1 0 1 0 CN vn /usr/lib/librt.so.1 2788 0x28252000 0x28253000 rw- 1 0 1 0 C- vn /usr/lib/librt.so.1 2788 0x28253000 0x28260000 r-x 13 0 1 0 CN vn /lib/libmd.so.4 2788 0x28260000 0x28261000 r-x 1 0 1 0 CN vn /lib/libmd.so.4 2788 0x28261000 0x28262000 rw- 1 0 1 0 C- vn /lib/libmd.so.4 2788 0x28262000 0x28351000 r-x 239 0 1 0 CN vn /lib/libc.so.7 2788 0x28351000 0x28352000 r-x 1 0 1 0 CN vn /lib/libc.so.7 2788 0x28352000 0x28358000 rw- 6 0 1 0 C- vn /lib/libc.so.7 2788 0x28358000 0x2836e000 rw- 22 0 1 0 C- df 2788 0x2836e000 0x2837a000 --- 0 0 0 0 -- -- 2788 0x28400000 0x28500000 rw- 256 0 1 0 C- df 2788 0xbfbe0000 0xbfc00000 rwx 32 0 1 0 C- df rwsrv05#

OK, now that I've rebuilt the kernel with the debugging options not commented out, I'm getting a number of 'lock order reversal' messages printed on the console: is that normal?

From the Debugging Deadlocks chapter to which I was referred by pluknet (above) it appears that I need to enter 'sysctl debug.kdb.enter=1' or 'sysctl debug.kdb.panic=1' after I get the process into the desired 'stuck' state. If I enter either of those commands, the system reboots. Now *I'm* stuck.