7 messages in org.opensolaris.mdb-discuss[mdb-discuss] Why the thread is not s...
FromSent OnAttachments
liujunAug 28, 2007 12:14 am 
liujunAug 28, 2007 12:19 am 
liujunAug 28, 2007 12:19 am 
ma...@bruningsystems.comAug 28, 2007 12:21 am 
liujunAug 28, 2007 5:39 am 
ma...@bruningsystems.comAug 28, 2007 6:14 am 
Gavin MaltbySep 15, 2007 12:59 pm 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:[mdb-discuss] Why the thread is not switch off ?Actions...
From:ma...@bruningsystems.com (@)
Date:Aug 28, 2007 6:14:13 am
List:org.opensolaris.mdb-discuss

Hi Liujun, Ok. Here is the way I would "attack" this problem. First, load kmdb:

# mdb -K

Since this is sparc (based on addresses), you should be able to do this from your console. You may need to add "-F" to the mdb command line. This should work, as you have a second cpu that is not also hung. This will drop you into kmdb. There, put a breakpoint at ssfcp_handle_devices+44c (the location where ssfcp_outstanding_lun_cmds should return: ssfcp_handle_device+44c:b

Then, continue:

:c

If you hit the breakpoint, try the next function in the stack. If you don't hit the breakpoint, the code is looping in ssfcp_outstanding_lun_cmds or in something it calls. In that case, put a breakpoint in ssfcp_outstanding_lun_cmds+18 and again continue.

If you hit the breakpoint, start single stepping and, at the same time, look at the C code to see if there is a loop. Basically, if you are not switching, it implies that the stack trace you have is not returning out to user. To single step, you can use :s or :e, or, better, just type '[' or ']'. (Then you don't have to type a carriage return). '[' skips over functions, ']' steps into function calls. I suggest '[' to start with, or you will be there all day. (You may be there all day anyway, but you can save yourself a little time).

If you need more detail, let me know. Also, this may be a known problem, so maybe the first thing is to look in the bug listings to see if it's already known.

I hope this helps.

max

liujun wrote:

max,

The stack for that thread is :

2a100371cc0::findstack -v

stack pointer for thread 2a100371cc0: 2a100370d61 [ 000002a100370d61 ktl0+0x48()
] 000002a100370eb1 ssfcp_outstanding_lun_cmds+0x18(60001918fd8, 0, 60003081a40,
60003590228, 1, 100000) 000002a100370f61 ssfcp_handle_devices+0x444(12b9ff0, 60001918fd8, 2,
60005c977b8, 1, 60001918ff0) 000002a100371071 ssfcp_statec_callback+0x614(60005c977b8, 1606d, 2,
60000205000, 600019433a8, 404) 000002a100371141 fctl_ulp_statec_cb+0x250(1, 2, 6000015d1d8, 6000577d188,
60000106000, ff000000) 000002a100371201 taskq_thread+0x1a4(60000010f90, 60000010f38, 50000,
5208a1333c14, 2a100371aca, 2a100371ac8) 000002a1003712d1 thread_start+4(60000010f38, 0, 0, 0, 0, 0)

ssfcp_outstanding_lun_cmds+0x18::dis

ssfcp_handle_ipkt_errors+0x2c4: mov %i0, %o0 ssfcp_handle_ipkt_errors+0x2c8: sra %l0, 0, %i0 ssfcp_handle_ipkt_errors+0x2cc: ret ssfcp_handle_ipkt_errors+0x2d0: restore ssfcp_outstanding_lun_cmds: save %sp, -0xb0, %sp ssfcp_outstanding_lun_cmds+4: ldx [%i0 + 0x20], %i2 ssfcp_outstanding_lun_cmds+8: cmp %i2, 0 ssfcp_outstanding_lun_cmds+0xc: be,pn %xcc, +0x7c <ssfcp_outstanding_lun_cmds+0x88> ssfcp_outstanding_lun_cmds+0x10:clr %i1 ssfcp_outstanding_lun_cmds+0x14:mov %i2, %o0 ssfcp_outstanding_lun_cmds+0x18:call -0x2891ac <mutex_enter> ssfcp_outstanding_lun_cmds+0x1c:nop ssfcp_outstanding_lun_cmds+0x20:ldx [%i2 + 0x10], %i3 ssfcp_outstanding_lun_cmds+0x24:cmp %i3, %i1 ssfcp_outstanding_lun_cmds+0x28: be,pn %xcc, +0x48 <ssfcp_outstanding_lun_cmds+0x70> ssfcp_outstanding_lun_cmds+0x2c:nop ssfcp_outstanding_lun_cmds+0x30:ld [%i3 + 0x54], %i4 ssfcp_outstanding_lun_cmds+0x34:cmp %i4, 2 ssfcp_outstanding_lun_cmds+0x38: be,pn %icc, +0x20 <ssfcp_outstanding_lun_cmds+0x58> ssfcp_outstanding_lun_cmds+0x3c:nop ssfcp_outstanding_lun_cmds+0x40:ldx [%

This message posted from opensolaris.org