| From | Sent On | Attachments |
|---|---|---|
| Martin Kraemer | Jul 20, 2001 1:50 am | |
| Matt Dillon | Jul 25, 2001 10:11 am | |
| Martin Kraemer | Jul 27, 2001 9:51 am | |
| Brandon D. Valentine | Jul 27, 2001 10:24 am | |
| Justin T. Gibbs | Jul 27, 2001 11:58 am | |
| Martin Kraemer | Jul 30, 2001 1:27 am | |
| Martin Kraemer | Jul 30, 2001 5:30 am | |
| Justin T. Gibbs | Jul 30, 2001 7:21 am | |
| Martin Kraemer | Jul 30, 2001 8:33 am | .dmesg, .pciconf, .messages |
| Martin Kraemer | Jul 30, 2001 8:52 am | |
| Matt Dillon | Jul 30, 2001 9:46 am | |
| Chad R. Larson | Jul 30, 2001 9:59 am | |
| Cy Schubert - ITSD Open Systems Group | Jul 30, 2001 11:10 am | |
| Martin Kraemer | Jul 30, 2001 1:28 pm | |
| Mike Harding | Jul 30, 2001 9:14 pm | |
| Martin Kraemer | Jul 30, 2001 11:44 pm | |
| Michael Sperber [Mr. Preprocessor] | Jul 31, 2001 5:00 am | .dmesg |
| Arno J. Klaassen | Jul 31, 2001 9:48 am | |
| Justin T. Gibbs | Jul 31, 2001 1:00 pm | |
| Michael Sperber [Mr. Preprocessor] | Aug 2, 2001 9:57 am |
| Subject: | Re: Continuing ahc problems - also cause fxp failure | |
|---|---|---|
| From: | Martin Kraemer (Mart...@Fujitsu-Siemens.com) | |
| Date: | Jul 27, 2001 9:51:43 am | |
| List: | org.freebsd.freebsd-stable | |
On Wed, Jul 25, 2001 at 10:12:12AM -0700, Matt Dillon wrote:
Hmm. Well, that last conversation seemed to come to a concensus that a known thermal problem with a chip on my DELL motherboard related to heavy use of the on-board adaptec and on-board ethernet might have been the cause. I replaced the motherboard and moved away from the on-board ethernet (threw in another PCI card), and the problem went away.
I don't know if your problem below is the same problem or a different problem. It sounds like it may be a different problem.
IMO it is quite different, as I changed the following parameters:
* opened the PC to allow free air circulation (*iff* that does anything)
* replaced the on-board 7880UW controller by a PCI AHA-2940UW card. While both offer the same functionality, and are made by the same manufacturer, they also share the same timeout problems.
In my first mail I said I had seen 4.2-STABLE work and 4.3-STABLE fail, but that was not true: the old system was 4.2-RELEASE, and I noticed the error for the first time with 4.3-RELEASE).
So I upgraded to 4.3-STABLE afterwards, no change. So I got the cvs source tree of dev/aic7xxx/ to see the differences between 4.2-RELEASE and 4.3-RELEASE. But the gratest change seems to be in the sequencer code, about which I don't understand very much... In the source file aic7xxx_freebsd.c (that's where the ahc_timeout() prints the messages) I see that only little changed since 4.2-RELEASE: a detach routine was added, but IMO it is only invoked then the device is released completely. In aic7xxx.c, a LOT has changed.
Can the changes in the sequencer code be the reason for the still re-occurring "lost interrupts" on higher load -- or what else can be causing the timeout?
Or can the presence of a second (non-wide) 2940 which is used for my DAT cause any problems of this kind?
Puzzled,
Martin
On-board 7880:
ahc0: <Adaptec aic7880 Ultra SCSI adapter> port 0xf800-0xf8ff mem
0xfedfb000-0xf
edfbfff irq 9 at device 6.0 on pci0
ahc0: Using left over BIOS settings
aic7880: Wide Channel A, SCSI Id=15, 16/255 SCBs
da0 at ahc0 bus 0 target 0 lun 0
da0: <IBM DDYS-T18350N S92A> Fixed Direct Access SCSI-3 device
da0: 40.000MB/s transfers (20.000MHz, offset 8, 16bit), Tagged Queueing Enabled
da0: 17501MB (35843670 512 byte sectors: 64H 32S/T 17501C)
Mounting root from ufs:/dev/da0s1a
da1 at ahc0 bus 0 target 1 lun 0
da1: <WDIGTL WDE9100 1.30> Fixed Direct Access SCSI-2 device
da1: 40.000MB/s transfers (20.000MHz, offset 8, 16bit)
da1: 8683MB (17783204 512 byte sectors: 64H 32S/T 8683C)
--
Errors from replacement 2940uw (same as with 7880 originally):
18:54:28 deejai2 /kernel: (da1:ahc1:0:1:0): SCB 0xe - timed out while idle,
SEQADDR == 0x177
18:54:30 deejai2 /kernel: STACK == 0x17f, 0x189, 0x0, 0xe
18:54:30 deejai2 /kernel: SXFRCTL0 == 0x80
18:54:30 deejai2 /kernel: ahc1: Dumping Card State at SEQADDR 0x177
18:54:31 deejai2 /kernel: SCSISEQ = 0x12, SBLKCTL = 0x2, SSTAT0 0x5
18:54:31 deejai2 /kernel: SCB count = 140
18:54:32 deejai2 /kernel: Kernel NEXTQSCB = 111
18:54:32 deejai2 /kernel: Card NEXTQSCB = 14
18:54:32 deejai2 /kernel: QINFIFO entries: 14 125 2 22 122 83 64 98
18:54:32 deejai2 /kernel: Waiting Queue entries:
18:54:32 deejai2 /kernel: Disconnected Queue entries:
18:54:32 deejai2 /kernel: QOUTFIFO entries:
18:54:32 deejai2 /kernel: Sequencer Free SCB List: 11 3 12 6 9 4 5 0 2 13 15 14
1 8 7
18:54:32 deejai2 /kernel: Pending list: 98 64 83 122 22 2 125 14
18:54:32 deejai2 /kernel: Kernel Free SCB list: 128 115 20 38 109 11 32 27 107
76 85 108 47 95 35 58 129 60 70 101 96 87 19 66 102 112 10 81 61 59 46 23 65 114
63 50 78 82 30 62 54 86 31 43 8 15 48 25 56 127 113 21 12 105 72 121 28 100 49
103 106 51 6 90 41 84 29 119 74 68 13 17 135 94 5 52 104 123 42 9 24 75 39 73 88
77 53 55 40 97 4 92 33 79 37 18 67 126 16 44 57 0 71 26 1 110 124 36 69 93 117 7
118 34 120 45 3 91 89 80 116 136 137 138 139 99 134 133 132 131 130
18:54:32 deejai2 /kernel: Untagged Q(1): 14
18:54:32 deejai2 /kernel: sg[0] - Addr 0x34b2000 : Length 1024
18:54:32 deejai2 /kernel: (da1:ahc1:0:1:0): SCB 14: Immediate reset. Flags =
0x6040
18:54:32 deejai2 /kernel: (da1:ahc1:0:1:0): no longer in timeout, status = 34b
18:54:32 deejai2 /kernel: ahc1: Issued Channel A Bus Reset. 8 SCBs aborted
-- <Mart...@Fujitsu-Siemens.com> | Fujitsu Siemens Fon: +49-89-636-46021, FAX: +49-89-636-41143 | 81730 Munich, Germany
To Unsubscribe: send mail to majo...@FreeBSD.org with "unsubscribe freebsd-stable" in the body of the message






.dmesg, .pciconf, .messages