2 messages in com.mysql.lists.bugsMySQL 3.23.42 and OpenBSD -stable (2....| From | Sent On | Attachments |
|---|---|---|
| Toni Mueller | 15 Oct 2001 23:54 | |
| Sasha Pachev | 16 Oct 2001 06:04 |
| Subject: | MySQL 3.23.42 and OpenBSD -stable (2.9): crashes system![]() |
|---|---|
| From: | Toni Mueller (supp...@oeko.net) |
| Date: | 10/15/2001 11:54:43 PM |
| List: | com.mysql.lists.bugs |
Hi,
yesterday I've discovered a problem I don't know how to handle.
The problem seems to relate to both MySQL and OpenBSD.
To call the wrath of all serious developers on me, I must note that I'm running a -stable as of 9/9 and a MySQL 3.23.42-max per the ports. I'll be posting a similar message to the MySQL list...
The way to experience the problem is to make the MySQL server work, and work hard. Whether this is done using a large number of small queries (I've done so using
1 million inserts of 20 bytes each), or whether
it's done using a long-lasting query doesn't appear to matter much. The net result is that the machine freezes, and that all partitions including / need to be fsck'ed which, in my case and the / partition, requires booting from CD, running fsck, and then booting again. Thus I'm not fond of the idea to make many iterations through this process of shooting the system. However, running this kind of queries/load is a requirement for me...
Now to the details of the problem:
When the machine freezes, ping and dns answers still get through, but nothing else works. No shell, no ssh, nothing. Pressing the reset button is sufficient, however. The three-finger-salute (don't know offhand if I have it enabled) is not, however. Please note that the only other job on that machine that requires significant CPU time is setiathome.
load averages: 0.06, 0.42, 1.25 14:44:25 81 processes: 1 running, 80 idle CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Memory: Real: 103M/139M act/tot Free: 111M Swap: 4K/1025M used/tot
PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND 4883 seti -5 20 16M 16M idle getnew 325.1H 0.00% setiathome ^^^^^^^^^^^^ 28980 mysql -5 4 94M 44M idle getnew 12:35 0.00% mysqld ^^^^^^^^^^^^ 4392 dnscache -5 0 5252K 5732K idle pipewr 4:00 0.00% dnscache 30178 root -14 0 68K 356K idle inode 2:49 0.00% svscan 8135 root -5 0 356K 1204K idle getnew 1:27 0.00% sshd 8324 root -5 -12 440K 704K idle getnew 1:24 0.00% xntpd 13450 toni 2 0 2868K 4324K sleep select 1:14 0.00% httpd 24674 tinydns 2 0 160K 548K sleep netio 1:11 0.00% tinydns 5166 dnslog -5 0 48K 376K idle getnew 0:40 0.00% multilog 18352 tinydns 2 0 160K 548K sleep netio 0:37 0.00% tinydns 6067 root -5 0 19M 20M idle getnew 0:29 0.00% squid 22526 root -14 0 308K 684K idle inode 0:21 0.00% cron 19413 dnslog -5 0 48K 368K idle getnew 0:12 0.00% multilog 18999 toni 28 0 228K 976K run - 0:11 0.00% top 31429 root 2 0 392K 1508K sleep select 0:06 0.00% sshd 5391 root -5 0 128K 500K idle getnew 0:06 0.00% syslogd 1550 dnslog -5 0 48K 376K idle getnew 0:05 0.00% multilog 19101 root 2 0 48K 348K idle netcon 0:04 0.00% tcpserver
This looks like the machine has still resources left, ie, is not running out of swap.
On one shell that was already open, I could see the following:
Uptime: 5879 Threads: 3 Questions: 2138463 Slow queries: 0 Opens: 14 Flush
tables: 1 Open tables: 2 Queries per second avg: 363.746
^C^C^C
^^^^^^ This is when I wanted to stop this process, but it would
not work. Before doing any kind of work, mysqladmin shutdown
also doesn't work, and the MySQL folks say if this doesn't work,
there is a problem with the operating system. The server logs
"Normal shutdown" when this is called, but mysqladmin doesn't
terminate itself except with a kill -9. The same goes for the
daemon which is well visible with ps long after it logged
"Normal shutdown". Killing the mysqladmin program also kills
the server, apparently, but I can only see this if I want
to connect to the data base like this: mysql mysql (didn't
try other data bases). Then the client says "Lost connection
to MySQL server during query", and a subsequent
ps ax |grep mysql doesn't report any more mysql processes
running.
The query that hung the server:
mysql> select count(*) from ipacct_cisco; ^C
^C^C
At this time the main (Perl) program ran a tight loop like this: insert into ipacct_cisco (srcip, dstip, pkt, bytes, acct_time) values (?,?,?,?,?)
with appropriate messages. All fields are int, so the whole record is 20 bytes.
After this, I had to reboot and manually clear up the root file system. The next round went better in that the machine didn't crash immediately (but shortly after, using a different query). Just after my program finished:
$ mysqladmin status
Uptime: 2176 Threads: 2 Questions: 2251847 Slow queries: 2 Opens: 10 Flush
tables: 1 Open tables: 2 Queries per second avg: 1034.856
So you see that I've inserted just over 2 million records (which is not that large a number from a practical viewpoint).
This time the mysqladmin status was _dog_slow_, if any, and just connecting from the command line to the data base like $ mysql database took some 38 seconds to show the 'mysql>' prompt (I already thought it died since the machine was otherwise fully responsive this time).
Then I added some nonsense-queries like sorting that one table just to load the system, and voila, a few minutes later when I executed the following query from the command line, the system was fully hosed again:
select distinct srcip from ipacct_cisco group by acct_time, dstip;
This was on the table i just inserted these 2 million records into. The following is what mysqlbug produces, plus a few more remarks. The mysqladmin status only worked (again) after a fresh start, so there is no load shown in the sample.
Description:
running heavy queries, or sending many queries in a short time (at least) makes the mysql server crash the machine. So it seems.
How-To-Repeat:
don't know, other than placing a high load on the server. Crashes within some two hours...
Fix:
none yet
Submitter-Id: Toni Mueller
Originator: Toni Mueller
Organization:
Oeko.neT Mueller & Brandt
MySQL support: none
Synopsis: MySQL gets stuck in getnew() and takes operating system with it
Severity: serious
Priority: medium
Category: mysql
Class: sw-bug
Release: mysql-3.23.42 (Source distribution)
Server: /usr/local/bin/mysqladmin Ver 8.21 Distrib 3.23.42, for
unknown-openbsd2.9 on i386
Copyright (C) 2000 MySQL AB & MySQL Finland AB & TCX DataKonsult AB This software comes with ABSOLUTELY NO WARRANTY. This is free software, and you are welcome to modify and redistribute it under the GPL license
Server version 3.23.42-log Protocol version 10 Connection Localhost via UNIX socket UNIX socket /var/mysql/mysql.sock Uptime: 3 min 31 sec
Threads: 1 Questions: 1 Slow queries: 0 Opens: 6 Flush tables: 1 Open
tables: 0 Queries per second avg: 0.005
Environment:
System: OpenBSD baoab.oeko.net 2.9 OEKONET-3#1 i386
Some paths: /usr/bin/perl /usr/bin/make /usr/local/bin/gmake /usr/bin/gcc
/usr/bin/cc
GCC: Reading specs from /usr/lib/gcc-lib/i386-unknown-openbsd2.9/2.95.3/specs
gcc version 2.95.3 20010125 (prerelease)
Compilation info: CC='cc' CFLAGS='-O2 ' CXX='c++' CXXFLAGS='-O2 '
LDFLAGS='-L/usr/local/lib/pth'
LIBC:
-r--r--r-- 1 root bin 724882 Sep 10 01:22 /usr/lib/libc.a
-r--r--r-- 1 root bin 563475 May 14 2000 /usr/lib/libc.so.25.0
-r--r--r-- 1 root bin 589414 Jan 26 2001 /usr/lib/libc.so.25.4
-r--r--r-- 1 root bin 589414 Feb 28 2001 /usr/lib/libc.so.26.0
-r--r--r-- 1 root bin 594040 Sep 10 01:22 /usr/lib/libc.so.26.2
Configure command: ./configure --enable-shared --enable-static
--localstatedir=/var/mysql --with-libwrap=/usr --with-mysqld-user=mysql
--with-unix-socket-path=/var/mysql/mysql.sock --without-perl --without-debug
--without-readline --without-bench --without-mit-threads --without-gemini
--with-berkeley-db --with-innodb --prefix=/usr/local --sysconfdir=/etc
----------------------
The following is the table used in the program:
create table ipacct_cisco ( id int not null auto_increment, srcip int unsigned not null, dstip int unsigned not null, pkt int not null, bytes int unsigned not null, acct_time int unsigned not null, primary key (id), index (srcip,dstip) );
----------------------
The following is the core of the script I used to populate the table:
my $stmt = qq{ insert into $table (srcip, dstip, pkt, bytes, acct_time) values (?, ?, ?, ?, ?) };
$db->do ("lock tables $table write");
my $sth = $db->prepare ($stmt) or die "$0: cant prepare $stmt: $DBI::errstr\n";
... stuff omitted (how to calculate the variable values below)
loop over the data set, for each row:
$result = $sth->execute ($srcip, $dstip, $pkts, $bytes, $acct_time)
afterwards:
$db->do ("unlock tables");
This should be no rocket science...
I'm using this for DBI/DBD:
$ pkg_info p5-DBI Information for p5-DBI-1.20:
Comment: unified perl interface for database access
Required by: p5-DBD-Msql-Mysql-1.22.16 p5-ApacheDBI-0.88
Unfortunately there was little in the logs that sounded "useful", but here it is:
Warning: Got signal 14 from thread 12
I've got about 20 of these, with other thread numbers being 8 and 11.
These are the last two queries out of a > 400MB binlog (hope to delete old binlogs rsn):
# at 410298563
#011015 17:42:22 server id 1 Intvar
SET INSERT_ID = 2251660;
# at 410298585
#011015 17:42:22 server id 1 Query thread_id=9 exec_time=0
error_code=0
SET TIMESTAMP=1003160542;
insert into ipacct_cisco
(srcip, dstip, pkt, bytes, acct_time)
values (2156576851, 3563514937, '69', '14378', '1003089602')
;
# at 410298748
#011015 17:42:22 server id 1 Intvar
SET INSERT_ID = 2251661;
# at 410298770
#011015 17:42:22 server id 1 Query thread_id=9 exec_time=0
error_code=0
SET TIMESTAMP=1003160542;
insert into ipacct_cisco
(srcip, dstip, pkt, bytes, acct_time)
values (3550348820, 3563515134, '18', '1080', '1003089602')
I've cut out the last 5000 lines of it, but didn't find any specifying an error.
This is my dmesg:
OpenBSD 2.9-stable (OEKONET-3) #1: Sun Sep 9 12:57:43 CEST 2001
to...@baoab.oeko.net:/usr/src/sys/arch/i386/compile/OEKONET-3
cpu0: Intel Pentium III ("GenuineIntel" 686-class, 512KB L2 cache) 551 MHz
cpu0:
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,SYS,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SIMD
real mem = 268005376 (261724K)
avail mem = 245563392 (239808K)
using 3297 buffers containing 13504512 bytes (13188K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(83) BIOS, date 10/19/99, BIOS32 rev. 0 @ 0xf06b0
apm0 at bios0: Power Management spec V1.2 (BIOS mgmt disabled)
apm0: AC on, battery charge unknown
pcibios0 at bios0: rev. 2.1 @ 0xf0000/0xf12
pcibios0: PCI IRQ Routing Table rev. 1.0 @ 0xf0e70/160 (8 entries)
pcibios0: PCI Interrupt Router at 000:04:0 ("Intel 82371FB PCI-ISA" rev 0x00)
pcibios0: PCI bus #1 is the last bus
bios0: ROM list: 0xc0000/0x8000 0xcc000/0x3600 0xd0000/0x1000
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 "Intel 82443BX PCI-AGP" rev 0x03
ppb0 at pci0 dev 1 function 0 "Intel 82443BX AGP" rev 0x03
pci1 at ppb0 bus 1
vga1 at pci1 dev 0 function 0 "Matrox MGA G400/G450 AGP" rev 0x04
wsdisplay0 at vga1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
pcib0 at pci0 dev 4 function 0 "Intel 82371AB PIIX4 ISA" rev 0x02
pciide0 at pci0 dev 4 function 1 "Intel 82371AB IDE" rev 0x01: DMA, channel 0
wired to compatibility, channel 1 wired to compatibility
atapiscsi0 at pciide0 channel 1
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: <TOSHIBA, CD-ROM XM-6602B, 1017> SCSI0 5/cdrom
removable
pciide0: channel 1 interrupting at irq 15
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2
"Intel 82371AB USB" rev 0x01 at pci0 dev 4 function 2 not configured
"Intel 82371AB Power Management" rev 0x02 at pci0 dev 4 function 3 not
configured
gdt0 at pci0 dev 11 function 0 "Vortex Computer Systems
GDT6128RD/GDT6528RD/GDT6628RD" rev 0x05: pci_mem_find: expected mem type
00000000, found 00000002
irq 12 dpmem c8000 3-bus 1 cache device
gdt0: ver 217, cache on, strategy 2, writeback on, blksz 32
gdt0: raw feat 1 cache feat 101
scsibus1 at gdt0: 35 targets
sd0 at scsibus1 targ 0 lun 0: <ICP, Host drive #00, > SCSI2 0/direct fixed
sd0: 35000MB, 4462 cyl, 255 head, 63 sec, 512 bytes/sec, 71682030 sec total
scsibus2 at gdt0: 16 targets
scsibus3 at gdt0: 16 targets
scsibus4 at gdt0: 16 targets
fxp0 at pci0 dev 13 function 0 "Intel 82557" rev 0x08: irq 9, address
00:90:27:8f:88:23
inphy0 at fxp0 phy 1: i82555 10/100 media interface, rev. 4
isa0 at pcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
sysbeep0 at pcppi0
lpt0 at isa0 port 0x378/4 irq 7
npx0 at isa0 port 0xf0/16: using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
biomask 9040 netmask 9240 ttymask 92c2
pctr: 686-class user-level performance counters enabled
mtrr: Pentium Pro MTRR support
dkcsum: sd0 matched BIOS disk 80
root on sd0a
rootdev=0x400 rrootdev=0xd00 rawdev=0xd02
You might also be interested in the kernel config for this box:
# $OpenBSD: GENERIC,v 1.256 2001/04/24 22:13:00 deraadt Exp $ # $NetBSD: GENERIC,v 1.48 1996/05/20 18:17:23 mrg Exp $ # # OEKONET-2: feeble attempt at doing a small server and ws kernel # covers our new hardware only # # arrived at 2.9MB, but seems to work # # GENERIC -- everything that's currently supported #
machine i386 # architecture, used by config; REQUIRED
# include "../../../conf/GENERIC"
# $OpenBSD: GENERIC,v 1.71 2001/04/09 20:40:49 deraadt Exp $ # # Machine-independent option; used by all architectures for their # GENERIC kernel
#option INSECURE # default to secure
option TIMEZONE=0 # time zone to adjust RTC time by option DST=0 # daylight savings time used by RTC option NTP # hooks supporting the Network Time Protocol
option SWAPPAGER # paging; REQUIRED option DEVPAGER # mmap() of devices
option DDB # in-kernel debugger option DDB_SAFE_CONSOLE # allow break into ddb during boot #makeoptions DEBUG="-g" # compile full symbol table #makeoptions PROF="-pg" # build profiled kernel #option GPROF # kernel profiling, kgmon(8) option DIAGNOSTIC # internal consistency checks option KTRACE # system call tracing, a la ktrace(1) option KMEMSTATS # collect malloc(9) statistics
option CRYPTO # Cryptographic framework
option SYSVMSG # System V-like message queues option SYSVSEM # System V-like semaphores option SYSVSHM # System V-like memory sharing
#option UVM_SWAP_ENCRYPT# support encryption of pages going to swap
option COMPAT_11 # NetBSD 1.1, option COMPAT_43 # and 4.3BSD
option LKM # loadable kernel modules
option FFS # UFS option FFS_SOFTUPDATES # Soft updates option QUOTA # UFS quotas option MFS # memory file system
option TCP_SACK # Selective Acknowledgements for TCP option TCP_FACK # Forward Acknowledgements for TCP
option NFSCLIENT # Network File System client option NFSSERVER # Network File System server
option CD9660 # ISO 9660 + Rock Ridge file system option MSDOSFS # MS-DOS file system option FDESC # /dev/fd option FIFO # FIFOs; RECOMMENDED option KERNFS # /kern option PROCFS # /proc
option INET # IP + ICMP + TCP + UDP option INET6 # IPv6 (needs INET) option PULLDOWN_TEST # use m_pulldown for IPv6 packet parsing option IPSEC # IPsec option IPFILTER # IP packet filter for security option IPFILTER_LOG # use /dev/ipl to log IPF option PPP_BSDCOMP # PPP BSD compression option PPP_DEFLATE option MROUTING # Multicast router
pseudo-device loop 8 # network loopback pseudo-device bpfilter 8 # packet filter pseudo-device enc 8 # IPSEC needs the encapsulation interface #pseudo-device strip 1 # Starmode Radio IP interface
pseudo-device pty 256 # pseudo-terminals pseudo-device vnd 4 # paging to files pseudo-device ksyms 1 # kernel symbols device pseudo-device bridge 4 # network bridging support pseudo-device vlan 1024 # IEEE 802.1Q VLAN
# for IPv6 pseudo-device gif 4 # IPv[46] over IPv[46] tunnel (RFC1933)
option BOOT_CONFIG # add support for boot -c option I686_CPU option UVM # use the UVM virtual memory system option DUMMY_NOPS # speed hack; recommended option COMPAT_LINUX # binary compatibility with Linux option COMPAT_FREEBSD # binary compatibility with FreeBSD
maxusers 128 # estimated number of users
config bsd swap generic
mainbus0 at root
bios0 at mainbus0 apm0 at bios0 flags 0x0000 # flags 0x0101 to force protocol version 1.1 pcibios0 at bios0 flags 0x0000 # use 0x30 for a total verbose
isa0 at mainbus0 isa0 at pcib? pci* at mainbus0 bus ?
option PCIVERBOSE
pchb* at pci? dev ? function ? # PCI-Host bridges ppb* at pci? dev ? function ? # PCI-PCI bridges pci* at ppb? bus ? pci* at pchb? bus ? pcib* at pci? dev ? function ? # PCI-ISA bridges (do nothing)
puc* at pci? # PCI "universal" communication device
npx0 at isa? port 0xf0 irq 13 # math coprocessor isadma0 at isa? isapnp0 at isa?
option WSDISPLAY_COMPAT_USL # VT handling option WSDISPLAY_COMPAT_RAWKBD # can get raw scancodes option WSDISPLAY_DEFAULTSCREENS=6 option WSDISPLAY_COMPAT_PCVT # emulate some ioctls
pckbc0 at isa? # PC keyboard controller pckbd* at pckbc? # PC keyboard pms* at pckbc? # PS/2 mouse for wsmouse pmsi* at pckbc? # PS/2 "Intelli"mouse for wsmouse vga0 at isa? vga* at pci? dev ? function ? pcdisplay0 at isa? # CGA, MDA, EGA, HGA wsdisplay* at vga? console ? wsdisplay* at pcdisplay? console ? wskbd* at pckbd? console ? wsmouse* at pms? mux 0 wsmouse* at pmsi? mux 0
pcppi0 at isa? sysbeep0 at pcppi?
pccom0 at isa? port 0x3f8 irq 4 # standard PC serial ports pccom1 at isa? port 0x2f8 irq 3 pccom2 at isa? port 0x3e8 irq 5
lpt0 at isa? port 0x378 irq 7 # standard PC parallel ports lpt1 at isa? port 0x278 lpt2 at isa? port 0x3bc
ahc* at pci? dev ? function ? # Adaptec 2940 SCSI controllers scsibus* at ahc? gdt* at pci? dev ? function ? # ICP Vortex GDT RAID controllers scsibus* at gdt?
sd* at scsibus? target ? lun ? # SCSI disk drives cd* at scsibus? target ? lun ? # SCSI CD-ROM drives uk* at scsibus? target ? lun ? # unknown SCSI
fdc0 at isa? port 0x3f0 irq 6 drq 2 # standard PC floppy controllers fd* at fdc? drive ?
pciide* at pci ? dev ? function ? flags 0x0000
wdc0 at isa? port 0x1f0 irq 14 flags 0x00 wdc1 at isa? port 0x170 irq 15 flags 0x00 wdc* at isapnp?
wd* at wdc? channel ? drive ? flags 0x0000 wd* at pciide? channel ? drive ? flags 0x0000
# ATAPI<->SCSI atapiscsi* at wdc? channel ? atapiscsi* at pciide? channel ? scsibus* at atapiscsi?
# Networking devices de* at pci? dev ? function ? # DC21X4X-based ethernet fxp* at pci? dev ? function ? # EtherExpress 10/100B ethernet xl* at pci? dev ? function ? # 3C9xx ethernet rl* at pci? dev ? function ? # RealTek 81[23]9 ethernet vr* at pci? dev ? function ? # VIA Rhine ethernet skc* at pci? dev ? function ? # SysKonnect GEnesis 984x sk* at skc? # each port of above
# Media Independent Interface (mii) drivers exphy* at mii? phy ? # 3Com internal PHYs inphy* at mii? phy ? # Intel 82555 PHYs iophy* at mii? phy ? # Intel 82553 PHYs icsphy* at mii? phy ? # ICS 1890 PHYs lxtphy* at mii? phy ? # Level1 LXT970 PHYs qsphy* at mii? phy ? # Quality Semi QS6612 PHYs rlphy* at mii? phy ? # RealTek 8139 internal PHYs dcphy* at mii? phy ? # Digital Clone PHYs amphy* at mii? phy ? # AMD 79C873 PHYs tqphy* at mii? phy ? # TDK 78Q212x PHYs bmtphy* at mii? phy ? # Broadcom 10/100 PHYs brgphy* at mii? phy ? # Broadcom Gigabit PHYs eephy* at mii? phy ? # Marvell 88E1000 series PHY xmphy* at mii? phy ? # XaQti XMAC-II PHYs ukphy* at mii? phy ? # "unknown" PHYs
spkr0 at pcppi? # PC speaker
pseudo-device pctr 1 pseudo-device mtrr 1 # Memory range attributes control
pseudo-device wsmux 2
Any help is very much appreciated!
Best, --Toni++




