5 messages in com.mysql.lists.javaRe: optimizing many inserts with JDBC...
FromSent OnAttachments
Will Glass-Husain27 Aug 2003 18:00 
Filip Hanik27 Aug 2003 18:08 
Marc Slemko27 Aug 2003 18:10 
Will Glass-Husain27 Aug 2003 22:33 
"Schäfer, Peter"28 Aug 2003 01:07 
Subject:Re: optimizing many inserts with JDBC - RESULTS
From:Will Glass-Husain (wgl@forio.com)
Date:08/27/2003 06:00:12 PM
List:com.mysql.lists.java

Hi,

Two days ago I asked on this mailing list about how to speed the processing of a
high volume of inserts from a Java web application. I made a test program and
got some interesting results I thought I'd share. I'd welcome any comments.

SUGGESTED OPTIMIZATION The suggested optimization was to make a PreparedStatement that executed an
insert with multiple records on one line, e,g.

INSERT (field1, field2) VALUES (val1a, val1b), (val2a, val2b), (val3a, 3b)

But what should the block size be? In other words, How many records per line?

SUMMARY OF RESULTS The results showed that the ideal size depended on the database. For MySql
3.23, about 5 was the right number (2900 ms). For MySql 4.0, between 20 and 100
(about 1800 ms). Interestingly, when I converted the tables to InnoDb; my
performance took a huge hit for small block sizes (20x longer!) but I got the
best results with a block size of 1000 records (1300 ms).

Incidentally, my current application uses an O/R mapping tool (Torque) to save
the data. Running this with my test case took a whopping 25,061 ms. In some
cases the performance hit wouldn't matter, but in this case improving 10-20
times is a big deal.

DETAILS OF TEST CASE The test case was to save 10,000 records with 4 columns (one was a foreign key).
All numbers reported are milliseconds. Obviously, specific times depend on the
hardware, load, etc. For the record, I used a 512MB Athlon 900 Mhz running RH 9
(Linux) and JDK 1.3.1 with the Connector/J 3.0.8 driver.

The table below shows the results. View in monotype font to see the columns
line up. S is a JDBC query with a Statement. PS uses a PreparedStatement, with
a block size after the dash. (i.e. how many records to include in one query).
Each datapoint is an average of two runs. Lower is better.

MYSQL 3.23.54 (MyIsam)

---- S: 5717 PS-1: 3356 PS-5: 2863 PS-20: 2768 PS-100: 2812 PS-500: 2735 PS-1000: 3141

MYSQL 4.0.14 (MyIsam)

---- S: 4785 PS-1: 4530 PS-5: 2440 PS-20: 2019 PS-100: 1829 PS-500: 1803 PS-1000: 1975

MYSQL 4.0.14 (InnoDB)

---- S: 127407 PS-1: 127961 PS-5: 33564 PS-20: 14075 PS-100: 3183 PS-500: 1571 PS-1000: 1291 (*)

(* the first run was significantly slower than the second, so I dropped the
first run)

Best regards,

Will