6 messages in org.apache.lucene.java-userCorrupted Indexes Under Lucene 2.3 (a...
FromSent OnAttachments
Tyler VFeb 29, 2008 2:25 pm 
Michael McCandlessFeb 29, 2008 2:46 pm 
Tyler VFeb 29, 2008 4:04 pm 
Yonik SeeleyFeb 29, 2008 6:00 pm 
Michael McCandlessMar 1, 2008 2:15 am 
Tyler VMar 1, 2008 11:18 am 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:Corrupted Indexes Under Lucene 2.3 (and 2.3.1)Actions
From:Tyler V (tyle@gmail.com)
Date:Feb 29, 2008 2:25:38 pm
List:org.apache.lucene.java-user

After upgrading to Lucene 2.3 (and subsequently 2.3.1), our application has experienced sporadic index corruptions on our larger (and more frequently updated) indexes. These indexes experienced no corruptions under any prior version of Lucene (which we have been using for several years).

The pattern of failure seems to be consistent. First, we receive an exception like the following:

java.lang.IndexOutOfBoundsException: Index: 4788, Size: 4762 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:322) at
org.apache.lucene.index.DocumentsWriter$ThreadState.init(DocumentsWriter.java:749) at
org.apache.lucene.index.DocumentsWriter.getThreadState(DocumentsWriter.java:2391) at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2434) at
org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2422) at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1445) at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1424) at com.myapp.indexing.IndexerRunner.run(IndexerRunner.java:134) at java.lang.Thread.run(Thread.java:619)

When we experience this error, we run a writer.flush() then a writer.close().

Then, we get this exception when trying to re-open the index:

org.apache.lucene.index.CorruptIndexException: doc counts differ for segment _c2z13: fieldsReader shows 2 but segmentInfo shows 3 at
org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:313) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:262) at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:197) at
org.apache.lucene.index.MultiSegmentReader.<init>(MultiSegmentReader.java:55) at
org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:75) at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:636) at
org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:63) at org.apache.lucene.index.IndexReader.open(IndexReader.java:209) at org.apache.lucene.index.IndexReader.open(IndexReader.java:192) at com.myapp.indexing.IndexerRunner.run(IndexerRunner.java:107) at java.lang.Thread.run(Thread.java:619)

Running the check index application included with 2.3 enables us to remove the bad documents from the index, but this workaround is less than desirable. It would be greatly appreciated if anyone could shed some light on our issue.

Regards,

Tyler