2 messages in org.apache.jackrabbit.usersBinary Content Search Problem...
FromSent OnAttachments
Patrick WiderOct 18, 2007 6:02 am 
Ard SchrijversOct 22, 2007 5:59 am 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:Binary Content Search Problem...Actions...
From:Patrick Wider (pat_@yahoo.fr)
Date:Oct 18, 2007 6:02:28 am
List:org.apache.jackrabbit.users

Hi all,

I'm setting up a new JackRabbit repository, which is backed by an Oracle DB for
persistence. The access to the created nodes and their properties are OK...
except if I try to execute (basic?) queries like:

"/jcr:root//element(*, nt:resource)[(jcr:contains(., 'myKeyWord'))]"

which is supposed to return all nt:resource nodes whose jcr:data binary content
contains 'myKeyWord', isn't it? But, it doesn't.... and I have no clue where I made the mistake.

First, I checked my workspace.xml file and particulary the SearchIndex property:

<SearchIndex class="org.apache.jackrabbit.core.query.lucene.SearchIndex"> <param name="path" value="${wsp.home}/index"/> <param name="textFilterClasses"
value="org.apache.jackrabbit.extractor.PlainTextExtractor, org.apache.jackrabbit.extractor.MsWordTextExtractor, ... many more extractors "/> </SearchIndex>

Then, I defined a node type as follow: [wider:file] > 'nt:file', 'mix:referenceable'

Then I created a wider:file node as parent of the jcr:content node (nt:resource
type): Calendar cal = ...; String mimetype=...; File myFile = new File...; InputStream inputstr = new FileInputStream(myFile); Node fileNode = myRoot.addNode(myFile.getName(), "wider:file"); Node resourceNode = fileNode.addNode("jcr:content", "nt:resource"); resourceNode.setProperty("jcr:mimeType", mimetype); //--> I made sure it is a
good one resourceNode.setProperty("jcr:encoding", ""); resourceNode.setProperty("jcr:lastModified", cal); resourceNode.setProperty("jcr:data", inputstr); mySession.save();

I made sure the mimetypes are OK... I have actually created 2 nt:resource nodes:
one with a Word Document (mimetype=application/msword) an the other one with a
text file (mimetype=text/plain)...

Of course the files contain somehow 'myKeyWord'... the text file contains it for
sure, but in the Document, 'myKeyWord' is wrapped by bold and italic styles. But
I don't think the styles cause any problems... on the other hand, I have no idea
how the extractors works ;-) it's just a guess....

And, as said before, these nodes do exist in the repository... I can query them
and their properties and the jcr:data property can be roughly displayed... only
the jcr:contains function seems not to work.

Maybe you should also know that the externalBLOBs param is declared as false...
and that I use JackRabbit 1.3.1 with Lucene 2.0.0...

I really have no idea what I did wrong... thanx for your help Regards, Patrick