5 messages in org.apache.incubator.lucene-net-userRe: Faceting in Lucene.Net
FromSent OnAttachments
Soormasher SinghDec 16, 2007 10:33 am 
Jokin CuadradoDec 18, 2007 1:43 am 
Soormasher SinghDec 19, 2007 8:08 am 
Jokin CuadradoDec 19, 2007 9:04 am.zip
Soormasher SinghDec 19, 2007 3:34 pm 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:Re: Faceting in Lucene.NetActions...
From:Jokin Cuadrado (joki@gmail.com)
Date:Dec 18, 2007 1:43:49 am
List:org.apache.incubator.lucene-net-user

could you be more explicit on your needs? How many documents have your index, how many different categories are and how much is the average search hit number would be enough to suggest an approach.

In my case i made an custom collector to count the hits on every category using a fieldcache item to get the item efficiently instead of call to hit.getDocument. (performance killer).

this is better if your searches return small sets and you have much categories.

If you have not many terms, and your searches return many results, you can use queryfilter.bits to get the masks, AND them, and count the number of set bits on the result. this have the drawback that .net implementation of Bitarray, don't have an efficient method of counting the set bits (cardinality on java), but you could get one from the bitvector class on lucene.net (you must use you own implementation of bitarray, or use reflection to access the backbone int32 array m_array and count over him).

here is the function to get the number of ones set in a bitarray:

Private Shared _bitsSetArray256 As Byte() = {0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8}

''' <summary> ''' return the number of bits on bitarray set to one ''' </summary> ''' <remarks></remarks> Private Function Cardinality(ByVal bits As BitArray) As Int32 Dim arr As UInt32() arr = bits.GetType().GetField("m_array", Reflection.BindingFlags.NonPublic Or Reflection.BindingFlags.Instance).GetValue(bits) Dim _count As Int32 = 0 For i As Int32 = 0 To arr.Length - 1 _count += _bitsSetArray256(arr(i) And &HFF) + _ _bitsSetArray256((arr(i) >> 8) And &HFF) + _ _bitsSetArray256((arr(i) >> 16) And &HFF) + _ _bitsSetArray256(arr(i) >> 24) Next i Return _count End Function

On Dec 16, 2007 7:33 PM, Soormasher Singh <soor@yahoo.com> wrote:

Hello All

I'm trying to use Lucene.Net for faceting (Category counting and search
refinement). I've not been able to find any examples of this using Lucene.Net.
I've tried to use the approach used in Solr, but the performance hasn't been the
greatest. Can anyone please help me with this? Any code/examples of anyone using
Lucene.Net for category counting/faceting?

Thanks a bunch!