atom feed30 messages in org.apache.lucene.java-devRe: Whither Query Norm?
FromSent OnAttachments
Grant IngersollNov 20, 2009 7:55 am 
Mark MillerNov 20, 2009 8:04 am 
Jake MannixNov 20, 2009 8:14 am 
Mark MillerNov 20, 2009 8:14 am 
Jake MannixNov 20, 2009 8:18 am 
Grant IngersollNov 20, 2009 10:08 am 
Jake MannixNov 20, 2009 10:24 am 
Grant IngersollNov 20, 2009 1:58 pm 
Mark MillerNov 20, 2009 2:24 pm 
Jake MannixNov 20, 2009 2:31 pm 
Mark MillerNov 20, 2009 2:39 pm 
Mark MillerNov 20, 2009 2:50 pm 
Jake MannixNov 20, 2009 3:39 pm 
Mark MillerNov 20, 2009 4:09 pm 
Mark MillerNov 20, 2009 4:20 pm 
Jake MannixNov 20, 2009 4:36 pm 
Jake MannixNov 20, 2009 4:42 pm 
Jake MannixNov 20, 2009 4:49 pm 
Mark MillerNov 20, 2009 4:49 pm 
Mark MillerNov 20, 2009 4:51 pm 
Jake MannixNov 20, 2009 4:56 pm 
Mark MillerNov 20, 2009 5:02 pm 
Jake MannixNov 20, 2009 5:10 pm 
Jake MannixNov 20, 2009 5:13 pm 
Otis GospodneticNov 24, 2009 9:18 pm 
Otis GospodneticNov 24, 2009 9:31 pm 
Jake MannixNov 24, 2009 9:39 pm 
Jake MannixNov 24, 2009 9:43 pm 
Jake MannixNov 24, 2009 9:55 pm 
Jake MannixNov 24, 2009 10:30 pm 
Subject:Re: Whither Query Norm?
From:Grant Ingersoll (gsin@apache.org)
Date:Nov 20, 2009 1:58:38 pm
List:org.apache.lucene.java-dev

On Nov 20, 2009, at 1:24 PM, Jake Mannix wrote:

On Fri, Nov 20, 2009 at 10:08 AM, Grant Ingersoll <gsin@apache.org> wrote:

I should add in my $0.02 on whether to just get rid of queryNorm() altogether:

-1 from me, even though it's confusing, because having that call there
(somewhere, at least) allows you to actually do compare scores across queries if
you do the extra work of properly normalizing the documents as well (at index
time).

Do you have some references on this? I'm interested in reading more on the
subject. I've never quite been sold on how it is meaningful to compare scores
and would like to read more opinions.

References on how people do this *with Lucene*, or just how this is done in
general?

in general. Academic references, etc.

There are lots of papers on fancy things which can be done, but I'm not sure
where to point you to start out. The technique I'm referring to is really just
the simplest possible thing beyond setting your weights "by hand": let's assume
you have a boolean OR query, Q, built up out of sub-queries q_i (hitting, for
starters, different fields, although you can overlap as well with some more
work), each with a set of weights (boosts) b_i, then if you have a training
corpus (good matches, bad matches, or ranked lists of matches in order of
relevance for the queries at hand), *and* scores (at the q_i level) which are
comparable, then you can do a simple regression (linear or logistic, depending
on whether you map your final scores to a logit or not) on the w_i to fit for
the best boosts to use. What is critical here is that scores from different
queries are comparable. If they're not, then queries where the best document
for a query scores 2.0 overly affect the training in comparison to the queries
where the best possible score is 0.5 (actually, wait, it's the reverse: you're
training to increase scores of matching documents, so the system tries to make
that 0.5 scoring document score much higher by raising boosts higher and higher,
while the good matches already scoring 2.0 don't need any more boosting, if that
makes sense).

This makes sense from a mathematical sense, assuming scores are comparable.
What I would like to get at is why anyone thinks scores are comparable across
queries to begin with. I agree it is beneficial in some cases (as you
described) if they are. Probably a question suited for an academic IR list...

There are of course far more complex "state of the art" training techniques, but
probably someone like Ted would be able to give a better list of references on
where is easiest to read those from. But I can try to dredge up some places
where I've read about doing this, and post again later if I can find any.