Now that Otis reminded me that this thread existed (I've got a brain like a
sieve these days, I tell you)...
On Fri, Nov 20, 2009 at 10:08 AM, Grant Ingersoll <gsin...@apache.org>wrote:
-1 from me, even though it's confusing, because having that call there
(somewhere, at least) allows you to actually do compare scores across
queries if you do the extra work of properly normalizing the documents as
well (at index time).
Do you have some references on this? I'm interested in reading more on the
subject. I've never quite been sold on how it is meaningful to compare
scores and would like to read more opinions.
So I couldn't find any really good papers on this specifically, but I seem
to remember seeing this stuff done a lot in Manning and Schutze' IR book -
the go over training field boosts with logistic regression and all that, but
they don't specifically look at the Lucene case (although they consider
similar scoring functions). They must talk about the necessity of
comparable scores to do this, I'm sure.