| From | Sent On | Attachments |
|---|---|---|
| Grant Ingersoll | Nov 20, 2009 7:55 am | |
| Mark Miller | Nov 20, 2009 8:04 am | |
| Jake Mannix | Nov 20, 2009 8:14 am | |
| Mark Miller | Nov 20, 2009 8:14 am | |
| Jake Mannix | Nov 20, 2009 8:18 am | |
| Grant Ingersoll | Nov 20, 2009 10:08 am | |
| Jake Mannix | Nov 20, 2009 10:24 am | |
| Grant Ingersoll | Nov 20, 2009 1:58 pm | |
| Mark Miller | Nov 20, 2009 2:24 pm | |
| Jake Mannix | Nov 20, 2009 2:31 pm | |
| Mark Miller | Nov 20, 2009 2:39 pm | |
| Mark Miller | Nov 20, 2009 2:50 pm | |
| Jake Mannix | Nov 20, 2009 3:39 pm | |
| Mark Miller | Nov 20, 2009 4:09 pm | |
| Mark Miller | Nov 20, 2009 4:20 pm | |
| Jake Mannix | Nov 20, 2009 4:36 pm | |
| Jake Mannix | Nov 20, 2009 4:42 pm | |
| Jake Mannix | Nov 20, 2009 4:49 pm | |
| Mark Miller | Nov 20, 2009 4:49 pm | |
| Mark Miller | Nov 20, 2009 4:51 pm | |
| Jake Mannix | Nov 20, 2009 4:56 pm | |
| Mark Miller | Nov 20, 2009 5:02 pm | |
| Jake Mannix | Nov 20, 2009 5:10 pm | |
| Jake Mannix | Nov 20, 2009 5:13 pm | |
| Otis Gospodnetic | Nov 24, 2009 9:18 pm | |
| Otis Gospodnetic | Nov 24, 2009 9:31 pm | |
| Jake Mannix | Nov 24, 2009 9:39 pm | |
| Jake Mannix | Nov 24, 2009 9:43 pm | |
| Jake Mannix | Nov 24, 2009 9:55 pm | |
| Jake Mannix | Nov 24, 2009 10:30 pm |
| Subject: | Re: Whither Query Norm? | |
|---|---|---|
| From: | Mark Miller (mark...@gmail.com) | |
| Date: | Nov 20, 2009 2:50:16 pm | |
| List: | org.apache.lucene.java-dev | |
Jake Mannix wrote:
Remember: we're not really doing cosine at all here.
This, I think, is fuzzy right? It seems to be common to still call this cosine scoring loosely - pretty much every practical impl fudges things somewhat when doing the normalization (though we are on the heavy side of fudgers) - I think its pretty rare to do the true cosine because its so expensive. It can be somewhat misleading though.
Have you looked at the Similarity scoring explanation page that was recently improved? Have any suggestions on changes to it? Doron put a fair amount of work into improving it recently, but I think it could always get better. Its currently leaning towards presenting this as cosine - that seems in line with the few text books I've seen, but I'm admittedly not that deep into any of this.





