

![]() | Start a set with this search |
![]() | Include this search in one of my sets |
![]() | Exclude this search from one of my sets |
![]() | Permalink to these results Paste this link in email or IM: |
| Atom feed for tracking future search results Paste this URL into your reader: |
17 messages in org.apache.lucene.java-userRe: Wikia search goes live today| From | Sent On | Attachments |
|---|---|---|
| Lukas Vlcek | Jan 7, 2008 4:48 am | |
| Grant Ingersoll | Jan 7, 2008 5:13 am | |
| Grant Ingersoll | Jan 7, 2008 8:21 am | |
| Otis Gospodnetic | Jan 7, 2008 2:14 pm | |
| Lukas Vlcek | Jan 7, 2008 11:48 pm | |
| Lukas Vlcek | Jan 7, 2008 11:54 pm | |
| Grant Ingersoll | Jan 8, 2008 4:46 am | |
| Mike Klaas | Jan 8, 2008 11:59 am | |
| Dennis Kubes | Jan 8, 2008 12:09 pm | |
| Michael Stoppelman | Jan 8, 2008 12:11 pm | |
| Lukas Vlcek | Jan 8, 2008 12:15 pm | |
| Andrzej Bialecki | Jan 8, 2008 12:23 pm | |
| Ryan McKinley | Jan 8, 2008 12:31 pm | |
| Lukas Vlcek | Jan 8, 2008 12:36 pm | |
| Lukas Vlcek | Jan 8, 2008 12:38 pm | |
| Andrzej Bialecki | Jan 8, 2008 2:23 pm | |
| Dennis Kubes | Jan 8, 2008 2:53 pm |

![]() | Permalink for this message Paste this link in email or IM: |
![]() | Permalink for this thread Paste this link in email or IM: |
| Atom feed for this thread Paste this URL into your reader: |
| Subject: | Re: Wikia search goes live today | Actions... |
|---|---|---|
| From: | Andrzej Bialecki (ab...@getopt.org) | |
| Date: | Jan 8, 2008 2:23:38 pm | |
| List: | org.apache.lucene.java-user | |
Ryan McKinley wrote:
Andrzej Bialecki wrote:
Lukas Vlcek wrote:
So staring will be accommodated only during indexing phase. Does it mean it will be pretty static value not a dynamically changing variable... correct? In other words if I add my starts to some document it won't affect the scoring immediately but after indexing cycle. Correct?
(I'm not involved in Wikia development). There are some ways to go about it even in the pure Lucene-land, so that the updates are fast without reindexing the main content. Hint: ParallelReader.
in solr (1.3-dev) you can have an external value source with a function query...
True, although function query tends to bring more overhead ...
While we're on the subject of complex scoring - I read an interesting paper (I don't have a link now), which discussed a so called bucketed scoring. The idea is that if your basic scoring is good enough to ensure that top-N results are highly relevant, then you can split these results into buckets of k documents (let's say 10 ;) ), and within each bucket apply arbitrary re-ranking function, which is then very inexpensive to perform because of the limited number of documents.
Example: you have a large corpus of web pages, and you want home pages to appear first, even if they score somewhat lower - and it doesn't pay off to modify the base scoring, because of overfitting, i.e. the scoring would be good for home pages but poor for other relevant documents.
-- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com







