17 messages in org.apache.lucene.java-userRe: Wikia search goes live today
FromSent OnAttachments
Lukas VlcekJan 7, 2008 4:48 am 
Grant IngersollJan 7, 2008 5:13 am 
Grant IngersollJan 7, 2008 8:21 am 
Otis GospodneticJan 7, 2008 2:14 pm 
Lukas VlcekJan 7, 2008 11:48 pm 
Lukas VlcekJan 7, 2008 11:54 pm 
Grant IngersollJan 8, 2008 4:46 am 
Mike KlaasJan 8, 2008 11:59 am 
Dennis KubesJan 8, 2008 12:09 pm 
Michael StoppelmanJan 8, 2008 12:11 pm 
Lukas VlcekJan 8, 2008 12:15 pm 
Andrzej BialeckiJan 8, 2008 12:23 pm 
Ryan McKinleyJan 8, 2008 12:31 pm 
Lukas VlcekJan 8, 2008 12:36 pm 
Lukas VlcekJan 8, 2008 12:38 pm 
Andrzej BialeckiJan 8, 2008 2:23 pm 
Dennis KubesJan 8, 2008 2:53 pm 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:Re: Wikia search goes live todayActions...
From:Michael Stoppelman (stop@gmail.com)
Date:Jan 8, 2008 12:11:47 pm
List:org.apache.lucene.java-user

I'm surprised they aren't keeping *any* logs or so they claim. Seems foolish to me from a data-mining prospective.

"A Wikia employee told me today that people were already asking what the most popular search terms were. He said there was no way of finding out as no logs are kept." [1] [1] http://radar.oreilly.com/archives/2008/01/why_wikia_will_change_search.html

-M

On Jan 8, 2008 12:09 PM, Dennis Kubes <kub@apache.org> wrote:

Star ratings are being stored but not accounted for in the score as of yet. The plan is to include them in future indexing scores. :)

Dennis

Mike Klaas wrote:

On 7-Jan-08, at 11:49 PM, Lukas Vlcek wrote:

This would be great!

I am particularly interested how they are going about customized search (if they have a plan to do it). I mean if they can reorder raw search results based on some kind of collective knowledge (which is probably kept outside of Lucene index - at least that is what I can see from Nutch score explanations).

I don't think that there is anything like that yet. It looks to me like a standard disjunction over title/content/host/url + a global document boost based on pagerank-y link analysis (or simply # inlinks). If they are incorporating the "star" ratings yet, it is probably folded in to the global doc boost.

Regards, Lukas

On Jan 7, 2008 11:14 PM, Otis Gospodnetic <otis@yahoo.com> wrote:

See my comment (around #45-50) on Techcrunch about that from late last night. There is actually one Wikia guy helping Nutch - Dennis Kubes. He must have been hitting reload on that TC post, because he IMed me quickly after I posted my comment and clarified that he is that Wikia developer I was referring to in my comment.... so I'm looking forward to more contributions from Dennis and his coworkers! :)

Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ---- From: Grant Ingersoll <gsin@apache.org> To: java@lucene.apache.org Sent: Monday, January 7, 2008 11:21:33 AM Subject: Re: Wikia search goes live today

One other thing to note, you can definitely see Lucene in action (or Nutch, that is) by clicking on the score returned for a given document

(try searching for Lucene) and you see, in all it's glory, the Lucene explain results... It even displays the Nutch logo, which makes me wonder if they are misusing an ASF trademark (but, IANAL, so I don't know) since they don't state that Nutch is a trademark of the ASF. But, that is a discussion for somewhere else...

On Jan 7, 2008, at 8:13 AM, Grant Ingersoll wrote:

On Jan 7, 2008, at 7:48 AM, Lukas Vlcek wrote:

Hi,

I noticed that Wikia search goes live today (see http://www.devxnews.com/article.php/3719906). Does anybody know where I could find more technical information about their solution? Are they going to contribute their enhancements back to Lucene/Nutch/Hadoop code? My understanding is that as long as they claim they want to build their solution on top of open source technology they should be contributing back.

Not sure what they have done, but nothing in the Apache license requires contribution back, even if it would be appreciated.

Cheers, Grant

-------------------------- Grant Ingersoll http://lucene.grantingersoll.com http://www.lucenebootcamp.com

Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ

-------------------------- Grant Ingersoll http://lucene.grantingersoll.com http://www.lucenebootcamp.com

Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ