189 messages

org.apache.lucene.nutch-dev [All Lists]

2007 May [All Months]

Page 4 (Messages 76 to 100): 1 2 3 4 5 6 7 8

[jira] Commented: (NUTCH-446) RobotRulesParser should ignore Crawl-delay values of other bots in robots.txt - Sami Siren (JIRA)
How to install Nutch on Freebsd? - mr_max
Re: SIGSEGV - Dennis Kubes
RE: Document Classification - indexing question - Armel T. Nene
[jira] Updated: (NUTCH-469) changes to geoPosition plugin to make it work on nutch 0.9 - Sami Siren (JIRA)
[jira] Resolved: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser - Andrzej Bialecki (JIRA)
Recrawl help - karthik085
[jira] Updated: (NUTCH-424) CLONE - Problem persists with Nutch 0.8.1 (Nekohtml 0.9.4) - NekoHTML's DOMFragmentParser hangs on certain URLs - Mike Brzozowski (JIRA)
[jira] Resolved: (NUTCH-456) parse msexcel plugin speedup - Sami Siren (JIRA)
[jira] Assigned: (NUTCH-446) RobotRulesParser should ignore Crawl-delay values of other bots in robots.txt - Sami Siren (JIRA)
[jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility - nutch.newbie (JIRA)
[jira] Updated: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object - Gal Nitzan (JIRA)
[jira] Created: (NUTCH-487) Neko HTML parser goes on default settings. - Marcin Okraszewski (JIRA)
[jira] Updated: (NUTCH-25) needs 'character encoding' detector - Doğacan Güney (JIRA)
[jira] Created: (NUTCH-491) dedup fails with ArrayIndexOutOfBoundsException - Nicolás Lichtmaier (JIRA)
[jira] Commented: (NUTCH-491) dedup fails with ArrayIndexOutOfBoundsException - Doğacan Güney (JIRA)
[jira] Created: (NUTCH-492) java.lang.OutOfMemoryError while indexing. - Nicolás Lichtmaier (JIRA)
[jira] Work started: (NUTCH-466) Flexible segment format - Andrzej Bialecki (JIRA)
Plugins initialized all the time! - Nicolás Lichtmaier
[jira] Commented: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters - Doğacan Güney (JIRA)
Re: Plugins initialized all the time! - Doğacan Güney
[jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility - Chris A. Mattmann (JIRA)
Re: OutOfMemoryError - Why should the while(1) loop stop? - Dennis Kubes
[jira] Created: (NUTCH-495) Unnecessary delays in Fetcher2 - Doğacan Güney (JIRA)
Making "Hits" work as a normal List - Nicolás Lichtmaier

Page 4 (Messages 76 to 100): 1 2 3 4 5 6 7 8