189 messages

org.apache.lucene.nutch-dev [All Lists]

2007 May [All Months]

Page 5 (Messages 101 to 125): 1 2 3 4 5 6 7 8

[jira] Created: (NUTCH-477) Extend URLFilters to support different filtering chains - Andrzej Bialecki (JIRA)
Hudson build is back to normal: Nutch-Nightly #75 - hud...@lucene.zones.apache.org
[jira] Created: (NUTCH-478) Add function for stopping FetherThread gracefully - chee.wu (JIRA)
[jira] Commented: (NUTCH-443) allow parsers to return multiple Parse object, this will speed up the rss parser - Antonio Eggberg (JIRA)
And if nutch it would be written on With С++ worked more quickly? - mr_max
[jira] Created: (NUTCH-480) Searching multiple indexes with a single nutch instance - Ravi Chintakunta (JIRA)
[jira] Updated: (NUTCH-480) Searching multiple indexes with a single nutch instance - Ravi Chintakunta (JIRA)
[jira] Commented: (NUTCH-470) Adding optional terms to a query - Trond Andersen (JIRA)
Re: [jira] Updated: (NUTCH-469) changes to geoPosition plugin to make it work on nutch 0.9 - Mike Schwartz
[jira] Updated: (NUTCH-479) Support for OR queries - Nicolás Lichtmaier (JIRA)
Re: svn commit: r536606 - in /lucene/nutch/trunk: ./ src/java/org/apache/nutch/fetcher/ src/java/org/apache/nutch/metadata/ src/java/org/apache/nutch/parse/ src/java/org/apache/nutch/util/ src/plugin/creativecommons/src/test/org/creativecommons/nutch/ src/... - Andrzej Bialecki
[jira] Commented: (NUTCH-446) RobotRulesParser should ignore Crawl-delay values of other bots in robots.txt - Doğacan Güney (JIRA)
[jira] Commented: (NUTCH-424) CLONE - Problem persists with Nutch 0.8.1 (Nekohtml 0.9.4) - NekoHTML's DOMFragmentParser hangs on certain URLs - Mike Brzozowski (JIRA)
[jira] Resolved: (NUTCH-446) RobotRulesParser should ignore Crawl-delay values of other bots in robots.txt - Sami Siren (JIRA)
[jira] Created: (NUTCH-483) remove redundant commons-logging jar from ontology plugin - Sami Siren (JIRA)
[jira] Updated: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object - Gal Nitzan (JIRA)
[jira] Updated: (NUTCH-485) Change HtmlParseFilter 's to return ParseResult object instead of Parse object - Gal Nitzan (JIRA)
[jira] Resolved: (NUTCH-482) Remove redundant plugin lib-log4j - Sami Siren (JIRA)
[jira] Updated: (NUTCH-487) Neko HTML parser goes on default settings. - Marcin Okraszewski (JIRA)
[jira] Created: (NUTCH-488) Avoid parsing uneccessary links and get a more relevant outlink list - Emmanuel Joke (JIRA)
[jira] Created: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters - Emmanuel Joke (JIRA)
[jira] Created: (NUTCH-490) Extension point with filters for Neko HTML parser (with patch) - Marcin Okraszewski (JIRA)
[jira] Updated: (NUTCH-490) Extension point with filters for Neko HTML parser (with patch) - Marcin Okraszewski (JIRA)
[jira] Commented: (NUTCH-489) URLFilter-suffix management of the url path when the url contains some query parameters - Doğacan Güney (JIRA)
[jira] Updated: (NUTCH-427) protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implmentation. - Vadim Bauer (JIRA)

Page 5 (Messages 101 to 125): 1 2 3 4 5 6 7 8