189 messages

org.apache.lucene.nutch-dev [All Lists]

2007 May [All Months]

Page 8 (Messages 176 to 189): 1 2 3 4 5 6 7 8

[jira] Updated: (NUTCH-446) RobotRulesParser should ignore Crawl-delay values of other bots in robots.txt - Doğacan Güney (JIRA)
[jira] Commented: (NUTCH-472) NullPointerException in ZipTextExtractor if no MIME type for zipped file - Sami Siren (JIRA)
[jira] Commented: (NUTCH-476) Would like to add a field to the document class for its MD5 signature - Sami Siren (JIRA)
[jira] Updated: (NUTCH-424) NekoHTML's DOMFragmentParser hangs on certain URLs (CLONE: Problem persists with Nutch 0.9 and 0.8.1 (Nekohtml 0.9.4)) - Mike Brzozowski (JIRA)
[jira] Commented: (NUTCH-444) Possibly use a different library to parse RSS feed for improved performance and compatibility - nutch.newbie (JIRA)
[jira] Updated: (NUTCH-481) http.content.limit is broken in the protocol-httpclient plugin - charlie wanek (JIRA)
[jira] Commented: (NUTCH-472) NullPointerException in ZipTextExtractor if no MIME type for zipped file - Sami Siren (JIRA)
[jira] Updated: (NUTCH-484) Nutch Nightly API link is broken in site - Gal Nitzan (JIRA)
Re: Site nightly API link is broken - Sami Siren
[jira] Resolved: (NUTCH-484) Nutch Nightly API link is broken in site - Sami Siren (JIRA)
[jira] Resolved: (NUTCH-483) remove redundant commons-logging jar from ontology plugin - Sami Siren (JIRA)
[jira] Resolved: (NUTCH-161) Change Plain text parser to use parser.character.encoding.default property for fall back encoding - Sami Siren (JIRA)
[jira] Updated: (NUTCH-488) Avoid parsing uneccessary links and get a more relevant outlink list - Emmanuel Joke (JIRA)
[jira] Resolved: (NUTCH-392) OutputFormat implementations should pass on Progressable - Andrzej Bialecki (JIRA)

Page 8 (Messages 176 to 189): 1 2 3 4 5 6 7 8