347 messages

org.apache.lucene.nutch-dev [All Lists]

2012 April [All Months]

Page 10 (Messages 226 to 250): 1 2 3 4 5 6 7 8 9 10 11 12 13 14

[jira] [Commented] (NUTCH-1234) Upgrade to Tika 1.1 - Hudson (Commented) (JIRA)
[jira] [Created] (NUTCH-1326) HostDeduplicator for Nutch - Markus Jelsma (Created) (JIRA)
[jira] [Updated] (NUTCH-1249) Resolve all issues flagged up by adding javac -Xlint arguement - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1150) http.redirect.max can lead to multiple parses of the same url - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1063) OutlinkExtractor test generates an exception but does not fail - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1060) URL filters to produce regexes to be used by OutlinkExtractor. - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1308) Unnecessary truncate content configuration, and logging in parse-zip/ZipParser - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1226) Migrate CrawlDbReader to MapReduce API - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1149) DomainStats should process numeric CrawlDB metadata - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-961) Expose Tika's boilerpipe support - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1284) Add site fetcher.max.crawl.delay as log output by default. - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1118) JUnit test for index-basic - Markus Jelsma (Updated) (JIRA)
Build failed in Jenkins: Nutch-nutchgora #225 - Apache Jenkins Server
[jira] [Commented] (NUTCH-1330) OutlinkDB to preserve back up - Lewis John McGibbney (Commented) (JIRA)
[jira] [Closed] (NUTCH-1333) Introduce AvroStore, DataFileAvroStore and Accumulo Datastore implementations - Lewis John McGibbney (Closed) (JIRA)
[jira] [Updated] (NUTCH-1334) NPE in FetcherOutputFormat - Julien Nioche (Updated) (JIRA)
[jira] [Updated] (NUTCH-1336) Optionally not index db_notmodified pages - Markus Jelsma (Updated) (JIRA)
[jira] [Commented] (NUTCH-1314) Impose a limit on the length of outlink target urls - Ferdy Galema (Commented) (JIRA)
[jira] [Created] (NUTCH-1341) NotModified time set to now but page not modified - Markus Jelsma (Created) (JIRA)
[jira] [Created] (NUTCH-1344) BasicURLNormalizer to normalize https same as http - Sebastian Nagel (JIRA)
[jira] [Commented] (NUTCH-1317) Max content length by MIME-type - Markus Jelsma (JIRA)
[jira] [Updated] (NUTCH-1162) Write JUnit tests for parse-js - Lewis John McGibbney (JIRA)
[jira] [Updated] (NUTCH-1168) Write JUnit tests for tld - Lewis John McGibbney (JIRA)
[jira] [Commented] (NUTCH-882) Design a Host table in GORA - Ferdy Galema (JIRA)
Re: We just blocked Nutch - Markus Jelsma

Page 10 (Messages 226 to 250): 1 2 3 4 5 6 7 8 9 10 11 12 13 14