347 messages

org.apache.lucene.nutch-dev [All Lists]

2012 April [All Months]

Page 12 (Messages 276 to 300): 1 2 3 4 5 6 7 8 9 10 11 12 13 14

[jira] [Created] (NUTCH-1324) DupeDB for Nutch - Markus Jelsma (Created) (JIRA)
[jira] [Updated] (NUTCH-1113) Merging segments causes URLs to vanish from crawldb/index? - Markus Jelsma (Updated) (JIRA)
[jira] [Resolved] (NUTCH-1225) Migrate CrawlDBScanner to MapReduce API - Markus Jelsma (Resolved) (JIRA)
[jira] [Updated] (NUTCH-1318) Parse time outs crash parsing fetcher - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-717) Make Nutch Solr integration easier - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1084) ReadDB url throws exception - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1194) CrawlDB lock should be released earlier - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1062) Migrate BasicURLNormalizer from Apache ORO to java.util.regex - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1300) Indexer to normalize URL's - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1130) JUnit test for Any23 RDF plugin - Markus Jelsma (Updated) (JIRA)
[jira] [Closed] (NUTCH-809) Parse-metatags plugin - Julien Nioche (Closed) (JIRA)
Re: Jenkins build is back to normal : Nutch-nutchgora #226 - Lewis John Mcgibbney
[jira] [Updated] (NUTCH-1333) Introduce AvroStore, DataFileAvroStore and Accumulo Datastore implementations - Lewis John McGibbney (Updated) (JIRA)
[jira] [Issue Comment Edited] (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed - Markus Jelsma (Issue Comment Edited) (JIRA)
[jira] [Updated] (NUTCH-1339) Default URL normalization rules to remove page anchors completely - Sebastian Nagel (Updated) (JIRA)
[jira] [Commented] (NUTCH-1314) Impose a limit on the length of outlink target urls - Julien Nioche (Commented) (JIRA)
[jira] [Created] (NUTCH-1343) Crawl sites with hashtags in url - Roberto Gardenier (Created) (JIRA)
[jira] [Updated] (NUTCH-1277) Fix [fallthrough] javac warnings - Lewis John McGibbney (JIRA)
[jira] [Updated] (NUTCH-887) Delegate parsing of feeds to Tika - Lewis John McGibbney (JIRA)
[jira] [Commented] (NUTCH-902) Add all necessary files and configuration so that nutch can be used with different backends out-of-the-box - Lewis John McGibbney (JIRA)
[jira] [Updated] (NUTCH-979) Add support for deleting Solr documents with ProtocolStatusCodes.NOTFOUND - Lewis John McGibbney (JIRA)
[jira] [Commented] (NUTCH-902) Add all necessary files and configuration so that nutch can be used with different backends out-of-the-box - Ferdy Galema (JIRA)
[jira] [Commented] (NUTCH-879) URL-s getting lost - Ferdy Galema (JIRA)
[jira] [Updated] (NUTCH-1205) Upgrade gora modules to 0.2 in ivy/ivy.xml - Ferdy Galema (JIRA)
[jira] [Updated] (NUTCH-1205) Upgrade gora modules to 0.2 in ivy/ivy.xml - Ferdy Galema (JIRA)

Page 12 (Messages 276 to 300): 1 2 3 4 5 6 7 8 9 10 11 12 13 14