347 messages

org.apache.lucene.nutch-dev [All Lists]

2012 April [All Months]

Page 14 (Messages 326 to 347): 1 2 3 4 5 6 7 8 9 10 11 12 13 14

[jira] [Commented] (NUTCH-1234) Upgrade to Tika 1.1 - Hudson (Commented) (JIRA)
[jira] [Updated] (NUTCH-1245) URL gone with 404 after db.fetch.interval.max stays db_unfetched in CrawlDb and is generated over and over again - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1219) Upgrade all jobs to new MapReduce API - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1224) Migrate FreeGenerator to MapReduce API - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1186) FreeGenerator always normalizes - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1275) Fix [unchecked] javac warnings - Markus Jelsma (Updated) (JIRA)
[jira] [Updated] (NUTCH-1079) StringBuffer converted to StringBuilder - Markus Jelsma (Updated) (JIRA)
[jira] [Commented] (NUTCH-1306) Commit after finished writing to solr index - Lewis John McGibbney (Commented) (JIRA)
[jira] [Commented] (NUTCH-1251) Deletion of duplicates fails with org.apache.solr.client.solrj.SolrServerException - Arkadi Kosmynin (Commented) (JIRA)
[Nutch Wiki] Update of "IndexMetatags" by JulienNioche - Apache Wiki
[jira] [Commented] (NUTCH-1208) Don't include KEYS file in bin distribution - Hudson (Commented) (JIRA)
[jira] [Created] (NUTCH-1336) Optionally not index db_notmodified pages - Markus Jelsma (Created) (JIRA)
[jira] [Commented] (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed - Roberto Gardenier (Commented) (JIRA)
[jira] [Commented] (NUTCH-585) [PARSE-HTML plugin] Block certain parts of HTML code from being indexed - Markus Jelsma (Commented) (JIRA)
[jira] [Updated] (NUTCH-1331) limit crawler to defined depth - Julien Nioche (Updated) (JIRA)
[jira] [Commented] (NUTCH-1314) Impose a limit on the length of outlink target urls - Julien Nioche (Commented) (JIRA)
[jira] [Created] (NUTCH-1340) Increase scalability by only removing markers when they actually exist for DbUpdaterReducer - Ferdy Galema (Created) (JIRA)
[jira] [Commented] (NUTCH-1317) Max content length by MIME-type - Lewis John McGibbney (JIRA)
[jira] [Updated] (NUTCH-1161) Write JUnit tests for microformats-reltag plugin - Lewis John McGibbney (JIRA)
[jira] [Updated] (NUTCH-1166) Write JUnit tests for scoring-link - Lewis John McGibbney (JIRA)
[jira] [Updated] (NUTCH-944) Increase the number of elements to look for URLs and add the ability to specify multiple attributes by elements - Lewis John McGibbney (JIRA)
[jira] [Updated] (NUTCH-1205) Upgrade gora modules to 0.2 in ivy/ivy.xml - Lewis John McGibbney (JIRA)

Page 14 (Messages 326 to 347): 1 2 3 4 5 6 7 8 9 10 11 12 13 14