183 messages

org.apache.lucene.nutch-dev [All Lists]

2010 March [All Months]

Page 1 (Messages 1 to 25): 1 2 3 4 5 6 7 8

[Nutch Wiki] Update of "Becoming_A_Nutch_Developer" by maqboolzee - Apache Wiki
[jira] Closed: (NUTCH-799) SOLRIndexer to commit once all reducers have finished - Julien Nioche (JIRA)
1.1 release? - Mattmann, Chris A (388J)
[jira] Commented: (NUTCH-650) Hbase Integration - Piet Schrijver (JIRA)
[jira] Updated: (NUTCH-469) changes to geoPosition plugin to make it work on nutch 0.9 - Julien Nioche (JIRA)
[jira] Commented: (NUTCH-762) Alternative Generator which can generate several segments in one parse of the crawlDB - Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?" - Robert Hohman (JIRA)
[jira] Commented: (NUTCH-762) Alternative Generator which can generate several segments in one parse of the crawlDB - Julien Nioche (JIRA)
[jira] Reopened: (NUTCH-802) Problems managing outlinks with large url length - Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-802) Problems managing outlinks with large url length - Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-800) Generator builds a URL list that is not encoded - Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-795) Add ability to maintain nofollow attribute in linkdb - Andrzej Bialecki (JIRA)
Re: Crawling authenticated websites ! - Susam Pal
[jira] Commented: (NUTCH-693) Add configurable option for treating nofollow behaviour. - Andrzej Bialecki (JIRA)
[jira] Commented: (NUTCH-787) Upgrade Lucene to 3.0.0. - Andrzej Bialecki (JIRA)
[DISCUSS] Nutch as a top level project (TLP)? - Andrzej Bialecki
Re: [DISCUSS] Nutch as a top level project (TLP)? - Otis Gospodnetic
[Nutch Wiki] Update of "FAQ" by Ankit Dangi - Apache Wiki
[jira] Commented: (NUTCH-762) Alternative Generator which can generate several segments in one parse of the crawlDB - Julien Nioche (JIRA)
[jira] Created: (NUTCH-804) CrawlDatum.statNames can be modified - Mike Baranczak (JIRA)
[jira] Commented: (NUTCH-784) CrawlDBScanner - Hudson (JIRA)
[jira] Updated: (NUTCH-673) Upgrade the Carrot2 plug-in to release 3.0 - Chris A. Mattmann (JIRA)
[jira] Updated: (NUTCH-763) Separate configuration files from resources to be included in the job file - Chris A. Mattmann (JIRA)
[jira] Updated: (NUTCH-540) some problem about the Nutch cache - Chris A. Mattmann (JIRA)
[jira] Commented: (NUTCH-789) Improvements to Tika parser - Chris A. Mattmann (JIRA)

Page 1 (Messages 1 to 25): 1 2 3 4 5 6 7 8