88 messages

org.apache.lucene.nutch-dev [All Lists]

2013 November [All Months]

Page 3 (Messages 51 to 75): 1 2 3 4

[jira] [Updated] (NUTCH-1413) Record response time - Lewis John McGibbney (JIRA)
[jira] [Commented] (NUTCH-1643) Unnecessary fetching with http.content.limit when using protocol-http - Lewis John McGibbney (JIRA)
[jira] [Resolved] (NUTCH-1650) Adaptive Fetch Scheduler interval Wrong Set - Lewis John McGibbney (JIRA)
[jira] [Created] (NUTCH-1660) Index filter for Page's latitude and longitude - Talat UYARER (JIRA)
[jira] [Commented] (NUTCH-1588) Port NUTCH-1245 URL gone with 404 after db.fetch.interval.max stays db_unfetched in CrawlDb and is generated over and over again to 2.x - Hudson (JIRA)
[jira] [Commented] (NUTCH-1360) Suport the storing of IP address connected to when web crawling - Hudson (JIRA)
Best way to avoid filtering outlinks? [PATCH] - Andy Boothe [WCG]
[jira] [Updated] (NUTCH-1663) Crawl page with specified language - İlhami KALKAN (JIRA)
[jira] [Created] (NUTCH-1667) Updatedb always ignore batchId - Nguyen Manh Tien (JIRA)
[jira] [Updated] (NUTCH-656) DeleteDuplicates based on crawlDB only - Julien Nioche (JIRA)
[jira] [Resolved] (NUTCH-1382) Adding support for EmbeddedSolrServer to SolrIndexer - Julien Nioche (JIRA)
[jira] [Work started] (NUTCH-1670) set same crawldb directory in mergedb parameter - lufeng (JIRA)
[jira] [Resolved] (NUTCH-1587) misspelled property "threshold" in conf/log4j.properties - Sebastian Nagel (JIRA)
[jira] [Updated] (NUTCH-1671) indexchecker to add digest field - Sebastian Nagel (JIRA)
[jira] [Created] (NUTCH-1673) Title isn't reset in MoreIndexingFilter - Nguyen Manh Tien (JIRA)
[jira] [Updated] (NUTCH-1674) Use batchId filter enable scan (GORA-119) for Fetch,Parse,Update,Index - Nguyen Manh Tien (JIRA)
[jira] [Commented] (NUTCH-1673) Title isn't reset in MoreIndexingFilter - Lewis John McGibbney (JIRA)
[jira] [Commented] (NUTCH-1325) HostDB for Nutch - Otis Gospodnetic (JIRA)
[jira] [Updated] (NUTCH-1661) Language based crawling - Otis Gospodnetic (JIRA)
[jira] [Commented] (NUTCH-1297) it is better for fetchItemQueues to select items from greater queues first - Otis Gospodnetic (JIRA)
[jira] [Commented] (NUTCH-1647) protocol-http throws unzipBestEffort returned null for some pages - Luke (JIRA)
[jira] [Commented] (NUTCH-1360) Suport the storing of IP address connected to when web crawling - Lewis John McGibbney (JIRA)
[jira] [Updated] (NUTCH-1674) Use batchId filter to enable scan (GORA-119) for Fetch,Parse,Update,Index - Alparslan Avcı (JIRA)
[jira] [Commented] (NUTCH-1325) HostDB for Nutch - Lewis John McGibbney (JIRA)
[jira] [Created] (NUTCH-1677) ORIGINAL_CHAR_ENCODING and CHAR_ENCODING_FOR_CONVERSION are not set in Parse HTML - Talat UYARER (JIRA)

Page 3 (Messages 51 to 75): 1 2 3 4