197 messages

org.apache.lucene.nutch-dev [All Lists]

2015 July [All Months]

Page 7 (Messages 151 to 175): 1 2 3 4 5 6 7 8

[jira] [Commented] (NUTCH-2038) Naive Bayes classifier based html Parse filter (for filtering outlinks) - Chris A. Mattmann (JIRA)
[jira] [Commented] (NUTCH-2038) Naive Bayes classifier based html Parse filter (for filtering outlinks) - Markus Jelsma (JIRA)
[jira] [Commented] (NUTCH-2038) Naive Bayes classifier based html Parse filter (for filtering outlinks) - Asitang Mishra (JIRA)
[jira] [Created] (NUTCH-2057) Put all the files produced during training of the model for Naive Bayes classifier, in the Naive Bayed Parse Filter (NUTCH-2038), in a single folder - Asitang Mishra (JIRA)
[jira] [Commented] (NUTCH-2052) Enhance index-static to allow configurable delimiters - Peter Ciuffetti (JIRA)
[jira] [Commented] (NUTCH-2052) Enhance index-static to allow configurable delimiters - Peter Ciuffetti (JIRA)
[jira] [Reopened] (NUTCH-2059) protocol-httpclient, protocol-http unit test errors on Jenkins - Peter Ciuffetti (JIRA)
[jira] [Commented] (NUTCH-2059) protocol-httpclient, protocol-http unit test errors on Jenkins - Peter Ciuffetti (JIRA)
Re: GSOC2015- Sitemap crawler roudmap problems - Cihad Guzel
[jira] [Updated] (NUTCH-2058) Indexer plugin that allows RegEx replacements on the NutchDocument field values - Chris A. Mattmann (JIRA)
[jira] [Commented] (NUTCH-2058) Indexer plugin that allows RegEx replacements on the NutchDocument field values - Peter Ciuffetti (JIRA)
[Nutch Wiki] Update of "NutchPropertiesCompleteList" by PeterCiuffetti - Apache Wiki
[jira] [Resolved] (NUTCH-2021) Use protocol-selenium to Capture Screenshots of the Page as it is Fetched - Lewis John McGibbney (JIRA)
[jira] [Commented] (NUTCH-2064) URLNormalizer basic to properly encode non-ASCII characters - Lewis John McGibbney (JIRA)
[jira] [Commented] (NUTCH-2064) URLNormalizer basic to properly encode non-ASCII characters - Sebastian Nagel (JIRA)
[jira] [Commented] (NUTCH-2062) Add Plugin for interacting with Selenium WebDriver - Chris A. Mattmann (JIRA)
[jira] [Updated] (NUTCH-2062) Add Plugin for interacting with Selenium WebDriver - Lewis John McGibbney (JIRA)
[jira] [Updated] (NUTCH-2066) Allow user to specify crawldb and segment db in the Generate JOb REST endpoint - Sujen Shah (JIRA)
[jira] [Commented] (NUTCH-1086) Rewrite protocol-httpclient - Nikolai Vasilev (JIRA)
[jira] [Assigned] (NUTCH-1936) GSoC 2015 - Move Nutch to Hadoop 2.X - Lewis John McGibbney (JIRA)
[jira] [Created] (NUTCH-2069) Ignore external links based on domain - Julien Nioche (JIRA)
[jira] [Created] (NUTCH-2070) Allow user to specify segment to Fetch via the REST API - Sujen Shah (JIRA)
[jira] [Updated] (NUTCH-2071) A parser failure on a single document may fail crawling job - Arkadi Kosmynin (JIRA)
[jira] [Updated] (NUTCH-1785) Ability to index raw content - Lewis John McGibbney (JIRA)
[jira] [Resolved] (NUTCH-1785) Ability to index raw content - Lewis John McGibbney (JIRA)

Page 7 (Messages 151 to 175): 1 2 3 4 5 6 7 8