24 messages in com.googlegroups.google-enterprise-developerRe: Metadata only crawl for images| From | Sent On | Attachments |
|---|---|---|
| Anthony Smith | 22 May 2008 14:13 | |
| Anthony Smith | 23 May 2008 05:08 | |
| Jeff Ling | 23 May 2008 07:32 | |
| Anthony Smith | 23 May 2008 11:23 | |
| Jeff Ling | 23 May 2008 11:46 | |
| Anthony Smith | 23 May 2008 11:57 | |
| Jeff Ling | 23 May 2008 13:10 | |
| Anthony Smith | 27 May 2008 06:52 | |
| Anthony Smith | 29 May 2008 12:29 | |
| Jeff Ling | 29 May 2008 12:34 | |
| Anthony Smith | 29 May 2008 12:46 | |
| Anthony Smith | 30 May 2008 13:29 | |
| Jeff Ling | 30 May 2008 13:33 | |
| Anthony Smith | 30 May 2008 13:45 | |
| Jeff Ling | 30 May 2008 13:50 | |
| Anthony Smith | 30 May 2008 13:52 | |
| Anthony Smith | 30 May 2008 15:14 | |
| Jeff Ling | 01 Jun 2008 08:24 | |
| Anthony Smith | 02 Jun 2008 13:47 | |
| John Lacey | 05 Jun 2008 01:41 | |
| Anthony Smith | 05 Jun 2008 06:39 | |
| Anthony Smith | 05 Jun 2008 06:45 | |
| John Lacey | 06 Jun 2008 11:42 | |
| Anthony Smith | 06 Jun 2008 12:23 |
| Subject: | Re: Metadata only crawl for images![]() |
|---|---|
| From: | Anthony Smith (anth...@frontlinelogic.com) |
| Date: | 05/27/2008 06:52:17 AM |
| List: | com.googlegroups.google-enterprise-developer |
Yes, the site is protected. We're using Metadata and URL feeds. The GSA is doing the crawling. The Crawler Access has been configured and seems to be working properly. PDFs and DOC files are getting indexed just fine. The exact error we're getting is "Error: Other 4xx HTTP response code." This error is only happening on image files (PNG, JPEG, GIF, etc...).
Anthony Smith, Developer Frontline Logic, Inc. http://www.frontlinelogic.com
Office Number +1 765-854-0739 Mobile Number +1 765-461-5254
On May 23, 2008, at 4:10 PM, Jeff Ling wrote:
I guess the site is protected? Are you using meta-url feeds or content feeds? Is GSA doing the crawling (it seems so)? Have you configured Crawler Access if that's the case?
On Fri, May 23, 2008 at 11:58 AM, Anthony Smith
<anth...@frontlinelogic.com
wrote:
Yes, I have tried and yes it does work.
Anthony Smith, Developer Frontline Logic, Inc. http://www.frontlinelogic.com
Office Number +1 765-854-0739 Mobile Number +1 765-461-5254
On May 23, 2008, at 2:47 PM, Jeff Ling wrote:
Have you tried to access the same URL from a browser? Does it work?
On Fri, May 23, 2008 at 11:24 AM, Anthony Smith
<anth...@frontlinelogic.com
wrote:
Thanks for the reply Jeff. We've commented out the lines pertaining to images (jpeg, gif, png, etc...) in the crawl exception patterns. The GSA is now trying to crawl them but we're getting a Error: Other 4xx HTTP Response Code (or something along those lines). The items are still not searchable. Is there something more I need to do GSA wise or something I might be missing in our connector implementation by any chance? Thanks!
Anthony Smith, Developer Frontline Logic, Inc. http://www.frontlinelogic.com
Office Number +1 765-854-0739 Mobile Number +1 765-461-5254
On May 23, 2008, at 10:33 AM, Jeff Ling wrote:
You could definitely do that - make sure the files with image extensions are not excluded from the "Crawl & Index" exclusion patterns - by default they are.
Jeff
On Fri, May 23, 2008 at 5:09 AM, Anthony Smith <anth...@frontlinelogic.com
wrote:
I forgot to mention the most important part! This is for connector development. We have images coming out of our repository with metadata attached to them and we'd like them to be searchable based on those metadata values.
On May 22, 2008, at 5:14 PM, Anthony Smith wrote:
Hey Folks,
Is there a way to "crawl" images only for metadata? We understand that an image can't be full-text indexed but we're still sending metadata information on the images that we'd still like to search on. Any help would be appreciated!
--
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google
Groups "Google Enterprise Developer Forum" group.
To post to this group, send email to
Goog...@googlegroups.com
To unsubscribe from this group, send email to
Goog...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/Google-Enterprise-Developer?hl=en
-~----------~----~----~----~------~----~------~--~---




