5 messages in com.googlegroups.google-enterprise-developerRe: UNC crawling with the latest updates
FromSent OnAttachments
Chris19 Feb 2007 08:59 
Sean Cooper20 Feb 2007 04:56 
Chris20 Feb 2007 06:11 
Shef200021 Feb 2007 17:03 
Shef200022 Feb 2007 10:23 
Subject:Re: UNC crawling with the latest updates
From:Shef2000 (ian.@gmail.com)
Date:02/21/2007 05:03:21 PM
List:com.googlegroups.google-enterprise-developer

Hi Chris,

I just recieved my mini a few days ago as well and have run into the same issue. Although the document clearly states:

<quote> If you are using Windows UNC path names, you do not need to specify the protocol and you need to use a backslash ("\") instead of a forward slash. UNC entries would use this format:

\\<host>[:port]\<path>

The information contained in square brackets [ ] is optional. The backslash after <host>[:port] is required.

Valid examples: https://www.example.com/secure/ http://www.example.com:80/help/ smb://fileshare.mycompany.com/ \\fileshare.mycompany.com\shared\ </quote>

It seems that a rewrite of the URL is taking place. I was succesful in setting up a samba share though:

smb://myserver/myshare/

But then search results included a smb:// so windows users could not browse to the actual files, only the cached versions.

I did find an alternate solution which included rewriting the results using XSLT so that the smb:// is replaced by \\, but have yet to try it.

I did send google a support request to see if there was some work around, and will keep you posted.

Regards Ian

On Feb 20, 6:12 am, "Chris" <itsa@gmail.com> wrote:

Thanks for the input. I just tried it out, however, I am still receiving the error:

"You have entered one or more invalid start URLs. Please check your edits."

I've tried both with and without the port: //myshare.company.com/myfolder/ //myshare.company.com:139/myfolder/

-Chris

On Feb 20, 7:56 am, "Sean Cooper" <seco@mitre.org> wrote:

Have you tried using forward slashes?

i.e. //myshare.company.com/myfolder/

The search appliances are running Linux which I believe uses forward slashes for network shares.

On Feb 19, 11:59 am, "Chris" <itsa@gmail.com> wrote:

I was looking at the documentation for the Search Appliance today, and noticed that under "Crawl and Index > Crawl URLs" that we can now specify UNC paths.

However, when I attempt to specify a UNC path, such as \ \myshare.company.com\myfolder\, the appliance assumes that it is a mistyped http path and changes it to something like http://%5C%5Cmyshare.company.com%5Cmyfolder%5C/.

Additionally, under the "Follow and Crawl..." heading, I try to enter \ \myshare.company.com\, however, it will only accept this if I end the line with a forward slash: \\myshare.company.com\/

Has anybody else tried anything with UNC paths? If so, were you able to successfully crawl?

Any input would be greatly appreciated.

Thanks, -Chris- Hide quoted text -

- Show quoted text -- Hide quoted text -

- Show quoted text -