7 messages in com.googlegroups.social-graph-apiRe: SG API not picking up rel=me link...| From | Sent On | Attachments |
|---|---|---|
| Stuart Langridge | 03 Jul 2008 06:11 | |
| Martin Atkins | 03 Jul 2008 10:00 | |
| Stuart Langridge | 03 Jul 2008 12:19 | |
| Brad Fitzpatrick | 03 Jul 2008 12:33 | |
| Stuart Langridge | 03 Jul 2008 12:36 | |
| Brad Fitzpatrick | 03 Jul 2008 12:43 | |
| Bob Ngu | 04 Jul 2008 08:39 |
| Subject: | Re: SG API not picking up rel=me links on my pages![]() |
|---|---|
| From: | Brad Fitzpatrick (brad...@google.com) |
| Date: | 07/03/2008 12:33:20 PM |
| List: | com.googlegroups.social-graph-api |
On Thu, Jul 3, 2008 at 12:20 PM, Stuart Langridge <si...@kryogenix.org> wrote:
http://socialgraph.apis.google.com/lookup?q=http://www.kryogenix.org&fme=1&pretty=1
doesn't seem to be picking much up. I'd expect it to follow the / contact link, because it has rel="me" on it, and then from that page follow other rel="me" links to places like Twitter and Flickr and so on. How can I find out why the lookup code isn't picking up my links? If it's me in the wrong then I'm happy to change things around...
There's an API for running pages through the API for testing purposes: http://code.google.com/apis/socialgraph/docs/testparse.html
But having said that, your pages do seem to be being parsed as expected:
[snip test]
Yep. Hence my puzzlement :)
It is reassuring to know that it's not just that I've got it wrong, anyway. Is socialgraph.apis just running an older version of the code?
I put up the /testparse interface so people can tell the difference between the parsers sucking versus the crawl coverage sucking.
My goal's been to get the parsers as good as possible first, then I'm going to start addressing the crawl coverage issues. Googlebot doesn't necessarily care about crawling the same things that the SGAPI would like. I need to give it steering directions.
There's also a lot of data I'm not using yet. I also want to work on latency. From the time Googlebot hits your site, I want it in the SGAPI index within minutes (if not sooner), not the hours/days/more it can take now. The main data source I use now is the web index which has a bunch of stuff I don't need in it.... e.g. Pagerank/etc. So I should be using a lower-latency, lower-level data source for day-to-day stuff, and just using the web index for back-fill and to learn about gaps that I should steer Googlebot towards.
Short-term I want to build a public / open source regression test suite for the parsers (not the parsers themselves, though -- too inseparable and not that interesting) and let everybody see everything that is and should be parsed. Then others could in theory maintain that and report bugs of missing things in the parsers/canonicalization while I switch gears to working mainly on coverage issues.
I might also put up a rate-limited, google-login-required "Crawl my page and updat the index for the SGAPI now" page.
- Brad
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Social Graph API" group.
To post to this group, send email to soci...@googlegroups.com
To unsubscribe from this group, send email to
soci...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/social-graph-api?hl=en
-~----------~----~----~----~------~----~------~--~---




