5 messages in com.googlegroups.social-graph-apiThe problems with voidstar.com| From | Sent On | Attachments |
|---|---|---|
| Brad Fitzpatrick | 13 Mar 2008 19:39 | .gz |
| Julian Bond | 14 Mar 2008 00:25 | |
| Danny Ayers | 14 Mar 2008 03:41 | |
| Julian Bond | 14 Mar 2008 03:48 | |
| Danny Ayers | 14 Mar 2008 04:21 |
| Subject: | The problems with voidstar.com![]() |
|---|---|
| From: | Brad Fitzpatrick (brad...@google.com) |
| Date: | 03/13/2008 07:39:48 PM |
| List: | com.googlegroups.social-graph-api |
| Attachments: | ![]() voidstar-json.txt.gz - 376k |
Julian, (and others who are curious...)
On Fri, Feb 15, 2008 at 3:51 AM, Julian Bond <juli...@voidstar.com> wrote: ....
There's something broken here. I've had a YASN-Roll block on every page in http://www.voidstar.com for months now. It contains rel="me" links. There are 25 or so entries to my profile pages on other sites. Googlebot is hitting these pages many times each day. But still
http://socialgraph-resources.googlecode.com/svn/trunk/samples/findcontact s.html?q=http%3A%2F%2Fwww.voidstar.com<http://socialgraph-resources.googlecode.com/svn/trunk/samples/findcontacts.html?q=http%3A%2F%2Fwww.voidstar.com>
and
http://socialgraph-resources.googlecode.com/svn/trunk/samples/findyours.h tml?q=http%3A%2F%2Fwww.voidstar.com<http://socialgraph-resources.googlecode.com/svn/trunk/samples/findyours.html?q=http%3A%2F%2Fwww.voidstar.com>
returns no data.
I looked in this.
Let's take the findcontacts.html example. Internally, if you look at the source to the JavaScript on that page, it requests the JSON from this URL:
http://socialgraph.apis.google.com/lookup?q=voidstar.com&fme=1&edi=1&edo=1&pretty=1
Notice the 503 error code after exactly 5 seconds. That's no coincidence... computing that response, for your URL, is overshooting the general goal of what's acceptable for an HTTP response latency and taking over 5 whole seconds (!!) . At that point I arbitrarily declare that ridiculous and give up, since anything over 500 ms is already ridiculously slow.
So I ran the query without a timeout and got the attached JSON response, which is 4.5 MB uncompressed. I've attached the compressed version (still 279KB).
Looks like you have a dozen subdomains on voidstar.com (www., ww., w, anything_at_all., jblaptop.,) which are all showing up. Plus every tag/date on your pages are counting as separate people.
Btw, this is how the whole Internet looked (a total mess) before I added both 1) a bunch of software-specific cleanup rules for software (e.g. Wordpress) that are on lots of different domains, and 2) the sgnodemapper stuff, to clean up the URLs of big, popular sites.
Because voidstar.com is a basically your personal homepage (it looks like?), it doesn't qualify for sgnodemapper, and because I don't recognize your URL patterns (your own custom software?), every page on your site is getting treated like its own URL ("own person").
Hence the explosion of URLs.
You having what appears to be a wildcard domain name all serving your site doesn't make the situation any better.
I'm tempted to do the quick hacky thing and map anything at voidstar.com to be just http://www.voidstar.com/, but that's a total one-off hack, and this is sure to bite somebody else in the future. I'd like to think of the best algorithmic, heuristic solution to this.
Anyway, that's what's up. Sorry for taking so long to get back to you on this. I'm going to think about what I want to do about this.
- Brad





.gz