atom feed151 messages in org.w3.public-lodRe: Is 303 really necessary?
FromSent OnAttachments
56 earlier messages
Bradley AllenNov 4, 2010 1:40 pm 
Mischa TuffieldNov 4, 2010 2:09 pm 
David BoothNov 4, 2010 3:09 pm 
David BoothNov 4, 2010 3:11 pm 
Kingsley IdehenNov 4, 2010 3:24 pm 
mike amundsenNov 4, 2010 3:26 pm 
Melvin CarvalhoNov 4, 2010 3:48 pm 
Kingsley IdehenNov 4, 2010 4:31 pm 
Kingsley IdehenNov 4, 2010 4:42 pm 
David BoothNov 4, 2010 5:41 pm 
mike amundsenNov 4, 2010 7:28 pm 
Leigh DoddsNov 5, 2010 2:28 am 
Michael HausenblasNov 5, 2010 2:29 am 
Leigh DoddsNov 5, 2010 2:34 am 
Leigh DoddsNov 5, 2010 2:36 am 
Leigh DoddsNov 5, 2010 2:41 am 
William WaitesNov 5, 2010 2:53 am 
Ian DavisNov 5, 2010 2:57 am 
NathanNov 5, 2010 3:05 am 
NathanNov 5, 2010 3:12 am 
Ian DavisNov 5, 2010 3:16 am 
Ian DavisNov 5, 2010 3:24 am 
NathanNov 5, 2010 3:33 am 
Ian DavisNov 5, 2010 3:40 am 
NathanNov 5, 2010 3:56 am 
Ian DavisNov 5, 2010 3:59 am 
Ian DavisNov 5, 2010 4:01 am 
NathanNov 5, 2010 4:14 am 
Mischa TuffieldNov 5, 2010 4:47 am 
Norman GrayNov 5, 2010 5:11 am 
Dave ReynoldsNov 5, 2010 5:38 am 
NathanNov 5, 2010 5:52 am 
NathanNov 5, 2010 5:56 am 
Vasiliy FaronovNov 5, 2010 6:00 am 
Vasiliy FaronovNov 5, 2010 6:33 am 
NathanNov 5, 2010 7:17 am 
David WoodNov 5, 2010 7:18 am 
Pat HayesNov 5, 2010 7:27 am 
Ian DavisNov 5, 2010 8:12 am 
Kingsley IdehenNov 5, 2010 8:18 am 
NathanNov 5, 2010 8:39 am 
Kingsley IdehenNov 5, 2010 9:35 am 
Pat HayesNov 5, 2010 10:29 am 
Kingsley IdehenNov 5, 2010 10:30 am 
NathanNov 5, 2010 10:37 am 
Hugh GlaserNov 5, 2010 10:50 am 
David BoothNov 6, 2010 1:41 pm 
Norman GrayNov 6, 2010 3:45 pm 
Kingsley IdehenNov 6, 2010 4:07 pm 
David BoothNov 7, 2010 10:27 pm 
45 later messages
Subject:Re: Is 303 really necessary?
From:Nathan (nat@webr3.org)
Date:Nov 5, 2010 3:56:40 am
List:org.w3.public-lod

Ian Davis wrote:

On Fri, Nov 5, 2010 at 10:05 AM, Nathan <nat@webr3.org> wrote:

Not at all, I'm saying that if big-corp makes a /web crawler/ that describes what documents are about and publishes RDF triples, then if you use 200 OK, throughout the web you'll get (statements similar to) the following asserted:

</toucan> :primaryTopic dbpedia:Toucan ; a :Document .

i don't think so. If the bigcorp is producing triples from their crawl then why wouldn't they use the triples they are sent (and/or content-location, link headers etc). The above looks like what you'd get from a third party translation of the crawl results without the context of actually having fetched the data from the URI.

Wouldn't be too sure about that, even the major browser vendors get it completely wrong, for instance do an XHR for a URI in chrome and even if there's 10 redirects in a chain, the base and the document uri is that which you requested. This is true all over the place, from using file_get_content's in PHP to most HTTP clients in any language, the pattern is simply:

requested-uri = "http://..."; doc = get(requested-uri);

info at the end is almost always ( requested-uri, doc ) - in fact often there's not even any way to get the redirected to URI back out from the HTTP client.

As for using the triples they are sent, all you need to do is consider an HTML crawler running over RDFa documents

If the bigcorp is not linked data aware then today they will follow the 303 redirect as a standard HTTP redirect. rfc2616 says that the target URI is not a substitute for the original URI but just an alternate location to get a response from. The bigcorp will simply infer the statements you list above **even though there is a 303 redirect**.

exactly, kind of semi-damning all /slash URIs.. or atleast requiring a load of provenance data.

As rfc2616 itself points out, many user agents treat 302 and 303 interchangeably. Only linked data aware agents will ascribe special meaning to 303 and they're the ones that are more likely to use the data they are sent.

God knows why linked data clients are ascribing any meaning to 303, the pattern's there to ensure that a thing and the doc describing it have different URIs, and to ensure that people don't say that thing is a document. Although it's not exactly worked out that way. The use of the particular status code 303 is only relevant if your ascribing meaning to the response code of GETs, if your not then 3** will do the same job.

Out of interest, just who is trawling the web and going "301 that's an IR, 303 that's maybe not an IR, 302 that's an IR".

My personal opinion on the entire thing is as simple as give different things different names, if there's a good chance something will think that thing is a different kind of thing by using a particular uri scheme or style (like saying mailto:fo@bar.org is a mailbox) then avoid it if it conflicts with the kind of thing you're describing. IMO slash URIs are often taken to mean documents, so I avoid them. You don't, so regardless of what status code you use, or how you deploy data, that conflation will be there. Thus my take away on the whole thing for you (and even though it goes against tag) is just 200 your uri's if you want to, but don't go around telling the rest of the world to do it and promote it as a good pattern, as it's not. tdb scheme or frag uris address the issues, whilst introducing others, but at least the data's somewhat cleaner.

I'll roll with the "who cares" line of thinking, I certainly don't care how you or dbpedia or foaf or dc publish your data, so long as I can deref it, but for god sake don't go telling everybody using slash URIs and 200 is "The Right Thing TM"

Best,