atom feed151 messages in org.w3.public-lodRe: Is 303 really necessary?
FromSent OnAttachments
5 earlier messages
Ian DavisNov 4, 2010 8:22 am 
Ian DavisNov 4, 2010 8:27 am 
Leigh DoddsNov 4, 2010 8:38 am 
William WaitesNov 4, 2010 8:43 am 
Giovanni TummarelloNov 4, 2010 8:50 am 
Leigh DoddsNov 4, 2010 8:53 am 
Kingsley IdehenNov 4, 2010 8:55 am 
Ian DavisNov 4, 2010 8:57 am 
Ian DavisNov 4, 2010 9:06 am 
Bradley AllenNov 4, 2010 9:06 am 
Kingsley IdehenNov 4, 2010 9:10 am 
Ian DavisNov 4, 2010 9:13 am 
Kingsley IdehenNov 4, 2010 9:16 am 
bill...@planet.nlNov 4, 2010 9:20 am 
Ian DavisNov 4, 2010 9:22 am 
Bradley AllenNov 4, 2010 9:25 am 
Harry HalpinNov 4, 2010 9:33 am 
Robin YANGNov 4, 2010 9:51 am 
Ian DavisNov 4, 2010 9:54 am 
David WoodNov 4, 2010 9:56 am 
Mike KellyNov 4, 2010 10:12 am 
Ian DavisNov 4, 2010 10:13 am 
Patrick DurusauNov 4, 2010 10:17 am 
David WoodNov 4, 2010 10:24 am 
Patrick DurusauNov 4, 2010 10:36 am 
NathanNov 4, 2010 10:51 am 
Kingsley IdehenNov 4, 2010 11:06 am 
NathanNov 4, 2010 11:07 am 
Patrick DurusauNov 4, 2010 11:08 am 
Ian DavisNov 4, 2010 11:18 am 
Ian DavisNov 4, 2010 11:24 am 
Robert FullerNov 4, 2010 11:38 am 
NathanNov 4, 2010 11:38 am 
Kingsley IdehenNov 4, 2010 11:41 am 
Jörn HeesNov 4, 2010 11:45 am 
NathanNov 4, 2010 11:46 am 
Robert FullerNov 4, 2010 11:48 am 
Ian DavisNov 4, 2010 11:58 am 
Kingsley IdehenNov 4, 2010 12:00 pm 
Harry HalpinNov 4, 2010 12:03 pm 
Kingsley IdehenNov 4, 2010 12:07 pm 
Jörn HeesNov 4, 2010 12:10 pm 
Kingsley IdehenNov 4, 2010 12:12 pm 
Kingsley IdehenNov 4, 2010 12:12 pm 
Kingsley IdehenNov 4, 2010 12:14 pm 
NathanNov 4, 2010 12:26 pm 
Kingsley IdehenNov 4, 2010 12:36 pm 
David WoodNov 4, 2010 12:56 pm 
Hugh GlaserNov 4, 2010 12:59 pm 
David WoodNov 4, 2010 1:14 pm 
NathanNov 4, 2010 1:22 pm 
Bradley AllenNov 4, 2010 1:40 pm 
Mischa TuffieldNov 4, 2010 2:09 pm 
David BoothNov 4, 2010 3:09 pm 
David BoothNov 4, 2010 3:11 pm 
Kingsley IdehenNov 4, 2010 3:24 pm 
mike amundsenNov 4, 2010 3:26 pm 
Melvin CarvalhoNov 4, 2010 3:48 pm 
Kingsley IdehenNov 4, 2010 4:31 pm 
Kingsley IdehenNov 4, 2010 4:42 pm 
David BoothNov 4, 2010 5:41 pm 
mike amundsenNov 4, 2010 7:28 pm 
Leigh DoddsNov 5, 2010 2:28 am 
Michael HausenblasNov 5, 2010 2:29 am 
Leigh DoddsNov 5, 2010 2:34 am 
Leigh DoddsNov 5, 2010 2:36 am 
Leigh DoddsNov 5, 2010 2:41 am 
William WaitesNov 5, 2010 2:53 am 
Ian DavisNov 5, 2010 2:57 am 
NathanNov 5, 2010 3:05 am 
NathanNov 5, 2010 3:12 am 
Ian DavisNov 5, 2010 3:16 am 
Ian DavisNov 5, 2010 3:24 am 
NathanNov 5, 2010 3:33 am 
Ian DavisNov 5, 2010 3:40 am 
NathanNov 5, 2010 3:56 am 
Ian DavisNov 5, 2010 3:59 am 
Ian DavisNov 5, 2010 4:01 am 
NathanNov 5, 2010 4:14 am 
Mischa TuffieldNov 5, 2010 4:47 am 
Norman GrayNov 5, 2010 5:11 am 
Dave ReynoldsNov 5, 2010 5:38 am 
NathanNov 5, 2010 5:52 am 
NathanNov 5, 2010 5:56 am 
Vasiliy FaronovNov 5, 2010 6:00 am 
Vasiliy FaronovNov 5, 2010 6:33 am 
NathanNov 5, 2010 7:17 am 
David WoodNov 5, 2010 7:18 am 
Pat HayesNov 5, 2010 7:27 am 
Ian DavisNov 5, 2010 8:12 am 
Kingsley IdehenNov 5, 2010 8:18 am 
NathanNov 5, 2010 8:39 am 
Kingsley IdehenNov 5, 2010 9:35 am 
Pat HayesNov 5, 2010 10:29 am 
Kingsley IdehenNov 5, 2010 10:30 am 
NathanNov 5, 2010 10:37 am 
Hugh GlaserNov 5, 2010 10:50 am 
David BoothNov 6, 2010 1:41 pm 
Norman GrayNov 6, 2010 3:45 pm 
Kingsley IdehenNov 6, 2010 4:07 pm 
46 later messages
Subject:Re: Is 303 really necessary?
From:David Wood (dav@3roundstones.com)
Date:Nov 4, 2010 1:14:08 pm
List:org.w3.public-lod

On Nov 4, 2010, at 15:04, Harry Halpin wrote:

On Thu, Nov 4, 2010 at 7:18 PM, Ian Davis <me@iandavis.com> wrote:

On Thursday, November 4, 2010, Nathan <nat@webr3.org> wrote:

Please, don't.

303 is a PITA, and it has detrimental affects across the board from network load
through to server admin. Likewise #frag URIs have there own set of PITA features
(although they are nicer on the network and servers).

However, and very critically (if you can get more critical than critical!), both
of these patterns / constraints are here to ensure that different things have
different names, and without that distinction our data is junk.

I agree with this and I address it in my blog post where I say we should link the thing to its description using a triple rather than a network response code.

This is key. The issue with 303 is that it uses a "network response code" to make a semantic distinction that can (and likely should) be done in the data-format itself, i.e. distinguishing a name for the data for the name identified by the thing itself. To be precise, you can state (ex:thing isDescribedBy ex:document, ex:thing rdf:type irw:nonInformationResource) in ex:document that is full of statements about ex:thing, and a half-way intelligent RDF parser should be able to sort that out.

Yes, I agree this is the key point. You might note that an HTTP request to a
resource that returns an RDF document (of whatever RDF serialization syntax)
will already give you a 200 in many cases and that is "correct" in that an RDF
document is an information resource. However, what it describes may not be. In
the case where you are describing a non-information resource, using a 303
provides a benefit in that the clue to the type of resource is accessible before
parsing the document.

It seems that Ian has made an efficiency argument. Which is cheaper? Getting a
clue from a network response code or parsing a representation?

Re 303 and performance, I am *sure* for DERI's Sindice it's fine to follow 303s. However, performance-wise for a server, it seems rather silly to do another HTTP redirect with little gain, and I think something Google-size would care about wasting HTTP response codes.

Lastly, the real problem is deployment. Imagine you are a regular database admin - lots of people do not care and do not want to (and can't) edit .htaccess and deal with Apace HTTP redirects just to put some data on the Web, and will not use Link headers or anything else either. They want to take a file and just put in on the Web via FTP without messing with their server. Many developers (I've had this conversation with David Recordon before the OGP launch in person) note that making the average data deployer worry about the difference between a URI for a thing and a document will naturally hurt deployment, so *not* following this practice was considered a feature, not a bug, by Facebook.

A database admin is at a bar. A RDF evangelist comes ups and says "Yes, I'd like for you to release your data, but you have to set up your server to do 303 redirects and convert your data to this weird looking RDF/XML format in addition to having a human-readable format. Maybe you can google for D2RQ..." A Microsoft guy comes up and says "Hey, here's this oData format, why not just have your server produce a format we understand, Atom." Guess who will win the argument :)

Kingsley correctly pointed out that this scenario is a bit overblown. How about
this? "Yes, I'd like you to release your data, so please serve your data with
metadata descriptions in RDFa, or separately as RDF with links from the data, or
303 redirects, or...."

Regards, Dave

Think outside the box, RDF needs to lower deployment costs. You can do that and keep the name/thing distinction, by doing it as a triple in the dataformat, which is a logical thing to do rather than doing essentially semantic work as a network response code.

This goes beyond your and my personal opinions, or those of anybody here, the
constraints are there so that in X months time when "multi-corp" trawls the web,
analyses it and releases billions of statements saying like { </foo> :hasFormat
"x"; sioc:about dbpedia:Whatever } about each doc on the web, that all of those
statements are said about documents, and not about you or I, or anything else
real, that they are said about the right "thing", the correct name is used.

I don't see that as a problem. It's an error because it's not what the original publisher intended but there are many many examples where that happens in bulk, and actually the 303 redirect doesn't prevent it happening with naive crawlers.

If someone asserts something we don't have to assume it is automatically true. We can get authority about what a URI denotes by dereferencng it. We trust third party statements as much or as little as we desire.

And this is critically important, to ensure that in X years time when somebody
downloads the RDF of 2010 in a big *TB sized archive and considers the graph of
RDF triples, in order to make sense of some parts of it for something important,
that the data they have isn't just unreasonable junk.

Any decent reasoner at that scale will be able to reject triples that appear to contradict one another. Seeing properties such as "format" against a URI that everyone else claims denotes an animal is going to stick out.

It's not about what we say something is, it's about what others say the thing
is, and if you 200 OK the URIs you currently 303, then it will be said that you
are a document, as simple as that. Saying you are a document isn't the killer,
it's the hundreds of other statements said along side that which make things so
ambiguous that the info is useless.

That's only true under the httpRange-14 finding which I am proposing is part of the problem.

If 303s are killing you then use fragment URIs, if you refuse to use fragments
for whatever reason then use something new like tdb:'s, support the data you've
published in one pattern, or archive it and remove it from the web.

These are publishing alternatives, but I'm focussed on the 303 issue here.

But, for whatever reasons, we've made our choices, each has pro's and cons, and
we have to live with them - different things have different name, and the giant
global graph is usable. Please, keep it that way.

Agree, different things have different names, that's why I emphasise it in the blog post. I don't agree that the status quo is the best state of affairs.

Best,