

![]() | Start a set with this search |
![]() | Include this search in one of my sets |
![]() | Exclude this search from one of my sets |
![]() | Permalink to these results Paste this link in email or IM: |
| Atom feed for tracking future search results Paste this URL into your reader: |
20 messages in org.xml.lists.xml-devRe: [xml-dev] MarkMail: now archiving...| From | Sent On | Attachments |
|---|---|---|
| Jason Hunter | Nov 26, 2007 11:55 am | |
| Costello, Roger L. | Nov 26, 2007 1:32 pm | |
| Len Bullard | Nov 26, 2007 5:07 pm | |
| bryan rasmussen | Nov 27, 2007 12:59 am | |
| Elliotte Harold | Nov 27, 2007 4:51 am | |
| Elliotte Rusty Harold | Nov 27, 2007 5:00 am | |
| Len Bullard | Nov 27, 2007 5:56 am | |
| Jason Hunter | Nov 27, 2007 11:05 am | |
| Jason Hunter | Nov 27, 2007 12:46 pm | |
| Elliotte Rusty Harold | Nov 27, 2007 6:52 pm | |
| Edward C. Zimmermann | Nov 27, 2007 11:41 pm | |
| Jason Hunter | Nov 28, 2007 12:48 am | |
| Andrew Welch | Nov 28, 2007 2:21 am | |
| Edward C. Zimmermann | Nov 28, 2007 3:45 am | |
| John Snelson | Nov 28, 2007 4:51 am | |
| Jason Hunter | Nov 28, 2007 11:34 am | |
| Edward C. Zimmermann | Nov 28, 2007 1:12 pm | |
| Jason Hunter | Nov 28, 2007 3:09 pm | |
| Elliotte Rusty Harold | Dec 7, 2007 4:39 am | |
| Jason Hunter | Dec 7, 2007 9:38 am |

![]() | Permalink for this message Paste this link in email or IM: |
![]() | Permalink for this thread Paste this link in email or IM: |
| Atom feed for this thread Paste this URL into your reader: |
| Subject: | Re: [xml-dev] MarkMail: now archiving xml-dev | Actions... |
|---|---|---|
| From: | Edward C. Zimmermann (ed...@bsn.com) | |
| Date: | Nov 27, 2007 11:41:20 pm | |
| List: | org.xml.lists.xml-dev | |
Quoting Elliotte Rusty Harold <elh...@metalab.unc.edu>:
Jason Hunter wrote:
I think the reason you *don't* see that is the inherent risk of letting someone else run arbitrary code on your server. What if the user starts calculating Pi to 1,000,000,000 digits?
You don't need to let outsiders runs "arbitrary" code.
Perhaps we shouldn't have made XQuery Turing complete? (Side note: I'm pretty sure XQuery is Turing complete. Has anyone proved it yet?)
Lets not even talk about XQuery. Do we talk about SQL in systems that have SQL back ends? Normally the functionality is wrapped in other functions and interfaces--- heck, these days, it seems most Java "programmers" could not even write a line of SQL if they had too (they'd argue, of course, that they don't). We must also be clear that XQuery is not "XQuery" especially in the context of information retrieval (e.g. Full-Text).. Beyond also the observation that SQL too is not "SQL" it would be foolish to promiscuously expose, despite all the user controls, one's RDBMS to every Tom, Dick and Harry.. One could design an XQuery scripting extension that would be "safer" for anonymous use (keep in mind that what looks "safe" is not always "safe" from malicious users and bug exploits) but why the bother? What's the benefit?
Functionality? This, I suggest, could be exposed via other means. One of my own personal interests is to explore how one can expose the information functionality (the will to retrieve "relevant" bits of information) in the most naive and transparent manner. Since we have a completely flexible unit of retrieval (not bound by "record" or any other unit defined at index time) and the user might not understand or know the details of the structural mark-up used to encode the information, we need to figure out ways to interfaces to get the user the information that's relevant to them. Since the problem is not typically "individualistic" (there are classes of common responses) one should be able to make do without user scripting. The email archive case is really much much easier since much of the structure is not only known by the user (subject, sender, etc. in the header and in the message body we have lines, sentences and paragraphs and perhaps some attachments) but the semantic rules for content too.. "Relevant" retrieval objects are nearly always the message in the context of the thread in its temporal context (other messages that appeared in the list). The only hard-bit is to figure out what belongs in a thread--- we have Message-ids but not always and we have changing subject lines..
What if they start consuming disk or thrashing the disk IO? When you query against hundreds of gigs of content, you don't have to be malicious to mess things up.
Its not 100s of GB. Mailing lists are not that large.
Or for a less constrained appraoch, try Amazon EC2. Run any code you like on their servers.
That's what virtual machines, zones and some other bits and concepts about.. Its not, however, needed, I think, for doing IR on XML. A lot of the functionality of XQuery--- holding back from talking about XQuery--- is not about the act of searching or retrieving information but about doing things to it. A lot of this "functionality" need not be performed by the "in-the-know" server.
Yes, it's challenging; but I suspect there's a real business model in there somewhere. :-)
-- E. Zimmermann, BSn/Munich R&D Unit Leopoldstrasse 53-55, D-80802 Munich, Federal Republic of Germany http://www.nonmonotonic.net
_______________________________________________________________________
XML-DEV is a publicly archived, unmoderated list hosted by OASIS to support XML implementation and development. To minimize spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address: http://www.oasis-open.org/mlmanage/ Or unsubscribe: xml-...@lists.xml.org subscribe: xml-...@lists.xml.org List archive: http://lists.xml.org/archives/xml-dev/ List Guidelines: http://www.oasis-open.org/maillists/guidelines.php







