atom feed108 messages in org.oasis-open.lists.search-ws[search-ws] queryn: A proposal for SR...
FromSent OnAttachments
Denenberg, RayDec 6, 2010 7:26 am 
Hammond, TonyDec 6, 2010 8:01 am 
Ray Denenberg, Library of CongressDec 6, 2010 8:24 am 
Ray Denenberg, Library of CongressDec 6, 2010 8:30 am 
Hammond, TonyDec 6, 2010 8:34 am 
Ray Denenberg, Library of CongressDec 6, 2010 8:48 am 
Hammond, TonyDec 6, 2010 8:50 am 
Hammond, TonyDec 7, 2010 6:46 am 
Ray Denenberg, Library of CongressDec 7, 2010 7:03 am 
Ray Denenberg, Library of CongressDec 7, 2010 7:24 am 
Ray Denenberg, Library of CongressDec 7, 2010 7:39 am 
Hammond, TonyDec 7, 2010 9:30 am 
LeVan,RalphDec 7, 2010 9:26 pm 
LeVan,RalphDec 7, 2010 9:30 pm 
Hammond, TonyDec 8, 2010 12:45 am 
Ray Denenberg, Library of CongressDec 8, 2010 7:40 am 
Hammond, TonyDec 8, 2010 8:19 am 
Ray Denenberg, Library of CongressDec 8, 2010 8:47 am 
Hammond, TonyDec 8, 2010 10:11 am 
Ray Denenberg, Library of CongressDec 8, 2010 11:16 am 
LeVan,RalphDec 8, 2010 11:30 pm 
Ray Denenberg, Library of CongressDec 9, 2010 6:14 am 
Matthew DoveyDec 13, 2010 4:17 am 
Matthew DoveyDec 13, 2010 4:21 am 
Hammond, TonyDec 14, 2010 1:05 am 
Hammond, TonyDec 14, 2010 1:19 am 
Matthew DoveyDec 14, 2010 1:54 am 
Hammond, TonyDec 14, 2010 2:43 am 
Matthew DoveyDec 14, 2010 3:38 am 
Ray Denenberg, Library of CongressDec 14, 2010 12:03 pm 
Matthew DoveyDec 14, 2010 12:44 pm 
Hammond, TonyDec 15, 2010 3:19 am 
Hammond, TonyDec 15, 2010 3:46 am 
Matthew DoveyDec 15, 2010 4:05 am 
Ray Denenberg, Library of CongressDec 15, 2010 7:35 am 
Hammond, TonyDec 15, 2010 7:47 am 
Matthew DoveyDec 15, 2010 8:25 am 
Ray Denenberg, Library of CongressDec 15, 2010 8:31 am 
LeVan,RalphDec 15, 2010 8:49 am 
Ray Denenberg, Library of CongressDec 15, 2010 9:05 am 
LeVan,RalphDec 15, 2010 9:12 am 
Ray Denenberg, Library of CongressDec 15, 2010 1:34 pm 
LeVan,RalphDec 15, 2010 1:44 pm 
Matthew DoveyDec 15, 2010 2:21 pm 
Ray Denenberg, Library of CongressDec 15, 2010 2:31 pm 
Matthew DoveyDec 15, 2010 2:39 pm 
Hammond, TonyDec 16, 2010 2:19 am 
Hammond, TonyDec 16, 2010 3:07 am 
LeVan,RalphDec 16, 2010 7:16 am 
LeVan,RalphDec 16, 2010 7:23 am 
Hammond, TonyDec 16, 2010 9:15 am 
LeVan,RalphDec 16, 2010 9:50 am 
LeVan,RalphDec 16, 2010 11:42 am 
Ray Denenberg, Library of CongressDec 16, 2010 12:48 pm 
LeVan,RalphDec 16, 2010 1:00 pm 
Ray Denenberg, Library of CongressDec 16, 2010 2:35 pm 
Hammond, TonyDec 17, 2010 3:35 am 
Hammond, TonyDec 17, 2010 3:38 am 
LeVan,RalphDec 17, 2010 6:47 am 
Ray Denenberg, Library of CongressDec 17, 2010 7:34 am 
Ray Denenberg, Library of CongressDec 17, 2010 7:34 am 
LeVan,RalphDec 17, 2010 7:56 am 
46 later messages
Subject:[search-ws] queryn: A proposal for SRU to facilitate forms processing
From:Hammond, Tony (t.ha@nature.com)
Date:Dec 7, 2010 9:30:03 am
List:org.oasis-open.lists.search-ws

Hi:

I wanted to put this (modedst) proposal for SRU forward and get some feedback.

One of the differences between SRU and other general search interfaces is that
the actual query (CQL string) is contained within a single parameter and not
scattered across several parameters, as e.g. this search in Google:

http://www.google.co.uk/search?q=this+-that=en=10==i=countryAU=images=qdr:w

This is a query for "this" and not "that" in Australian sites in the past week.

&q=this+-that &cr=countryAU &tbs=qdr:w

Yep, it's a bit of a mess. :) Mixes together query and control params. But still
it's straightforward to map to from a forms interface. I always think of
traditional query interfaces as being 1-D and SRU as being 2-D: one dimension
for query, and the other for control. And this separation of concerns is both a
blessing and a curse. A curse especially for implementors.

Now one of the difficulties with a forms input for SRU is that the CQL query
needs to be composed before it is added to the querystring as a single parameter
which usually means some clever stylesheet handling of the query fields (which
we are currently using from the oclcsrw package) or some other preprocessing
method.

I was wondering whether if SRU had a new parameter "queryn" say which gave an
integer number of query search clauses across which the query was fragmented
then the query could be simply recomposed in a predetermined fashion.

E.g. if one had something like:

&queryn=2 &q1.idx=index1 &q1.rel=relation1 &q1.trm=term1 &q1.bln=boolean1 &q2.idx=index2 &q2.rel=relation2 &q2.trm=term2 &q2.bln=boolean2

then the parameters could be sent direct from the form without any handling and
composed on the server side by following a simple rule, i.e. concatenation of
(known number of) search clause components with whitespace separators, and
concatenation of search clauses with (whitespaced) booleans. So, in above
example with n=2 params it would be straightforward for a querystring builder to
look for params "q1.*" through "q2.*" and build the CQL query as

query = ''; for (i=1; i <= queryn; i++) { if (q{i}.trm) { query += q{i}.idx + ' ' + q{i}.rel + ' ' + q{i}.trm; } if (i < queryn) { query += ' ' + q{i}.bln + ' '; } }

i.e.

query = q1.idx + ' ' + q1.rel + ' ' + q1.trm + ' ' + q1.bln + ' ' + q2.idx + '
' + q2.rel + ' ' + q2.trm

As long as a form laid out query components in a defined (numbered) fashion and
then declared the total number of search clauses then the query builder just
needs to iterate over the known number of search clauses.

Alternately the query could be assembled on the client using JavaScript such as
the "mungeForm" function we have on nature.com OpenSearch via the oclcsrw
package. And if a client had disabled JavaScript then the server itself could
detect the "queryn" parameter and reassemble the query. Of course this really
means that

searchRetrieve = query | queryn (=> query = q1.* + q2.* + ...)

Such an extension to SRU could certainly provide ample support for simple forms
- such as most in practice invariably are - without requiring special JavaScript
or bespoke handling. Of course, it is very limiting in terms of query
expressivity although it does map reasonably well to standard form inputs.

What do you think? Interested to hear your feedback on this general approach to
(re)assembling CQL queries.

Thanks,

Tony

ps/ In an earlier attempt I had considered just breaking a CQL query into an
arbitrary number of string fragments which could be resequenced into a complete
CQL string but ran into a problem concerning empty terms which would break the
validity of the CQL. Hence this revised approach which is more of a matrix
method with index, relation, term (and boolean) correlated and identified by row
order.