| From | Sent On | Attachments |
|---|---|---|
| Abhishek53 S | Feb 1, 2012 1:28 am | |
| Geert Josten | Feb 1, 2012 1:55 am | |
| Abhishek53 S | Feb 1, 2012 2:46 am | |
| Abhishek53 S | Feb 1, 2012 2:54 am | |
| Will Thompson | Feb 1, 2012 9:20 am | |
| Michael Blakeley | Feb 1, 2012 10:55 am | |
| Will Thompson | Feb 1, 2012 11:18 am |
| Subject: | Re: [MarkLogic Dev General] element-query with punctuation insensitive and punctuation marks as cts:text | |
|---|---|---|
| From: | Will Thompson (wtho...@jonesmcclure.com) | |
| Date: | Feb 1, 2012 11:18:29 am | |
| List: | com.marklogic.developer.general | |
Mike - This is also what I have found. Search:parse has to actually return this
"empty" query for the <empty> option to have any effect:
<cts:and-query qtextempty="1" xmlns:cts="http://marklogic.com/cts"/>
When it is passed punctuation text and "punctuation-insensitive" in options it
returns:
<cts:word-query qtextref="cts:text" xmlns:cts="http://marklogic.com/cts"> <cts:text>,</cts:text> <cts:option>punctuation-insensitive</cts:option> </cts:word-query>
The same problem occurs with "whitespace-insensitive" in options and
search:parse(" ",$options):
<cts:word-query qtextref="cts:text" xmlns:cts="http://marklogic.com/cts"> <cts:text> </cts:text> <cts:option>whitespace-insensitive</cts:option> </cts:word-query>
Both these queries are unaffected by <empty apply="all-results"/> and return no
results. I don't think this is desirable for any application. Ideally I think
Search API would provide an option to behave like your parser or for
search:parse to return empty queries for these scenarios.
Stripping out punctuation from the input query is a decent workaround, but we
have to be careful not strip out characters that could be part of a constraint,
phrase, custom grammar, etc., so the regex gets uglier.
-Will
-----Original Message-----
From: gene...@developer.marklogic.com
[mailto:gene...@developer.marklogic.com] On Behalf Of Michael Blakeley
Sent: Wednesday, February 01, 2012 10:56 AM
To: General MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] element-query with punctuation insensitive
and punctuation marks as cts:text
In cases like this it's worth looking at the query output. The search:parse
function produces this:
<cts:and-query strength="20" qtextjoin="" qtextgroup="( )"
xmlns:cts="http://marklogic.com/cts">
<cts:word-query qtextpre=""" qtextref="cts:text" qtextpost=""">
<cts:text>metal</cts:text>
<cts:option>case-insensitive</cts:option>
<cts:option>unstemmed</cts:option>
<cts:option>punctuation-insensitive</cts:option>
</cts:word-query>
<cts:and-query strength="20" qtextjoin="" qtextgroup="( )">
<cts:word-query qtextref="cts:text">
<cts:text>,</cts:text>
<cts:option>case-insensitive</cts:option>
<cts:option>unstemmed</cts:option>
<cts:option>punctuation-insensitive</cts:option>
</cts:word-query>
<cts:word-query qtextpre=""" qtextref="cts:text" qtextpost=""">
<cts:text>locker</cts:text>
<cts:option>case-insensitive</cts:option>
<cts:option>unstemmed</cts:option>
<cts:option>punctuation-insensitive</cts:option>
</cts:word-query>
</cts:and-query>
</cts:and-query>
See the cts:text entry for ','? After some testing with 5.0-2, my guess is that
since ',' is the only character in that punctuation-insensitive word-query, that
word-query term ends up not matching anything. I think it should match
*everything*, which would also cause problems if search:parse created that
query. But whether the existing behavior is a bug or not, the workaround should
be simple: rewrite the input query so that it does not contain any punctuation.
This might be suitable:
replace($query, '[^\w\s]', ' ')
Or you might look into using https://github.com/mblakele/xqysp with
search:resolve(). XQYSP ignores unexpected punctuation unless it is part of a
quoted term.
-- Mike
On 1 Feb 2012, at 09:21 , Will Thompson wrote:
Abhishek - I recently had a very similar issue with empty searches and
punctuation, and the solution appeared to be adding <empty apply="all-results"
/> to search options. However, after further testing, I am also getting empty
results. For example,
let $options := <options xmlns="http://marklogic.com/appservices/search"> <term> <empty apply="all-results" /> <term-option>punctuation-insensitive</term-option> </term> <searchable-expression>//doc</searchable-expression> </options> let $empty := <cts:word-query qtextref="cts:text" xmlns:cts="http://marklogic.com/cts"> <cts:text>;</cts:text> <cts:option>punctuation-insensitive</cts:option> </cts:word-query> return search:resolve($empty,$options)
This returns no results, and the value of @apply does not seem to have any
effect. I think this is probably a bug.
-Will
From: gene...@developer.marklogic.com
[mailto:gene...@developer.marklogic.com] On Behalf OfAbhishek53 S
Sent: Wednesday, February 01, 2012 2:55 AM
To: General MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] element-query with punctuation insensitive
and punctuation marks as cts:text
Hi Geert,
Here is the sample query I used
import module namespace search = "http://marklogic.com/appservices/search"
at
"/MarkLogic/appservices/search/search.xqy";
let $parsed-query := search:parse('"metal" , "locker"',
<options
xmlns="http://marklogic.com/appservices/search">
<search-option>unfiltered</search-option> <term> <empty apply="all-results" /> <term-option>case-insensitive</term-option> <term-option>unstemmed</term-option> <term-option>punctuation-insensitive</term-option> </term>
</options>)
let $query := cts:element-query(xs:QName("data"),cts:query($parsed-query)) return
xdmp:estimate(cts:search(fn:doc(), $query))
Thanks Abhishek Srivastav Tata Consultancy Services Cell:- +91-9883389968 Mailto: abhi...@tcs.com Website: http://www.tcs.com
____________________________________________ Experience certainty. IT Services Business Solutions Outsourcing
____________________________________________
From:
Abhishek53 S <abhi...@tcs.com>
To:
General MarkLogic Developer Discussion <gene...@developer.marklogic.com>
Date:
02/01/2012 04:17 PM
Subject:
Re: [MarkLogic Dev General] element-query with punctuation insensitive and
punctuation marks as cts:text
Sent by:
gene...@developer.marklogic.com
Hi Geert,
Thanks for your response. Currently I am not inclined towards removing the
word-query with punctuation marks (Until it will be the last option to do) from
the main query. I am using search:parse function to parse the search term.
I tried with your 3rd option but still unable to get the expected result [count
without punctuation (,) = count with punctuation (,) as
punctuation-insensitive]. If I can recall it correctly this term option is used
to send result or not when the term is empty terms how this would help me in
this case...
Thanks for you help!
Abhishek Srivastav Tata Consultancy Services Cell:- +91-9883389968 Mailto: abhi...@tcs.com Website: http://www.tcs.com
____________________________________________ Experience certainty. IT Services Business Solutions Outsourcing
____________________________________________
From:
Geert Josten <geer...@dayon.nl>
To:
General MarkLogic Developer Discussion <gene...@developer.marklogic.com>
Date:
02/01/2012 03:26 PM
Subject:
Re: [MarkLogic Dev General] element-query with punctuation insensitive and
punctuation marks as cts:text
Sent by:
gene...@developer.marklogic.com
Hi Abishek,
What is happening here is that you pass ',' as search term to a word-query with
'punctuation-insensitive' option. That option causes the comma character
effectively to be stripped out of the search term, leaving an empty search term.
Doing a cts:word-query with an empty search term results nothing.
I think you have few options:
1. Don't tokenize the search string yourself (at least, if that is what you
are doing), and pass in 'metal,' or ', metal' as search term with punctuation
insensitive. That is effectively the same as searching for 'metal'.
2. Strip punctuation yourself before parsing it to <cts:query> element
structure (or post-process the query element structure to filter out
punctuation-only queries)
3. Add <empty apply="all-results" /> to your search options (I'm guessing
you are using search:parse, so to the options you pass in there)
Kind regards, Geert
Van: gene...@developer.marklogic.com
[mailto:gene...@developer.marklogic.com] NamensAbhishek53 S
Verzonden: woensdag 1 februari 2012 10:30
Aan: General MarkLogic Developer Discussion
Onderwerp: [MarkLogic Dev General] element-query with punctuation insensitive
and punctuation marks as cts:text
Hi Folks,
I am not sure if I am wrong somewhere while explaining this issue of
punctuation-insensitive search with punctuation marks as cts:text
(element-query). While executing the below query I am not getting any count back
because punctuation mark is not ignored during search (even if
punctuation-insensitive). The expected behavior of our application is always
punctuation-insensitive . If I remove word query with punctuation marks, It will
start returning count based on remaining search criteria. On the other hand word
query with punctuation-sensitive option is behaving similar to it is ignored
from the search criteria.
Please let me know how to make this element-query punctuation insensitive even
if punctuation marks are present into cts:text node of word-query .
xdmp:estimate(cts:search(fn:doc(),
cts:query(
<cts:element-query>
<cts:element xmlns="">data</cts:element>
<cts:and-query>
<cts:word-query>
<cts:text xml:lang="en">,</cts:text>
<cts:option>case-insensitive</cts:option>
<cts:option>punctuation-insensitive</cts:option>
<cts:option>unstemmed</cts:option>
</cts:word-query>
<cts:word-query>
<cts:text xml:lang="en">metal</cts:text>
<cts:option>case-insensitive</cts:option>
<cts:option>punctuation-insensitive</cts:option>
<cts:option>unstemmed</cts:option>
</cts:word-query>
</cts:and-query>
</cts:element-query>
)))
Thanks & Regards Abhishek Srivastav Tata Consultancy Services Cell:- +91-9883389968 Mailto: abhi...@tcs.com Website: http://www.tcs.com
____________________________________________ Experience certainty. IT Services Business Solutions Outsourcing
____________________________________________ =====-----=====-----===== Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you_______________________________________________ General mailing list Gene...@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general_______________________________________________ General mailing list Gene...@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list Gene...@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list Gene...@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list Gene...@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general





