atom feed5 messages in com.marklogic.developer.generalRe: [MarkLogic Dev General] Bug in ct...
FromSent OnAttachments
Geert JostenMay 12, 2012 8:52 am 
Danny SokolskyMay 12, 2012 10:38 am 
Danny SokolskyMay 12, 2012 3:41 pm 
Geert JostenMay 13, 2012 1:31 am 
Geert JostenMay 13, 2012 1:32 am 
Subject:Re: [MarkLogic Dev General] Bug in cts:element-words? (was: Term with same stem)
From:Geert Josten (geer@dayon.nl)
Date:May 13, 2012 1:31:07 am
List:com.marklogic.developer.general

Duh.. It just had to be something that obvious..

Thnx Danny!

-----Oorspronkelijk bericht----- Van: gene@developer.marklogic.com [mailto:general- boun@developer.marklogic.com] Namens Danny Sokolsky Verzonden: zondag 13 mei 2012 0:42 Aan: MarkLogic Developer Discussion Onderwerp: Re: [MarkLogic Dev General] Bug in cts:element-words? (was: Term with same stem)

I hadn't had enough coffee yet when I made my last comment. The example in the doc is correct, it just puts a start value in. Geert, your example would use the "collation=..." string as the start value, and would pick up the whatever is the default collation in your environment (and you probably do not have an element word lexicon on the default collation, so it probably throws an exception).

-Danny

________________________________________ From: gene@developer.marklogic.com [general- boun@developer.marklogic.com] On Behalf Of Danny Sokolsky [Dann@marklogic.com] Sent: Saturday, May 12, 2012 10:38 AM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Bug in cts:element-words? (was: Term with same stem)

I think your call to element-words is missing the second parameter; $options is the 3rd parameter. So I think it should be:

cts:element-words(fn:QName("http://grtjn.nl/twitter/utils", "text"), (), "collation=http://marklogic.com/collation/nl/S1/AS/T00BB")

It looks like the example in the doc is missing that second arg too--I'll see if I can get that fixed ;)

-Danny

________________________________________ From: gene@developer.marklogic.com [general- boun@developer.marklogic.com] On Behalf Of Geert Josten [geer@dayon.nl] Sent: Saturday, May 12, 2012 8:52 AM To: MarkLogic Developer Discussion Subject: [MarkLogic Dev General] Bug in cts:element-words? (was: Term with same stem)

Curious how well the idea of Danny would perform, I thought to apply it to one of my test databases with a fair number of tweets (roughly 400K last time I checked). I had to rewrite cts:words to cts:element-words since I have no words lexicon. But it breaks with me. Did I hit a bug?

let $map := map:map() let $all := for $x in cts:element-words(fn:QName("http://grtjn.nl/twitter/utils", "text"), "collation=http://marklogic.com/collation/nl/S1/AS/T00BB") return map:put($map, cts:stem($x), $x) return ( fn:concat(xs:string(fn:count(map:keys($map))), " unique stems in the database"), fn:concat(fn:count(cts:words()), " unique words in the database "), map:keys($map) )

Note that I specify a specific collation, but that seems to get ignored. Can anyone confirm this behavior?

Kind regards, Geert

Van: gene@developer.marklogic.com<mailto:general- boun@developer.marklogic.com> [mailto:general- boun@developer.marklogic.com<mailto:general- boun@developer.marklogic.com>] Namens Danny Sokolsky Verzonden: zaterdag 12 mei 2012 0:13 Aan: MarkLogic Developer Discussion Onderwerp: Re: [MarkLogic Dev General] Term with same stem

If you have a word lexicon you can do something like this to get information about your words and stems:

let $map := map:map() let $all := for $x in cts:words() return map:put($map, cts:stem($x), $x) return ( fn:concat(xs:string(fn:count(map:keys($map))), " unique stems in the database"), fn:concat(fn:count(cts:words()), " unique words in the database "), map:keys($map) )

-Danny

From: gene@developer.marklogic.com<mailto:general- boun@developer.marklogic.com> [mailto:general- boun@developer.marklogic.com]<mailto:[mailto:general- boun@developer.marklogic.com]> On Behalf Of Michael Blakeley Sent: Friday, May 11, 2012 2:02 PM To: MarkLogic Developer Discussion Cc: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Term with same stem

If stemming=advanced I think cts:stem will do that. With basic the best you can do is to pass terms to cts:stem and see if they have the same stem. -- Mike

On May 11, 2012, at 13:39, Abhishek53 S <abhi@tcs.com<mailto:abhi@tcs.com>> wrote: Hi Folks,

Is it possible to get the all terms that have same stem from Marklogic database? I want to get all terms that belongs to the same stem.

Thanks & Regards Abhishek Srivastav Systems Engineer Tata Consultancy Services Cell:- +91-9883389968 Mailto: abhi@tcs.com<mailto:abhi@tcs.com> Website: http://www.tcs.com<http://www.tcs.com/>

____________________________________________ Experience certainty. IT Services Business Solutions Outsourcing

http://developer.marklogic.com/mailman/listinfo/general