| From | Sent On | Attachments |
|---|---|---|
| Geert Josten | May 12, 2012 8:52 am | |
| Danny Sokolsky | May 12, 2012 10:38 am | |
| Danny Sokolsky | May 12, 2012 3:41 pm | |
| Geert Josten | May 13, 2012 1:31 am | |
| Geert Josten | May 13, 2012 1:32 am |
| Subject: | Re: [MarkLogic Dev General] Bug in cts:element-words? (was: Term with same stem) | |
|---|---|---|
| From: | Geert Josten (geer...@dayon.nl) | |
| Date: | May 13, 2012 1:31:07 am | |
| List: | com.marklogic.developer.general | |
Duh.. It just had to be something that obvious..
Thnx Danny!
-----Oorspronkelijk bericht----- Van: gene...@developer.marklogic.com [mailto:general- boun...@developer.marklogic.com] Namens Danny Sokolsky Verzonden: zondag 13 mei 2012 0:42 Aan: MarkLogic Developer Discussion Onderwerp: Re: [MarkLogic Dev General] Bug in cts:element-words? (was: Term with same stem)
I hadn't had enough coffee yet when I made my last comment. The example in the doc is correct, it just puts a start value in. Geert, your example would use the "collation=..." string as the start value, and would pick up the whatever is the default collation in your environment (and you probably do not have an element word lexicon on the default collation, so it probably throws an exception).
-Danny
________________________________________ From: gene...@developer.marklogic.com [general- boun...@developer.marklogic.com] On Behalf Of Danny Sokolsky [Dann...@marklogic.com] Sent: Saturday, May 12, 2012 10:38 AM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Bug in cts:element-words? (was: Term with same stem)
I think your call to element-words is missing the second parameter; $options is the 3rd parameter. So I think it should be:
cts:element-words(fn:QName("http://grtjn.nl/twitter/utils", "text"), (), "collation=http://marklogic.com/collation/nl/S1/AS/T00BB")
It looks like the example in the doc is missing that second arg too--I'll see if I can get that fixed ;)
-Danny
________________________________________ From: gene...@developer.marklogic.com [general- boun...@developer.marklogic.com] On Behalf Of Geert Josten [geer...@dayon.nl] Sent: Saturday, May 12, 2012 8:52 AM To: MarkLogic Developer Discussion Subject: [MarkLogic Dev General] Bug in cts:element-words? (was: Term with same stem)
Curious how well the idea of Danny would perform, I thought to apply it to one of my test databases with a fair number of tweets (roughly 400K last time I checked). I had to rewrite cts:words to cts:element-words since I have no words lexicon. But it breaks with me. Did I hit a bug?
let $map := map:map() let $all := for $x in cts:element-words(fn:QName("http://grtjn.nl/twitter/utils", "text"), "collation=http://marklogic.com/collation/nl/S1/AS/T00BB") return map:put($map, cts:stem($x), $x) return ( fn:concat(xs:string(fn:count(map:keys($map))), " unique stems in the database"), fn:concat(fn:count(cts:words()), " unique words in the database "), map:keys($map) )
Note that I specify a specific collation, but that seems to get ignored. Can anyone confirm this behavior?
Kind regards, Geert
Van: gene...@developer.marklogic.com<mailto:general- boun...@developer.marklogic.com> [mailto:general- boun...@developer.marklogic.com<mailto:general- boun...@developer.marklogic.com>] Namens Danny Sokolsky Verzonden: zaterdag 12 mei 2012 0:13 Aan: MarkLogic Developer Discussion Onderwerp: Re: [MarkLogic Dev General] Term with same stem
If you have a word lexicon you can do something like this to get information about your words and stems:
let $map := map:map() let $all := for $x in cts:words() return map:put($map, cts:stem($x), $x) return ( fn:concat(xs:string(fn:count(map:keys($map))), " unique stems in the database"), fn:concat(fn:count(cts:words()), " unique words in the database "), map:keys($map) )
-Danny
From: gene...@developer.marklogic.com<mailto:general- boun...@developer.marklogic.com> [mailto:general- boun...@developer.marklogic.com]<mailto:[mailto:general- boun...@developer.marklogic.com]> On Behalf Of Michael Blakeley Sent: Friday, May 11, 2012 2:02 PM To: MarkLogic Developer Discussion Cc: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Term with same stem
If stemming=advanced I think cts:stem will do that. With basic the best you can do is to pass terms to cts:stem and see if they have the same stem. -- Mike
On May 11, 2012, at 13:39, Abhishek53 S <abhi...@tcs.com<mailto:abhi...@tcs.com>> wrote: Hi Folks,
Is it possible to get the all terms that have same stem from Marklogic database? I want to get all terms that belongs to the same stem.
Thanks & Regards Abhishek Srivastav Systems Engineer Tata Consultancy Services Cell:- +91-9883389968 Mailto: abhi...@tcs.com<mailto:abhi...@tcs.com> Website: http://www.tcs.com<http://www.tcs.com/>
____________________________________________ Experience certainty. IT Services Business Solutions Outsourcing
=====-----=====-----===== Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use, review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments. Thank you
_______________________________________________ General mailing list Gene...@developer.marklogic.com<mailto:Gene...@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list Gene...@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list Gene...@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
_______________________________________________ General mailing list Gene...@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general





