what a pity - I only noticed this challenge recently, way too short
before the close of this month.
Now I hope that the acceptance tests are hard enough to get the
A problem that definitely should be solved is the proper distinction
of keywords and identifiers, in the absence of reserved words. Sorry
to say, and I may be missing something, but I have not seen a solution
here addressing this satisfactorily.
Is it necessary to point out that the distinction highly depends on
a token's context? Repetitive application of regular expressions,
without proper context detection, will not be sufficient, and
enumerating some special cases does not provide a complete solution
either. In fact I believe that the best way to tokenize XQuery is
feeding back information from a full-fledged parser into the lexer.
Of course the environment must provide appropriate interfaces for
integrating that. From a quick check it seems to me that CodeMirror
would be my first choice to go for.
So if there is any chance to extend the timeframe of this challenge,
I would gladly attempt a CodeMirror approach.