8 messages in net.sourceforge.lists.courier-sqwebmailRe: [sqwebmail] patches: No flowed fo...
FromSent OnAttachments
Hatuka*nezumi - IKEDA SojiApr 24, 2006 3:22 am.patch, .patch
Sam VarshavchikMay 2, 2006 7:18 pm 
Hatuka*nezumi - IKEDA SojiMay 5, 2006 11:45 pm 
Sam VarshavchikMay 6, 2006 7:44 am 
Hatuka*nezumi - IKEDA SojiMay 7, 2006 11:34 pm 
Sam VarshavchikMay 8, 2006 4:00 am 
Hatuka*nezumi - IKEDA SojiMay 8, 2006 9:07 pm 
Sam VarshavchikMay 9, 2006 3:52 am 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:Re: [sqwebmail] patches: No flowed format etc.Actions...
From:Sam Varshavchik (mrs@courier-mta.com)
Date:May 8, 2006 4:00:13 am
List:net.sourceforge.lists.courier-sqwebmail

Hatuka*nezumi - IKEDA Soji writes:

On Sat, 06 May 2006 10:44:28 -0400 Sam Varshavchik <mrs@courier-mta.com> wrote:

I think that going with RFC 3676 will solve this, using a two-pass approach. On first pass, generate linebreaks with DelSp=no, then, if each flowed line ends with a space, remove them and set DelSP=yes.

(*) I'm planning to implement Unicode line breaking algorithm defined by UAX#14 on SqWebMail. But it'll need at very least some weeks.

Why don't you just write up a small, standalone library that implements UAX#14, using a very simple API: given a list of unicode characters, return a list of potential linebreaking characters, and a flag indicating if the linebreaking character should be removed.

Lately, I'm partial to using callbacks, in lieu of heap allocations. Something like:

int uc_findPotentialLineBreaks(wchar_t *wc, int (*cb)(size_t index, bool_t removeFlag, void *voidptr), void *voidptr);

You'd find all the linebreaks in uc_findLineBreaks(), and repeatedly invoke cb(), passing the index of the valid linebreaking character, and the removal flag. voidptr is a transparent pass-through pointer.

Thanks for good suggestion. I'll try with following prototype. Please tell me any inappropriate things (if any) ---

enum unicode_lbtype unicode_linebreak(const unicode_char *text, size_t textlen, void (*writeout_cb)(const unicode_char *line, size_t linelen, enum unicode_lbtype lbtype, struct unicode_lbinfo *lbinfo), struct unicode_lbinfo *lbinfo);

I don't think that's the right approach. First of all, you may have just a partial text fragment, and all you want to know just where the potential line breaks are.

Unless I'm misreading UAX#14, given an arbitrary text fragment, you should be able to determine where all the valid linebreaks are, and whether the linebreaking character should be removed, or not.

So that's your first step. Just identify where all the potential line breaks are, verbatim as per UAX #14.

This can be used for purposes other than computing actual line breaks. Since potential linebreaks are, essentially, word terminators, this function can also be used to do other things, such as returning the individual words from a text fragment.

The second step is to take the list of allowed line breaks, together with the desired line length, and, using wcwidth(), compute the optimum locations of all the line breaks, given the list of valid locations for linebreaks that was computed by the first function.

There are two separate functions here: 1) Compute the list of all possible line break positions, and 2) Actually pick the right line break positions, out of the potential linebreaks computed by the first function, given the desired line length.