8 messages in net.sourceforge.lists.courier-sqwebmailRe: [sqwebmail] patches: No flowed fo...
FromSent OnAttachments
Hatuka*nezumi - IKEDA SojiApr 24, 2006 3:22 am.patch, .patch
Sam VarshavchikMay 2, 2006 7:18 pm 
Hatuka*nezumi - IKEDA SojiMay 5, 2006 11:45 pm 
Sam VarshavchikMay 6, 2006 7:44 am 
Hatuka*nezumi - IKEDA SojiMay 7, 2006 11:34 pm 
Sam VarshavchikMay 8, 2006 4:00 am 
Hatuka*nezumi - IKEDA SojiMay 8, 2006 9:07 pm 
Sam VarshavchikMay 9, 2006 3:52 am 
Actions with this message:
Paste this link in email or IM:
Paste this link in email or IM:
Atom feed for this thread
Paste this URL into your reader:
Subject:Re: [sqwebmail] patches: No flowed format etc.Actions...
From:Hatuka*nezumi - IKEDA Soji (hat@nezumi.nu)
Date:May 7, 2006 11:34:33 pm
List:net.sourceforge.lists.courier-sqwebmail

On Sat, 06 May 2006 10:44:28 -0400 Sam Varshavchik <mrs@courier-mta.com> wrote:

I think that going with RFC 3676 will solve this, using a two-pass approach. On first pass, generate linebreaks with DelSp=no, then, if each flowed line ends with a space, remove them and set DelSP=yes.

(*) I'm planning to implement Unicode line breaking algorithm defined by UAX#14 on SqWebMail. But it'll need at very least some weeks.

Why don't you just write up a small, standalone library that implements UAX#14, using a very simple API: given a list of unicode characters, return a list of potential linebreaking characters, and a flag indicating if the linebreaking character should be removed.

Lately, I'm partial to using callbacks, in lieu of heap allocations. Something like:

int uc_findPotentialLineBreaks(wchar_t *wc, int (*cb)(size_t index, bool_t removeFlag, void *voidptr), void *voidptr);

You'd find all the linebreaks in uc_findLineBreaks(), and repeatedly invoke cb(), passing the index of the valid linebreaking character, and the removal flag. voidptr is a transparent pass-through pointer.

Thanks for good suggestion. I'll try with following prototype. Please tell me any inappropriate things (if any) ---

enum unicode_lbtype unicode_linebreak(const unicode_char *text, size_t textlen, void (*writeout_cb)(const unicode_char *line, size_t linelen, enum unicode_lbtype lbtype, struct unicode_lbinfo *lbinfo), struct unicode_lbinfo *lbinfo);

text: Buffer containing unicode text. textlen: Length of text. writeout_cb: Callback function to write out one broken line. line: Buffer containing one broken line. linelen: Length of broken line. lbtype: Type of linebreaking. UNICODE_LBTYPE_MANDATORY: "Mandatory" Break. line is terminated by explicit newline character. UNICODE_LBTYPE_INDIRECT: "Indirect" Break. line is terminated by some numbers of space. UNICODE_LBTYPE_DIRECT: "Direct" Break. line is broken between non-space characters, by linebreaking rule. UNICODE_LBTYPE_EOT: End of Text. line is the last line of text. * "Mandatory", "Direct" and "Indirect" are as terms described in UAX#14. lbinfo: Working buffer and options to taylor linebreak behaviour.

Returns: UNICODE_LBTYPE_NOMOD: text hasn't been broken at all, or only Mandatory Break(s) occur. UNICODE_LBTYPE_INDIRECT: Indirect Break(s) occur but no Direct Breaks. UNICODE_LBTYPE_DIRECT: Direct Break(s) occur.

On two-pass approach for RFC 3676 (yes, 3, 6, 7, 6), the code will be like:

switch (unicode_linebreak(text, textlen, NO_OP_callback, lbinfo)) { case UNICODE_LBTYPE_NOMOD: write_to_header("Content-Type: text/plain"); unicode_linebreak(text, textlen, FIXED_callback, lbinfo); break; case UNICODE_LBTYPE_INDIRECT: write_to_header("Content-Type: text/plain; format=flowed"); unicode_linebreak(text, textlen, FLOWED_callback, lbinfo); break; case UNICODE_LBTYPE_DIRECT: write_to_header("Content-Type: text/plain; format=flowed; delsp=yes"); unicode_linebreak(text, textlen, FLOWED_DELSP_callback, lbinfo); break; }

--- nezumi