Hatuka*nezumi - IKEDA Soji writes:
But: modern browsers that default to UTF-8 will send composed text as
UTF-8, and UTF-8 should use flowed format.
So, after thinking about it, I believe that when typing in Asian
characters you must explicitly hit enter at the end of every line, which
inserts a hard linebreak.
This is just what users of Asian characters do --- Someones prefer inserting
hard linebreak manually. But most users use auto-wrap feature of MUA which
insert hard linebreak automatically so that lines are made even on columns.
"wbnoflowed=2" option emulates the latter behaviour (*).
I think that going with RFC 3676 will solve this, using a two-pass approach.
On first pass, generate linebreaks with DelSp=no, then, if each flowed line
ends with a space, remove them and set DelSP=yes.
In some (most of Asian etc.) scripts, word separator (such as SPACE) won't be
used. Linebreak may occur all character boundaries, except some limitations.
So RFC 2646 didn't fit for text of those scripts. RFC 3767 extends flowed
format to handle such situations.
3676. I made the same typo.
(*) I'm planning to implement Unicode line breaking algorithm defined by
UAX#14 on SqWebMail. But it'll need at very least some weeks.
Why don't you just write up a small, standalone library that implements
UAX#14, using a very simple API: given a list of unicode characters, return
a list of potential linebreaking characters, and a flag indicating if the
linebreaking character should be removed.
Lately, I'm partial to using callbacks, in lieu of heap allocations.
Something like:
int uc_findPotentialLineBreaks(wchar_t *wc,
int (*cb)(size_t index, bool_t removeFlag, void *voidptr),
void *voidptr);
You'd find all the linebreaks in uc_findLineBreaks(), and repeatedly invoke
cb(), passing the index of the valid linebreaking character, and the removal
flag. voidptr is a transparent pass-through pointer.
Although sqwebmail does not use wchar_t's internally, this is the right
design for an app-agnostic library.