On Mon, 08 May 2006 07:00:36 -0400
Sam Varshavchik <mrs...@courier-mta.com> wrote:
Hatuka*nezumi - IKEDA Soji writes:
enum unicode_lbtype
unicode_linebreak(const unicode_char *text,
size_t textlen,
void (*writeout_cb)(const unicode_char *line,
size_t linelen,
enum unicode_lbtype lbtype,
struct unicode_lbinfo *lbinfo),
struct unicode_lbinfo *lbinfo);
I don't think that's the right approach. First of all, you may have
just a partial text fragment, and all you want to know just where the
potential line breaks are.
Unless I'm misreading UAX#14, given an arbitrary text fragment, you should
be able to determine where all the valid linebreaks are, and whether the
linebreaking character should be removed, or not.
My last post described the later steps. Linebreaking feature is
wrapped into unicode_linebreak() where 2-steps algorithm is carried
out. Callback functions receive each buffer of lines breaking
points have been already solved, by each calls. Sorry for
insufficient description.
There are two separate functions here: 1) Compute the list of all possible
line break positions, and 2) Actually pick the right line break positions,
out of the potential linebreaks computed by the first function, given the
desired line length.
I see. Anyhow I'll try it.
BTW: I hesitate to use wcwidth() desiding character width. Some
implementations return -1 for characters outside of locale
(though modern Unicode'fied fonts render some wide characters as
wide). Some others are simply broken (e.g. always return 1 for
printable character). So I'm also planning to implement UAX#11,
or somewhat else suitable.
--- nezumi