[xsl] Hyphenation in XSL FO

Subject: [xsl] Hyphenation in XSL FO
From: Gustaf Liljegren <gustaf.liljegren@xxxxxx>
Date: Tue, 09 Jan 2001 21:35:42 +0100

After studing hyphenation in XSL, I have the following comments and questions:

Like in DSSSL, there are three properties for defining a localization for
the hyphenation (because each language has its own rules for this). The
intention is (please correct) that implementors has to learn and make
hyphenation rules for different countries, languages and scripts. First of
all, it must be quite a large job only to implement half a dozen of the
most common cases within ISO 8859-1. Secondly, there are common exceptions
to hyphenation rules, making it necessary to treat each case individually.
So I don't believe in this approach.

The only way to get things right is a hyphenation dictionary, like in
DSSSL, FOSI or TeX. This is what I'm thinking about:

  <fo:block hyphenation-exception="uri(list.txt)">

The URI refers to a file containing a list of words with hypens in
appropriate places. When a word needs hyphenation the formatter check this
file to see in which places the word may be hyphenated. Words not in the
list are never hyphenated. If this feature is not added, I'm afraid we'll
have to wade through a lot of FO code to correct bad hyphenation before
processing the final output.

For URIs, a special problem arise. A URI is often resulting in large spaces
on the following line. The best way I've seen to avoid that is to break the
URI after a "/", but without hyphen. Any ideas for how this exception could
be solved?

Ideally, there would be no need for hyphenation at all. The most convenient
way to avoid it is to work with the space between words. Most layout
programs today do this one row at a time, instead of working with the whole
paragraph, and this seem to be the intent in XSL aswell. Why not a
hyphenation that check and adjust the whole paragraph?

The spec doesn't mention anything about soft hyphens (#173). It's important
that soft hyphens are treated as such. That is, if your XML file is
scattered with soft hyphens for every single place a hyphen may occur, the
finished output should be exactly as you want it. This could be a cheap
solution for those whose native hyphenation rules (and exceptions!) are not
implemented, and it would be valuable for those who need to make final
corrections in the FO.

Finally, I think there should be a property for setting the minimum number
of characters for hyphenated words.

Thanks for your time,

Gustaf Liljegren

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Current Thread