Subject: Re: fo:bidi-override - is it necessary? From: Frank Wegmann <wegmann@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> Date: Sat, 7 Aug 1999 23:34:14 +0200 |
I have to agree with Steve. Yet, I'd like to clarify some points if you consider "real processing" of bidi. 1. If you have a Unicode-capable application that is able to handle mixed L2R and R2L text (and this is what Chris mentioned as example) then your application will know the following: - Every Unicode character knows whether it (or the rendering of its visual representation, the glyph) will be written from left to right or vice versa. - The Unicode control marks (LRM,RLM,LRE,RLE,PDF,LRO,RLO) will be paid attention to. There's simply no way around it, or otherwise bidi processing doesn't make much sense. 2. If you have Hebrew or Arabic encoded in your documents, how will you represent them internally? UCS-2/UCS-4 is not very likely, e.g. we use UTF-8 for processing mixed Yiddish/German/English texts (Yiddish is written wirh Hebrew letters, erm .. mostly). In any case, you will very probably not see the glyphs in your marked up document. So an additional markup is not only excessively verbose, but it is definitely superfluous (no one but tools will be able to read the pure markup), and will thus (as Steve pointed out) make processing harder. Here a typical line from such a document: ¿<f>×§×?Ö¸× ×¦×¢× ×?, ¿<f>×§×?Ö¸× ×¦×¢× ×?, Only after rendering will humans be capable to read that stuff. (Peter Flynn might object to point 2, since he did some very valuable work with TEI WSDs and he used entity references for each and every Hebrew character -- something that would be too expensive for us). So this will very probably lead us in a land of confusion. Unicode- based bidi processing is still not a common thing to do, so don't let us put new some obstacles in the way. Frank > >It's markup instead of what are essentially formatting codes. > > But it's markup that is 100% equivalent to formatting codes. > > >And you *could* generate the bidi-override characters in the > >transformation into FOs, but it's clearer to use markup rather than > >cryptic hexcodes or entities. > > How is > > <p>This is some <fo:bidi-override direction="rtl">Arabic > </fo:bidi-override> text.</p> > > any less cryptic than > > <p>This is some &rtl-begin;Arabic&rtl-end; text.</p> > > My real concern is that you now have two independent "absolute" > formatting methods, without any clearly-defined semantics regarding > how they might interact. What happens when LRO or RLO are used in the > text within a fo:bidi-override element? Which one "wins"? The RFC you > reference even mentions the problem explicitly: > > "authors and authoring software writers should be aware that conflicts > can arise if the DIR attribute is used on inline elements (including > BDO) concurrently with the use of the corresponding ISO 10646 > formatting characters." > > It just seems to me that given the fact that Unicode already supplies > the elements required to handle all possible cases, the addition of > another layer in the form of a markup tag adds nothing but unnecessary > complexity. It makes the job of bidirectional formatting harder (and > consequently more error-prone), not easier. > > -Steve > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: fo:bidi-override - is it necess, Steve Schafer | Thread | XSL controversy, Charlie Kaiman |
ResultTreeFragment and XT extension, Denys Duchier | Date | More on: ResultTreeFragment and XT , Denys Duchier |
Month |