Subject: Re: [xsl] BIDI problem in XSL-FO From: "Eliot Kimber ekimber@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Tue, 3 May 2016 15:58:52 -0000 |
As it happens I just implemented some code to generate text-level analysis based on configured character ranges. The generated template looks like this: <xsl:template match="text()" mode="epub:textToCharSet-ja_jp"> <xsl:param name="doDebug" as="xs:boolean" tunnel="yes" select="false()"/> <!-- Handle language ja_jp--> <xsl:if test="$doDebug"> <xsl:message>+ [DEBUG] epub:textToCharSet-ja_jp: text="<xsl:value-of select="."/>"</xsl:message> </xsl:if> <xsl:analyze-string select="." regex="([c-o>]+)"> <xsl:matching-substring> <xsl:sequence select="."/> </xsl:matching-substring> <xsl:non-matching-substring> <span class="non-native-text"> <xsl:sequence select="."/> </span> </xsl:non-matching-substring> </xsl:analyze-string> </xsl:template> In this case I'm identifying text *not* in the national language in question but the same approach can be applied to other business logic of course. In an earlier version of this code I had multiple groups in the regular expression and used a choice group to determine which group had matched by checking each group to see if it was empty and using the one that was not. Cheers, Eliot ---- Eliot Kimber, Owner Contrext, LLC http://contrext.com On 5/3/16, 10:42 AM, "Michael MC<ller-Hillebrand mmh@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: >Hi Tony, > >Wow, what an interesting tool this is: >http://www.unicode.org/cldr/utility/bidi.jsp > >Unfortunately, in my case the parentheses are likely to be just regular >text and I have no direct way of knowing whether they surround Arabic or >Western text (other than trying to find some all-purpose magic XPath >analyzing basically every text() node). But the content inside the >parentheses is tagged as non-translateable and I can take advantage of >that. > ><p>ARABIC <nt>Brand name</nt> (<nt>Former name</nt>) TEXT.</p> > >By playing around with the tool (and without proper understanding of the >rules) I find some options that would make the parentheses correct, but >the preceding or following Arabic text will be ordered in the wrong way. > >I have the impression that direction control characters in this situation >do not as well as <fo:bidi-override> would work. Unfortunately I have not >heard back, whether the presentation as > >.TXET (Former name) Brand name CIBARA > >is accepted by the client. > >- Michael > >BTW: I hope this is still on topic enough. That's why I mentioned XPath. > > >> Am 03.05.2016 um 14:21 schrieb Tony Graham tgraham@xxxxxxxxxxxxx >><xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>: >> >> tldr: Put ‎ after the ')'. > >> As Michael notes below, some characters, such as Latin letters, have a >> 'strong' directionality, and some have a 'weak' or 'neutral' >> directionality. The closing ')' is a 'neutral', and because it's at the >> end of the string, it takes the 'embedding direction' [5], which is RTL >> in Michael's example. You can see this with the bidi utility at >> >>http://www.unicode.org/cldr/utility/bidi.jsp?a=Brand+name+%28Former+name% >>E2%80%8E%29&p=RTL
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] BIDI problem in XSL-FO, Michael Müller-Hille | Thread | Re: [xsl] BIDI problem in XSL-FO, Tony Graham tgraham@ |
Re: [xsl] BIDI problem in XSL-FO, Michael Müller-Hille | Date | Re: [xsl] BIDI problem in XSL-FO, Tony Graham tgraham@ |
Month |