|
Subject: Re: [xsl] BIDI problem in XSL-FO From: "Eliot Kimber ekimber@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Tue, 3 May 2016 15:58:52 -0000 |
As it happens I just implemented some code to generate text-level analysis
based on configured character ranges.
The generated template looks like this:
<xsl:template match="text()" mode="epub:textToCharSet-ja_jp">
<xsl:param name="doDebug" as="xs:boolean" tunnel="yes"
select="false()"/>
<!-- Handle language ja_jp-->
<xsl:if test="$doDebug">
<xsl:message>+ [DEBUG] epub:textToCharSet-ja_jp:
text="<xsl:value-of select="."/>"</xsl:message>
</xsl:if>
<xsl:analyze-string select="." regex="([c-o>]+)">
<xsl:matching-substring>
<xsl:sequence select="."/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<span class="non-native-text">
<xsl:sequence select="."/>
</span>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
In this case I'm identifying text *not* in the national language in
question but the same approach can be applied to other business logic of
course.
In an earlier version of this code I had multiple groups in the regular
expression and used a choice group to determine which group had matched by
checking each group to see if it was empty and using the one that was not.
Cheers,
Eliot
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com
On 5/3/16, 10:42 AM, "Michael MC<ller-Hillebrand mmh@xxxxxxxxx"
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>Hi Tony,
>
>Wow, what an interesting tool this is:
>http://www.unicode.org/cldr/utility/bidi.jsp
>
>Unfortunately, in my case the parentheses are likely to be just regular
>text and I have no direct way of knowing whether they surround Arabic or
>Western text (other than trying to find some all-purpose magic XPath
>analyzing basically every text() node). But the content inside the
>parentheses is tagged as non-translateable and I can take advantage of
>that.
>
><p>ARABIC <nt>Brand name</nt> (<nt>Former name</nt>) TEXT.</p>
>
>By playing around with the tool (and without proper understanding of the
>rules) I find some options that would make the parentheses correct, but
>the preceding or following Arabic text will be ordered in the wrong way.
>
>I have the impression that direction control characters in this situation
>do not as well as <fo:bidi-override> would work. Unfortunately I have not
>heard back, whether the presentation as
>
>.TXET (Former name) Brand name CIBARA
>
>is accepted by the client.
>
>- Michael
>
>BTW: I hope this is still on topic enough. That's why I mentioned XPath.
>
>
>> Am 03.05.2016 um 14:21 schrieb Tony Graham tgraham@xxxxxxxxxxxxx
>><xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>:
>>
>> tldr: Put ‎ after the ')'.
>
>> As Michael notes below, some characters, such as Latin letters, have a
>> 'strong' directionality, and some have a 'weak' or 'neutral'
>> directionality. The closing ')' is a 'neutral', and because it's at the
>> end of the string, it takes the 'embedding direction' [5], which is RTL
>> in Michael's example. You can see this with the bidi utility at
>>
>>http://www.unicode.org/cldr/utility/bidi.jsp?a=Brand+name+%28Former+name%
>>E2%80%8E%29&p=RTL
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] BIDI problem in XSL-FO, Michael Müller-Hille | Thread | Re: [xsl] BIDI problem in XSL-FO, Tony Graham tgraham@ |
| Re: [xsl] BIDI problem in XSL-FO, Michael Müller-Hille | Date | Re: [xsl] BIDI problem in XSL-FO, Tony Graham tgraham@ |
| Month |