Re: [xsl] Implementation Advice: Grouping Strings by Character Range in XSLT 2

Subject: Re: [xsl] Implementation Advice: Grouping Strings by Character Range in XSLT 2
From: "Eliot Kimber ekimber@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 29 Apr 2016 15:43:09 -0000
Cool. My initial implementation attempt uses analyze-string just as you
show and seems to work as I wanted.

Cheers,

E.
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 4/29/16, 10:19 AM, "G. Ken Holman g.ken.holman@xxxxxxxxx"
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

>I think any time going from a string "up" to rich markup (remember
>the Omnimark triangle? Perhaps they used the triangle from someone
>else) I would use analyze-string.
>
>And I think it would be the easiest to synthesize as well, something
>along the lines of:
>
>     regex="([cde]+)|([g]+)"
>
>... then using regex-group(n) for each range.
>
>One would have to use tail recursion for XSLT 1, but I don't think it
>buys anything, plus your synthesis would be a lot more complicated
>(yes, I know it is done only once).  Remember the XSLT processor is
>optimizing analyze-string rather than any stylesheet expression of
>the tail recursion.
>
>. . . . . . Ken
>
>At 2016-04-29 15:04 +0000, Eliot Kimber ekimber@xxxxxxxxxxxx wrote:
>>Using XSLT 2, I have a requirement to take text and group contiguous
>>sequences of characters in markup according to a given character range
>>the
>>characters are in. This is to support the application of range-specific
>>fonts to text in HTML.
>>
>>I have a static definition of the character ranges for a given national
>>language and there shouldn't be any overlap between ranges. Given this
>>static definition, I'm generating XSLT code to operate on text nodes in
>>order to apply the range markup. The
>>
>>For example, given the text string "abcdefg" where range "R1" is "cde"
>>and
>>R2 is "g", the marked up result should be: abc<span
>>class="R1">cde</span>f<span class="R2">g</span>
>>
>>My initial approach is to generate a template that takes the current
>>language and the text node and then applies templates in a
>>language-specific mode.
>>
>>For each language I'm then generating a template to do the range
>>matching.
>>
>>My question, once I'm in a language-specific template for a text node,
>>what is the most efficient and/or easiest to code way to map the string
>>to
>>ranges? Since I'm generating the code it doesn't have to be concise.
>>
>>I'm thinking along the lines of using analyze-string to match on any of
>>the groups and then within the matching-substring clause have a choice
>>group to determine which range actually matched. But it feels like I'm
>>missing a more elegant way to determine the actual range.
>>
>>Or maybe there's a clearer/simpler/more efficient way using tail
>>recursion?
>>
>>Thanks,
>>
>>Eliot
>>----
>>Eliot Kimber, Owner
>>Contrext, LLC
>>http://contrext.com
>>
>>
>
>
>--
>Check our site for free XML, XSLT, XSL-FO and UBL developer resources |
>Streaming hands-on XSLT/XPath 2 training @US$45: http://goo.gl/Dd9qBK |
>Crane Softwrights Ltd. _ _ _ _ _ _ http://www.CraneSoftwrights.com/s/ |
>G Ken Holman _ _ _ _ _ _ _ _ _ _ mailto:gkholman@xxxxxxxxxxxxxxxxxxxx |
>Google+ blog _ _ _ _ _ http://plus.google.com/+GKenHolman-Crane/posts |
>Legal business disclaimers: _ _ http://www.CraneSoftwrights.com/legal |
>
>
>---
>This email has been checked for viruses by Avast antivirus software.
>https://www.avast.com/antivirus

Current Thread