[xsl] Re: Generate identifier

Subject: [xsl] Re: Generate identifier
From: "Vladimir Nesterovsky" <vladimir@xxxxxxxxxxxxxxxxxxxx>
Date: Thu, 7 Jan 2010 19:50:12 +0200
I have missed the original post, but the effect of diacritics
is quite different from language to language.
Sometimes it is "only" an accent, in other languages it
changes the sound and sometimes meaning of a character.
What is a "Western" language? If you think of European
languages, there are some that do not use ASCII characters
at all (Cyrillic, Greek) and your method will not work.
So I would just drop them or replace them with an underscore.
Saves a lot of energy :-)

You're absolutely right. It's not possible to build a perfect name suggestion from a string.
In my case it's a COBOL name:
cobol-word = [A-Za-z0-9]+ ([\-]+ [A-Za-z0-9]+)*


And my current implementation is like this:

 <!--
   Creates a normalized name for a specified name components.
     $component - name components to generate normalized name for.
     $default-name - a default name in case a name cannot be built.
     Returns a normalized name.
 -->
 <xsl:function name="t:create-name" as="xs:string?">
   <xsl:param name="components" as="xs:string*"/>
   <xsl:param name="default-name" as="xs:string?"/>

<xsl:variable name="parts" as="xs:string*">
<xsl:for-each select="$components">
<xsl:analyze-string
regex="[A-Z0-9]+"
flags="imx"
select="
replace
(
replace(normalize-unicode(upper-case(.), 'NFD'), '&#xc6;', 'AE'),
'[\p{Sk}\p{Mc}\p{Me}\p{Mn}]',
''
)">
<xsl:matching-substring>
<xsl:sequence select="."/>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:for-each>
</xsl:variable>


   <xsl:sequence select="
     if (empty($parts)) then
       $default-name
     else
       string-join
       (
         (
           for
             $i in 1 to count($parts),
             $part in $parts[$i]
           return
             if (($i = 1) and ($part lt ':') and ($part ge '0')) then
               (($default-name, 'N')[1], $part)
             else
               $part
         ),
         '-'
       )"/>
 </xsl:function>

--
Vladimir Nesterovsky
http://www.nesterovsky-bros.com/

Current Thread