Re: [xsl] Unexpected MSXML Javascript extension results

Subject: Re: [xsl] Unexpected MSXML Javascript extension results
From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 20 Jan 2017 10:11:42 -0000
On 19.01.2017 22:23, C. Edward Porter cep@xxxxxxx wrote:
Hello all,



We have an XSL transformation that runs using some form of MSXML for the actual transformation. Given that MSXML is XSL 1.0, I am trying to code around the fact that it lacks regular expression functions by writing a JavaScript extension function to match Greek characters and wrap them in a span to avoid them being wrongly transformed to capital letters by smallcap treatment on the text around it. The JavaScript function I wrote appears to work fine if I run it locally, but when run as part of the XSL, it's not recursing properly. I don't have much of a way to debug this, so I'm hoping perhaps someone can either recognize why I'm getting inconsistent recursion, or perhaps suggest an alternative approach. Code/sample text below:



JavaScript Function:

<msxsl:script language="javascript" implements-prefix="uspc"><![CDATA[

//Recursive function to wrap greek characters in nosmallcaps span

function wrapGreek(str){

var regex = /(.*)([N1-O N-N)])(.*)/g;

var m = regex.exec(str);

var rStr = "";

if(m != null){

rStr = wrapGreek(m[1]);

} else {

rStr = str;

}

if(m != null) {

rStr += '<span class="nosmallcaps">' + m[2] + '</span>' + m[3];

}

return "" + rStr;

}

]]></msxsl:script>



Text template:

<xsl:template match="text()" priority="20">

<xsl:variable name="textout">

<xsl:value-of select="."/>

</xsl:variable>

<xsl:value-of disable-output-escaping="yes" select="uspc:wrapGreek($textout)"/>

</xsl:template>



Sample Content:

This head composed correctly:

<title cid="pttFA" id="GUID-04193AE6-05B6-44C1-AB2A-E8A900F93CE3">DESN1CRIPN1TION</title>



For this one, only the second alpha symbol was wrapped, so it did not recurse. Copying this text to a JavaScript interpreter and running the function on it in isolation does appear to recurse correctly.

<title cid="1IMidE" id="GUID-D279E612-1350-4709-B662-8FFEF76CD0C0">InfrN1ared AbsoN1rption, <i cid="2KvZ79">Spectrophotometric Identification Tests, Appendix IIIC</i></title>

It seems you simply want to do a string replacement so I wonder why you need the exec method and can't simply use the replace methods on strings e.g.



function wrapGreek(str) {
return str.replace(/[N1-O N-N)]+/g, function(m) { return '<span class="nosmallcaps">' + m + '<\/span>'});
}


A full stylesheet is

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
xmlns:mf="http://example.com/mf";
xmlns:ms="urn:schemas-microsoft-com:xslt"
exclude-result-prefixes="mf ms"
version="1.0">

<ms:script language="JScript" implements-prefix="mf">
<![CDATA[
function wrapGreek(str) {
return str.replace(/[N1-O N-N)]+/g, function(m) { return '<span class="nosmallcaps">' + m + '<\/span>'});
}
]]>
</ms:script>

<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>


<xsl:template match="text()" priority="20">
<xsl:value-of disable-output-escaping="yes" select="mf:wrapGreek(string())"/>
</xsl:template>

</xsl:stylesheet>


I have tested it on oXygen with MSXML 3 and the two .NET implementations offered there to support the ms:script element and the input

<?xml version="1.0" encoding="UTF-8"?>
<root>
<title cid="pttFA" id="GUID-04193AE6-05B6-44C1-AB2A-E8A900F93CE3">DESN1CRIPN1TION</title>
<title cid="1IMidE" id="GUID-D279E612-1350-4709-B662-8FFEF76CD0C0">InfrN1ared AbsoN1rption, <i cid="2KvZ79">Spectrophotometric Identification Tests, Appendix IIIC</i></title>
</root>


is transformed into

<root>
<title cid="pttFA" id="GUID-04193AE6-05B6-44C1-AB2A-E8A900F93CE3">DES<span class="nosmallcaps">N1</span>CRIP<span class="nosmallcaps">N1</span>TION</title>
<title cid="1IMidE" id="GUID-D279E612-1350-4709-B662-8FFEF76CD0C0">Infr<span class="nosmallcaps">N1</span>ared Abso<span class="nosmallcaps">N1</span>rption, <i cid="2KvZ79">Spectrophotometric Identification Tests, Appendix IIIC</i></title>
</root>


Current Thread