Re: [xsl] Javascript for extension functions

Subject: Re: [xsl] Javascript for extension functions
From: "bryan rasmussen" <rasmussen.bryan@xxxxxxxxx>
Date: Mon, 10 Sep 2007 11:02:30 +0200
Well the thing about javascript is that much of its power depends as
to what hooks it has to the system it is in. But let us suppose that
the javascript implementation in the processor has access to the
system external to the processor greater than the processor gives xslt
itself.

An example would be MSXSML which can take extension functions written
in any ActiveScripting language, thus you can use ActivePerl and
ActivePython as well as javascript and vbscript as the extension
language.

By doing this one can easily build one's own XML DSL's. As an example
I wrote the following in a document for the oioxml project which I
can't find the most current version but this one will do, (text quoted
below, tested in MSXML 4 running in MSXSL, not sure about newer
versions of processor):

http://www.oio.dk/files/OIXML_XSLT_Guidebook.pdf

"The use of extension functions and elements is often justified on the
basis of speed, given that in most processors the calling of such a
function would add some overhead before the speed benefits could be
calculated this argument is not a good one, as noted by Dimitre
Novatchev here in his discussion of using XSLT for specific functions
instead of a javascript extension
http://sources.redhat.com/ml/xsl-list/2002-09/msg00745.html and also
in the third page of his article on his EXSLT implementation
http://www.xml.com/pub/a/2003/08/06/exslt.html?page=3
In some processors extension functions can be handled via a script tag
in the processors namespace, this was also followed in the case of
XSLT 1.1, which was never issued as a W3C recommendation due to a
decision to go forward with XSLT 2.0 development. This possibility of
a script tag was vigorously fought against by the XSLT community,
including a petition not to allow an xsl:script element
http://www.biglist.com/lists/xsl-list/archives/200103/msg00000.html .

Given that all the languages we can think of off-hand that are used in
these script tags have an eval function or similar capabilities it
follows that a function could just be implemented to analyze arbitrary
scripting code.
An example of such a usage of a script extension, for informational
purposes because of the obvious security issues involved in such a
scenario, is shown below for msxsl

<?xml version='1.0' encoding='UTF-8'?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:jsc="http://www.somefakeurl/jsc";
xmlns:js="http://www.somefakeurl/jscriptEvalExtension";
version="1.0"
extension-element-prefixes="msxsl xsl js jsc"
>
<xsl:output method="xml" encoding="UTF-8"/>
<msxsl:script language="JScript" implements-prefix="jsc">
function jseval(value){
valparam = eval(value);
return valparam;
}
function jsshell(param){
var WshShell = new ActiveXObject("WScript.Shell"); WshShell.Run(param);
var out = param + " done";
return out
}
function jsGetVar(value){
stempval = eval(value);
return String(stempval);
}

<![CDATA[
function isAlien(a) { return isObject(a) && typeof a.constructor != 'function';
}
]]>
</msxsl:script>
<xsl:template match="*"><xsl:copy><xsl:copy-of select="@*"/>
<xsl:apply-templates/>
</xsl:copy></xsl:template>
<xsl:template match="js:runStraight" ><xsl:value-of
select="jsc:jsevalErr(string(.))"/>
</xsl:template>
<xsl:template match="js:getJsVariable" >
<xsl:value-of select="jsc:jsGetVar(string(@name))"/></xsl:template>
<xsl:template match="js:variable" >
<xsl:variable name="var"><xsl:apply-templates/></xsl:variable>
<xsl:value-of select="jsc:eval(concat(string(@name),'=',string($var),';'))"/></xsl:template>
<xsl:template match="js:shell" ><cmdrun><xsl:value-of
select="jsc:jsshell(string(.))"/></cmdrun></xsl:template>
</xsl:stylesheet>

This stylesheet will copy all elements directly except for those
elements in the js namespace, those elements in the js namespace it
will evaluate with script in the jsc namespace, the jsc namespace is
just javascript.

... followed by some simple explanations....
Running the stylesheet above against the following example:

<?xml version="1.0"?>
<test xmlns:js="http://www.somefakeurl/jscriptEvalExtension";
>
<example>
<desc>send the string from js:runStraight to an eval function, returns
string</desc>
<js:runStraight>blah ='blahs';</js:runStraight></example>
<js:variable name="foo">'bar'</js:variable>
<p> the next tag gets the value of foo which was set by the former tag. </p>
<js:getJsVariable name="foo"/>
<p> the next tag gets the value of blah </p>
<js:getJsVariable name="blah"/>
<p> now we change the value of blah </p>
<js:runStraight>var blah; blah= &quot;new blah&quot;;</js:runStraight>
</test>

results in the following output:
<?xml version="1.0" encoding="UTF-8"?><test
xmlns:js="http://www.somefakeurl/jscriptEvalExtension";>
<example>
<desc>send the string from js:runStraight to an eval function, returns
string</desc>
blahs</example>
bar
<p> the next tag gets the value of foo which was set by the former tag. </p>
bar
<p> the next tag gets the value of blah </p>
blahs
<p> now we change the value of blah </p>
new blah
</test>

Furthermore, if we make the following instance:
<js:shell xmlns:js="http://www.somefakeurl/jscriptEvalExtension";>cmd</js:shell>
the xml result will be the following:
<?xml version="1.0" encoding="UTF-8"?>
<cmdrun>cmd done</cmdrun>

and when running the stylesheet from an environment that has rights to
call the command line such as msxsl.exe cmd.exe will be called
"
... followed by a screenshot of this outrageous security violation
taking place :)
....


"For various reasons, such as possible faults in the xslt processor,
we would not recommend using this method for script storage, however
it does provide a theoretically interesting example of how an
extensible programming interpreter might work, following the ideas
presented by Gregory v. Wilson
in his article for acmqueu as such we think it provides an interesting area of
experimentation..."

Of course the example above was sort of stupid because I would never
just eval from input, so I have to admit I couldn't think of a good
example.

As for XMLHTTP that is implemented in different ways in different
browsers, in IE it exists as an ActiveXObject, this was the first
implementation, as such you can use it as an extension in MSXML.

However I think an extension for all HTTP verbs in SAXON would be more
xsl-licious.


Cheers,
Bryan Rasmussen


On 9/10/07, Abel Braaksma <abel.online@xxxxxxxxx> wrote:
> Colin Paul Adams wrote:
> >     Abel> surely it will, esp. because most PHP programmers write for the
> >     Abel> internet and this usually requires ecmascript/javascript
> >     Abel> skills: it is familiar already.
> >
> > But is it useful?
> >
> > What will the programer be able to do that (s)he cannot do with
> > xsl:function?
> >
> > If using the host language, I can envisage various answers, but Javascript?
> >
>
> One thing that I dearly miss from the XSLT spec is a way to do a POST
> request (i.e., retrieve a SOAP or JSON XML document through web
> services). With JavaScript one has the ability to use XmlHTTPRequest
> which can be used for issuing a POST to the server and returning an XML
> object.
>
> Another thing you can do is check the existence of an external resource,
> i.e., if you use unparsed-text-available() the function will fully parse
> the text and will only fail or succeed. It won't say, for instance, that
> the UTF-8 contains invalid characters. I happen to work a lot with
> external resources. Knowing the difference between the availability of a
> URL and the unparsability of a URL is of great value to me.
>
> And I'm sure there are other things that cannot be done through
> xsl:function. I.e., try to process a ZIP file will be rather hard
> (meaning: impossible) in XSLT because of the &#x0; characters.
>
> -- Abel Braaksma

Current Thread