[xsl] Turning escaped mixed content back to XML

Subject: [xsl] Turning escaped mixed content back to XML
From: Martin Holmes <mholmes@xxxxxxx>
Date: Fri, 28 Mar 2014 11:12:37 -0700
Hi there,

I'm trying to process an ODS spreadsheet which has <text:p> nodes which contain embedded mixed-content markup in escaped form:

<text:p>indicates the amount by which this zone has been rotated clockwise, with respect to the normal orientation of the parent &lt;gi&gt;surface&lt;/gi&gt; element as implied by the dimensions given in the &lt;gi&gt;msDesc&lt;/gi&gt; element or by the coordinates of the &lt;gi&gt;surface&lt;/gi&gt; itself. The orientation is expressed in arc degrees.</text:p>

I need to turn this back into parsed XML for insertion into XML documents. I'm using Saxon 9.4 with XSLT 2 (and I can use 3 if necessary). I've tried a variety of approaches involving saxon:serialize with disable-output-escaping, feeding into saxon:parse, but the output always ends up being escaped just like the input. Does anyone have experience of doing this?

Here's the sort of thing I've tried:

[...]
<xsl:output name="outputSerializedTEI" method="xml" indent="no" encoding="UTF-8" exclude-result-prefixes="#all" omit-xml-declaration="yes" />


[...]

<xsl:template match="text:p" exclude-result-prefixes="#all">
<xsl:variable name="unparsed">
<xsl:copy-of select="*|text()"/>
</xsl:variable>
<xsl:variable name="parsed" select="saxon:parse(saxon:serialize($unparsed, 'outputSerializedTEI'))"/>
<tei:p>
<xsl:copy-of select="$parsed"/>
</tei:p>
</xsl:template>


Cheers,
Martin

Current Thread