Re: [xsl] Turning escaped mixed content back to XML

Subject: Re: [xsl] Turning escaped mixed content back to XML
From: Martin Honnen <Martin.Honnen@xxxxxx>
Date: Fri, 28 Mar 2014 19:35:24 +0100
Martin Holmes wrote:

I'm trying to process an ODS spreadsheet which has <text:p> nodes which
contain embedded mixed-content markup in escaped form:

<text:p>indicates the amount by which this zone has been rotated
clockwise, with respect to the normal orientation of the parent
&lt;gi&gt;surface&lt;/gi&gt; element as implied by the dimensions given
in the &lt;gi&gt;msDesc&lt;/gi&gt; element or by the coordinates of the
&lt;gi&gt;surface&lt;/gi&gt; itself. The orientation is expressed in arc
degrees.</text:p>

I need to turn this back into parsed XML for insertion into XML
documents. I'm using Saxon 9.4 with XSLT 2 (and I can use 3 if
necessary).

I tried


<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
  xmlns:text="http://example.com";
  xmlns:tei="http://example.com/tei";
  version="3.0">

<xsl:template match="text:p">
  <tei:p>
    <xsl:copy-of select="parse-xml-fragment(.)"/>
  </tei:p>
</xsl:template>

</xsl:stylesheet>

with Saxon 9.5 PE and got


<?xml version="1.0" encoding="UTF-8"?><tei:p xmlns:text="http://example.com"; xmlns:tei="http://example.com/tei";>indicate
s the amount by which this zone has been rotated clockwise, with respect to the normal orientation of the parent <gi>sur
face</gi> element as implied by the dimensions given in the <gi>msDesc</gi> element or by the coordinates of the <gi>sur
face</gi> itself. The orientation is expressed in arc degrees.</tei:p>


That has XML elements and not escaped markup so should do, you will need to change the namespaces and maybe use exclude-result-prefixes.

Current Thread