Subject: [xsl] Re: Turning escaped mixed content back to XML From: Martin Holmes <mholmes@xxxxxxx> Date: Fri, 28 Mar 2014 14:06:30 -0700 |
Cheers, Martin
That's what I needed: parse-xml-fragment(). This seems to work:
<xsl:template match="text:p" exclude-result-prefixes="#all">
<!-- <xsl:variable name="unparsed" select="concat('<p>', string-join(//text(), ''), '</p>')"/> <xsl:variable name="parsed" select="saxon:parse($unparsed)"/> <xsl:copy-of select="$parsed" exclude-result-prefixes="#all"/>--> <xsl:if test="string-length(.) gt 0"> <tei:p> <xsl:value-of select="parse-xml-fragment(string-join(//text(), ''))"/> </tei:p></xsl:if> </xsl:template>
for most cases. I do have some horrible edge-cases though:
<text:p>a start-tag, with delimiters < and > is intended</text:p>
I should be able to pre-process the input text for angle brackets in the context of spaces and swap them out for something else temporarily though.
Thanks, Martin
On 14-03-28 11:35 AM, Martin Honnen wrote:Martin Holmes wrote:
I'm trying to process an ODS spreadsheet which has <text:p> nodes which contain embedded mixed-content markup in escaped form:
<text:p>indicates the amount by which this zone has been rotated clockwise, with respect to the normal orientation of the parent <gi>surface</gi> element as implied by the dimensions given in the <gi>msDesc</gi> element or by the coordinates of the <gi>surface</gi> itself. The orientation is expressed in arc degrees.</text:p>
I need to turn this back into parsed XML for insertion into XML documents. I'm using Saxon 9.4 with XSLT 2 (and I can use 3 if necessary).
I tried
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:text="http://example.com" xmlns:tei="http://example.com/tei" version="3.0">
<xsl:template match="text:p"> <tei:p> <xsl:copy-of select="parse-xml-fragment(.)"/> </tei:p> </xsl:template>
</xsl:stylesheet>
with Saxon 9.5 PE and got
<?xml version="1.0" encoding="UTF-8"?><tei:p xmlns:text="http://example.com" xmlns:tei="http://example.com/tei">indicate s the amount by which this zone has been rotated clockwise, with respect to the normal orientation of the parent <gi>sur face</gi> element as implied by the dimensions given in the <gi>msDesc</gi> element or by the coordinates of the <gi>sur face</gi> itself. The orientation is expressed in arc degrees.</tei:p>
That has XML elements and not escaped markup so should do, you will need to change the namespaces and maybe use exclude-result-prefixes.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Turning escaped mixed con, Martin Holmes | Thread | Re: [xsl] Re: Turning escaped mixed, David Carlisle |
Re: [xsl] XSLT Hello World - outrea, Brian Chrisman | Date | Re: [xsl] Re: Turning escaped mixed, David Carlisle |
Month |