Re: [xsl] fault tolerant saxon:parse()

Subject: Re: [xsl] fault tolerant saxon:parse()
From: "Andrew Welch" <andrew.j.welch@xxxxxxxxx>
Date: Mon, 17 Nov 2008 12:12:21 +0000
>> The former needs parsing if you want to process the escaped markup,
>> but if you do that with the latter you get an error
>
> but if you just parsed with tagsoup (or probably the others as well0 it
> would work in both cases, because in super-lax html parsing modes an &
> not followed by some letters and a semicolon parses as itself rather
> than an error.
>


Given this XML:

<root>
  <title>&lt;a href="foo.html"&gt;Today&lt;/a&gt;</title>
  <title>Hammersmith &amp; City</title>
</root>

and the need to process the <title> element to strip out the markup
(or some other requirement) - how would you incorporate tagsoup?

Currently I'm calling saxon:parse on the contents of the title
element, wrapped in a root node (as there's no guarantee of a single
root element):

<xsl:variable name="parsed-content"
select="saxon:parse(concat('&lt;root&gt;', saxon:parse(title),
'&lt;/root&gt;'))/root"/>
<xsl:value-of select="$parsed-content"/>

Do I parse the entire XML using tagsoup?


thanks
-- 
Andrew Welch
http://andrewjwelch.com
Kernow: http://kernowforsaxon.sf.net/

Current Thread