Subject: Re: [xsl] Tags in text content From: "Pieter Masereeuw pieter@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Thu, 11 Apr 2024 14:01:37 -0000 |
I have some xml where there are btagsb embedded in the text content.As Martin suggested, parse-xml-fragment() will do the job as long as all of the escaped markup is well-formed. If whatbs been escaped is HTML that may have come from a process that produced not-well-formed markup, for example:
<root> <p><em>End tags? We donbt need no stinking end tags![1]</p> </root>
then the problem is a little harder. What Ibve found successful in this case is the Validator.nu HTML parser. It will parse any input and produce a well-formed document that conforms to the parsing rules of HTML5. Some post-processing may be necessary to tidy it back up again (removing the HTML5 namespace, removing the <html> wrapper, etc. but thatbs *much* easier once you have an XML fragment).
Be seeing you, norm
[1] https://en.wikipedia.org/wiki/Stinking_badges
-- Norm Tovey-Walsh <ndw@xxxxxxxxxx> https://norm.tovey-walsh.com/
There is a great difference between seeking how to raise a laugh from everything, and seeking in everything what may justly be laughed at.--Lord Shaftesbury
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Tags in text content, Norm Tovey-Walsh ndw | Thread | [xsl] evaluation of predicate using, Martin Honnen martin |
Re: [xsl] Tags in text content, Norm Tovey-Walsh ndw | Date | [xsl] evaluation of predicate using, Martin Honnen martin |
Month |