Re: [xsl] [XSLT 1.0] How to remove CDATA sections?

Subject: Re: [xsl] [XSLT 1.0] How to remove CDATA sections?
From: David Mitchell <david.mitchell@xxxxxxxxx>
Date: Wed, 30 Jun 2010 16:57:36 -0500
Not in XSLT.

You want a text processor that works at the text level, not the XML level.

XSLT lets you process the tree after the XML parser is done with it.
The tree of nodes doesn't have CDATA, it just has text. CDATA is just
a convenient way for authors to escape one or more less than and
ampersand characters as without having to type &lt; and &amp;
respectively.

That means your source XML is indistinguishable to this source XML (to
any XML tool, anyway):

<?xml version="1.0" encoding="US-ASCII"?>
<Weather xmlns="http://www.weather.org";>

   <Source>Agent Dick Tracy</Source>
   <Location>Atlantistan</Location>
   <Date>2009-09-30T12:26:00</Date>
   <Temperature units="degrees F">91</Temperature>


       &lt;% String eid = request.getParameter("eid"); %>

       Employee ID: &lt;%= eid %>


</Weather>


In either case, once the parser is done with it (which is before we
get to the XSLT processor), the text in the Weather element looks like
this:
  <% String eid = request.getParameter("eid"); %>

       Employee ID: <%= eid %>

Take a look at sed, awk, perl, or python.

On Wed, Jun 30, 2010 at 12:56 PM, Costello, Roger L. <costello@xxxxxxxxx>
wrote:
> Hi Folks,
>
> Consider this XML document containing a CDATA section:
>
> <?xml version="1.0" encoding="US-ASCII"?>
> <Weather xmlns="http://www.weather.org";>
>
>    <Source>Agent Dick Tracy</Source>
>    <Location>Atlantistan</Location>
>    <Date>2009-09-30T12:26:00</Date>
>    <Temperature units="degrees F">91</Temperature>
>
>    <![CDATA[
>        <% String eid = request.getParameter("eid"); %>
>
>        Employee ID: <%= eid %>
>    ]]>
>
> </Weather>
>
>
> Using XSLT 1.0, I would like to remove the CDATA section:
>
> <?xml version="1.0" encoding="US-ASCII"?>
> <Weather xmlns="http://www.weather.org";>
>
>    <Source>Agent Dick Tracy</Source>
>    <Location>Atlantistan</Location>
>    <Date>2009-09-30T12:26:00</Date>
>    <Temperature units="degrees F">91</Temperature>
>
>
> </Weather>
>
>
> This is just one example. I would like the XSLT program to remove all CDATA
sections in any XML document. Is there a way to do it?
>
> /Roger

Current Thread