Re: [xsl] remove tags + CDATA tag out of big xml file

Subject: Re: [xsl] remove tags + CDATA tag out of big xml file
From: Michael Ludwig <milu71@xxxxxx>
Date: Fri, 29 Jan 2010 23:18:03 +0100
bw schrieb am 29.01.2010 um 12:02:10 (+0100):
> Hello,
> 
> I have a big xml feed out of my content management system that
> includes wysiwyg html tags inside CDATA tags.
> 
> I am looking for a way to remove the CDATA and only get the text.

>          <content><![CDATA[
> <p>The <strong>keyword</strong> is nice to have but is not needed to
> include in a solr feed</p> ...

Looks like this feed is for Solr (an indexer), which won't do anything
useful with the markup anyway. Someone has defined <title> and <content>
as fields for the indexer but has forgotten to strip the markup from the
source. That source markup in CDATA has no purpose in a feed for Solr
and should not have been included in the first place.

-- 
Michael Ludwig

Current Thread