Re: [xsl] RSS feeds and disable-output-escaping="yes"

Subject: Re: [xsl] RSS feeds and disable-output-escaping="yes"
From: Alex Milowski <alexml@xxxxxxxxxxxx>
Date: Tue, 10 May 2005 07:28:23 -0700
On May 6, 2005, at 1:59 AM, Don Robertson wrote:

Greetings,
I have set up a Drupal sub-site and would like the RSS feed from the
site to be displayed as a 'whats new' panel on our main page.

Easy little XSL script I thought. The RSS feed has all the html in the
'description' tag escaped. I have used disable-output-escaping="yes" to
display the html, but I really need to be able to manipulate some of the
tags - the img tags in particular - I'd like to either remove or reduce
the width of the images (it is mostly user documentation for the
WebOPAC).


Is there any way I can do this or do I need to pre-process the rss feed
before I feed it into the XSL transformer thingy.

Yes. You could use an XML Pipeline. There several different XML pipelining
languages/implementations out there to use.


...but I'll shamelessly promote my open source smallx XML Pipelining language
(see http://smallx.dev.java.net):


You need to:

1. "unescape" the descriptions.
2. Transform the descriptions (via XSLT?).
3. Then transform the whole thing to HTML.

That looks something like:

<p:pipe name="rss-convert" xmlns:p="urn:publicid:IDN+smallx.com:pipeline:1.0";
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
>
<p:subtree select="description">
<p:parse/>
<p:xslt>
<xsl:transform version="1.0">...whatever...</xsl:transform>
</p:xslt>
</p:subtree>


<p:xslt src="final.xsl"/>

</p:pipe>

The 'subtree' step scopes the contains steps to occurrences of the
'description' element.  For each of those, the text descendants will
be parsed as XML and then have XSLT run on the resulting subtree.  The
XSLT in that step was inlined.

The last step will run the stylesheet 'final.xsl'.

Now, many RSS feeds have incorrectly escaped content (e.g. &amp; instead
of &amp;amp;).  That means that the 'parse' step may fail.  You can
fix that by trapping the errors:

<p:trap>
  <p:parse>
  <p:on-error>
     <p:template>
        <xsl:copy-of select="/c:error-context/description"/>
     </p:template>
  </p:on-error>
</p:trap>

The above just copies the bad description back into the pipeline result. The
parse error is available as a new subtree:


   <c:error-context>
      <c:error>...</c:error>
      ...original content...
   </c:error-context>

-- Alex Milowski

"The excellence of grammar as a guide is proportional to the paucity of the
inflexions, i.e. to the degree of analysis effected by the language
considered."


Bertrand Russell in a footnote of Principles of Mathematics

Current Thread