RE: [xsl] Entity Questions

Subject: RE: [xsl] Entity Questions
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Tue, 18 Jan 2005 17:41:21 -0000
You still haven't shown us your XML input, but it's beginning to sound very
much as if it contains HTML markup that is disguised as ordinary text either
by putting it in a CDATA section or by escaping the angle brackets.

If that's the case, you either need to extract the text-containing-HTML and
parse it into a tree, which you can do with an extension such as
saxon:parse(), or you need to get rid of the extra level of escaping by
serializing the HTML using disable-output-escaping="yes".

It's best not to start from here, as they say, but sometimes you have no
choice.

Michael Kay
http://www.saxonica.com/


> -----Original Message-----
> From: Luke Shannon [mailto:lshannon@xxxxxxxxxxxxxxx] 
> Sent: 18 January 2005 17:30
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Re: [xsl] Entity Questions
> 
> Hi Michael;
> 
> The point I am at now is the TEXT node of my XML document 
> contains text,
> some of this text contains HTML. If I display the text 
> disabling output
> escaping, I can see HTML tags in the output. Otherwise the 
> output contains
> escaped HTML tags (example: ;&lt;P&gt;).
> 
> What I think I need to do is tansform this text into a tree 
> so I can create
> templates to handle nodes like <p>. I noticed you meantioned 
> to someone else
> saxon:parse() to transform text into a tree.
> 
> Is this what I need to do? Do I have other options?
> 
> To review my goal is to find HTML tags in the content and 
> replace them with
> FO tags.
> 
> Thanks,
> 
> Luke
> 
> ----- Original Message ----- 
> From: "Michael Kay" <mike@xxxxxxxxxxxx>
> To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
> Sent: Monday, January 17, 2005 6:19 PM
> Subject: RE: [xsl] Entity Questions
> 
> 
> > >
> > > I tested adding the match="P" template. I didn't replace
> > > anything. I think
> > > this might be because I changed the output method to xml 
> to deal with
> > > entities like &nbsp;
> > >
> > > I am thinking the <P> tags are now &lt;P&gt; so the template
> > > is not matching
> > > them. I am adding some logging to verify this, if this is the
> > > case I may
> > > need to rethink things.
> >
> > Try to reduce the size of the problem, and post a complete specimen
> > including source XML, desired result, and stylesheet. Then 
> we can see
> where
> > you're going wrong. You're not grasping some of the 
> concepts: you're still
> > talking about tags rather than nodes, and thinking of the 
> input and output
> > as text rather than trees. It might be a good idea to do 
> some reading.
> >
> > Michael Kay
> > http://www.saxonica.com/

Current Thread