Re: [xsl] Length of a literal string containing embedded tags

Subject: Re: [xsl] Length of a literal string containing embedded tags
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Wed, 01 Aug 2007 14:28:46 +0200
Michael Kay wrote:
XSLT operates on the tree view of the parsed text, it has no access to the
original tags. There's no way of producing different results for the two
equivalent input files:

<a foo="bar">xyz<br/>pqr</a>

<a foo = 'bar'>xyz<br></br><![CDATA[pqr]]></a>

which is what you seem to be trying to do.

I think 'no way' is relative... The following is not really a way to treat your content from XSLT standpoint, but there's a way to use (pure) XSLT 2.0 to achieve what you want, actually, there are two ways, an easy and a hard one:


1. if you do not care about whitespace, CDATA and empty tags like in MK's example above, you can use the information from the infoset to calculate the length of the node you are having by using name() and attribute() methods (and perhaps namespace methods) and some string-length addition.

2. alternatively, if you do care about this, there is a harder way which uses the unparsed-text() method on the bare content of the XML file you are trying to process. The hard part is finding the correct node by using mere text manipulation routines and this is far from trivial, but it 'can be done'. You can even recognize CDATA and entities this way, which is impossible in any other way.

Though, depending on why you need it and how important it is for you, I should think twice, even thrice, before attempting such a task. But if you pay by the hour... ;)

Cheers,
-- Abel

Current Thread