Re: [xsl] Dealing mixed content with invalid node-like text

Subject: Re: [xsl] Dealing mixed content with invalid node-like text
From: David Carlisle <davidc@xxxxxxxxx>
Date: Thu, 08 Dec 2011 22:15:12 +0000
On 08/12/2011 21:16, Karlmarx R wrote:
the<II .>  and<2 .>  have "space" between the dot and previous
letters. If I had texts WITHOUT SPACE, like<II.>  o



well that's because unlike <II .>, <II.> is a legal XML start tag, so it parses that way, for the same reason that <b> in your example parsed as a tag.


If you know your elements don't have . in their name then you could take a local copy of htmlparse (or xsl:import it) and modify the regexp that recognises element names not to include "."

you could change

<xsl:variable name="d:elem"
   select="'(\i\c*)'"/>


to


<xsl:variable name="d:elem"
   select="'([a-zA-Z][a-zA-Z0-9]*)'"/>


for example, if you only need ascii letters and digits


David

Current Thread