Re: [xsl] Dealing mixed content with invalid node-like text

Subject: Re: [xsl] Dealing mixed content with invalid node-like text
From: Karlmarx R <karlmarxr@xxxxxxxxx>
Date: Fri, 9 Dec 2011 05:16:26 +0800 (SGT)
Thankyou very much for further recommendations, very useful.

Brandon - Thanks
for that <xsl:value-of select="replace($unparsed,
'&lt;(\S+\s+\.)&gt;','&lt;$1&gt;')"/> suggestion. With a slight modification -
removing \s, I can use this for my specific use (pls see next point). Btw, if
a solution available to cover wider scope, I think that would be more
beneficial. 

David - On your /htmlparse.xsl + <xsl:sequence
select="d:htmlparse($in,'',false())"/> suggestion, Yes I want something like
that. Your solution almost worked, except for a glitch (minor or major?). Of
the example text, 

<xsl:variable name="in"><![CDATA[Line one text <b>within
valid node</b> and like <II .> Title etc Line two with <1a .> Title etc,
<i>within</i> <b>something</b> etc another line can be just normal
text]]></xsl:variable>

the <II .> and <2 .> have "space" between the dot and
previous letters. If I had texts WITHOUT SPACE, like <II.> or <3.>, then it
fails. I have not fully gone through your htmlparse.xsl whether any tweaks can
be made so that this can be covered. But if it can be achieved, then that
would be the ideal solution. For example, I think such fix will also cover a
scenario (hypothetical) like,

<xsl:variable name="in"><![CDATA[Line one text
with <bbbb> like invalid node item and like <5.> Chapter V with valid <b>bold
title</b> etc]]></xsl:variable>,

to give an output like <output>Line one text
with &lt;bbbb&gt; like invalid node item and like &lt;5.&gt; Chapter V with
valid <b>something</b> etc</output>

Is this possible to achieve tweaking your
code?

Thanks,
karl

Current Thread