Subject: RE: [xsl] Handling Non Well conformed HTML content From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Tue, 3 Oct 2006 14:12:02 +0100 |
> I have typical issue in handling HTML content in XML document > of the below structure and i want to replace the HTML > template with the respective node element text. > HTML is not well formed. Before you can process the HTML, you will have to turn it into well-formed XML. You can do this using the JTidy utility. For that matter we are doing base64 > encode of the html content. You'll have to find a Base64 decoder. Details will depend on your processing environment, e.g. whether it's Java, Microsoft, or whatever. However, I can't relate either of those points to the example you show below. > Please provide any resolution for the same. > The replacement content might be in any part of the document. > Any suggestions are welcome. > > Input content > <?xml version="1.0" encoding="UTF-8"?> > <broadcast> > <content_vars> > <content name="subject"><html>Hello [[BUYERS_NAME]]</html></ > content><!--encoded--> > <content name="text">REF Order [WEB_ORDER_NUMBER]</content><!-- > encoded-> > </content_vars> > > <ORDER_FEED> > <ORDER> > <ORDER_HEADER> > <BUYERS_NAME>Senthil</BUYERS_NAME> > <WEB_ORDER_NUMBER>W12345<WEB_ORDER_NUMBER> > </ORDER_HEADER> > <!--Line Items--> > </ORDER> > </ORDER_FEED> > </broadcast> > > XSLT I tried for the same > <?xml version="1.0" encoding="UTF-8"?> > <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/ > Transform"> > > <xsl:output method="html" indent="yes" /> > > <xsl:template match="/broadcast"> > <xsl:apply-templates select="content_vars/content" /> > > </xsl:template> > > <xsl:template match="content"> > > <xsl:variable name="temp1" select="translate(., '[]', '')" /> > <xsl:variable name="temp2" > > select="normalize-space(../following-sibling::*[contains($temp1, > local-name())])" /> > <xsl:variable name="temp3" > select="local-name(../following-sibling::*[contains($temp1, > local-name())])" /> > <xsl:value-of select="substring-before($temp1, $temp3)" > /><xsl:value-of select="$temp2" /><xsl:value-of > select="substring-after($temp1, $temp3)" /> </xsl:template> > > </xsl:stylesheet> > > Expected output > <html> > Hello Senthil > REF Order W12345 > </html> > > And I am getting unexpected > <html> > Hello BUYERS_NAME > REF Order WEB_ORDER_NUMBER > </html> > Let me know how do I tweak the code to work as desired. I think it's more than a tweak. Your main mistake is using the following-sibling axis rather than following (the BUYERS_NAME element is not a sibling of the content_vars element). But also, your code seems generally lacking in robustness. You're ignoring both the HTML tagging and the [[...]] markers (or [...] depending which of the two examples we look at); you're assuming that there will only be one insert in each element, and that its name won't clash with any other textual content in the element. This all seems pretty poor coding. Michael Kay http://www.saxonica.com/
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Handling Non Well conformed H, Senthilkumaravelan K | Thread | Re: [xsl] Handling Non Well conform, Mukul Gandhi |
RE: [xsl] [Announce] XX Framework V, Michael Kay | Date | Re: [xsl] Source Code Java in XML, Jeremy Patterson |
Month |