Subject: [xsl] Splitting Excel XML tables using XSLT 1.0? From: "Scott Lynch" <slynch@xxxxxxxxxx> Date: Thu, 22 Feb 2007 17:54:38 -0500 |
I'm a newbie, so bear with me..this is a long email.. I'm working on a project which requires that the XML output of an Excel spreadsheet be transformed into another data domain using XSLT 1.0 in Java 5.0. I've transformed the Excel XML into a form which omits the various style and formatting data (easier to read and work with), and now have a document which contains many Worksheets which look something like this: <Worksheet ss:Name="Cover"> <Table> <Row> <Cell> <Data>INSERT CUSTOMER LOGO HERE</Data> </Cell> </Row> <Row> <Cell> <Data>Succession UA-IP SN09FF</Data> </Cell> </Row> <Row> <Cell> <Data>Network Specification Document</Data> </Cell> </Row> <Row> <Cell> <Data>Customer Name</Data> </Cell> <Cell> <Data>Verizon GNS</Data> </Cell> </Row> <Row> <Cell> <Data>Project Name</Data> </Cell> <Cell> <Data>GNS</Data> </Cell> </Row> <Row> <Cell> <Data>Project Number</Data> </Cell> <Cell> <Data>H4R379</Data> </Cell> </Row> <Row> <Cell> <Data>Region</Data> </Cell> <Cell> <Data>North America</Data> </Cell> </Row> <Row> <Cell> <Data>End cover sheet</Data> </Cell> </Row> </Table> </Worksheet> What I need to do is transform this file again so that some of these rows are reformatted and some remain untouched. The rows which span <Row> <Cell> <Data>Customer Name</Data> </Cell> <Cell> <Data>Verizon GNS</Data> </Cell> : : <Row> <Cell> <Data>Region</Data> </Cell> <Cell> <Data>North America</Data> </Cell> </Row> are actually in a subtable on the spreadsheet (visual table only since Excel doesn't understand subtables in it's XML output) and I would like to add attributes to the Row and Cell nodes for easier second stage transformation and data queries. Something like the following: <Row subtable="Network Specification Parms"> <Cell type="parm"> <Data>Region</Data> </Cell> <Cell type="value"> <Data>North America</Data> </Cell> </Row> I've been able to transform to this format partially for the addition of the "subtable" attribute, however, I can't seem to find a way to skip rows which are handled by the recursive template logic so that they are not duplicated in the final output. Here's the XSL: <xsl:template match="ss:Workbook"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> </xsl:template> <xsl:template match="ss:Worksheet"> <xsl:if test="@ss:Name='Cover'"> <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:apply-templates/> </xsl:copy> </xsl:if> </xsl:template> <xsl:template match="ss:Table"> <xsl:copy> <xsl:apply-templates select="ss:Row" mode="subtable"/> </xsl:copy> </xsl:template> <xsl:template match="ss:Row" mode="subtable"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> <xsl:if test="ss:Cell[ss:Data='Network Specification Document']"> <xsl:apply-templates select="following-sibling::*[1]" mode="subtable_label"/> </xsl:if> </xsl:template> <xsl:template match="ss:Row" mode="subtable_label"> <xsl:if test="ss:Cell[ss:Data!='Region'] and ss:Cell[ss:Data!='North America']"> <xsl:copy> <xsl:attribute name="subtable">Network Specification Parms</xsl:attribute> <xsl:apply-templates/> </xsl:copy> <xsl:apply-templates select="following-sibling::*[1]" mode="subtable_label"/> </xsl:if> </xsl:template> <xsl:template match="ss:Cell"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> </xsl:template> <xsl:template match="ss:Data"> <xsl:copy> <xsl:value-of select="."/> </xsl:copy> </xsl:template> <xsl:template match="@*"> <xsl:copy/> </xsl:template> The output looks like this: <Worksheet ss:Name="Cover"> <Table> <Row> <Cell> <Data>INSERT CUSTOMER LOGO HERE</Data> </Cell> </Row> <Row> <Cell> <Data>Succession UA-IP SN09FF</Data> </Cell> </Row> <Row> <Cell> <Data>Network Specification Document</Data> </Cell> </Row> <Row subtable="Network Specification Parms"> <Cell> <Data>Customer Name</Data> </Cell> <Cell> <Data>Verizon GNS</Data> </Cell> </Row> : : <Row subtable="Network Specification Parms"> <Cell> <Data>Region</Data> </Cell> <Cell> <Data>North America</Data> </Cell> </Row> <Row> <Cell> <Data>Customer Name</Data> </Cell> <Cell> <Data>Verizon GNS</Data> </Cell> </Row> : : <Row> <Cell> <Data>Region</Data> </Cell> <Cell> <Data>North America</Data> </Cell> </Row> <Row> <Cell> <Data>End cover sheet</Data> </Cell> <Cell> <Data>EOS004.A08.02.07</Data> </Cell> </Row> </Table> </Worksheet> </Workbook> The recursive insertion of the "subtable" attribute "works", but I can't seem to stop the Rows from being replicated without losing the other rows outside the subtable. Is there some way to modify my XSL to accomplish this? Is there a mechanism in XSL 2.0 which could provide better management of ranges of nodes? Unfortunately, the Excel XML schema is devoid of any meaning other than spreadsheet layout, so I'm finding it difficult to come up with a mechanism to handle modifications of subranges of nodes during my copy transform. If possible, I need a generic mechanism to handle this sort of Row range transform on other Worksheets as well, using even more column identifiers than the simple "parm" and "value" labels I'm using here. thanks and sorry for the long email, Scott Lynch
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] How to include attributes, Mukul Gandhi | Thread | RE: [xsl] Splitting Excel XML table, Scott Lynch |
Re: [xsl] How to include attributes, Spencer Tickner | Date | RE: [xsl] Books on XSLT/XPATH, Wendell Piez |
Month |