Subject: RE: [xsl] Splitting Excel XML tables using XSLT 1.0? From: "Scott Lynch" <slynch@xxxxxxxxxx> Date: Fri, 23 Feb 2007 11:05:43 -0500 |
I actually got this to work by having nested recursive templates process each "section" of the giant Excel table, invoking mode templates with following-sibling context both recursively and to the nested templates (which are also recursive). The solution starts at the top/first node in the ss:Row context of the Excel ss:Table (one per worksheet in Excel) and simply recursively pattern matches it's way through the table processing individual chunks of ss:Rows in various ways (sometimes adding attributes, sometimes not). Here's the latest XSL that I'm using (would appreciate comments and suggestions for modularizing and improving performance): <xsl:template match="ss:Workbook[1]"> <xsl:copy> <xsl:apply-templates select="ss:Worksheet[@ss:Name='Cover']"/> </xsl:copy> </xsl:template> <xsl:template match="ss:Worksheet"> <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:apply-templates select="ss:Table[1]"/> </xsl:copy> </xsl:template> <xsl:template match="ss:Table"> <xsl:copy> <xsl:apply-templates select="ss:Row[1]" mode="subtable_search"/> </xsl:copy> </xsl:template> <xsl:template match="ss:Row" mode="subtable_search"> <xsl:choose> <xsl:when test="ss:Cell[ss:Data='Network Specification Document']"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> <xsl:apply-templates select="following-sibling::*[1]" mode="subtable_label1"/> </xsl:when> <xsl:otherwise> <xsl:copy> <xsl:apply-templates/> </xsl:copy> <xsl:apply-templates select="following-sibling::*[1]" mode="subtable_search"/> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="ss:Row" mode="subtable_label1"> <xsl:choose> <xsl:when test="ss:Cell[ss:Data='Document Status']"> <xsl:copy> <xsl:attribute name="subtable">Network Specification Parms2</xsl:attribute> <xsl:apply-templates/> </xsl:copy> <xsl:apply-templates select="following-sibling::*[1]" mode="subtable_label2"/> </xsl:when> <xsl:otherwise> <xsl:copy> <xsl:attribute name="subtable">Network Specification Parms1</xsl:attribute> <xsl:apply-templates/> </xsl:copy> <xsl:apply-templates select="following-sibling::*[1]" mode="subtable_label1"/> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="ss:Row" mode="subtable_label2"> <xsl:choose> <xsl:when test="ss:Cell[ss:Data='Specbook Template Version']"> <xsl:copy> <xsl:attribute name="subtable">Network Specification Parms3</xsl:attribute> <xsl:apply-templates/> </xsl:copy> <!-- <xsl:apply-templates select="following-sibling::*[1]" mode="subtable_label2"/> --> </xsl:when> <xsl:otherwise> <xsl:copy> <xsl:attribute name="subtable">Network Specification Parms2</xsl:attribute> <xsl:apply-templates/> </xsl:copy> <xsl:apply-templates select="following-sibling::*[1]" mode="subtable_label2"/> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="ss:Cell"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> </xsl:template> <xsl:template match="ss:Comment"> <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:apply-templates/> </xsl:copy> </xsl:template> <xsl:template match="ss:Data"> <xsl:copy> <xsl:value-of select="."/> </xsl:copy> </xsl:template> <xsl:template match="@*"> <xsl:copy/> </xsl:template> <!-- Default template to ignore anything not defined --> <xsl:template match="* | text() | processing-instruction()" /> thanks, Scott Lynch -----Original Message----- From: Lynch, Scott (NCRTP:3798) Sent: Thursday, February 22, 2007 5:55 PM To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Subject: [xsl] Splitting Excel XML tables using XSLT 1.0? I'm a newbie, so bear with me..this is a long email.. I'm working on a project which requires that the XML output of an Excel spreadsheet be transformed into another data domain using XSLT 1.0 in Java 5.0. I've transformed the Excel XML into a form which omits the various style and formatting data (easier to read and work with), and now have a document which contains many Worksheets which look something like this: <Worksheet ss:Name="Cover"> <Table> <Row> <Cell> <Data>INSERT CUSTOMER LOGO HERE</Data> </Cell> </Row> <Row> <Cell> <Data>Succession UA-IP SN09FF</Data> </Cell> </Row> <Row> <Cell> <Data>Network Specification Document</Data> </Cell> </Row> <Row> <Cell> <Data>Customer Name</Data> </Cell> <Cell> <Data>Verizon GNS</Data> </Cell> </Row> <Row> <Cell> <Data>Project Name</Data> </Cell> <Cell> <Data>GNS</Data> </Cell> </Row> <Row> <Cell> <Data>Project Number</Data> </Cell> <Cell> <Data>H4R379</Data> </Cell> </Row> <Row> <Cell> <Data>Region</Data> </Cell> <Cell> <Data>North America</Data> </Cell> </Row> <Row> <Cell> <Data>End cover sheet</Data> </Cell> </Row> </Table> </Worksheet> What I need to do is transform this file again so that some of these rows are reformatted and some remain untouched. The rows which span <Row> <Cell> <Data>Customer Name</Data> </Cell> <Cell> <Data>Verizon GNS</Data> </Cell> : : <Row> <Cell> <Data>Region</Data> </Cell> <Cell> <Data>North America</Data> </Cell> </Row> are actually in a subtable on the spreadsheet (visual table only since Excel doesn't understand subtables in it's XML output) and I would like to add attributes to the Row and Cell nodes for easier second stage transformation and data queries. Something like the following: <Row subtable="Network Specification Parms"> <Cell type="parm"> <Data>Region</Data> </Cell> <Cell type="value"> <Data>North America</Data> </Cell> </Row> I've been able to transform to this format partially for the addition of the "subtable" attribute, however, I can't seem to find a way to skip rows which are handled by the recursive template logic so that they are not duplicated in the final output. Here's the XSL: <xsl:template match="ss:Workbook"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> </xsl:template> <xsl:template match="ss:Worksheet"> <xsl:if test="@ss:Name='Cover'"> <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:apply-templates/> </xsl:copy> </xsl:if> </xsl:template> <xsl:template match="ss:Table"> <xsl:copy> <xsl:apply-templates select="ss:Row" mode="subtable"/> </xsl:copy> </xsl:template> <xsl:template match="ss:Row" mode="subtable"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> <xsl:if test="ss:Cell[ss:Data='Network Specification Document']"> <xsl:apply-templates select="following-sibling::*[1]" mode="subtable_label"/> </xsl:if> </xsl:template> <xsl:template match="ss:Row" mode="subtable_label"> <xsl:if test="ss:Cell[ss:Data!='Region'] and ss:Cell[ss:Data!='North America']"> <xsl:copy> <xsl:attribute name="subtable">Network Specification Parms</xsl:attribute> <xsl:apply-templates/> </xsl:copy> <xsl:apply-templates select="following-sibling::*[1]" mode="subtable_label"/> </xsl:if> </xsl:template> <xsl:template match="ss:Cell"> <xsl:copy> <xsl:apply-templates/> </xsl:copy> </xsl:template> <xsl:template match="ss:Data"> <xsl:copy> <xsl:value-of select="."/> </xsl:copy> </xsl:template> <xsl:template match="@*"> <xsl:copy/> </xsl:template> The output looks like this: <Worksheet ss:Name="Cover"> <Table> <Row> <Cell> <Data>INSERT CUSTOMER LOGO HERE</Data> </Cell> </Row> <Row> <Cell> <Data>Succession UA-IP SN09FF</Data> </Cell> </Row> <Row> <Cell> <Data>Network Specification Document</Data> </Cell> </Row> <Row subtable="Network Specification Parms"> <Cell> <Data>Customer Name</Data> </Cell> <Cell> <Data>Verizon GNS</Data> </Cell> </Row> : : <Row subtable="Network Specification Parms"> <Cell> <Data>Region</Data> </Cell> <Cell> <Data>North America</Data> </Cell> </Row> <Row> <Cell> <Data>Customer Name</Data> </Cell> <Cell> <Data>Verizon GNS</Data> </Cell> </Row> : : <Row> <Cell> <Data>Region</Data> </Cell> <Cell> <Data>North America</Data> </Cell> </Row> <Row> <Cell> <Data>End cover sheet</Data> </Cell> <Cell> <Data>EOS004.A08.02.07</Data> </Cell> </Row> </Table> </Worksheet> </Workbook> The recursive insertion of the "subtable" attribute "works", but I can't seem to stop the Rows from being replicated without losing the other rows outside the subtable. Is there some way to modify my XSL to accomplish this? Is there a mechanism in XSL 2.0 which could provide better management of ranges of nodes? Unfortunately, the Excel XML schema is devoid of any meaning other than spreadsheet layout, so I'm finding it difficult to come up with a mechanism to handle modifications of subranges of nodes during my copy transform. If possible, I need a generic mechanism to handle this sort of Row range transform on other Worksheets as well, using even more column identifiers than the simple "parm" and "value" labels I'm using here. thanks and sorry for the long email, Scott Lynch
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Splitting Excel XML tables us, Scott Lynch | Thread | [xsl] keep togethers for table-row , ms |
Re: [xsl] Filtering child text node, Mukul Gandhi | Date | [xsl] XHTML table of contents with , Daniel K. Schneider |
Month |