RE: [xsl] Normalizing/Simplify MS-Excel XML

Subject: RE: [xsl] Normalizing/Simplify MS-Excel XML
From: stephan@xxxxxxxxxx
Date: Tue, 16 Aug 2005 19:07:34 +0800
Hi Kaila,

thanks so mutch, works like a charm!
My stylesheet takes the name of a worksheet as a parameter and pulls the 
values into a easier structure for further import (I asume that the first 
row does have the field names and no "holes"). For future readers I 
include my sheet below:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="
http://www.w3.org/1999/XSL/Transform"; xmlns:e="
urn:schemas-microsoft-com:office:spreadsheet" xmlns:ss="
urn:schemas-microsoft-com:office:spreadsheet" exclude-result-prefixes="e 
ss">
        <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="
yes"/>
        <xsl:param name="sheetname">Sheet 1/xsl:param>
        <xsl:template match="e:Workbook">
                <xsl:element name="importdata">
                        <xsl:attribute name="type"><xsl:value-of select="
$sheetname"/></xsl:attribute>
                        <xsl:apply-templates select="
e:Worksheet[@ss:Name=$sheetname]"/>
                </xsl:element>
        </xsl:template>
        <xsl:template match="e:Worksheet">
                <xsl:apply-templates select="e:Table/e:Row"/>
        </xsl:template>
        <xsl:template match="ss:Row">
                <xsl:if test="not(ss:Cell[1]/@ss:Index)">
                        <!-- We do NOT process records where the first 
cell is empty -->
                        <xsl:element name="record">
                                <xsl:attribute name="position"><
xsl:value-of select="position()-1"/></xsl:attribute>
                                <xsl:apply-templates select="e:Cell"/>
                        </xsl:element>
                </xsl:if>
        </xsl:template>
        <xsl:template match="ss:Cell">
                <xsl:variable name="curCol">
                        <xsl:choose>
                                <xsl:when test="@ss:Index">
                                        <xsl:value-of select="@ss:Index"/>
                                </xsl:when>
                                <xsl:otherwise>
                                        <xsl:call-template name="
cell-index">
                                                <xsl:with-param name="idx" 
select="1"/>
                                        </xsl:call-template>
                                </xsl:otherwise>
                        </xsl:choose>
                </xsl:variable>
                <xsl:element name="field">
                        <xsl:attribute name="col"><xsl:value-of select="
$curCol"/></xsl:attribute>
                        <xsl:attribute name="name"><xsl:value-of select="
normalize-space(ancestor::e:Table/e:Row[position()=1]/e:Cell[position()=$curCol])
"/></xsl:attribute>
                        <xsl:value-of select="ss:Data"/>
                </xsl:element>
        </xsl:template>
        <xsl:template name="cell-index">
                <xsl:param name="idx"/>
                <xsl:if test="$idx &lt;= position()">
                        <xsl:choose>
                                <xsl:when test="
preceding-sibling::ss:Cell[position()=$idx]/@ss:Index">
                                        <xsl:value-of select="
preceding-sibling::ss:Cell[position()=$idx]/@ss:Index +$idx"/>
                                </xsl:when>
                                <xsl:when test="$idx = position()">
                                        <xsl:value-of select="$idx"/>
                                </xsl:when>
                                <xsl:otherwise>
                                        <xsl:call-template name="
cell-index">
                                                <xsl:with-param name="idx" 
select="$idx+1"/>
                                        </xsl:call-template>
                                </xsl:otherwise>
                        </xsl:choose>
                </xsl:if>
        </xsl:template>
</xsl:stylesheet>

To be very honest:

I do not fully understand how the recursive call to cell-index is actually 
working. If a kind soul could enlighten me on that one, that would be 
great.
:-) stw

"Kaila Kaarle" <Kaarle.Kaila@xxxxxxx> wrote on 15/08/2005 20:28:47:

> hi,
> 
> I had the same task a while ago and you can find my solution at:
> 
> http://www.kk-software.fi/kalle/xslt/fromExcel.xsl
> 
> It fills all empty cells but doesn't do anything with empty rows
> as I'm not interested in them.
> 
> regards
> Kaarle
> 
> -----Original Message-----
> From: stephan@xxxxxxxxxx [mailto:stephan@xxxxxxxxxx]
> Sent: 15. elokuuta 2005 14:36
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] Normalizing/Simplify MS-Excel XML
> 
> 
> Hi there,
> 
> cracking my head on this...
> I try to convert an Excel XML Spreadsheet into a simpler format. My 
first 
> attempt works fine (my Excel sheet name is called "Sheet 1") but only 
when 
> all cells are filled (see code below). When cells in Excel are not 
filled 
> Excel XML simply adds a column reference to the next following cell like 

> this:
> 
> <Row>
> <Cell><Data ss:Type="String">Peter Pan</Data></Cell>
> <Cell ss:Index="4"><Data ss:Type="String">Toon</Data></Cell>
> <Cell><Data ss:Type="String">Disney</Data></Cell>
> <Cell><Data ss:Type="String">USA</Data></Cell>
> <Cell ss:Index="11"><Data ss:Type="String">7</Data></Cell>
> <Cell><Data ss:Type="String">GA</Data></Cell>
> </Row>
> 
> So type of movie is in column 4, the Studio in column 5 and the country 
in 
> column 6.
> (Column 2 would have the Male Star and Column 3 the Female, which is 
empty 
> for Toons or Animee, Column 7-10 some more statistical data)
> 
> I need:
> <movie>
> <title>Peter 
> 
Pan</title><genre>Toon</genre><studio>Disney</studio><country>USA</country><rating>7</rating><audience>GA</audience>
> </movie>
> 
> The field names are in the first row....
> What I wonder: how do I get an XPath expression (I guess 
> preceding-sibling:: comes into play) that can give me the column name 
> based on the last ss:Index and the "distance to it", so I get column 6 
for 
> Disney, column 7 for USA etc.
> 
> So it seems to be the basic axis question: "When within my siblings was 
> the last time a attribute of type x appeared and what was the value. The 

> what was the value question seems rather simple: select="
> preceding-sibling::e:Cell/@ss:Index"
> The "how many nodes ago" question is where I'm stuck....
> 
> Help is greatly appreciated!!!
> :-) stw
> 
> My template that works for full rows:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet version="1.0" xmlns:xsl="
> http://www.w3.org/1999/XSL/Transform"; xmlns:e="
> urn:schemas-microsoft-com:office:spreadsheet" xmlns:ss="
> urn:schemas-microsoft-com:office:spreadsheet" exclude-result-prefixes="e 

> ss">
>         <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="
> yes"/>
>         <xsl:param name="sheetname">Sheet 1</xsl:param>
>         <xsl:template match="e:Workbook">
>                 <xsl:element name="importdata">
>                         <xsl:attribute name="type"><xsl:value-of 
select="
> $sheetname"/></xsl:attribute>
>                         <xsl:apply-templates select="
> e:Worksheet[@ss:Name=$sheetname]"/>
>                 </xsl:element>
>         </xsl:template>
>         <xsl:template match="e:Worksheet">
>                 <xsl:apply-templates select="e:Table"/>
>         </xsl:template>
>         <xsl:template match="e:Table">
>                 <xsl:apply-templates select="e:Row"/>
>         </xsl:template>
>         <xsl:template match="e:Row">
>                 <xsl:element name="record">
>                         <xsl:attribute name="position"><xsl:value-of 
> select="position()-1"/></xsl:attribute>
>                         <xsl:apply-templates select="e:Cell"/>
>                 </xsl:element>
>         </xsl:template>
>         <xsl:template match="e:Cell[e:Data!='']">
>                 <xsl:variable name="pos" select="position()"/>
>                 <xsl:element name="field">
>                         <xsl:attribute name="name"><xsl:value-of 
select="
> 
normalize-space(ancestor::e:Table/e:Row[position()=1]/e:Cell[position()=$pos])
> "/></xsl:attribute>
>                         <xsl:value-of select="e:Data"/>
>                 </xsl:element>
>         </xsl:template>
>         <!-- elements we do not need -->
>         <xsl:template match="e:Row[position()=1]"/>
>         <!-- First row contains field names -->
>         <xsl:template match="*"/>
> </xsl:stylesheet>

Current Thread