Re: [xsl] Selecting a particular node of an XML generated from Excel

Subject: Re: [xsl] Selecting a particular node of an XML generated from Excel
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Tue, 08 Nov 2011 21:56:34 +0000
What I have done successfully in the past with Excel data is to do a preprocessing phase that adds an ss:index attribute to every Cell. This is a standard conversion that can be applied to any Excel-exported stylesheet, so you only need to write the code once. Once you have done it, selecting the 117th Cell becomes trivial: Cell[@ss:index='117'].

Adding the index values is easier than you might think:

<xsl:for-each-group select="Cell" group-starting-with="*[@ss:Index]">
<xsl:variable name="start" select="(@ss:Index, 1)[1]"/>
<xsl:for-each select="current-group()">
<Cell>
<xsl:copy-of select="@*"/>
<xsl:attribute name="ss:Index" select="$start + position() - 1"/>
<xsl:copy-of select="node()"/>
</Cell>
</xsl:for-each>
</xsl:for-each-group>

Michael Kay
Saxonica


On 08/11/2011 20:49, Karlmarx R wrote:
Hello,

I have a requirement where I am not sure of the best possible solution. The XML is generated from Excel. The problem is with missing rows which makes it difficult to select a particular node. Better I explain using this sample XML: [Pls ignore any case mismatch or spell mistakes]

<root>
     <Row>
         <Cell><Data>u1</Data></Cell>
         <Cell><Data>13.7 billion years ago</Data></Cell>
         <Cell><Data>Hydrogen and helium nuclei form in the first three minutes...</Data></Cell>
         <Cell ss:Index="107"><Data>107</Data></Cell>
         <Cell><Data>108</Data></Cell>
         <Cell><Data>109</Data></Cell>
         <Cell ss:Index="115"><Data>115</Data></Cell>
         <Cell><Data>116</Data></Cell>
         <Cell><Data>YES</Data></Cell>  <!-- this is 117th row. But position() != 117  -->
         <Cell><Data>118</Data></Cell>
     </Row>
     <Row>
         <Cell><Data>u2</Data></Cell>
         <Cell><Data>2 million yearsgo</Data></Cell>
         <Cell><Data>blahlah</Data></Cell>
         <Cell><Data>1</Data></Cell>
         <Cell><Data>2</Data></Cell>
         <!-- SO MANY ROWS HERE -->
         <Cell><Data>YES</Data></Cell>    <!-- THIS IS 117th row (Cell) i.e. position() = 117 -->
         <!--GAIN SO MANY ROWS HERE -->
         <Cell><Data>199</Data></Cell>
         <Cell><Data>200</Data></Cell>   <!--ASSUME THIS IS 200TH ROW -->
     </Row>
     <Row>
         <Cell><Data>u3</Data></Cell>
        <Cell><Data>20th century</Data></Cell>
         <Cell><Data>blah blah</Data></Cell>
         <Cell ss:Index="117"><Data>NO</Data></Cell>    <!-- THIS IS 117th row But position() != 117  -->
         <Cell ss:Index="199"><Data>119</Data></Cell>
     </Row>
</root>


The excel to xml convertion results in empty excel columns getting stripped out in the resulting<Cell>. And so, the number of<Cell>'s vary for each<Row>. Now, if I want to select a particular<Cell>, say 117th<Cell>, I need to check any @ss:Index exist and if so, count() the number of the preceding-sibling's + the preceding @ss:Index value. So, for 117th Cell value,



(A) in Row 1, 2nd<Cell> of @ss:Indiex is 117th (B) in Row 2, position() = 117 is the required one (C) in Row 3, @ss:Indiex matching 117 (D) if none matching found, ignore


In my requirement, I need to select the N-th<Cell> value and check whether it is 'yes' or 'no' and then process. I am trying approach something like this (incomplete)




<xsl:for-each select="Cell">
   ...

<xsl:variable name="last-index " select="preceding-sibling::Cell[@ss:Index][1]"/>

<!-- the ABOVE giver the last<Cell>'s ss:Index value.

Now I need to do something like count(preceding-sibling::Cell[@ss:Index][1]) - this may be wrong, but here is what I having trouble and need suggestions and best approach to solve this, to select N-th<Cell>, where N is known and can be 1 to X (=max possible value). Thanks in advance!.



Regards,
karl

Current Thread