RE: [xsl] Interesting problem: Combing tables sections into one table

Subject: RE: [xsl] Interesting problem: Combing tables sections into one table
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Wed, 16 Aug 2006 19:10:16 +0100
I would certainly tackle this as a multi-pass transformation, just to keep
my head clear. It's not clear to me entirely what the phases are, for
example is there always a date in the top entry on each page? But once
you've worked that out, I don't think any of the phases are really that
difficult. 

For example, filling in the missing dates is simply

<xsl:template match="date[not(normalize-space)]"
  <date>
    <xsl:value-of select="../preceding-sibling::date[normalize-space()]"/>
  </date>
</xsl:template>

Similarly, converting the dates to ISO format is no great hassle; and when
you've done that, the sort should be easy.

Michael Kay
http://www.saxonica.com/

 

> -----Original Message-----
> From: Ed Yau [mailto:eyau@xxxxxxxxxxxxxxx] 
> Sent: 16 August 2006 18:57
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] Interesting problem: Combing tables sections 
> into one table
> 
> 	Hi,
> 
> 	I wanted to float this by everyone.  I have a very 
> complicated XSLT problem (imho), and want to seek opinions as 
> to how best to proceed with it.
> 	Hope someone's up for the challenge, as it baffles my 
> relatively small XSLT knowledge.
> 
> 	I'm getting through XML from a data-extraction system 
> that has been used on phone bills.  The phone bills are 
> normally folded in half and stapled in middle in a booklet format.
> 	The XML looks something like this:
> 
> 	<page no='3'>
> 	...
> 	<t-left>
> 	  <tr>
> 		<date> 26 Jan </date>
> 		<time> 11:45 </time>
> 		<destination> XXX </destination>
> 	 </tr>
> 	  <tr>
> 		<date>  </date>
> 		<time> 12:45 </time>
> 		<destination> XXX </destination>
> 	 </tr>
> 	  <tr>
> 		<date>27 Jan </date>
> 		<time> 12:45 </time>
> 		<destination> XXX </destination>
> 	 </tr>
> 	  <tf> 
> 			... footer info ...
> 	 </tf>
> 	</t-left>
> 	...
> 
> 	<t-right>
> 	  <tr>
> 		<date> 26 feb </date>
> 		<time> 11:45 </time>
> 		<destination> XXX </destination>
> 	 </tr>
> 	  <tr>
> 		<date>  </date>
> 		<time> 12:45 </time>
> 		<destination> XXX </destination>
> 	 </tr>
> 	  <tr>
> 		<date>27 feb </date>
> 		<time> 12:45 </time>
> 		<destination> XXX </destination>
> 	 </tr>
> 	  <tf> 
> 			... footer info ...
> 	 </tf>
> 	</t-right>
> 
> 	</page>
> 
> 	<page no='4'>
> 	...
> 	<t-left>
> 	  <tr>
> 		<date> 28 Jan </date>
> 		<time> 11:45 </time>
> 		<destination> XXX </destination>
> 	 </tr>
> 	  <tr>
> 		<date>  </date>
> 		<time> 12:45 </time>
> 		<destination> XXX </destination>
> 	 </tr>
> 	  <tr>
> 		<date>29 Jan </date>
> 		<time> 12:45 </time>
> 		<destination> XXX </destination>
> 	 </tr>
> 	  <tf> 
> 			... footer info ...
> 	 </tf>
> 	</t-left>
> 	...
> 	</page>
> 
> 	What I am aiming for should look something like this:
> 
> 	<table>
> 	  <tr>
> 		<date> 28 Jan </date>
> 		<time> 11:45 </time>
> 		<destination> XXX </destination>
> 	 </tr>
> 	  <tr>
> 		<date>28 Jan   </date>
> 		<time> 12:45 </time>
> 		<destination> XXX </destination>
> 	 </tr>
> 	  <tr>
> 		<date>29 Jan </date>
> 		<time> 12:45 </time>
> 		<destination> XXX </destination>
> 	 </tr>
> 	  <tf> 
> 			... footer info ...
> 	 </tf>
> 	</table>
> 
> 	The two main challenges I'm faced with is therefore:
> 		1) reordering the entries as they should appear 
> - the order the appear in the XML is not the right order due 
> to the folding over the booklet
> 		2) filling 'empty' dates with preceeding dates
> 
> 	I've considered the following approaches:
> 	1)  A 'pull' approach where I recurse over the tables 
> bring them together, doing the sorting.
> 		- The ordering of the pages is hard to predict. 
>  There is a pattern, but trying to code this in XSLT far 
> defeats me.  I think this is much the harder approach.
> 		The main problem I've found is that the way you 
> recurse depends on the total number of sections.  So I'd have 
> to find some way of counting this first.
> 
> 	2) Trying to sort by date.  This is tricky because the 
> months are not in number format, and there are lots of 
> missing dates that I'd have to fill in at the same time.  I'm 
> assuming it's possible to do this as a 1-pass approach, but 
> it would certainly be simpler as a 2-pass.
> 
> 	Any thoughts on this welcome.  Anyone have any 
> experience of this?  Which approach do you think is likely to 
> work best?
> 
> 	Many thanks in advance,
> 		Ed

Current Thread