Re: [xsl] Unions and/or temporally related groups

Subject: Re: [xsl] Unions and/or temporally related groups
From: tcn@xxxxxxxxxxxxx (Trevor Nash)
Date: Fri, 13 Jul 2001 10:11:05 GMT
Hi Tony,

You missed out a vital bit of the spec: can a given time element
contain more than one <data/> for the same value of channel and
parameterSet?  For what follows I am assuming not.  How many elements
might a given <time/> contain?

You are right that the union operator is picky regarding what counts
as the 'same' node - it has to be the same node from the original
document, even an xsl:copy-of won't do.  As well as working with
values, you seem to want to ignore the 'value' attribute I think?

I think you are also right to throw out 'key' if you can do it on the
basis of this node and the next, at least for a key working in the
context of the whole input file.

In terms of node sets you are looking for something like the
following.  The current context is a time element, and $next is its
following-sibling.  I abbreviate your attribute names to the first
letter so we see the wood rather than the trees.

    name[$next/name[@c=^@c and @p=^@p]]

where ^ is an XPath operator I just invented meaning 'context of the
context'.  A sort of ./.  There is no such animal as far as I know,
but maybe someone else does - extension function anyone?  Besides,
this involves a pairwise comparison of each node so the performance
characteristic is bad news if you have lots of elements within each
time.

Can you do a transformation on a given time element to produce a
string characteristic of the c and p values present?  Something like

  <xsl:variable name="this-id">
    <xsl:for-each select="data">
          <xsl:sort select="@c" />
          <xsl:sort select="@p" />
          <xsl:value-of select="concat(@c,@p)"/>
   </xsl:for-each>
  </xsl:variable>

You may be able to discard the sort, and you may need to put some
separators in so as not to confuse @c='1' @p='11' with @c='11' @p='1'.
Do likewise for $next/data and then the problem boils down to
comparing two strings.

Of course you end up calculating the 'characteristic' of each time
element twice.  Might be worth considering doing it in two passes, but
I doubt it unless you do it as some sort of SAX filter.

Does this help?


>I've searched the sites and haven't found an answer that solves my problem
>and it may be that what I want to do cannot be done in XSL, but here's the
>situation:
>
>I have an xml document that has a tremendous amount of time ordered data.
>The temporal order is essential to the report.  Additionally the data is
>comming from multiple channels and there is a grouping based on the channels
>being sent.  To make it even more complex there is also a grouping within a
>channel as a channel's parameters change.  Here is some XML to help explain:
>
><time value="12:00:00">
>	<data value="123" channel="1" parameterSet="1"/>
>	<data value="123" channel="2" parameterSet="2"/>
></time>
><time value="12:15:00">
>	<data value="234" channel="1" parameterSet="1"/>
>	<data value="456" channel="2" parameterSet="2"/>
></time>
><time value="12:30:00">
>	<data value="234" channel="1" parameterSet="3"/>
>	<data value="456" channel="2" parameterSet="2"/>
></time>
><time value="12:45:00">
>	<data value="234" channel="1" parameterSet="3"/>
>	<data value="456" channel="2" parameterSet="4"/>
></time>
><time value="13:00:00">
>	<data value="234" channel="1" parameterSet="3"/>
></time>
>
>When generating html, I essentially need to create a 2 column table (rows by
>time, columns are channels) and break the table every where the is a
>discontinuity; where a discontinuity occurs between <time> elements when the
>next time element does not contain the same number of data elements with the
>same channel and parameterSet attribute values.  In the previous example
>there is a discontinuity between 12:15 and 12:30 because the parameterSet of
>channel 1 changed to 3.  There is a discontinutity between 12:30 and 12:45
>because the parameterSet of channel 2 changed.  There is discontinuity
>between 12:45 and 13:00 because there is no data for channel 1.
>
>I have a solution to the problem which involves an <xsl:if> (I know, not
>pretty!) and basically I want to do an intersection of the values of the
>current time elements data elements channel attributes  (
>$currentTime/data/@channel ) with the same attributes of the next time
>element.  Unfortunatly it appears that despite the fact that the attributes
>are named the same and even when their values are the same a union results
>in four nodes.  So I suppose what I need to know how to do is how do I get
>an intersection of nodes based on value not identity?
>
>Also, I'm open to solving this problem other ways.  I've explored using
>keys, but keys only allows basically grouping by one key where I need
>grouping by two keys (channel and parameterSet).
>
>I suppose there is one more constraint; I'm working with hundereds of
>thousands of data elements and I need to be able to process them in seconds
>not minutes.
>
>Any ideas???
>
>TIA
>
>-Tony
>
Regards,
Trevor Nash
--
Traditional training & distance learning,
Consultancy by email

Melvaig Software Engineering Limited
voice:     +44 (0) 1445 771 271 
email:     tcn@xxxxxxxxxxxxx

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread