Subject: Re: [xsl] Keys and select distinct - is that the solution ? From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx> Date: Tue, 06 Jun 2006 15:06:54 -0400 |
If I understand your requirements correctly, solution 1 is nearly there; you just have to add in the facility of de-duplicating your codes before you call the key() function for their names. This could be done either by using a predicate on your select expression (which would filter out all but the first occurrences of values assigned to ManureTypeCollection), or an explicit xsl:if test inside the for-each.
Solution 2 relies on the key() function itself to perform the de-duplication of values of ManureTypeCode. Passing a given value to the key() function multiple times is fine, since the resulting set will only have single instances of whatever nodes are returned (the ManureTypeName referred to by that value of ManureTypeCode). I don't see any reason why this shouldn't work just fine here; in fact it's a fairly elegant approach to the problem.
Cheers, Wendell
Hi Wendell, (and others)version="1.0"
Thank you very much for a very thorough answer. I think it starts to fall into place.... However it would still be beneficial for me to go through - as you suggest - a simple extract from the project.
As suggested, I've included a simple XML instance and XSL stylesheet. The stylesheet consists of to template matches:
The method 1 seems to me to be the logical approach. Match the ManureTypeCollection and iterate over each ManureTypeStructure/ManureTypeCode. For each code use the key to look up the corresponding ManureTypeName. The problem here is that the same code is being looked up twice and returned twice, which should only be once.
in Method 2 (a colleagues tip) the result is actually what I want - the names are just returned once! but the approach seems not right to me - it seems to work the other way around, by first matching the lookup names, and returning them if a corresponding code is found, that is not optimal is it?
It would be great if someone could describe to me:
1. the best way of returning the ManureTypeNames once. (comment on method 1 and 2) 2. describe the code line by line especially if it uses the Munchean method
on beforehand thanks a lot!
- Christian
XSL: <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
01/"
xmlns:eih="http://rep.oio.dk/glrchr.dk/eih/xml/schemas/2005/03/01/"
xmlns:gr="http://rep.oio.dk/glrchr.dk/goedningsregnskab/xml/schemas/2006/05/
01/">> <xsl:key name="ManureType" match="gr:ManureTypeName" use="../gr:ManureTypeCode"/>
<xsl:template match="/"> root is matchet! <xsl:apply-templates select="eih/eih:ManureTypeCollection"/> </xsl:template>
<!-- METHOD 1 - writes the text twice, returns: AjleAjleFast gxdningFast gxdning --> <xsl:template match="eih:ManureTypeCollection"> <xsl:for-each select="eih:ManureTypeStructure/gr:ManureTypeCode"> <xsl:value-of select="key('ManureType',node())"/> </xsl:for-each> </xsl:template>
<!-- METHOD 2 - writes out the text once, as wanted, returns: Fast gxdning, Ajle, --> <xsl:template match="eih:ManureTypeCollection"> <xsl:for-each select="key('ManureType', eih:ManureTypeStructure/gr:ManureTypeCode)"> <xsl:value-of select="node()"/> <xsl:if test="not(position()='last')"><xsl:text>, </xsl:text></xsl:if> </xsl:for-each> </xsl:template> </xsl:stylesheet>
XML instance: <?xml version="1.0" encoding="UTF-8"?> <eih xmlns:eih="http://rep.oio.dk/glrchr.dk/eih/xml/schemas/2005/03/01/"
xmlns:gr="http://rep.oio.dk/glrchr.dk/goedningsregnskab/xml/schemas/2006/05/
<!-- Codes and data --> <eih:ManureTypeCollection> <eih:ManureTypeStructure> <gr:ManureTypeCode>5</gr:ManureTypeCode> <gr:ElementIdentifier>N</gr:ElementIdentifier> <gr:ElementQuantity>17.0</gr:ElementQuantity> </eih:ManureTypeStructure> <eih:ManureTypeStructure> <gr:ManureTypeCode>5</gr:ManureTypeCode> <gr:ElementIdentifier>P</gr:ElementIdentifier> <gr:ElementQuantity>0.6</gr:ElementQuantity> </eih:ManureTypeStructure> <eih:ManureTypeStructure> <gr:ManureTypeCode>4</gr:ManureTypeCode> <gr:ElementIdentifier>N</gr:ElementIdentifier> <gr:ElementQuantity>17.5</gr:ElementQuantity> </eih:ManureTypeStructure> <eih:ManureTypeStructure> <gr:ManureTypeCode>4</gr:ManureTypeCode> <gr:ElementIdentifier>P</gr:ElementIdentifier> <gr:ElementQuantity> 6.3</gr:ElementQuantity> </eih:ManureTypeStructure> <eih:ManureTypeStructure> <gr:ManureTypeCode>3</gr:ManureTypeCode> <gr:ElementIdentifier>N</gr:ElementIdentifier> <gr:ElementQuantity> 65.3</gr:ElementQuantity> </eih:ManureTypeStructure> <eih:ManureTypeStructure> <gr:ManureTypeCode>3</gr:ManureTypeCode> <gr:ElementIdentifier>P</gr:ElementIdentifier> <gr:ElementQuantity> 26.3</gr:ElementQuantity> </eih:ManureTypeStructure> <eih:ManureTypeStructure> <gr:ManureTypeCode>3</gr:ManureTypeCode> <gr:ElementIdentifier>P</gr:ElementIdentifier> <gr:ElementQuantity> 16.3</gr:ElementQuantity> </eih:ManureTypeStructure> </eih:ManureTypeCollection>
<!-- look up information for the codes --> <eih:XImanureTypeCollection> <eih:XImanureTypeStructure> <gr:ManureTypeCode>4</gr:ManureTypeCode> <gr:ManureTypeName>Fast gxdning</gr:ManureTypeName> </eih:XImanureTypeStructure> <eih:XImanureTypeStructure> <gr:ManureTypeCode>5</gr:ManureTypeCode> <gr:ManureTypeName>Ajle</gr:ManureTypeName> </eih:XImanureTypeStructure> </eih:XImanureTypeCollection> </eih>
On 6/5/06, Wendell Piez <wapiez@xxxxxxxxxxxxxxxx> wrote:Hi Christian,
At 07:30 PM 6/2/2006, you wrote: >I have now tried the solutions, but none of them works.
Actually, I kind of doubt that. :-> What you have tried is either an attempt at solving the problem blind, posted by contributors (me) who worked with a partial data set and partial problem description, or attempts of your own at patching such code.
Believe me, "the solution" works just fine. You just haven't figured out how to write it yet, and neither have we. This doesn't mean that the solution is not known -- we'ver written it plenty of times before, just not fitted for your particular problem (which we nevertheless recognize as a member of the species).
>Actually I dont think I need to use the generic_id, do I? >Because I don't need to make all the elements unique!!? As far as I >can see, I only have to pick out all the distinct codes.
The generate-id() idiom I suggested is not for the purposes of "making an element unique". It is merely a way of checking whether one node is the same node as another node. Consider this document:
<a> <b>100</b> <b>100</b> </a>
Are /a/b[1] and /a/b[2] the same node? No.
How does a stylesheet know this? It can't tell by comparing their names: they're both named 'b'. Nor by comparing their values, which are both '100'.
It would be possible to write a template that produced for each node a unique identifier, which we could compare. For example, it could generate for the first b node the identifier "/a/b[1]" and for the second, "/a/b[2]". We could compare these strings to establish the two nodes are not the same node.
Or, since generate-id() generates, for any node, an identifier that is unique to the node, we could just use this function, and not have to write that template.
Or, there's another way to test whether these are the same. Say we have
<xsl:variable name="first-b" select="/descendant::b[1]"/>
<xsl:template match="b"> <xsl:choose> <xsl:when test="count(.|$first-b)=1">This b is the first</xsl:when> <xsl:otherwise>This b is not the first</xsl:otherwise> </xsl:template>
Using generate-id() instead, we could say
<xsl:template match="b"> <xsl:choose> <xsl:when test="generate-id() = generate-id($first-b)">This b is the first</xsl:when> <xsl:otherwise>This b is not the first</xsl:otherwise> </xsl:template>
which also works.
Either of these can be applied to solve the problem of "am I a unique representative of a given group of nodes", which is part of the grouping problem. (And David C is correct: yours is a grouping problem.)
>By doing that I do have to match on the content of the node, and not >the element name, right!?
Actually you match on a node, not on its content or name.
We do match nodes *by* name. Indeed this is the normal way of doing it. In XSLT 1.0 it's not possible to match nodes with templates based on their content.
>If I match on the content/text of the node >couldn't I say something like take all the elements whose content is >not in any preeceding sibling content ???
You could match a node and test to see if its content appeared on a preceding element or preceding-sibling element, yes. And indeed, that is a solution available to us for grouping. But: it is a slow solution with poor performance; it doesn't scale well to even medium-sized data sets.
It's much quicker to do something like
<xsl:template match="b"> <xsl:variable name="bs-like-this" select="/descendant::b[.=current()]"> <xsl:if test="generate-id()=generate-id($bs-like-this[1])"> <xsl:text>I'm a b; my content is </xsl:text> <xsl:apply-templates/> </xsl:if> </xsl:template>
Instead of using the painful traversal along the preceding axis, this template works like this:
1. Bind to a variable all the 'b' nodes in the document whose content is the same as the b node matched 2. Test to see whether the b node matched is the first of the nodes bound to the variable; if it is, report its content
If we can do this, then grouping all the bs by content (*not* by name) is as simple as processing all the bs bound to the variable in step 2. This is a trivial tweak to what I just wrote above (which I leave it to you to figure out).
This is still slow, however, since for every b matched by the template we have to assemble the set /descendant::b[.=current()], which entails looking through the entire document. Accordingly, for this we usually use keys (this was Steve Muench's contribution to the method), since keys are pre-indexed and hence, fast:
<xsl:variable name="bs-like-this" select="key('bs-by-value',.)"/>
which grabs those nodes without having to traverse the entire tree.
In this case the key 'bs-by-value' would index the 'b' nodes by their content (value):
<xsl:key name="bs-by-value" match="b" use="."/>
If you really want to pursue a solution based on checking backwards along the preceding:: axis, we can help with that. By pointing you to the grouping solutions (which build on what I just showed you above), we are trying to skip you past that point, since it's not the best solution available.
If you need more help disentangling this, please feel free to post again. But when you do, post your sample code again please, so we can point the way using examples that make sense.
Good luck, Wendell
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Keys and select distinct , Christian Rasmussen | Thread | Re: [xsl] Keys and select distinct , Joseph Dane |
Re: [xsl] Office 2007, XSL-FO, and , Robert Koberg | Date | [xsl] Extreme Markup Languages 2006, B Tommie Usdin |
Month |