|
Subject: Re: [xsl] Find inconsistencies: Perl or XSLT? From: Hermann Stamm-Wilbrandt <STAMMW@xxxxxxxxxx> Date: Wed, 1 Dec 2010 19:01:22 +0100 |
Perhaps I am missing something here, but for this simple problem XSLT 1.0
end even XPATH 1.0 seems to be good enough.
Problem:
identify duplicate source entries of unit elements
Input tags did not match, find corrected input.xml below.
If input file size is moderate this simple XPATH statement will do it:
$ xpath++ "/data/unit[source=following-sibling::unit/source]" input.xml
===============================================================================
<unit id="1">
<source>blabla</source>
<target>plapla</target>
</unit>
===============================================================================
<unit id="2">
<source>bleble</source>
<target>pleple</target>
</unit>
$
Now in case of bigger files to process making use of key() function helps:
$ cat dupsrc.xsl
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>
<xsl:key name="source" match="node()" use="source"/>
<xsl:template match="text()"/>
<xsl:template match="/data/unit[count(key('source',source))>1]">
<xsl:value-of select="concat(@id,'-',source,' ')"/>
</xsl:template>
</xsl:stylesheet>
$
$ xsltproc dupsrc.xsl input.xml
<?xml version="1.0"?>
1-blabla
2-bleble
4-blabla
5-bleble
$ cat input.xml
<data>
<unit id="1">
<source>blabla</source>
<target>plapla</target>
</unit>
<unit id="2">
<source>bleble</source>
<target>pleple</target>
</unit>
<unit id="3">
<source>bloblo</source>
<target>ploplo</target>
</unit>
<unit id="4">
<source>blabla</source>
<target>plapla</target>
</unit>
<unit id="5">
<source>bleble</source>
<target>lolailo</target>
</unit>
</data>
$
Mit besten Gruessen / Best wishes,
Hermann Stamm-Wilbrandt
Developer, XML Compiler, L3
Fixpack team lead
WebSphere DataPower SOA Appliances
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
From: Michael Kay <mike@xxxxxxxxxxxx>
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Date: 12/01/2010 04:06 PM
Subject: Re: [xsl] Find inconsistencies: Perl or XSLT?
On 01/12/2010 14:46, Manuel Souto Pico wrote:
> Dear all,
>
> I need to process some files and I know how to do it in Perl, but as
> has happened to be the case in the past with other stuff, perhaps
> there's a (objectively) simpler or more efficient way to do it with
> XSLT.
>
> I have a file like this
>
> <unit id="1">
> <source>blabla</source>
> <target>plapla</source>
> </unit>
> <unit id="2">
> <source>bleble</source>
> <target>pleple</source>
> </unit>
> <unit id="3">
> <source>bloblo</source>
> <target>ploplo</source>
> </unit>
> <unit id="4">
> <source>blabla</source>
> <target>plapla</source>
> </unit>
> <unit id="5">
> <source>bleble</source>
> <target>lolailo</source>
> </unit>
>
> I think the example is illustrative enough.
>
> The target element contains the translation of the source element, and
> one same element must always be translated in the same way, but
> sometimes it's not. So what I'd to do is find two or more units with
> the same source but with different target (like 2 and 5 in the
> example, but unlike 1 and 4).
>
> In Perl I would use a XML module (or not) and put the source elements
> in the keys of a hash and the target elements in their corresponding
> values. When assigning a new key-value pair, if the key already
> exists, I compare the values. If they are equal, they pass, else they
> are flagged and included in the report.
>
> The report in this case would be something like:
>
> The following inconsitencies have been found
> 2: bleble -> pleple
> 5: bleble -> lolailo
>
> Is it possible to do this in XSLT? Is it more efficient that doing it
> in Perl as I was planning to? I knowledge of XSLT is very limited and
> I can't see beyond transforming a XML file into another XML file.
>
> Thanks a lot for your opinion.
> Manuel
>
>
Something like this:
<xsl:for-each-group select="unit" group-by="source">
<xsl:if test="count(distinct-values(current-group()/target)) gt 1">
<conflicts-for source="{current-grouping-key()}">
<xsl:value-of select="distinct-values(current-group()/target)"/>
</conflicts>
</xsl:if>
</xsl:for-each-group>
Michael Kay
Saxonica
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] Find inconsistencies: Per, Michael Kay | Thread | Re: [xsl] Find inconsistencies: Per, Hermann Stamm-Wilbra |
| Re: [xsl] xslt test automation, Wendell Piez | Date | Re: [xsl] Find inconsistencies: Per, Hermann Stamm-Wilbra |
| Month |