Re: [xsl] Find inconsistencies: Perl or XSLT?

Subject: Re: [xsl] Find inconsistencies: Perl or XSLT?
From: Manuel Souto Pico <m.soutopico@xxxxxxxxx>
Date: Thu, 2 Dec 2010 01:23:49 +0100
Hi guys,

Thanks a lot for all your answers. It looks like XSLT can be used for
everything :)

What Michael wrote was exactly what I needed. I just tweaked a bit the
output, to make it (the output) more human-readable:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
	xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl";
	exclude-result-prefixes="xd"
	version="2.0">
	<xsl:output method="text"/>
	<xsl:template match="file">
		<xsl:text>INCONSISTENCIES FOUND&#xA;</xsl:text>
		<xsl:for-each-group select="unit" group-by="source">
			<xsl:if test="count(distinct-values(current-group()/target)) gt 1">
				<xsl:text>&#xA;</xsl:text>
				<xsl:text>Segment [</xsl:text>
				<xsl:value-of select="current-grouping-key()"/>
				<xsl:text>]&#xA;translated as:&#xA;[</xsl:text>
				<xsl:value-of select="distinct-values(current-group()/target)"
separator="] and &#xA;["/>
				<xsl:text>].&#xA;</xsl:text>
			</xsl:if>
		</xsl:for-each-group>
	</xsl:template>
</xsl:stylesheet>

So I get

INCONSISTENCIES FOUND

Segment [bleble]
translated as:
[pleple] and
[lolailo].

That function conflicts-for must be quite new, it's not in my O'Reilly book.

Once again: really, thanks a lot.

Cheers, Manuel


2010/12/1 Michael Kay <mike@xxxxxxxxxxxx>:
> On 01/12/2010 14:46, Manuel Souto Pico wrote:
>>
>> Dear all,
>>
>> I need to process some files and I know how to do it in Perl, but as
>> has happened to be the case in the past with other stuff, perhaps
>> there's a (objectively) simpler or more efficient way to do it with
>> XSLT.
>>
>> I have a file like this
>>
>> <unit id="1">
>>    <source>blabla</source>
>>    <target>plapla</source>
>> </unit>
>> <unit id="2">
>>    <source>bleble</source>
>>    <target>pleple</source>
>> </unit>
>> <unit id="3">
>>    <source>bloblo</source>
>>    <target>ploplo</source>
>> </unit>
>> <unit id="4">
>>    <source>blabla</source>
>>    <target>plapla</source>
>> </unit>
>> <unit id="5">
>>    <source>bleble</source>
>>    <target>lolailo</source>
>> </unit>
>>
>> I think the example is illustrative enough.
>>
>> The target element contains the translation of the source element, and
>> one same element must always be translated in the same way, but
>> sometimes it's not. So what I'd to do is find two or more units with
>> the same source but with different target (like 2 and 5 in the
>> example, but unlike 1 and 4).
>>
>> In Perl I would use a XML module (or not) and put the source elements
>> in the keys of a hash and the target elements in their corresponding
>> values. When assigning a new key-value pair, if the key already
>> exists, I compare the values. If they are equal, they pass, else they
>> are flagged and included in the report.
>>
>> The report in this case would be something like:
>>
>> The following inconsitencies have been found
>> 2: bleble ->  pleple
>> 5: bleble ->  lolailo
>>
>> Is it possible to do this in XSLT? Is it more efficient that doing it
>> in Perl as I was planning to? I knowledge of XSLT is very limited and
>> I can't see beyond transforming a XML file into another XML file.
>>
>> Thanks a lot for your opinion.
>> Manuel
>>
>>
> Something like this:
>
> <xsl:for-each-group select="unit" group-by="source">
> <xsl:if test="count(distinct-values(current-group()/target)) gt 1">
> <conflicts-for source="{current-grouping-key()}">
> <xsl:value-of select="distinct-values(current-group()/target)"/>
> </conflicts>
> </xsl:if>
> </xsl:for-each-group>
>
> Michael Kay
> Saxonica

Current Thread