Re: [xsl] Expensive XSLT2 - suggestions for improving?

Subject: Re: [xsl] Expensive XSLT2 - suggestions for improving?
From: Michael Müller-Hillebrand <mmh@xxxxxxxxxxxxx>
Date: Thu, 16 Oct 2008 21:33:10 +0200
Wendell,

Thanks for your pointers and for reminding me that key() has a third
attribute. It does not help here, because the structure is totally
flat, the parent is the root element.

And "key('oid-by-value',.)[1] except ." is a nice piece of Boolean
logic :-)

Thanks,

- Michael

Am 16.10.2008 um 18:34 schrieb Wendell Piez:

Michael,

This is an interesting problem, and you may want to try a few things.

Part of what makes it interesting is the question of how widely you
wish to scope your examination for similar values. In XSLT 2, a
third argument can be used to define the scope within which the key
works.

You could try something like this:

<xsl:key name="oid-by-value" match="@oid" use="string(..)"/>
<!-- retrieves an @oid attribute using the string value of its
parent element -->

and then

<xsl:template match="value">
 <xsl:copy>
   <xsl:apply-templates select="@*"/>
   <xsl:for-each select="key('oid-by-value',.)[1] except .">
     <!-- traverse to the @oid of the first element with the
          same value, unless this is it -->
     <xsl:attribute name="refoid" select="string()"/>
   </xsl:for-each>
   <!-- skip content -->
 </xsl:copy>
</xsl:template>

if you wanted to scope only within the parent element, you could use
key('oid-by-value',.,..)[1] -- the '..' as the third argument
restricts the scope of retrieval.

Note: untested. (But if it won't work, surely some sharp-eyed XSLTer
will notice.)

Cheers,
Wendell


At 11:54 AM 10/16/2008, you wrote:
Hello experts,

The task is to remove duplicate text content before moving an XML
file
into translation. After the translation, the former duplicate content
should be recreated.

Assume this input XML (I dropped a lot of attributes):

<Doc>
<value oid="40068">Lasttrennschalter</value>
<value oid="40069">Umbau von N12 auf N4</value>
<value oid="4006a">Lasttrennschalter</value>
</Doc>

The third <value> should be empty because its content is identical to
the first, but we need a pointer to that first element to be able to
recreate the content after translation. Also, all original attributes
must stay unchanged. Therefore in each duplicate I insert an extra
attribute @refoid with the @oid of the source element. So I get this:

<Doc>
<value oid="40068">Lasttrennschalter</value>
<value oid="40069">Umbau von N12 auf N4</value>
<value oid="4006a" refoid="40068"/>
</Doc>

My XSL is very simple and works as intended, but it does not scale
very good, I guess because I look at preceding::value so many times:

<!-- Condenser: modify all duplicates -->
<xsl:template match="value[.=preceding::value]">
 <xsl:copy>
   <xsl:apply-templates select="@*"/>
   <xsl:attribute name="refoid"
     select="preceding::value[.=current()][last()]/@oid"/>
   <!-- skip content -->
 </xsl:copy>
</xsl:template>

<!-- pass-through all nodes and attributes -->
<xsl:template match="@*|node()">
 <xsl:copy>
   <xsl:apply-templates select="@*|node()"/>
 </xsl:copy>
</xsl:template>

I guess a clever constructed key could help a lot... any pointers are
very welcome!


-- _______________________________________________________________ Michael M|ller-Hillebrand: Dokumentations-Technologie Adobe Certified Expert, FrameMaker Lvsungen und Training, FrameScript, XML/XSL, Unicode <http://cap-studio.de/> -- Tel. +49 (9131) 28747

Current Thread