Re: [xsl] a weird bug today, tree seems to change mid transform

Subject: Re: [xsl] a weird bug today, tree seems to change mid transform
From: "bryan rasmussen" <rasmussen.bryan@xxxxxxxxx>
Date: Sat, 8 Sep 2007 13:17:04 +0200
> I was reading through your story and as far as I can tell, there is no
> bug that causes danish text to be translated automagically into english
> text ;)
>
Neither do I. However the application as a whole does this kind of
translation. I don't have access to the code that is doing it, but it
seems evident that somehow it is tampering with the DOM during the
transform. This does not seem to be something someone would want to do
therefore I suppose a bug.
> Though I was wondering whether you'd already tried the following things
> to narrow your problem down a bit:
>
>   1. Use a copy template to find out what the source really looks like
> when it is processed by the XSLT

from the second post I mentioned that I  did:
<xsl:copy-of select="self::*"/>

and got
<x x="Andet">...

do you mean I should have done a mode like apply-templates
select="self::*" mode="copy" and then copy the structure out in the
mode template? If this gave a different result from copy-of select I
would still assume there was something messed up going on.


>   2. If you can't do (1) because it is in the midst of a larger process,
> consider adding a node that is not recognized by the rest of the process
> and copy the input document into it for analysis (it should give 'more').

The transformation was incredibly simple. There was no node-set there
were no includes imports no document function nothing external to the
transfrom that basically transformed an x element to a button with the
value of the x attribute as the button text.


>   3. You seem to get your different outputs based on changes in the
> matching template match attribute (your follow up post).
no, in the follow up post my examples:


<xsl:variable name="caption"
select="translate(@x,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'')"/>
<xsl:variable name="capt2" select="translate(@x,'A','p')"/>

x=<xsl:value-of select="@x"/>
caption=<xsl:value-of select="$caption"/>
capt2=<xsl:value-of select="$capt2"/>
<xsl:copy-of select="self::*"/>

I get
x=Andet
caption=more
capt2=More <-- can things get any weirder!!?
<x x="Andet">..</x>

were inside a template matching x. so
<xsl:template match="x">

<xsl:variable name="caption"
select="translate(@x,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz'')"/>
<xsl:variable name="capt2" select="translate(@x,'A','p')"/>

x=<xsl:value-of select="@x"/>
caption=<xsl:value-of select="$caption"/>
capt2=<xsl:value-of select="$capt2"/>
<xsl:copy-of select="self::*"/>


</xsl:template>


I get
x=Andet
caption=more
capt2=More
<x x="Andet">..</x>

If I copy the XML out, the x element and run it on it I get

x=Andet
caption=andet
capt2=pndet
<x x="Andet">...</x>

basically there is something messing with the transformation. Because
I dont currently have access to the higher level code I can't figure
out if it is something in the code (and I have a hard time figuring
out how you can force this kind of thing to happen) or a bug in the
processor. So I was hoping someone would have a similar experience and
be able to say, oh yeah if there is a mismatch between object X and
blah blah blah then this can happen.


> Does the
> stylesheet use xsl:import or xsl:include? Are there other places that
> may receive a higher priority? In essence: I think your @x with
> different outputs does not come from different calls to translate(), but
> from different stuff in your input XML (which may very well be modified
> on the fly).
there are no differences in priority.

Okay the input XML is dynamically generated, but if they modify
attributes on the fly by the DOM will this affect the transformation
mid-transform. I've done alteration of both input and XSL Doms and
never experienced an affect mid-transform. This would be what I would
think was a bug in a processor, or can one in fact structure the code
in such a way that it will cause this. The application is written on
the server in ASP, if you or anyone can think of an example doing this
kind of restructuring mid-transform.

>To find this out, change your xsl:value-of to contain the
> generate-id(@id) as well,

good idea
> plus an xsl:copy-of the attribute and place
> the translated attribute in a new one.
>
> In addition, you may want to try two more things: do a complete copy of
> the source again, in three ways:
>
> 1. use copy-of (that really should not impose extra processing, the
> operation is atomic)
> 2. use copy for shallow copy
> 3. use copy and translate(., 'ABC...', 'abc....') on *every* attribute,
> name etc:
>
> <xsl:template match="*">
>     <xsl:element name="{translate(name(), $up, $down)}>
>        <xsl:apply-templates select="node() | @*" />
>     </xsl:element>
> </xsl:template>
>
> <xsl:template match="@*">
>     <xsl:attribute name="{translate(name(), $up, $down)}>
>        <xsl:value-of select="." />
>     </xsl:attribute>
> </xsl:template>
>
> <xsl:template match="text()">
>     <xsl:value-of select="{translate(name(), $up, $down)} />
> </xsl:template>
>

good
>
> If the only node that gets translated into English is now the Andet
> node, I'd be surprised. If everything gets translated into English, I'd
> like to buy your product: very handy haha. Hopefully it will show you
> some pointers....
>
Actually in the original post I think I mentioned that every @x got
translated from the Danish to English.


> It is not a solution, I know. It sounds as if the default extension
> prefix for functions is set to an empty string (which is not possible)
> and that translate() is also an extension function.

Yeah.
> You can test this by
> adding your own version of translate() with the proper extension in an
> msxml script block, i.e. with C# of VBScript code. It shouldn't be too
> hard and at least (even if you do not have the problem solved) you can
> continue working using your own my:translate(. $up, $down) (and if you
> are busy with it, just use the regex engine of VBScript/JScript, it will
> make your templates easier).
>
>
well actually my project on it got done yesterday, I solved the
problem with a hack based on the number of preceding-sibling::x but I
didn't like that. The system is so fragile as it is it doesn't need
more bits of string and gum holding it together. I wrote up a report
on the problem describing it and providing the testing code to
replicate and gave reasons for thinking it might be a bug either in
the processor, the application etc. I may be called back again later
(highly probable), and I would like to be able to do a follow up
suggestion or report on it. Actually your suggestions here are pretty
good. I can send in another XSL-T with them to do further testing. It
may be that whatever causes this phenomenon is replicated in other
parts of the system, there is a lot of use of xsl when rendering html
fragments for eventual assemblage into actual pages.

Cheers,
Bryan Rasmussen
>
> bryan rasmussen wrote:
> > Hi,
> >
> > I had to edit some parts of a big old legacy application today. I had
> > access to its XSLs. Actually the XSLs were in wd-xsl but I had
> > permission to change it to XSL-T (1.0). So I did. One thing that might
> > be pertinent to this story (I am not certain) is that the xsl-t
> > processor  used was msxml 2 (because of the wd-xsl) I did not have
> > permission to change this.
> >
> > I had three elements one of which had to not be processed: the tree
> > was like the following
> >
> > <x x="Kend"></x>
> > <x x="Personer"></x>
> > <x x="Andet"></x>
> >
> > it was the x where the attribute x= Andet that was not to be
> > processed. I did a bunch of stuff, I did:
> >
> > <xsl:template match="x[@x='Andet']"/>
> >
> > I did choose whens I did all sorts of stuff. Andet kept being
> > processed. I copied out the XML. It came out as shown (well actually
> > the above is a simplification but only in not having longer names and
> > attributes)
> >
> > I was really freaked out. I check with contains(@x,'Andet')
> > no it does not contain Andet, I check with contains(@x,'A') yes it
> > does, actually all of the do, even the ones where the attribute x does
> > not have an A in it.
> >
> > so at this point I do all sorts of checks, among which I do
> > translate(@x,'ABCDEFGHIJKLMNOPQRSTUVWXYZ','abcdefghijklmnopqrstuvwxyz')
> > and put it in a variable x.
> >
> > then I check does $x ='andet' no it does not. I output the value of
> > $x. The value of $x = more.
> > Now this is where things get weird.
> >
> > you see I know this application has some sort of XML based translation
> > between languages. You know the sort of thing, an XML document with
> > the words that are meaningful in different languages and dependent on
> > which language the application is configured to all buttons and so
> > forth come out with the pertinent text. Andet which is Danish in
> > English is More. Furthermore all the other values of x which I was not
> > able to get either have their english values when the xpath translate
> > function is run over it.
> >
> > So I test. If I write
> > <xsl:value-of select="@x"/> it equals Andet.
> > If I translate it using xpath translate it equals more.
> >
> >
> > The XSL-T does not have any msxml extensions in it. There are no
> > included or imported stylesheets to mess with anything. It is
> > basically just that the value of the attribute x seems to be slightly
> > more unstable than one is used to.
> >
> > Now if it was in wd-xsl I could understand because there was a bug
> > associated with that so many years ago whereby one could alter the
> > tree during the transformation. But I am not aware of any such bug for
> > XSL-T even in such an old version of MSXML. It seems most likely that
> > the XML is being altered somewhere at the higher server level, but I
> > am not aware of any way that one can alter the XML mid-transformation
> > in any of the APIs available under MSXML. I don't really have access
> > to the server side but it may be that I will get it when the bug
> > report goes back to the central office. But what I am wondering, has
> > anyone ever seen a similar error, is anyone aware of a bug in earlier
> > versions of MSXML that can cause this?
> >
> > Believe me, this does not require the XML or the code. you cannot
> > replicate the error with the code and the extracted XML. If I copy the
> > XML into the output and then save it and run my transform everything
> > works as it should. (I ended up by having to remove the x with x=
> > Andet by counting the preceding siblings. an atrocious hack)
> >
> > Cheers,
> > Bryan Rasmussen

Current Thread