Subject: Aw: Re: [xsl] How to copy attribute value to text? (Suspected bug involving supplementary characters) From: "Martin Honnen martin.honnen@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Thu, 7 Jul 2016 20:22:13 -0000 |
I think you can file problems at https://saxonica.plan.i o/projects/saxon/issues, but make sure you mention the Java version and the way you use Saxon (command line, Api) -- Diese Nachricht wurde von meinem Android Mobiltelefon mit GMX Mail gesendet.Am 07.07.2016, 20:54, "Kenneth Reid Beesley krbeesley@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> schrieb: From: Kenneth Reid Beesley <krbeesley@xxxxxxxxx> Subject: Re: [XSL-List: The Open Forum on XSL] Digest for 2016-07-06 Date: July 7, 2016 at 12:43:54 PM EDT To: "XSL-List: The Open Forum on XSL" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Many thanks to Martin Honnen for his response below. I add more comments below (suspected bug in Saxon). On 7Jul2016, at 05:28, XSL-List: The Open Forum on XSL <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: From: Martin Honnen <martin.honnen@xxxxxx> Subject: Re: [xsl] How to copy attribute value to text? Date: 7 July 2016 at 00:43:37 MDT To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx On 07.07.2016 07:22, Kenneth Reid Beesley krbeesley@xxxxxxxxx wrote: If I start with an input XML document that contains mixed text with <word> elements like this: … this is just <word correction=“too”>to</word> funny I’d like to write an XSLT stylesheet that yields as output … this is just <word origerror=“to”>too</word> funny So in the output I effectively want (in the same <word> element) to 1. Set the value of a new attribute to the original text() value, and 2. Reset the text() value to be the value of the original @correction attribute I’ve tried many variants of the following, so far without success. I’m using SaxonHE9-7-0-6J; it runs, but the results are not as expected/hoped. I’ve tried matching the text() in a separate template, but I can’t seem to reference the attribute values of the parent node (i.e., <word>) of the text() and the parent node’s attributes. E.g, the following doesn’t work for me, failing somehow in the select=“../@correction” reference. <xsl:template match=“word[@correction]/text()”> <xsl:value-of select=“../@correction”/> </xsl:template> You can use <xsl:template match="@* | node()"> <xsl:copy> <xsl:apply-templates select="@* | node()"/> </xsl:copy> </xsl:template> <xsl:template match="word[@correction]/text()"> <xsl:value-of select="../@correction"/> </xsl:template> <xsl:template match="word/@correction"> <xsl:attribute name="origerror" select=".."/> </xsl:template> Your solution looks perfect and appears to work perfectly for ASCII-based XML input examples like the following <?xml version="1.0" encoding="UTF-8"?> <foo> <bar>this is just <word correction="too">to</word> funny</bar> </foo> yielding the correct/desired output <?xml version="1.0" encoding="UTF-8"?> <foo> <bar>this is just <word origerror="to">too</word> funny</bar> </foo> I now see that some of my own attempts also worked, on the same ASCII-based example. ***** Suspected bug involving supplementary characters ***** But my real task involves an input XML document, in UTF-8 encoding, that consists of Deseret Alphabet characters, which are encoded in the supplementary area. In such a case, the resulting text content in the <word> element, copied from an original attribute value, is corrupted. I saw such corruption in my own attempts, and couldn’t understand what was happening. Using the following input document (the Deseret Alphabet characters may not display correctly for you) <?xml version="1.0" encoding="UTF-8"?> <foo> <bar>pp.p p.p p>p2pp; <word correction="p;p">pp/p p.</word> pp2pp.</bar> </foo> the output, using your script, is corrupted. The text() value in the output is not the same as the original @correction value. Extra characters (just one in this case) are inserted. The longer the original attribute value, the more extra characters are inserted. <?xml version="1.0" encoding="UTF-8"?> <foo> <bar>pp.p p.p p>p2pp; <word origerror="pp/p p.">p;p;p</word> pp2pp.</bar> </foo> This kind of corruption is exactly what I was seeing using my own scripts, leading me to bother the group. I suspect a bug in the XSLT engine involving supplementary characters. Again, I’m using SaxonHE9-7-0-6J. What’s my next step? Thanks, Ken ******************************** Kenneth R. Beesley, D.Phil. PO Box 540475 North Salt Lake UT 84054 USA ******************************** Kenneth R. Beesley, D.Phil. PO Box 540475 North Salt Lake UT 84054 USA XSL-List info and archiveEasyUnsubscribe (by email) XSL-List info and archiveEasyUnsubscribe (by email)
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] How to copy attribute val, Kenneth Reid Beesley | Thread | [xsl] Oxford XML (and XSLT and XQue, dal dalapeyre@xxxxxx |
Re: [xsl] How to copy attribute val, Michael Müller-Hille | Date | Re: [xsl] How to copy attribute val, Michael Kay mike@xxx |
Month |