Subject: Re: [xsl] Character entities in attribute values From: mark_fletcher@xxxxxxxxxxxxxx Date: Wed, 23 Apr 2003 10:00:38 -0700 |
Hi Mike (and others who have responded), First, I've found and fixed the problem. I'm using Arbortext's E3 product to do my processing and there was an instruction in their internal code to write out non-ASCII characters as numeric character references. So, that's how the accented unicode characters in the tag attributes became character references. Once I fixed that problem, the HTML output was fine, as there were no ampersands in any of the attribute values. However, it still sounds like you're all saying that even when a character reference does exist in an attribute value, I should not be seeing escaped ampersands when that attribute value is output as text. Well, if anyone's interested (and I'm not sure why you would be, at this point ;-) here's a sample of my previous input and output data and my xsl code that demonstrates the problem I was having: source xml tag: <xref linkend="i090f42a68009c2c9" book_code="cmkt" book_title="Guide Marketing du système GRC de PeopleSoft, version 8.8" chapter_title="Définition des entités de l'application Marketing de PeopleSoft" XREF_type="3" target_title="Définition des entités de l'application Marketing de PeopleSoft" chapter_type="Chapitre" file_name="cmkt03.htm"/> xsl template for this element: <xsl:template name="xref"> <A HREF="../../{@book_code}/htm/{@file_name}#{@linkend}"><xsl:value-of select="@target_title"/></A> </xsl:template> html output: <A HREF="../../cmkt/htm/cmkt03.htm#i090f42a68009c2c9">D&#xe9;finition des entit&#xe9;s de l'application Marketing de PeopleSoft</A> Mark Fletcher PeopleSoft Language Engineering 925.694.3753 mark_fletcher@xxxxxxxxxxxxxx "Mike Brown" <mike@xxxxxxxx> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Sent by: cc: owner-xsl-list@xxxxxxxxxxx Subject: Re: [xsl] Character entities in attribute values rrytech.com 04/23/2003 06:05 AM Please respond to xsl-list mark_fletcher@xxxxxxxxxxxxxx wrote: > the output text looks something like this: &eacute; instead of this: > é First please realize that when you output XML or HTML, the XSLT processor is (effectively, not necessarily) running a node tree through a serializer, and the serializer is what is escaping "&" and "<" and certain other characters appearing in places where they would otherwise be confused with markup. If you're getting &eacute; in the output, then you must have put the 8 characters "&" "e" "a" "c" "u" "t" "e" ";" into an attribute node (or text node, but you mentioned attribute) in your result tree, perhaps by copying this text from the source tree. Since you told the processor you wanted the *node* to contain those 8 characters, rather than 1 entity reference, it serialized the node in such a way that you'd get the characters when the output document is parsed. In other words, it preserved the semantics of the data, clearly distinguishing between character data and the structures implied by markup. Given that the XML parser feeding parsed data to the XSLT processor would have interpreted "é" in your original source document as a reference to the entity named acute, there's no way the 8 characters could have ended up in your source tree unless you did one of the following: - explicitly constructed that string in your stylesheet - copied text that was originally written like &eacute; - copied text that was originally written like <![CDATA[é]]> Both of the latter two mean exactly the same thing, and since the most common FAQ and misconception on this list (well, one of the most common) is the mistaken assumptions people make about what CDATA sections are, I'm going to guess that whoever made your XML decided to try to use it as a transport for entity-laden, non-well-formed HTML, saying that this data is just text, not markup. Then you tried to use XSLT to copy it through, and were surprised to see that you can't use XSLT to pretend character data is actually markup. However, as others have mentioned, this is just a wild guess. Explain more about what you're doing, with sample code (brief). XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Character entities in att, Edward . Middleton | Thread | RE: [xsl] Character entities in att, Michael Kay |
Re: [xsl] controller stylsheet. per, S Woodside | Date | Re: [xsl] FOP conversion font probl, J.Pietschmann |
Month |