Subject: Re: [xsl] Unicode usage From: "Jonathan Perret" <jonathan@xxxxxxxxxxxx> Date: Fri, 25 Jan 2002 18:02:04 +0100 |
> I also loaded each result into Notepad on Win95. Notepad displayed the iso > file correctly, but not the utf-8 result (it showed that "A" character with > a little circle above it), ahead of the trademark symbol. This is what I > was suggesting would happen. BTW, Notepad on the Win2000 computer did > display both results correctly. I don't see what this proves that wasn't already obvious. Notepad on Windows 95 supports only one encoding, which matches the installed code page - that encoding is generally windows-1252 (what windows calls 'ANSI' or even 'ASCII' -yuk!- sometimes) on an occidental version. Feeding it utf-8 text, regardless of the actual codepoints used, is akin to opening a Word document with it : though some text might appear readable, the general result is garbage. On Windows 2000, notepad has been upgraded to know about UTF-8, so again it's no surprise that it can display the text correctly, given that the file probably starts with a BOM mark, that signals it as being utf-8 encoded. Note that without the BOM, Windows 2000 notepad would probably have 'failed' the same way as its Win95 cousin, since it would have assumed an ANSI-encoded file. > Summarizing, what you will see displayed for high-order characters can > depend on the encoding, OS, and the viewing program. On older versions of > Windows, at least, non-browsers are likely to display the wrong thing. The fact is that what you will see is completely predictible (give or take the odd bug). If the viewing program is not told in what encoding the text is, it will assume an encoding that will quite frequently be wrong. In the notepad example, the OS itself has nothing to do with the issue : notepad/Win95 and notepad/Win2000 are two very different programs. If you were to take the win95 notepad binary and run it under Win2000, it'd behave exactly the same as under win95. Why not try this ? > In fact, even on my Win2000 machine, using XML Cooktop to run and display > the transformation gave an incorrect display (and it uses the IE activeX > control to display the results!), so you can't be sure even on Win2000 that > high order characters will display the intended way, depending on the app. If XML Cooktop (which I've never used) has the same bug as XML Spy, then it has trouble with MSXML's transformNode method, which always transforms to UTF-16 regardless of the <xsl:output> element. That would cause what you've been seeing. Cheers, --Jonathan XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Unicode usage, Thomas B. Passin | Thread | RE: [xsl] Unicode usage, Michael Kay |
[xsl] stylesheet vs egrep round 2, Ahmad J Reeves | Date | [xsl] Creating a temporary tree in , Khalid |
Month |