Subject: Re: [xsl] unreadable characters from indesign From: David Carlisle <davidc@xxxxxxxxx> Date: Thu, 18 Jan 2007 02:01:57 GMT |
> because I still don't understand how those characters end up > in my xml the usual cause of unexpected characters is incorrectly specified encodings. for example if the original document had a "PARAGRAPH SEPARATOR" character Unicode hex 2029, decimal 8233, and that file was written using utf-8 then this character would take three bytes, with hex codes E2 80 A9, that is, decimal 226 128 169, which are the three numbers you mentioned in the original post. If the file is correctly read as utf8 these three bytes will make a single character, accessable using the same code, or & #x2029; for example, but if it is incorrectly read using iso-8859-1 then the three bytes will appear as three spurious characters > U+00E2, LATIN SMALL LETTER A WITH CIRCUMFLEX > U+0080, control > U+00A9, COPYRIGHT SIGN David
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] unreadable characters fro, Marc Lambrichs | Thread | [xsl] position() from the attribute, Abel Braaksma |
Re: [xsl] unreadable characters fro, Marc Lambrichs | Date | [xsl] [ANN] XML Prague 2007 CALL FO, James Fuller |
Month |