Subject: Re: [xsl] where to look for xsl folk.. From: "Graydon graydon@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Sun, 3 Jul 2016 21:02:00 -0000 |
On Sun, Jul 03, 2016 at 04:13:09PM -0000, Terry Badger terry_badger@xxxxxxxxx scripsit: > Graydon, The document.xml I have found and worked with taken from a > .docx file always have a prolog that has encoding="UTF-8" so I have > not worried about invalid Unicode characters and can process any text > in Word using an xsl stylesheet. Do you have a sample where a docx > file has non Unicode encodings? Not on hand, and if I did, it wouldn't be my data to share. I've hit two cases of code point 96 -- a codepage 1252 n-dash -- in an XSLT document (which is admittedly not Word) during paid work in the last couple weeks, though. It does happen. It won't cause problems until something checks for UTF-8 encoding specifically, rather than the XML character set. It's entirely possible to have the whole XSLT toolchain completely happy -- as it was in that case -- and something downstream -- checking for encoding -- not happy at all. I have certainly hit this problem with the XML versions of Office documents in the past. Pre-XML ver 5, it was possible to trust the parser to tell if your document wasn't UTF-8 because XML's character set was a subset of UTF-8. With ver 5, that's no longer the case. -- Graydon
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] where to look for xsl fol, adam adam@xxxxxxxxxx | Thread | [xsl] How to copy attribute value t, Kenneth Reid Beesley |
Re: [xsl] where to look for xsl fol, adam adam@xxxxxxxxxx | Date | Re: [xsl] where to look for xsl fol, Graydon graydon@xxxx |
Month |