Subject: RE: [xsl] Entities: The worst of both worlds :-( From: Graham Hannington <Ghannington@xxxxxxx> Date: Fri, 10 Oct 2003 20:20:35 +0100 |
>From Zarella's 2 Oct 2001 email: > There is a way to process character entities, but it requires a bit of > hacking to get the XML parser to work for you. Take your Entity declaration > files and create a new set that you will use just for transformation > purposes. Each entity will need to be modified to have the form: > <!ENTITY tilde "<ent>&tilde;</ent>"> > Now, this will create new <ent> elements in your XML file before it gets to > the XSLT processor. So, in XSLT, you can now use a template rule as follows: > <xsl:template match="ent"> > <xsl:value-of disable-output-escaping="yes" select="text()"/> > </xsl:template> This is very, very useful to me. I can now output XHTML with entity references preserved. But the fun's not over yet. In addition to producing English XHTML, I need to output a set of Shift-JIS encoded (Japanese) XHTML files (from Shift-JIS encoded Japanese XML source provided by the translators). l have some numeric entity references (&#nnnn;) in both my original XML source, and also in my XSLT stylesheet, that are causing problems with this Shift-JIS XML source. The MSXML (v3.0) transformNodeToObject method I'm using to invoke the XSLT stylesheet works without problem against the Shift-JIS encoded Japanese XML source - I can view the resulting transformed, Shift-JIS encoded X(HT)ML - but when I call the Save method on the XML object, it saves only up until where the first &#nnnn; reference (such as  ) was, and reports an error (Err.Number = 0; Err.Description is blank). This might just be an MSXML bug, I don't know. Any ideas? Finally, using Zarella's tip means that the resulting XML document must now refer to a doctype that defines these "preserved" entites (such as )... (when I said the output of the XSLT was XHTML, what I really meant was "something very similar to XHTML, but without the external DTD reference")... so now this whole process relies on being connected to the Web (so that the XHTML "http://..." DTD reference can be resolved by MSXML, so it's happy that the entities have definitions)... or otherwise I have to insert a local file system-specific DOCTYPE SYSTEM DTD reference, which I'd really like not to have... can someone throw me a lifeline, and point me in the right direction of getting MSXML to validate an XML document when the DOCTYPE refers to an "http://..." address, but you're not net-connected? (I've read that you can use the Add method to associate an XDR/XSD schema file with a schema URI... ah, boy... mebbe I should go look for the W3C XHTML XSD files... I could only find the DTD and .ENT files last time I looked... this would probably mean upgrading to v4 or ditching MSXML, 'cos v3 supports only XDR... starting to ramble now, time to hit Send. Pearls of wisdom gratefully accepted. Graham Hannington P.S. For anyone out there who wants to try Zarella's tip, here are the search'n'replace regular expressions I used (in jEdit) to hack the W3C XHTML .ent files (a trivial, truly pitiful attempt at trying to "give something back", I know): Search expression: <!ENTITY\s*([^\s]*)[^>]*> Replace expression: <!ENTITY $1 "<ent>&$1;</ent>"> P.P.S. Off-topic (sorry), but I've seen similar queries on "appropriate" discussion groups with only tumbleweeds for answers: does anyone know how to make VBScript write Shift-JIS encoded files? (I can only "make" it do this by employing XSLT and the XML Save method... VBScript's native FileSystemObject seems limited to Unicode and ASCII... although I suspect that, on a Japanese PC, it might default to Shift-JIS instead of ASCII.) XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Replace Apostrophe Templa, Kathy Burke | Thread | RE: [xsl] Re: Replace Apostrophe Te, Kathy Burke |
[xsl] Re: Replace Apostrophe Templa, Dimitre Novatchev | Date | RE: [xsl] Replace Apostrophe Templa, Passin, Tom |
Month |