Subject: Re: [xsl] Selective escaping of special characters From: "Thomas B. Passin" <tpassin@xxxxxxxxxxxx> Date: Wed, 13 Mar 2002 10:25:59 -0500 |
[Kyrre Wathne] > My apologies if this question has been asked before, I haven't found posts > that address this exact issue. > > My problem is that I want to transform junk HTML generated by Microsoft > Word. This contains markup, of course, so my first instinct was to use > disable-output-escaping. However, this also disables escaping of other > special characters, like the special dash character –. These are then > outputted in a format my browser (Internet Explorer) doesn't understand (I > use "ISO-8859-1" as encoding in output). > Not exactly what you asked for, but HTML-Tidy has a setting that causes it to remove all the Microsoft junk from Word2000 output. There are java and C versions, with various wrappers including Python. One fast preprocessing pass with Tidy will do a really nice job of getting rid of all that noise, much easier than trying to get a stylesheet working. Cheers, Tom P XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Selective escaping of spe, Michael Kay | Thread | [xsl] Finding if a node exists, Robin Samways |
RE: [xsl] Counting non-empty elemen, kfricovsky | Date | RE: [xsl] Counting non-empty elemen, Sergej |
Month |