Subject: RE: [xsl] resolve html entities From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Mon, 31 Oct 2005 09:11:08 -0000 |
I would suggest parsing the HTML using John Cowan's TagSoup parser. This looks to the XSLT processor just like an XML parser, so you can probably integrate it directly - depending on the XSLT processor that you are using. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: Maximilian Gdrber [mailto:max@xxxxxxxxxx] > Sent: 31 October 2005 08:40 > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: [xsl] resolve html entities > > Hi, > > I know this is a common question but I could not find a > specific answer > to this: > > I am exporting texts from a database that contains html markup. Now I > need to transform > the html to something usable in a DTP application. > > The tags are not the problem because I am only allowing a > subset of html > but the html entities > (german umlauts, special characters) would need to be transformed to > plain Unicode (UTF-8) > characters. > > What is the best way to achieve this? > > Thanks, > > Max Gaerber
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] resolve html entities, Maximilian Gärber | Thread | [xsl] Import stylesheet, Mark Williams |
Re: [xsl] xsl:key only checks first, Ragulf Pickaxe | Date | Re: [xsl] Acces nodes [help], Ragulf Pickaxe |
Month |