Re: Parsing errors on unknown entities (unicode characters)

Subject: Re: Parsing errors on unknown entities (unicode characters)
From: Michael Laing <mpl@xxxxxxxx>
Date: Wed, 24 Nov 1999 18:12:15 -0500
> Tangi Vass wrote:
> 
> Hi,
> 
> I've got an XML file, built from the result of a request on a search
> engine (via a private API), that may contain weird Unicode entities
> (such as &laqno;). Of course the parser crashes because my DTD only
> contains the most usual Unicode entities.
> 
> Has anyone a smarter idea than building a DTD with all Unicodes?
> 
> Tangi

Hmm - &laqno; is probably &laquo;... I would suggest using Sebastian
Rahtz's unicode.xml which is a very nice and complete base of
information about unicode characters from which (using xslt!) one can
easily extract character entities for building a DTD, plus lots of other
useful info.

I got my copy with jadetex, but it is lying around the internet in a few
places. I can mail it to you if you wish or chase down a link if it
would be of general interest.

ml


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread