Subject: Re: & in SGML vs XML|
From: "Christopher R. Maden" <crism@xxxxxxxxxx>
Date: Sun, 05 Nov 2000 23:47:35 -0800
I have another tricky &-related question:
I have SGML documents which can easily be converted to XML by just exchanging the declaration in the first line(s), except for that they contain &'s standing alone, as in <line>you & me</line>.
This is legal in SGML, but XML parsers and XT do not accept it. Is there a way of getting this right except for string replacement (& -> &)? (Which is tricky because "real" entities like Č must not be destroyed.) James Clark's sx does it alright, but I'd prefer a Java solution (ideally, one line of declaration either in the stylesheets or the XML).
In other words: Is there a way of treating an XML document like <line>you & me</line>?
s/&\([^a-zA-Z]\)/\&\1/g # ampersand followed by innocuous character # is replaced by & and character s/&$/\&/ # ampersand at end of line is replaced by # &
-Chris -- Christopher R. Maden, Senior XML Analyst, Lexica LLC 222 Kearny St., Ste. 202, San Francisco, CA 94108-4510 +1.415.901.3631 tel./+1.415.477.3619 fax <URL:http://www.lexica.net/> <URL:http://www.oreilly.com/%7Ecrism/>