Subject: Re: [xsl] Parsing illegal characters out of XML before XSL:FO transform?|
From: "J.Pietschmann" <j3322ptm@xxxxxxxx>
Date: Fri, 19 Sep 2003 22:29:02 +0200
...Hey all.. a portion fo the website we are building takes some XML data and transforms it with an XSL stylesheet giving FO coding then its transformed into PDF.. the current problem at hand is I'm taking a bunch of our java objects and making an XML document out of them by hand, something like this: xmlData.append("<?xml version='1.0' encoding='UTF-8'?><USERLIST><STATE>"); xmlData.append("<SYSTEM>" + flexReference.getPlatform() + "</SYSTEM>"); xmlData.append("<MEMBERS>" + flexReference.getMembers() + "</MEMBERS>"); xmlData.append("<ASSETS>" + flexReference.getAssets() + "</ASSETS>");
Anyway this is all pulled from our database and some of the values that will be inserted have characters that are illegal for XML, ie an & etc.. is there a parser out there that can parse a java String or StringBuffer and add the proper escape codes that are needed for the XML to be well formed then be able to be transformed correctly?
It depends somewhat whether your database entries may also contain markup. If they don't, either fix the strings using a Java function which quotes & and < (and perhaps filters control characters) xmlData.append("<SYSTEM>") xmlData.append(xmlQuote(flexReference.getPlatform()); xmlData.append("</SYSTEM>"); or use a Java API, somewhat vaguely like SAXParserFactory factory = (SAXParserFactory)TransformerFactory.newInstance() TransformerHandler transformer = factory.newTransformerHandler( new StreamSource(new File(xslFileName))); Attributes attr = new Attributes(); transformer.startElement("","USERLIST","USERLIST",attr); transformer.startElement("","STATE","STATE",attr); transformer.startElement("","SYSTEM","SYSTEM",attr); char c=flexReference.getPlatform().getChars(); transformer.characters(c,0,c.length); transformer.endElement("","SYSTEM","SYSTEM"); ... Another possiblity is to build a DOM. Using SAX is often the most efficient way, YMMV. You don't need to quote special characters in the API, actually you must not (think about it).