Subject: RE: Multiple pages of well formed HTML ---> XML From: "Maxime Levesque" <maximel@xxxxxxxxxxxxxx> Date: Tue, 3 Aug 1999 11:20:44 -0700 |
As a *workaround* for the unimplemented document() function, you could implement a 'composite parser' (or agregating parser ...) that would callback it's org.xml.sax.DocumentHandler to make it think that it's handling a single document ... That will work if you are using a SAX based XSL processor (ex.: XT), if it's DOM based, you can just 'glue' the trees together ... public class CompositeParser implements org.xml.sax.DocumentHandler, org.xml.sax.Parser { private InputSources[] inputSources_; private DocumentHandler documentHandler_; private org.xml.sax.Parser aRealParser_ = "... your favorite parser ..."; public void setDocumentHandler(DocumentHandler handler) { documentHandler_ = handler; } public CompositeParser(InputSource[] inputSources) { inputSources_ = inputSources; } public void parse(InputSource source) throws SAXException, java.io.IOException { // ignore source ... documentHandler_.startDocument(); // fake the start of the 'aggregated' doc, // fake a root start documentHandler_.startElement("YourFakeRoot", new AttributeListImpl()); // receive the callbacks from all the // inputSources_ : for(int i = 0; i < inputSources_.length; i++) { aRealParser_.setDocumentHandler(this); aRealParser_.parse(inputSources_[i]); } // fake a root end documentHandler_.endElement("YourFakeRoot"); documentHandler_.endDocument(); // fake the end of the 'aggregated' doc. } public void startElement(String name, AttributeList atts) throws SAXException { documentHandler_.startElement(name, atts); } public void endElement(String name) throws SAXException { documentHandler_.startElement(name); } public void characters(char[] ch, int start, int length) throws SAXException { documentHandler_.characters(ch, start, length); } public void ignorableWhitespace(char[] ch, int start, int length) throws SAXException { documentHandler_.ignorableWhitespace(ch, start, length); } public void processingInstruction(String target, String data) throws SAXException { documentHandler_.(target, data); } public void startDocument() throws SAXException {} // silence this calls public void endDocument() throws SAXException {} // silence this calls //.... implement the other methods of org.xml.sax.Parser with empty methods .... //... or delegate them to 'aRealParser_' ... } Maxime Levesque > -----Original Message----- > From: owner-xsl-list@xxxxxxxxxxxxxxxx > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxx]On Behalf Of McKisson, Shawn > Sent: Tuesday, August 03, 1999 8:38 AM > To: 'xsl-list@xxxxxxxxxxxxxxxx' > Subject: Multiple pages of well formed HTML ---> XML > > > Thanks to all those that helped me with the linear to deep xsl > transformation - the information you gave was priceless to a beginner like > myself. (see post XSL problem 8/2/1999) > Special thanks to David Carlisle and Dave Pawson who went out of their way > to help. > > Related to this, I now have the need to gather well formed HTML from > multiple web pages and form it into a single XML document. It > seems like to > only trick here is to get each of the HTML trees > to hang off of the root node of the DOM tree that XSL is going to > manipulate. > ie. > > (wp = webpage) > > DOM > root > / | \ > / | \ > / | \ > wp1 wp2..wpn > > With that accomplished, it seems that I could use XSL in standard way to > generate the XML. > Does this sound like a reasonable solution to the problem? Any other > suggestions? (I haven't looked into XLink, so I'm not sure exactly what it > is or if it is relevant here) > > --shawn > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: Multiple pages of well formed H, Duane Nickull | Thread | Dynamically changing an XML stylesh, Nimmons, Daniel |
Returning nodeset from external fun, Dan Machak | Date | RE: How to call a href?, Kara Lee |
Month |