Subject: RE: [xsl] SAX filters (was Re: Generating NCRs) From: "Michael Kay" <michael.h.kay@xxxxxxxxxxxx> Date: Fri, 17 Jan 2003 09:04:33 -0000 |
> Michael Kay wrote: > > Use a text editor, or perhaps a SAX filter, to replace "&#" by > > "&". Why use a power drill when you can do the job with a hammer? > > A long while ago, Dave Pawson told me that you and I mention > SAX filters a bit > too often to ignore in the FAQ. Can we maybe put together a > list of use cases > for when a SAX filter is more appropriate than XSLT, and > maybe a brief demo of one? > No time to do a proper job on this, but here's a starter for ten (sorry, that's a catchprase from a UK TV programme) 1. A SAX filter can sometimes be used instead of an XSLT transformation, and it can sometimes be used for pre-processing the input to an XSLT transformation, or for post-processing the output. 2. The main cases where a SAX filter can be useful are: (a) in cases where the XML file is too large to be processed by XSLT (b) in cases where you need to perform operations - usually text processing - that can't be done easily in XSLT. (c) to preserve information that the XSLT/XPath data model does not retain 3. To solve problems of document size, you can: (a) do all the processing in a SAX application (if the processsing is simple and purely serial) (b) use a preprocessing SAX filter to create a smaller input document for the transformation to work with (e.g. by projection or restriction) (c) use a preprocessing SAX filter to split the large document into many small documents, each of which is then transformed independently by XSLT. If necessary, you can then use a postprocessing SAX filter to put the transformed pieces back together again. 4. A SAX filter can be used to transform the input data into a form that is more amenable to XSLT processing. Examples include: (a) preparsing a structured text field (e.g. CSV) into a set of separate elements (b) changing the representation of a date field to the ISO 8601 form yyyy-mm-dd (c) computing a derived attribute, e.g. adding @value as the product of @price and @qty, making it easier for the XSLT stylesheet to do sorting and totalling. (d) simple grouping of elements, for example adding a <list> element around any consecutive sequence of one or more <list-item> elements 5. A SAX filter can be used to capture features of the source document that are not representable in the XSLT data model. For example, entity references and CDATA sections, as well as DTD declarations, can all be captured in a SAX filter and translated into elements that are visible to the XSLT stylesheet. 6. A postprocessing SAX filter (or simply a SAX ContentHandler) is useful in two principal situations: (a) to undo the changes made by a preprocessing filter (b) to achieve serialization effects that cannot be achieved using the standard serialization methods (as an alternative to disable-output-escaping). Sometimes a user-written serializer can be produced by subclassing the standard serializer supplied with your chosen product. This will of course be product-dependent and your code may not work with future releases of the product. 7. It's also possible to write a SAX filter to preprocess the stylesheet. This is less common, but it can be used to tackle problems such as dynamic sort keys, or XPath expressions that are contained within source documents. The new STX specification provides the prospect of being able to write SAX filters without needing to do low-level Java coding. If this takes off, I think that the idea of doing a complex transformation as a pipeline of SAX filters, some generated using XSLT and some using STX, may become increasingly attractive. Although XSLT 2.0 deals with nearly all the limitations of XSLT 1.0 in areas such as text processing, grouping, and aggregation, it doesn't address the problem of handling large input documents. Michael Kay Software AG home: Michael.H.Kay@xxxxxxxxxxxx work: Michael.Kay@xxxxxxxxxxxxxx XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] SAX filters (was Re: Generati, Mike Brown | Thread | [xsl] calling extension java functi, Laura |
RE: [xsl] XSL DATA STRUCTURE PROBLE, Jarno . Elovirta | Date | RE: [xsl] reposting.append to a fil, bryan |
Month |