Subject: Re: About the source library From: Joerg Wittenberger <Joerg.Wittenberger@xxxxxxxxx> Date: 04 May 1999 03:52:55 +0200 |
>>>>> "ADC" == Adam Di Carlo <adam@xxxxxxxxxxx> writes: ADC> Joerg Wittenberger <Joerg.Wittenberger@xxxxxxxxx> writes: >> I *guess* there is some flaw in jade making it so slow. At least >> the backend interface is a design flaw I can hardly live with. ADC> I'm actually not sure whether it's jade, or an implementational ADC> flaw in the DSSSL spec. I'm not sure either. IMHO a straight forward imlementation of DSSSL *probably* leads to a complexity > O(n) (several tree traversals), but I'm not sure that it has must do so. It's definately easy to write complex, hence slow, DSSSL programs without taking notice. ADC> BTW, I've used Joerg's sdc -- it's very good and fast, although ADC> since it doesn't understand DSSSL, any additional DTD style ADC> support has to be written from scratch. That's a bit missleading - and I think Adam got that impression, because I did a bad (no) job at documenting the internal design. Sdc, while started as a practical formater, was a prototype of a transformation engine. I tried to get along without the distinction of a style and a transformation engine. Instead I envisoned a "pipe" of transformations. As I needed a working program, it came quite far. I think this design would still be good to explain/discuss here, because it's might have a bit of proposal value for whatever will be done next: The idea was to come up with a set of FO's. Those where supposed to be defined using a DTD. Supporting a certain DTD on input would mean writing a transformation from that DTD to the FO-DTD (maybe via multiple, cascaded transformations). The FO-DTD, which got never written, has approximately what "basic" LaTeX (plus HTML minus plain TeX) has to offer (paragraphs, inline FO's, graphics, lists, xref/links). Those FO's are implemented to a feasable extent on a variety of backends (ranging from GNU info and nroff -mm to LaTeX, Lout and HMTL). A single transformation - in contrast to DSSSL - was unable to express abitrary, multiple traversals of the tree. Those would have to be filtered out (say when analysing the DSSSL program) and could be placed into either a prior pipe stage or in the worst case folded into a second, full traversal. SUBDOC support, pretty expensive with jade, came in quite naturaly and very cheap as one such pipe stage. There are only a few such transformation yet written. Mostly a DTD of mine, which was designed after QUERTZ omitting everything which looks vagualy like style (<toc> etc.) saveing me any typing effort possible, some special additions designed to prepare classes (overhead slides mixed with hand outs and teacher annotations, filtering at the transformation level, not to clutter up my writings with "<![ %HandOut; [ ...]]>"), a letter DTD, man page and linuxdoc. In addition there is a mostly unique feature: sensible NDATA handling. Any NDATA entity, even CDATA element content, is transparently converted into something sensible for the output. (That way u can have some pic or Lout pictures within your GNU/info document, which are embedded as GIF in HTML). The sorry detail (and here is Adam right): a) the separation of style transformation from FO implementation is not yet completed b) there is no style language reader yet. Instead I handcoded the data structures the style definition reader (DSSSL or whatever) should produce. Hence I must regard that (Scheme code) the style definition - I'm very sorry about that. ADC> capability to it (and maybe DSSSL later). I don't know if a ADC> grove-based engine would make all the performance gains, and code ADC> simplicity disappear. It's definately a 'community' project ADC> rather than an 'individual' project. That was actually the thing I did not manage to find yet. It looks to me as if one has to be *very* careful about the representation issues of grove nodes (especially separating access to nodes, which are closer to the root). Another yet to be investigated problem is the one node per character illusion. ---------------------------------------------------------------------------- I'd like to propose that we design/document the following interfaces: Backend Interface ----------------- Flow Object tree DTD (FO-DTD). The backend reads a well formed XML document (without? DTD for preformance) to produce the desired formated object. I'd like to see this working with a pipe interface, but others might be more appropriate. Front End Parser ---------------- The front end parser composes the document from storage objects and delivers something easy to parse or any stream interface to be agreed upon. A second SAX style event API might be worth, but is certainly not as efficient for transformers as it is for interactive programs. There shall be a variety of parsers ranging from full SGML compliant to well formed, none validating XML. Also "basic" LaTeX might be funny (useful) and database queries come to mind. Style Definition Reader Interface --------------------------------- The style definition reader delivers a set of transformation specifications. I think this is the part which needs most of the discussion. The representation must at least allow to reorder transformations (and maybe contain uninterpreted data). Access to upper nodes and any other way to create self references within a node should be easy to spot. There shall be a variety of style langs possible. Sure DSSSL and XSL. I don't dare yet to propose a cetain interface/representation. Maybe closer investigation reveals that we could even stick with slighly extended XSL here (I could imagine, but I don't think so). Style Extension Interface ------------------------- The "uninterpreted data" (above), if at all present, get's invoked on a pluggable interpreter with some access to the already styled FOT. Or whatever. I did not think about that yet. I just wanted to open the door to non functional aproaches. But that might not be desirable at all. NDATA Handler Interface ----------------------- For each NDATA definition there shall be a defined way to transform the correspondenting entity into a FO. This step must allow external programs to be started, which are told about the storage entity and the desired output format. Those programs are not allowed to have side effects visible to the transformation/formating process. /Jerry DSSSList info and archive: http://www.mulberrytech.com/dsssl/dssslist
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: About the source library, Adam Di Carlo | Thread | Re: About the source library, Adam Di Carlo |
Re: About the source library, Sebastian Rahtz | Date | Re: Question concerning speedup of , Joerg Wittenberger |
Month |