Subject: (dsssl) Re: [OpenJade-devel] Re: OpenJade development From: "Paul Tyson" <paul@xxxxxxxxxxxxxxxxxxxxxx> Date: Thu, 4 Oct 2001 14:28:30 -0700 |
Javier, I am trying to contain my excitement at the possibility of further development on openjade, leading to a more complete implementation of the DSSSL standard. There has been occasional talk of it in the last few years. But it seems there are less than 20 people in the world who: a) are concerned about the subject; and b) understand what DSSSL can do. And of those people, none are able to commit sustained effort to create code. I am cross-posting this to the dsssl-list, because I think the topics you discuss are of interest to DSSSL users as well as potential developers. My apologies for the length, but I wanted to include full quotes of your and Adam's comments for the benefit of dsssl-list readers. See my specific comments and responses to your many thoughtful observations below. Legend: [jf] = Javier Farreres [ad] = Adam Di Carlo (2-level quoted text) [pt] = Paul Tyson [jf,ad] > Well, when I first decided to contact someone from OpenJade, it was just to get > information. > Now that I have two answers, it comes my time to ask questions to the list to > solve some doubts I have. > The fact is, before I involve any student in this, I want to have everything > clear. > > First I contacted Adan di Carlo as it was the name I got from some place. > > >> The thing is, I am not aware of the level of development of OpenJade. As > >> far as I know, it doesn't have a transformation language developed, > > > >It does have a proprietary transformation language. It doesn't have > >ISO DSSSL standard transformation, however. On the other hand, the > >ISO DSSSL transformation language is considered overcomplicated by > >many and already obsoleted by XSLT, not to mention that many consider > >the proprietary varient, implemented by James Clark in the original > >Jade, much better anyhow. > > Ok. Here comes the first question, which is a bit long. > First came SGML, then DSSSL and HyTime, later came XML and later XSL. > XML can be seen as a simplified SGML (it has no marked sections for example). > XSLT is a translation stylesheet language based on XML. It is not a programming > language. There is no construct (as far as I know) in XSLT to say repeat this > structure n times. Because XSLT is not a programming language. It is good for > lay people to write stylesheets easily, but DSSSL, being a programming > language, is much more powerfull than XSLT. > With all this, I mean XSLT --CANNOT-- substitute DSSSL as a general tool. > Apart from the fact that the development of XML has ignored groves for long, > and has no general model underneath. > Am I wrong in something? > [pt] Only in that you may underestimate the power of XSLT. It is possible, for instance, to repeat a structure n times. For someone who knows functional programming techniques, XSLT can do some wonderful things, but the syntax is often awkward and verbose. I stop short of saying XSLT could ever be a complete replacement or substitute for DSSSL, partially because the application domains are different (with a large overlapping area). I would estimate that for 90% of *current* publishing applications you *could* use XML, XSLT and XSL-FO. But there is a large set of potential applications, exploiting the power of groves and HyTime, that have not even been considered in *current* publishing applications, which could not even be imagined from the XML/XSL paradigm. You are right on target with your observation that the lack of a comprehensive data model in XSLT is a big weakness. [jf] > Now to the transformation stuff. The transformation language in DSSSL is based > on groves. And the transformation language in DSSSL transforms groves, not > documents. XSLT only transforms elements into elements. And well, it is not > that DSSSL transformation is overcomplicated. I think noone has tried to teach > it consequently. I am trying it. in a course I am planning on DSSSL. > [pt] The DSSSL transformation language is, conceptually, very sophisticated and challenging. Nothing less would be so powerful. Unfortunately, to understand it fully requires: a) good grasp of functional programming methods, especially recursion on tree structures; and b) an understanding of the abstract structure of documents according to the grove model. Very few people, apparently, have (or bother to acquire) those prerequisite skills. I agree that perhaps it is lack of good teaching that has contributed to its disregard. Also lack of immediate practical applications other than creating SGML documents. James Clark himself asked (perhaps cynically) what the transformation language would do that jade's sgml output method couldn't do. He's right, in the space of applications that only want sgml-to-sgml transformations. But the full model is sgml-to-grove-to-grove(s)-to-whatever, with the TL doing the grove-to-grove(s) part. What you do with the output grove is only limited by your imagination (and programming skill!). This provides a standard, universal model for practically all data and document transformation needs, using a single consistent data model and processing paradigm. The reality is, 99% of SGML/XML users will (or must) settle for special-purpose, limited processing solutions because that is what gets today's jobs done today. [jf] > Apart from teaching at the university as a parttime professor, and doing my > phd, I am also involved in the publishing theme. I wiorked some years in a > publishing company who developed encyclopaedias. And I would say without fear > that SGML, being as complicated as it is, is even too simple for the problem. > Perhaps HyTime is the sollution, but I know of no implementation of it. [pt] Again, you are right on target here. I believe HyTime processing could easily be implemented on top of a complete DSSSL engine. There are a few HyTime implementations, the most noteworthy being GroveMinder, now held by Epremis (www.epremis.com). But as far as I know this doesn't use or understand DSSSL. The HyTime users group home page http://www.hytime.org is old, but has some useful information. You can also find a few more HyTime links at the SGML/XML Cover Pages http://xml.coverpages.org. [jf] > Now I offer courses to publishing companies, all related to SGML. One of them > is DSSSL I have not yet finished it, but I think I have found a good way to > teach every part of the standard. And well, when someone understands it, the > transf language in DSSSL is not so complicated. > > My question is: would it be useful to implement this part of DSSSL into > OpenJade? > [pt] I myself would find it useful, and I believe it would have the potential to support many wonderfully useful applications. But the market has not demonstrated a need for this sort of thing. The fact is, for any *particular* SGML transformation, there are at least half a dozen "good" ways to accomplish it, and XSLT is quickly becoming about the easiest way to specify such transformations. Developers do not hesitate to string together a long sequence of preprocessing and/or postprocessing steps, with XML+XSLT in the middle. Two or three other scripting languages may be involved in the process, such as ASP, SQL, etc., and neither the developers nor the system architects are bothered by polyglot solutions. Companies big and small are offering integrated, but inevitably proprietary and non-standards-based) solutions to these problems. A DSSSL+HyTime grove-based transformation system would drastically simplify these sorts of applications, and raise the capability of "single-source publishing" to an entirely new level. [jf,ad] > >> not does it have a full implementation of the grove paradigm. It > >> doesn't have also a full implementation of all the flow objects, not > >> the full query language. > > > >I think there is a partial query implementation in there. > > About it, I think Jade and OpenJade implement only the core query language, but > not the full query language. But implementing the full query language has only > sense if the full grove paradigm is implemented. > Anyway, DSSSL query language is designed to work on the default grove plan. > There are no query constructs for parts of the grove outside the default plan. > > Would it be interesting to implement the full query language? > [pt] Definitely interesting, but as with the TL the market has not demonstrated a need for this. XPath implemented the 80% of query that is most useful. I admit I don't fully understand node regular expressions, but I believe it could be put to creative use if it were implemented. Any industrial-strength DSSSL engine should have a complete query capability. A useful extension would be a shorthand notation--perhaps XPath itself. I believe this could be implemented as DSSSL procedures. [jf,ad] > >The thing I hear most often complained about is that we only implement > >a subset of the DSSSL page layout stuff. So full-on page layout, at > >least, last I checked, was the most often requested feature. > > > >I think this might be difficult to do since you'd have to think about > >both the DSSSL processor, but much of the work would be in the > >backends as well, which is probably of less academic interest. But > >you asked! > > Well, yes. My oppinion is the same. The fact that the page feature is not fully > implemented pretty much leaves Jade as a toy. I the publishiing area where > I teach, this feature is of uttermost interest. And well, I think that for a > student doing his final project, everything can be formative. Implementing this > part could be very nice, I think. > > Take into account, I have an unlimited source of students. I must not limit to > offer one project. I can offer several projects, to implement different parts > of OpenJade. > > >Also, see develdoc/TODO in the sources. > > I will, thanks. > > > Dear Javier, > > > > a while ago, I did a work on DSSSL myself, devoloping a PDF backend for > > (Open)Jade in Java. Unfortunately, the (Open)Jade Project seems to be - > > sorry guys - on a pretty low level, due to the other things most developers > > do and the general shift towards XSL after all. > > This is my question. XSL and XML have great impact, as there is internet > behind. OpenJade will allways be the more complicated tool for professional > publishing. > But I think it is ok this way. > > > While DSSSL (and it's main tool, Jade) is certainly a great concept, it is > > a) quite hard to unterstand and b) even harder to understand the code for > > "normal" people. James Clark has done an awesome job, noone doubts that, but > > the complexity is definitly high. > > With code, you mean the code of the DSSSL language, or the code of the Jade > implementation? > Understanding DSSSL is just understanding lisp (ok, scheme), which is not that > difficult. > And well, after all, Jade and DSSSL are not for "normal people" but for > professional programmers and publishing technicians. > And XML, with its simplicity, is no more HTML, and it requieres also some > information structuring concepts, so it is not that easy neither. > > > From my point of view, a project both interesting as well as good for the > > community would be to implement the full group of the "page-sequence" flow > > object - but the main part would be to get the backends render this stuff > > correctly. As with the separate PDF backend, it would be possible and most > > chalenging, but I doubt it would be possible at all for RTF, plain text etc. > > Only the TeX-engine could certainly cope with it easily, but getting this > > done is certainly a thing for itself... > > Ok. As far as I see, there is a kind of consensus about the goodness of > implementing the page feature. > It was also one of my ideas. > [pt] Definitely, high-end publishing requires complex page layout and sequencing. In my experience, for the formatting needs I have encountered, the style language is capable of expressing all the requirements. It remains only to implement them. I don't believe getting this information into the flow object tree is the difficult part--rather, it's in the back-end to actually do the page layout based on the fot. (I could be completely wrong about this, though.) [jf] > Any more comments? What about implementing full groves and a real grove plan? > [pt] Full grove implementation is essential for adding any HyTime addressing and linking capabilities. It would also allow other possibilities, such as automatic DTD analysis and transformation. By allowing additional front-end notation processors to supply input groves, openjade could become a general-purpose transforming wizard. For instance, Rich Text Format documents could be delivered as groves to a DSSSL transformation. (GroveMinder has the capability to process other notations to create groves, but does not use DSSSL for transformation.) Flexible grove plans would allow creation of "lite" groves that could enhance performance in some situations. As I see it, groves are at the center of the division between ISO and W3C. The conception of groves is what finally led to the breakthrough that unified DSSSL and HyTime. The failure to appreciate and understand groves is what led W3C to create fragmented, inconsistent standards and special-purpose syntax for local, limited-scope applications of DSSSL and HyTime concepts. So, for better or for worse, any implementation of DSSSL or HyTime must have full grove-processing capabilities. Other comments: 1. ISO standards are reviewed every 5 years for renewal, revision, or cancellation. With the lack of activity on DSSSL implementations, I am concerned that it could lose its status as an active ISO standard. (I think scheme itself is in a similar predicament as an IEEE standard.) Of course, status as a standard doesn't affect its usefulness or integrity as a language, but being a non-standard (or worse, a has-been standard) is just one more reason for people not to choose DSSSL. If indeed you are prepared to commit some real resources to further implementations of DSSSL, it might behoove others (especially non-coders such as myself) to get involved with their national standards bodies as they review the standard. 2. The built-in backends for jade made its implementation of the style language immediately useful. Some built-in backends for the transformation language would be essential. At minimum, it should be able to emit SGML documents (including declarations and DTDs), and canonical grove representation. Other possibilities would be STEP Express instances, and CGM files. I believe some preliminary work on grove plans for both of these notations was done a few years ago, but I don't know if it is possible to recover or resuscitate these efforts. 3. Anything that would make openjade more directly usable as the back end of a http server would make it more attractive to a wider market. I have no idea what this would entail, but some useful features would be: persistent grove representation; ability to invoke in-memory "compiled" transformations; database-to-grove translation component. If, in the best of all possible worlds, openjade came with a mini-http server built in, you could have an instant SGML server that would eliminate the need for *any* translation or preprocessing of your SGML source data for web publishing. It would support HTML viewing or on-demand composition and delivery of PDF files from the same SGML source. With additional front-end notation processors, it could be a complete "information server", as envisioned (and partially implemented) by GroveMinder. [jf] > Javi > [pt] I wish you success with your efforts, Javier. As you can tell, I agree with you 100% that it is a Good Thing to do, and I would like to help if possible. But the skeptical side of me says that the tidal wave of XML and related W3C standards has all but completely washed away the hopes and dreams embodied in ISO standards DSSSL:1996 and HyTime TC:1997. But the tidal wave didn't wash away the real difficulties inherent in processing complex document-based information. It remains to be seen whether, when the wave recedes, the efforts so ably conceived and executed by the ISO committees will thrive and prosper, or if completely new solutions will be discovered. Good luck, Paul Tyson, Principal Consultant Precision Documents paul@xxxxxxxxxxxxxxxxxxxxxx http://precisiondocuments.com "The art and science of document engineering." DSSSList info and archive: http://www.mulberrytech.com/dsssl/dssslist
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: (dsssl) Hypothetical question o, David Carlisle | Thread | (dsssl) Practical Bibliography ques, Trent Shipley |
Re: (dsssl) Hypothetical question o, Norman Walsh | Date | Re: (dsssl) XML not appropriate for, Trent Shipley |
Month |