Re: Formatting Objects considered harmful

Subject: Re: Formatting Objects considered harmful
From: David Carlisle <davidc@xxxxxxxxx>
Date: Fri, 16 Apr 1999 14:51:00 +0100 (BST)

You give an example of using CSS+HTML as an alternative to using
formatting objects.

    The result is: 
      <H1 STYLE="font-size:1.3em; margin-top:1.5em; margin-bottom:0.4em">
         The headline
      </H1>

    The result contains both semantics and presentation. This is a good thing. 


But what is a bad thing (from my point of view a fatally bad thing) is
that if you start with HTML then you start with under specified and
browser specific default presentation for the elements. Unless my
understanding is faulty, the CSS only modifies these defaults.

You could use CSS over a `clean' XML dtd with no default semantics
but then most of the arguments you make against FO would also apply to
that case.



    XFO was not designed to be used over the Web, and most people
    interested enough to read this far will agree that the use of
    XTL+XFO described in this document is abuse. However, it seems that
    there is no way to stop the abuse. Rather, it seems like conforming
    implementations are required to support XFO on the Web.

    Here's a demonstration: 
      1.download the XML/XSL browser from InDelv 
      2.point it to a document which only contains XFO 

Firstly XSL engines are _not_ obliged to linearise FO as XML, they may
just render the objects directly, thus I would say that the problems
that you discuss (which are, I agree real problems) the fault of the
person who made the file referred to in stage 2 above.

No matter what the original semantics are in the original document, it
will always be possible to divert a half digested form to a file, and
then put that file on the web. I might put up a pcl file intended for my
deskjet printer, it's not very portable (and I probably wouldn't do it)
but I don't see how you can legislate against the production of pcl
files, just because I might.


    For these reasons, I believe W3C should encourage authors to publish
    documents in semantically rich HTML and XML [2] with attached style
    sheets. The style sheets should be evaluated on the client
    side. This gives us the best of both worlds: rich applications and
    rich presentations.

I rather hope HTML will just quietly vanish (although I expect it won't)
With any style language being available (be it CSS or XSL) it would
surely be better to have a base set of elements on which to hang the
style that does not have the historical and unfortunate default
presentational semantics that come with HTML.


You make the point about aural rendering. It may be so that once someone
implements an aural FO renderer it turns out that what, to the print
minded view, seems like a single formatting object ought to be two or
more, to cater for aural distinctions, but surely that can be fixed by
suitable FO definitions. It does not mean that there is something
intrinsically wrong with the FO model. If you use HTML sometimes you
will be able to carry some semantic information through to the formatter
sometimes not, if it is important then it is surely better to do it
right and specify some mechanism for passing such information down the
line. A solution of using HTML for this problem seems strange, it only
has a slightly ad hoc collection of semantic possibilities and it has
all sorts of complications separating out the desired formatting from
unwanted defaults based on that semantic information.


  High-quality printing is very hard, and can't be done without looking
  at the shape of the glyphs.

There aren't many systems that look at the shape of the glyphs.
(Although that would be useful to do some `skyline' optimisations to the
spacing). You do need to know the character metrics. Certainly coming
from TeX, a DSSSL/XSL approach of deferring all such knowledge to a
`back end' processor hidden from the style sheet seems rather limiting,
but it may possibly work out alright, given sufficiently powerful
formatting objects. (Not that the formatting objects in either DSSSL or
XSL are sufficiently powerful yet.)

  Express formatting objects in something other than XML. The only
  technical reason why formatting objects are expressed in XML today is
  that XTL can only output XML. (There's probably a few non-technical
  reasons as well, but let's ignore those for now.)

XSL does not mandate how the output tree is linearised. It does not have
to be XML at all. For example if you use the HTML namespace it can
linearise the output as HTML, with SGML rather than XML syntax.
Thus this `technical reason' doesn't seem to be valid.

  By extending XTL to express FOs in some abstract, non-syntaxed manner
  the problem can be avoided. DSSSL and CSS use this model.

DSSSL also doesn't specify how the output tree is linearised.
Certainly jade (the most popular dsssl engine, I would guess)
is quite happy to output the result tree as SGML (the fot backend)
FOT files generated via dsssl have pretty much exactly the same status
as an FO XML file generated by XSL. In both cases you can format the
file with an essentially identity transform. (The transform looks
slightly more complicated in the dsssl case as the mapping from the
names in the FOT file to the flow objects is not built in, so has to be
done again in the stylesheet, but conceptually I see no difference.)

David

I should perhaps say I'm not involved with XSL definition in any way
(except for some very preliminary contacts between the Math and XSL
groups at the begining of this year which have no bearing on the
examples here).


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread