Re: Formatting Objects considered harmful

Subject: Re: Formatting Objects considered harmful
From: Håkon Wium Lie <howcome@xxxxxxxxxxxxxxxxx>
Date: Sat, 17 Apr 1999 12:28:30 +0200 (MET DST)
David Carlisle wrote:

 > You give an example of using CSS+HTML as an alternative to using
 > formatting objects.
 > 
 >     The result is: 
 >       <H1 STYLE="font-size:1.3em; margin-top:1.5em; margin-bottom:0.4em">
 >          The headline
 >       </H1>
 > 
 >     The result contains both semantics and presentation. This is a good thing. 
 > 
 > But what is a bad thing (from my point of view a fatally bad thing) is
 > that if you start with HTML then you start with under specified and
 > browser specific default presentation for the elements. 

Yes, HTML doesn't specify presentation so saying it's underspecified is
an understatement (:-). 

 > Unless my
 > understanding is faulty, the CSS only modifies these defaults.

Yes and no. The concept of cascading allows a user or author style
sheets to be combined with the browser default style sheet. This makes
it possible to write short style sheets. For example, if you only care
about EM elements being red, you can write a very short style sheet:

 EM { color: red }

You can also, if you so prefer, supply a complete style sheet which
sets all values on all elements. So, you can entirely replace the
browser default with your settings -- the defaults are only as
convenience backup settings.

 > You could use CSS over a `clean' XML dtd with no default semantics
 > but then most of the arguments you make against FO would also apply to
 > that case.

How so? The XML+CSS combination retains both semantics and
presentation.

 >     XFO was not designed to be used over the Web, and most people
 >     interested enough to read this far will agree that the use of
 >     XTL+XFO described in this document is abuse. However, it seems that
 >     there is no way to stop the abuse. Rather, it seems like conforming
 >     implementations are required to support XFO on the Web.
 > 
 >     Here's a demonstration: 
 >       1.download the XML/XSL browser from InDelv 
 >       2.point it to a document which only contains XFO 
 > 
 > Firstly XSL engines are _not_ obliged to linearise FO as XML, they may
 > just render the objects directly, thus I would say that the problems
 > that you discuss (which are, I agree real problems) the fault of the
 > person who made the file referred to in stage 2 above.

I agree. But given that the fault has been made, a complient
implementation of XTL+XFO is required to collaborate. It's like having
laws against theft but requireing the police to help people steal
if they so decide.

 > No matter what the original semantics are in the original document, it
 > will always be possible to divert a half digested form to a file, and
 > then put that file on the web. I might put up a pcl file intended for my
 > deskjet printer, it's not very portable (and I probably wouldn't do it)
 > but I don't see how you can legislate against the production of pcl
 > files, just because I might.

On the Web we can't. People can -- and have -- put PS, PDF, PCL etc.
on the Web. But, should W3C encourage the use of these formats? Should
they state that such formats "enhances the functionality and
interoperability of the Web" (a common phrase in W3C Recommendations)?

 > I rather hope HTML will just quietly vanish (although I expect it won't)
 > With any style language being available (be it CSS or XSL) it would
 > surely be better to have a base set of elements on which to hang the
 > style that does not have the historical and unfortunate default
 > presentational semantics that come with HTML.

I hope HTML, rephrased in XHTML, stays around. It's semantics is very
well known, and I'd rather build on it than start again from scratch.
I agree that the presentational conventions are unfortunate.

 > You make the point about aural rendering. It may be so that once someone
 > implements an aural FO renderer it turns out that what, to the print
 > minded view, seems like a single formatting object ought to be two or
 > more, to cater for aural distinctions, but surely that can be fixed by
 > suitable FO definitions. 

True. That's #1 in my list of requirements:

  1 there must be a specification for aural formatting objects
  2 there must be implementations of aural formatting objects
  3 the fact that the user has an aural client must be known to the server
  4 all web sites must install XTL sheets to transform content into
    aural formatting objects

For the chain to be complete, however, steps 2-4 must also be put in
place. 

I also agree that there will often be many aural formatting objects
where there's only one visual formatting object. For example, while
emphasis is only sparsingly used in print (e.g. by boldfacing a
certain word), emphasis must be computed for all words and sentences
when synthesizing speech. It's <word>im<em>pra</em>ctical</word> to
markup all text to the level of detail required for speech synthesis
-- instead the formatting objects will be created very close to the
output device, sometimes in hardware.

 >   Express formatting objects in something other than XML. The only
 >   technical reason why formatting objects are expressed in XML today is
 >   that XTL can only output XML. (There's probably a few non-technical
 >   reasons as well, but let's ignore those for now.)
 > 
 > XSL does not mandate how the output tree is linearised. It does not have
 > to be XML at all. For example if you use the HTML namespace it can
 > linearise the output as HTML, with SGML rather than XML syntax.
 > Thus this `technical reason' doesn't seem to be valid.

I'd be very happy if this was the case. What non-XML format would you
suggest for formatting objects?

 >   By extending XTL to express FOs in some abstract, non-syntaxed manner
 >   the problem can be avoided. DSSSL and CSS use this model.
 > 
 > DSSSL also doesn't specify how the output tree is linearised.
 > Certainly jade (the most popular dsssl engine, I would guess)
 > is quite happy to output the result tree as SGML (the fot backend)
 > FOT files generated via dsssl have pretty much exactly the same status
 > as an FO XML file generated by XSL. In both cases you can format the
 > file with an essentially identity transform. (The transform looks
 > slightly more complicated in the dsssl case as the mapping from the
 > names in the FOT file to the flow objects is not built in, so has to be
 > done again in the stylesheet, but conceptually I see no difference.)

Well, XTL outputs "XML trees" which doesn't leave much room for
imagination about how to linearise it.

-h&kon

Håkon Wium Lie             http://www.operasoftware.com/people/howcome
howcome@xxxxxxxxxxxxxxxxx                      simply a better browser



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread