HTML is a formatting/UI language was: RE: Formatting Objects considered harmful

Subject: HTML is a formatting/UI language was: RE: Formatting Objects considered harmful
From: "Jonathan Borden" <jborden@xxxxxxxxxxxx>
Date: Sun, 18 Apr 1999 21:07:17 -0400
Håkon Wium Lie wrote:

>
> Alas, many people consider HTML to be a formatting language and use it
> accordingly. This is wrong and W3C tries to educate people otherwise.
> Deprecating a bunch of presentational tags in HTML 4 is a clear sign
> of direction.

	HTML at its core basically *is* a formatting language (more properly a
*user interface language*). If it were able to handle semantics in a robust
fashion there wouldn't be such a profound need for XML. HTML is in essense a
language to instruct a browser what to display in a window, how to build a
GUI, but its hardly up to the task of developing complex semantic
structures. The semantics I think you are talking about here are scratching
just below the surface of the GUI.

>
>  > In your document, for example, most
>  > element types are not very semantic. The only element types in
> it without
>  > FO equivalents are EM, H?, META and LINK.
>
> And TITLE and P (which is more than a generic container). Also, the
> note uses classes (byline, abstract, question, answer) which -- if
> conventions are developed -- can communicate semantics.
>
	And I can communicate semantics using Morse code, yet we have moved far
beyond the class attribute which works great for letting a browser know how
to display an element, but let's not try to push this issue too far because
it breaks quickly.

...
>
> Unless the semantics of the vocabulary is known on the other side, the
> client is worse off receiving arbitrary XML than receiving documents
> that have been downtranslated (as in the ladder of abstraction) to
> HTML or XHTML. I don't see any significant bandwidth or performance
> issues here.

	This is untrue for a variety of reasons, but the most simple one is that
XSL stylesheets which change infrequently are cached locally by the browser.
Relatively small XML documents are transmitted through which complex HTML
pages can be created on the browser. Most clients don't have much to do
(especially in the days of the 500Mhz client!) while most servers are quite
busy, so this is an excellent arrangement. So even when the client has no
notion of the semantics of the document, XSL can reduce bandwidth.

	The term "semantics of the vocabulary is known" leaves a huge gap for
interpretation. What does it mean to "know semantics"? The point is that it
is very difficult for you to assume how a client may need to process a
document and, especially with XSL translations, how such a document might be
processed even without any specific 'knowledge' of a client beyond that
supplied by a stylesheet (e.g. DSSSL or XSL).

>
> One solution which captures both is to downtranslate your internal XML
> to XHTML while retaining the original element types in the CLASS
> attribute:
>
>  <p class="question">How come?</p>
>  <p class="answer>Because.</p>
>
	I have been developing a Web based electronic medical records/workflow
system. Here is an example document fragment:

<person xmlns="urn:hl7" UID=".." SSN="...">
	<n>
	   <FirstName>John</FirstName>
	   <LastName>Smith</LastName>
	</n>
	<a type="home"><Address1>... ...</a>
	<DOB dt:dt="iso...">...
	<insurance ID="...">
		<business role="Insuror">...</business>
		...
	</insurance>
	<person role="Provider">...
	<person role="EmergencyContact"> ...
	<diagnosis CPT="...">...
	<medication ...>

	How would you render this in HTML+CSS, and maintain the 'semantic' content?

	My application downloads such documents and maintains them internally, and
they are rendered as HTML with multiple views (e.g. 1 download may produce
10 screens). Some of these views are forms which are used to edit and
otherwise process the document. Element reordering and other graph
transformations are the norm rather than the exception. The client
application is a browser + Javascript + XSL. Where is the semantic
'knowledge' maintained? In the browser (it knows *nothing* about the HL7
namespace). In the Javascript? It also knows *nothing* about the HL7
namespace. The 'semantic knowledge' hence must be contained within the XSL
stylesheets. How does XHTML help here? Are you suggesting that this can all
be accomplished with <P CLASS="X"> tags?

	Perhaps a more general question: How does XHTML+CSS accomplish graph-graph
transformations?

	You may argue that this isn't the question ... and my example has nothing
to do with FO which is true. My point is that XML is compared not to HTML
rather SGML, and XHTML is merely an XML document type, hence your call to
transmit XHTML rather than 'arbitrary' XML is merely a call to perform
translation on the server rather than the client. This is wasteful of server
resources and severly limits the ability of sophisticated client side
processing.

Jonathan Borden
http://jabr.ne.mediaone.net


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread