RE: [xsl] odf2xhtml: Processing nested element content seperatly ?

Subject: RE: [xsl] odf2xhtml: Processing nested element content seperatly ?
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Wed, 01 Nov 2006 18:11:05 -0500
At 12:11 PM 10/27/2006, Mike wrote:
... to be fair, he wasn't asking how to solve a
coding problem, he was asking how to define a mapping between two data
models, and preferably how to make that mapping reversible. To which the
only answer I can come up with is: it isn't easy, and it may be impossible.
(Actually I wrote a PhD thesis many moons ago which said much the same
thing, though not of those two particular data models, and I said it in
100,000 words rather than eight.)

Nonetheless, it bears repeating.


This mapping isn't clean since it crosses between two essentially dissimilar semantic domains.

In ODF, as in many document formats, a "paragraph" is defined as a fairly flexible construct which, while it is distinct in layout from nearby structures (its distinction being indicated by vertical whitespace or at least a line indent), nevertheless can contain other structures (as David noted) which themselves claim the same distinction. Lists, tables and figures can all occur "within" the paragraph even while in rendition they all appear at "block level".

HTML 1.0 takes a rather stricter view of it, in which a "p" is the lowest level block-level element, which by definition admits none other as content.

They are thus incompatible by their respective definitions of what it is to be a paragraph. (As David also notes, the more flexible and capacious model is closer to what we learned in school to call a "paragraph".)

Unpacking the semantics of "paragraph" in the two cases shows that the only possible mapping into HTML is what David suggested, into a div, and even then there is arguably some loss, since what was before claimed to be a paragraph (a thing properly within the domain of rhetoric, not typography) is no longer claimed to be a paragraph.

Despite the fancy circumlocutions it can support, XSLT is mainly a tool for renaming or relabelling. In this case, the thing we are renaming has to be called "div" in HTML, not "p", since many cases are excluded by the HTML rules for "p". Only two sizes being available, we have to choose the one that's too large.

Calling it <div class="p"> might help, if you can bear it.

Cheers,
Wendell

Current Thread