Re: [xsl] odf2xhtml: Processing nested element content seperatly ?

Subject: Re: [xsl] odf2xhtml: Processing nested element content seperatly ?
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Mon, 06 Nov 2006 12:01:22 -0500
Dear Andreas,

At 08:46 PM 11/3/2006, you wrote:
On 04.11.2006 01:20 Andreas M. wrote

> BTW: I would be very glad if someone pointed me to a document on
> the web, where I can find (un)ordered lists being parts of sentences in

No need. Found it myself. Just a few seconds ago, when I wrote:

/*
 * VLC's HTTP capabilities allow for two different things:
 *  a) Using VLC as a media-server
 *  b) controlling VLC via HTTP requests (making VLC de facto a web-server)
 * resulting in [...]

Now, that's what I call a an ordered list within a paragraph and seems
correct language to me.

Indeed. Here are more examples:
http://home.ccil.org/~cowan/style-revised.html#9
(and see also #10, whose first paragraph contains a complete sentence that contains a list.)


Also, I think it's worth considering that (a list within a paragraph):

* This issue applies to other elements besides lists, such as tables, figures, and, notably, block quotations, and of which might occur within paragraphs.
* There are many XML document models in wide use that permit lists and other "block-level" elements within paragraphs, including Docbook, TEI and NCBI Journal (to name only three of the best established).


I think by raising the accessibility issue you've revealed the core of the difficulty with HTML. It's natural and proper to expect that an accessibility application such as a reader would take the term "paragraph" at face value and treat it differently from a div, even if the latter were designated a "class='paragraph'". In this, they are simply taking HTML at its word, to mean that its "p" is what we mean by a "paragraph" and not just some arbitrary "division"; and "paragraph" is a term whose implied semantics (as you're discovering) go well beyond "vertical whitespace" into other things. That's what I mean when I say a paragraph is a "rhetorical thing" (some might say a "logical" thing) rather than a "typographical thing". Students in school learn to write paragraphs according to sets of rules that only incidentally include what should happen with respect to whitespace -- as in fact the whitespace that identifies a paragraph will vary from case to case.

(A "paragraph" is literally a mark that used to be placed in the margin or the flow of text to identify text divisions in an age when paper was expensive and whitespace frequently wasn't used for this purpose.)

But due to the constraints imposed by its content models, an HTML "p" becomes just about useless for marking what are actually the "paragraphs" in a text (too many paragraphs require more than one p). This certainly does pose a problem for HTML-rendering applications that don't work by providing paragraphs with vertical whitespace (about the only thing an HTML p can be trusted to do).

There are some problems that transformations can't fix by themselves. We've discussed the best mapping and the choice, in this instance, between too tight and too loose. XSLT can even do an incorrect mapping if you like. But why HTML doesn't support a correct mapping cleanly is out of scope for this list.

Regards,
Wendell

Current Thread