Re: [xsl] odf2xhtml: Processing nested element content seperatly ?

Subject: Re: [xsl] odf2xhtml: Processing nested element content seperatly ?
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Sat, 28 Oct 2006 02:12:04 +0200
Andreas M. wrote:
ODF has a different approach to lining out text than HTML. HTML is sensible: Within html:p there may be no other block-elements. Only inline-elements are allowed. The same is valid for inline elements (ie: html:span, html:img, html:a). They may contain no block-elements (html:div, html:h*, etc.)

If that's a fact (sensible ---> sensitive, I know ;-) , why did they do XHTML 1.1, with modularization, allowing for modules and thus: expansion? I tried a few things, trying to add / change elements, but couldn't really get it to work (validator at w3 kept complaining, I'm sure I did something wrong). Though others have managed to intermix xhtml1.1 with mathml. In a way, I believe that what you want, different content models for some elements, should be possible by using this expansion possibility (but to get it right in browsers for viewing is another story).


Somewhere in the docs it says "minimal content model" for html:p elements and the like, set to PCDATA and %inline. Doesn't "mimimal" mean that it can be expanded?


I also have no clue as to what technique to use in order to get the
<text:p> and the <draw:frame> correct. In HTML the only element,
that would match a draw-frame would be a <div>, but a <div> is not
allowed within <p>. So, for the ODF this is perfectly fitting, also
it is perfectly legal to have an <img> within a <p> in HTML, but as
soon we get the frame, there seems to be a problem.

I would be very glad if someone would know of a solution, since right
now, I make all a <div> and this is surley not, how HTML should be
marked up.

Well, you can put a <p> inside a <p>, but in standards compliance mode, browsers will complain (well, no, but they do not try to render it correctly with CSS styling attached, unless you switch to quirks mode, but hey, you wanted xhtml, so you somehow have to obey there rules), and validators will not allow for it (it is not xhtml, as you pointed out). You can do some trickery to get it working, but I am almost sure that "trickery" in such a task as ODF-XHTML is not your best bet.


Maybe this is an option (but only when CSS is allowed by you): inline elements may be contained inside inline elements. And with CSS it is allowed to change any element from inline to block and vice versa. Using inline CSS (not between tags, that is not allowed anymore, put it in the head-section instead) you can use <span> elements instead of <div>, and, more-over, you can use them inside <p>.

But alas, this will not bring you closer to adding a <p> inside a <p>. Resulting in semantical problems. Especially search engines are tricky about this and rather have elements emphasize the content model. You can mimic this, however, with using <em>, <strong> and the like and change them to `style="display:none" '.

With CSS, there is also another option, which requires more work and testing, but will result in legal XHTML and will retain your <p> tags, even where you would actually use a <p> inside a <p>. Using classes will give you more flexibility. And it renders well in all browsers.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
   "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd";>
<html xmlns="http://www.w3.org/1999/xhtml";>

<head>
   <title>Webitor Compatibility Tests Home</title>
   <style type="text/css">
       p {margin-left: 2em;}
       div p {margin-left: 4em;}
       span {display: block; margin: 120px;}
   </style>
</head>
<body class="compat">
   <p>
       Opening P with some text to make it a P....
   </p>
   <div>
       <p>
           This paragraph is inside the parent P,
           even though the DOM tree shows differently, and it retains
           its semantics of being a paragraph.
       </p>
   </div>
   <p>
       ... and here continues the text of the
       original P again. Looks a bit like a
       blockquote, this way, doesn't it?
   </p>
</body>
</html>


But all these are workarounds to find a way to deal with the screen-mentality of (x)html and try to apply the print-mentality of odf to it.


Btw: on the topic of keeping two-way conversion stable, I'd recommend using an approach I often find with Microsoft, but others do it to: adding specially designed comments with data that more closely resembles the original.

Good luck!

Cheers,
-- Abel Braaksma
  http://www.nuntia.com

Current Thread