Subject: [xsl] odf2xhtml: Processing nested element content seperatly ?|
From: "Andreas M." <sfamix@xxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 27 Oct 2006 15:24:50 +0200
Hi, I am trying to create an OASIS ODF -> XHTML XSL-T. I want it to be as much 1:1 as possible. I ran into some problems, that I find no way to solve. I am using XSLT v1.0 and currently parse with MSXML.NET on oXygen. A quick outline of the problem: ODF has a different approach to lining out text than HTML. HTML is sensible: Within html:p there may be no other block-elements. Only inline-elements are allowed. The same is valid for inline elements (ie: html:span, html:img, html:a). They may contain no block-elements (html:div, html:h*, etc.) ODF can intermix paragraphs with tables and frames (that would translate to html:div as the most logical advice) Now, if you have a source document with a paragraph and inside this paragraph you have a frame with an image, and this image-frame itself contains a paragraph of text (a description to the image), then the problems start. It seems, at least to my knowledge and skills, impossible to create a clean ODF -> XHTML translation. Check this horrible result out. Of course, this results in completly invalid XHTML. "content.xml" (the source): <text:h text:style-name="Heading_20_1" text:outline-level="1"> <draw:frame draw:style-name="fr2" draw:name="Grafik1" text:anchor-type="paragraph" svg:x="2.27cm" svg:y="2.057cm" svg:width="5.689cm" style:rel-width="22%" svg:height="5.539cm" style:rel-height="scale" draw:z-index="11"> <draw:image xlink:href="Pictures/100000000000012C0000012CBED4AE2D.jpg" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad"/> </draw:frame>TITLE_TEXT </text:h> <text:p text:style-name="Text_20_body"> <draw:frame draw:style-name="fr3" draw:name="KaratekaPrincess" text:anchor-type="paragraph" svg:x="15.727cm" svg:y="0.279cm" svg:width="10.16cm" svg:height="7.17cm" draw:z-index="4"> <draw:image xlink:href="Pictures/10000201000001800000010FE410B668.png" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad" /> </draw:frame> SOME_PARAGRAPH_TEXT <text:span text:style-name="Emphasis">THIS_WILL_BE_EMPHASIZED </text:span>. PARAGRAPH_TEXT_CONTINUES </text:p> "content.html" (the result): <div style="top:2.27cm;left:2.057cm;height:5.539cm;width:5.689cm;border:1px solid black;"> <img src="Pictures/100000000000012C0000012CBED4AE2D.jpg" alt="Pictures/100000000000012C0000012CBED4AE2D.jpg"/> </div>TITLE_TEXT <p> <div style="top:15.727cm;left:0.279cm;height:7.17cm;width:10.16cm;border:1px solid black;"> <img src="Pictures/10000201000001800000010FE410B668.png" alt="Pictures/10000201000001800000010FE410B668.png"/> </div>SOME_PARAGRAPH_TEXT<span>THIS_WILL_BE_EMPHASIZED</span>.PARAGRAPH_TEXT_CONTINUES. </p> This is completly crazy! Please note, that both images are outlined "at paragraph" in OpenOffice. So it should not happen, imo, that the first image gets put into the <text:h>, since there is clearly a new paragraph following the heading. I mean, the title comes _before_ the image in the document, which is aligned at the side to the paragraph following the heading. I also have no clue as to what technique to use in order to get the <text:p> and the <draw:frame> correct. In HTML the only element, that would match a draw-frame would be a <div>, but a <div> is not allowed within <p>. So, for the ODF this is perfectly fitting, also it is perfectly legal to have an <img> within a <p> in HTML, but as soon we get the frame, there seems to be a problem. I would be very glad if someone would know of a solution, since right now, I make all a <div> and this is surley not, how HTML should be marked up. I also checked the XSL FAQ, especially the point about xsl:copy. I had hoped, that I, somehow, could do a programmatic rearrangement of the elements in question. First I would extract all the text from the text:p element and remember all other, that is contained within, which then I would process seperatly, after the text:p has been transformed neatly into html:p. However, if I use the text() function I get only the first fragment of the text and, since I need to issue an xsl:apply-templates I get the text even twice. Thanks. -- Bye, Andreas M.