[xsl] [Ann (sort of)] ODF 1.2 to XHTML 1.1

Subject: [xsl] [Ann (sort of)] ODF 1.2 to XHTML 1.1
From: Gannon Dick <gannon_dick@xxxxxxxxx>
Date: Sat, 6 Dec 2008 11:38:45 -0800 (PST)
Open Office 3.0 has an XSLT export filter to make XHTML 1.1 versions.  An identity transform yields a flat ODF 1.2 file, so in effect the transform is ODF 1.2 to XHTML 1.1.

The XSLT can be run (on the flat file) outside of Open Office with a java parser (like Saxon 9). A GRDDL transform available from the W3C will convert meta data (expressed according to Dublin Core's XHTML Rec.) to RDFXML.

I posted two small patches today*, http://www.openoffice.org/issues/show_bug.cgi?id=44257 , and everything should work, validate and graph correctly.

Aside from some nag messages about using XSLT 1.0, I would certainly appreciate hearing about any bugs arising from use outside of OO.  I know that there is a MathML display bug (around svg:desc).


1. Validation
   body.xsl Line 529 The attribute name changed from "name" to "id"
   From the XHTML DTD - "The destination(or link 'target') is
   identified via its 'id' attribute rather than the 'name'
   attribute as was used in HTML."

2. RDF Validation
   header.xsl Line 278 Provenance simplified
   The MeSH style CURIE was flattened as it was found that the
   W3C RDF Validator failed to produce a graph.  The
   [dcterms:provenance] was only printed if a <meta:printed-by />
   and <meta:print-date /> was present.  If a Literal is inadequate
   for processing, the "Print Publisher" or, for example,
   [dcterms:dateSubmitted] can be expressed in the user defined
   meta data fields.

3. Validadation / Display
   body.xsl Line Line 769-812 (no changes)
   The display of maths depends heavily upon links to CSS with
   functions added via XSL.  The links necessary are not built
   into the Transform.  Note to users: Here is your chance to
   fiddle with it - you know you want to.  That said, the
   validation of imbedded formulae has no problem, although the
   name space (MathML 1.01 vs. MathML 2.0) might cause problems in
   some applications, nor does the PDF output have any display

Future Directions

It seems to me, and from all reports me alone, that HTML seen as
an Object has an implied "self" RDF subject that is encapsulated
in the <head> only. Therefore it is mostly improper to mix meta
data by referencing other documents (also with a "self") in the
<head>.  There are exceptions, links to schema, links to original
sources etc., but in the main, there exists no natural ontology
to sort out semantics after the mixing.  Web surfing is not
Reasoning, for now anyway[a].

The transform markup is clean, and easy to read and edit by hand.
I see no reasonwhy RDFa should not be used to clarify meta data
in the <body> of a document. The extra name spaces do interfere
with DTD validation though.  There is a transform to extract
RDF XML triples from XHTML, and RDFa permits the development
of hierarchies[b].

There is one school of thought, again I might be the only student,
that says bibliographic references are a display for document ideas
just as MathML is a display for mathematical ideas.  So, why not
embed MODS (Meta Data Objects) just like MathML.  MathML is Entity
intensive and less amenable to schema processing, so in fact this
project is easier[c].


[1] PUBLIC "-//W3C//DTD XHTML 1.1 plus MathML 2.0 plus SVG 1.1//EN"

[2] W3C RDF Validator <http://www.w3c.org/>
    GRDDL <link rel="transformation" href="http://xml.openoffice.org/odf2xhtml/rdf-extract.xsl"; />

[3] *.odf/content.xml PUBLIC "-//OpenOffice.org//DTD Modified W3C MathML 1.01//EN"

[a] <http://www.rustprivacy.org/>
[b] <http://www.w3.org/TR/2008/NOTE-xhtml-rdfa-primer-20081014/>
[c] <http://www.loc.gov/standards/mods/>

Current Thread