Re: [xsl] where to look for xsl folk..

Subject: Re: [xsl] where to look for xsl folk..
From: "Flynn, Peter pflynn@xxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 21 Jun 2016 08:59:10 -0000
On 21/06/16 04:42, adam adam@xxxxxxxxxxxxxxx wrote:
> thanks, I know the PKP project and mentioned it in an earlier post. I'm
> not looking to adapt that approach.
>
> Rather I am looking to convert docx to HTML with xsl.

I am assuming that you mean "docx with no named styles" here. That is,
the Word document is a sequence of undistinguished paragraphs, and the
only clue to their function (headings, etc) is in the fonts and spacing
that they use (with minimal help wrt list items).

> No magic involved.

If the above is true, then you *will* need some magic, in the form of
heuristics expressed in XSLT, to guess the function from the OOXML.

Iff, on the other hand, there are some named styles involved (even just
the Word built-in set), then a reasonable "good-enough" conversion can
be obtained with minimal effort: install a new copy of Word, note down
the names of the default styles available, and write a template for each.

> Good enough HTML is good enough.

<xsl:template match="w:p">
  <p>
    <xsl:apply-templates/>
  </p>
</xsl:template>

> I was looking for someone to help me
> build this as well structured stylesheets that can be extended later.

There is an extensive set of XSLT2 stylesheets which my university uses
in their on-site hosted journals. They are migrating the journals to
OJS, so this code will become redundant, and I am sure they would be
amenable to share it if requested. It is, however, highly specific to
the set of named styles used by the journals, so it would need extensive
paring-back if you wanted it to function as a generic transformation.

<plug>
It may be useful to know that many of the techniques involved are
covered in various sessions at the XML Summerschool in Oxford in Sept.
</plug>

///Peter

Current Thread