Re: [xsl] generating Office Open XML parts using xslt

Subject: Re: [xsl] generating Office Open XML parts using xslt
From: "Jirka Kosek jirka@xxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 28 Jul 2014 16:00:59 -0000
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 28.7.2014 15:14, Wendell Piez wapiez@xxxxxxxxxxxxxxx wrote:
> This is fantastic ... and brings up the related question -- how
> about going the other way, reading data out of XSLX format?

Well, this is actually pretty easy if you use Java based XSLT engine.
In Java you can prepend jar: before URI and it will allow direct
access to files stored inside ZIP file (and OOXML files are just ZIP
files with additional metadata). Several lookups are necessary to find
proper files in OPC, but it's perfectly doable. I don't have XSLX
example at hand, but please find bellow example of reading some
statistical data from DOCX file. With XSLX you can do similar thing.
And sorry for Czech comments, but code should be understandable
without them as well.

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                version="2.0"

xmlns:r="http://schemas.openxmlformats.org/package/2006/relationships";

xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main";
                xmlns:dc="http://purl.org/dc/elements/1.1/";
                exclude-result-prefixes="r w dc">

  <!-- Parametr pro pEedC!nC- adresy dokumentu, kterC= chceme zpracovC!vat -->
  <xsl:param name="url">file:../wordprocessingml/zahlavi.docx</xsl:param>

  <!-- PromDnnC) zastupujC-cC- jednotlivC) D
C!sti uvnitE OOXML souboru -->
  <!-- SchC)ma jar: umoE>Euje transparentnC- pEC-stup k archivum ZIP/JAR -->
  <xsl:variable name="rels"
                select="doc(concat('jar:', $url, '!/_rels/.rels'))"/>

  <!-- URI hlavnC- D
C!sti v balC-D
ku -->
  <xsl:variable name="mainPartUri"
                select="$rels/r:Relationships/r:Relationship[@Type =
'http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument']/@Target"/>

  <!-- Dokument hlavnC- D
C!sti -->
  <xsl:variable name="doc"
                select="doc(concat('jar:', $url, '!/', $mainPartUri))"/>

  <!-- URI D
C!sti s metadaty -->
  <xsl:variable name="metaPartUri"
                select="$rels/r:Relationships/r:Relationship[@Type =
'http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties']/@Target"/>

  <!-- Dokument s metadaty -->
  <xsl:variable name="meta"
                select="doc(concat('jar:', $url, '!/', $metaPartUri))"/>


  <!-- E ablona, kterou transformace zaD
C-nC! -->
  <xsl:template name="stat">
    <html>
      <head>
        <title>Statistika dokumentu
          <xsl:value-of select="$meta/*/dc:title"/>
        </title>
      </head>
      <body>
        NC!zev dokumentu: <xsl:value-of select="$meta/*/dc:title"/><br/>
        Autor dokumentu: <xsl:value-of select="$meta/*/dc:creator"/><br/>
        PoD
et odstavcE/: <xsl:value-of select="count($doc//w:p)"/><br/>
      </body>
    </html>
  </xsl:template>

</xsl:stylesheet>


Of course, it would be nice to have set of XPath functions providing
easier API for access to all data in documents.

				Jirka

- -- 
- ------------------------------------------------------------------
  Jirka Kosek      e-mail: jirka@xxxxxxxx      http://xmlguru.cz
- ------------------------------------------------------------------
       Professional XML consulting and training services
  DocBook customization, custom XSLT/XSL-FO document processing
- ------------------------------------------------------------------
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep.
- ------------------------------------------------------------------
    Bringing you XML Prague conference    http://xmlprague.cz
- ------------------------------------------------------------------
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.17 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlPWc6gACgkQzwmSw7n0dR6wNQCfcRF3kbKez14D5+63sfm+u/g3
yRIAoIwLU9pE4ZRqLnTfjHZB45c4mx5Z
=gr+z
-----END PGP SIGNATURE-----

Current Thread