Re: [xsl] creating a collection from an archive

Subject: Re: [xsl] creating a collection from an archive
From: "Graydon graydon@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 20 Apr 2018 13:09:44 -0000
On Thu, Apr 19, 2018 at 08:52:42PM -0000, Terry Badger terry_badger@xxxxxxxxx scripsit:
>    Here is part of the solution you what I think. Think I used the various
>    namespaces later. This works in the current version of Oxygen. There is an
>    xml file in Word that is a manifest of all the files in the zip(Word) and
>    you could extract that then use that it get the names of the other files..
> 
>    <?xml version="1.0" encoding="UTF-8"?>
>    <xsl:stylesheet  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; 
>    xmlns:xs="http://www.w3.org/2001/XMLSchema"; exclude-result-prefixes="xs"
>    version="2.0"  xmlns:file="java.io.File" 
>    xmlns:StringUtils="java:org.apache.commons.lang.StringUtils" 
>    xmlns:class="http://saxon.sf.net/java-type";
>    xmlns:ZipFile="java.util.zip.ZipFile" 
>    xmlns:ZipInputStream="java.util.zip.ZipInputStream">
>    <xsl:output indent="yes"/>
>    <!-- ========================================= -->
>    <xsl:template name="main" match="/">
>    <xsl:variable name="doc-content"
>    select="doc('jar:file:///G:/Badger/xslt-with-java/XML_Projects.docx!/word/document.xml')"/>
>    <xsl:result-document href="document.xml">
>    <xsl:copy-of select="$doc-content"/>
>    </xsl:result-document>
>    </xsl:template>
>    </xsl:stylesheet>

if I define:
xmlns:ct="http://schemas.openxmlformats.org/package/2006/content-types";


<xsl:variable name="wordArchive" as="document-node()+">
   <xsl:variable name="pathPrefix" as="xs:string" select="concat('jar:file://', $wordArchiveURI)"/>
   <xsl:for-each select="doc(concat($pathPrefix,'!/[Content_Types].xml'))/ct:Types/ct:Override/@PartName/string()" >
       <xsl:sequence select="doc(concat($pathPrefix, '!', .))" />
   </xsl:for-each>
</xsl:variable>

<xsl:template name="xsl:initial-template">
   <bucket>
       <xsl:sequence select="$wordArchive/document-uri(.)" />
   </bucket>
</xsl:template>

works!

Thank you!

This doesn't quite get everything; /word/_rels/ is apparently invisible, for example.  But I should be able to figure that out.

-- Graydon

Current Thread