Subject: Re: [xsl] creating a collection from an archive From: "Terry Badger terry_badger@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Thu, 19 Apr 2018 20:52:35 -0000 |
Here is part of the solution you what I think. Think I used the various namespaces later. This works in the current version of Oxygen.B There is an xml file in Word that is a manifest of all the files in the zip(Word) and you could extract that then use that it get the names of the other files.. <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0" xmlns:file="java.io.File" xmlns:StringUtils="java:org.apache.commons.lang.StringUtils" xmlns:class="http://saxon.sf.net/java-type" xmlns:ZipFile="java.util.zip.ZipFile" xmlns:ZipInputStream="java.util.zip.ZipInputStream"> <xsl:output indent="yes"/> <!-- ========================================= --> <xsl:template name="main" match="/"> <xsl:variable name="doc-content" select="doc('jar:file:///G:/Badger/xslt-with-java/XML_Projects.docx!/word/doc ument.xml')"/> <xsl:result-document href="document.xml"> <xsl:copy-of select="$doc-content"/> </xsl:result-document> </xsl:template> </xsl:stylesheet> Terry On bThursdayb, bAprilb b19b, b2018b b02b:b07b:b56b bPM, Graydon graydon@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: So I have a Word document, localtest.docx, which is in the 2016 strict version of the OOXML standard.B As such, it's a zip archive of a bunch of XML files.B I want to apply XSLT to the XML files. I could use the arch module and the collection function to write the whole thing to disk and then load it from disk as a collection before doing whatever to it and writing it to disk as an archive again, but this seems inefficient. It would be better to read the archive into an in-memory collection, manipulate it, and then write that back out as an archive. I'm using XSLT 3.0 via Saxon 9.8.0.8 in oXygen. <xsl:variable name="wordArchive" as="document-node()+"> B <xsl:variable name="arch" select="file:read-binary($wordArchiveURI)"/> B <xsl:variable name="entries" select="arch:entries($arch)"/> B <xsl:variable name="dirs" select="$entries[ends-with(.,'/')]"/> B <xsl:sequence select="for $x in ($entries except $dirs) B B B B B B B B B B B return arch:extract-text($arch,$x) => parse-xml()" /> </xsl:variable> works, in that I get a sequence of document nodes and those documents have the expected XML content. I don't get document nodes with associated document-uri() values or any of the rest of the archive structure.B Those URIs are in the values returned by arch:entries but I'm not seeing how I assign a document-uri value to a document node.B xsl:document doesn't seem to have a facility for assigning a document-uri value and of course you can't create an attribute whose parent is a document node even if document-uri was an attribute in the first place. What I want is a collection where the structure matches the Word archive, various subdirectories and all, and I can use the doc() function to access various compontent documents.B I can't shake the feeling that I'm missing something obvious, but this feeling is no help in discerning what the obvious thing is! Thanks! Graydon
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] creating a collection from an, Graydon graydon@xxxx | Thread | Re: [xsl] creating a collection fro, Graydon graydon@xxxx |
[xsl] creating a collection from an, Graydon graydon@xxxx | Date | Re: [xsl] creating a collection fro, Michael Kay mike@xxx |
Month |