RE: [xsl] xslt2: Retrieving a directory's non-XML file names

Subject: RE: [xsl] xslt2: Retrieving a directory's non-XML file names
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Wed, 1 Oct 2008 11:40:39 +0100
on-error=warn means output a warning that the file is not valid XML, exclude
it from the collection, and then carry on to process other files; in the
end, collection() returns the subset of the files that are valid.

I think you will need an extension function for this. You could write a
CollectionURIResolver that returns the URIs of the files wrapped as XML
documents <text href="file://a/b/c/d.txt"/> and then use unparsed-text() to
read the file; but writing an extension function is no more work and equally

Michael Kay

> -----Original Message-----
> From: Yves Forkl [mailto:Y.Forkl@xxxxxx] 
> Sent: 01 October 2008 11:07
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] xslt2: Retrieving a directory's non-XML file names
> Hi,
> from a directory whose path is stored in $myDir I would like 
> to retrieve the names of all files with extension ".txt". Of 
> course that could be done easily using shell mecanisms, but I 
> want to do this using XSLT 2 (with Saxon) only.
> That appears to be somewhat difficult, at least I couldn't 
> find the solution anywhere. Knowing that collection() allows 
> to access all of the XML documents in a given directory, I tried this:
>      <xsl:variable name="txt_files" as="xs:string*">
>        <xsl:for-each
>          select="collection(concat($myDir,
>            '?select=*.txt;on-error=warn'))/saxon:discard-document(.)">
>          <xsl:value-of select="unparsed-entity-uri(.)"/>
>        </xsl:for-each>
>      </xsl:variable>
> This gives me
> "Error SXXP0003: Error reported by XML parser: Premature end of file.
> Transformation failed: Run-time errors were reported"
> I guess the reason is that collection() is unable to return 
> the document node for any of these text files.
> So how to obtain the list of .txt files from my directory in XSLT 2?
>    Yves

Current Thread