Re: [xsl] document( URI ) with accented chars fails

Subject: Re: [xsl] document( URI ) with accented chars fails
From: "Alexandre Hoïde alexandre.hoide@xxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 18 Nov 2020 11:32:38 -0000
On Tue, Nov 17, 2020 at 10:51:25PM -0000, Alexandre HoC/de alexandre.hoide@xxxxxxxxxx wrote:
> On Tue, Nov 17, 2020 at 09:26:23PM -0000, Michael Kay mike@xxxxxxxxxxxx wrote:
> > The document() function expects a URI, not a filename, and URIs never contain accented characters.
> > 
> > XSLT 2.0+ has functions to escape special characters using %HH escapes so you can turn arbitrary filenames into valid URIs.
> > 
> > For xsltproc you'll need some processor-specific solution and I can't help you with that.

It is not directly XSLT related, but just in case :

The EXSLT has a `str:encode-uri`B9 function but, unfortunately,
`xsltproc` from `libxslt` does not implement it.

So, I have now enriched my bash script used to build
the fileslist.xml with a small Perl script including the Perl
module bURIbB2, and applied to each file path.

~~~{filename-to-uri.pl}
#!/usr/bin/env perl
use URI::file;
my $uri = URI::file->new( $ARGV[0] );
print $uri . "\n";
~~~

applied on each file name with :
~~~{bash command line}
$ perl filename-to-uri.pl <the-filename-to-convert-to-uri>
~~~

Best regards and thanks again,
Alexandre HoC/de

1. http://exslt.org/str/functions/encode-uri/index.html
2. https://metacpan.org/pod/URI
   (on GNU Guix the package is `perl-uri`)

Current Thread