Re: [xsl] Escaping special characters for *nix file path

Subject: Re: [xsl] Escaping special characters for *nix file path
From: Lighton Phiri <lighton.phiri@xxxxxxxxx>
Date: Sun, 29 Jul 2012 15:23:02 +0200
Great! Thank you.I guess all I needed to do was revise my regular
expression skills. I eventually settled for the expression below. I
couldn't quite figure out how to put the single quote with the second
group and so I just use an extra replace function.

replace(replace($filename,&quot;'&quot;,&quot;\\'&quot;),
'(&quot;|\(|\)|\[|\]|\s+)', '\\$1')

Lighton Phiri
http://lightonphiri.org


On 29 July 2012 14:53, Liam R E Quin <liam@xxxxxx> wrote:
> On Sun, 2012-07-29 at 06:30 +0200, Lighton Phiri wrote:
>
>> Yes they are allowed, but special characters need to be escaped for
>> one to access a file path.
> [..]
>> phiri@PHRLIG001:~$ touch ../data/Sites/3218AD Eland\'s
>> Bay/Bobbejaansberg/BB15/testFile.txt
>> touch: cannot touch `Bay/Bobbejaansberg/BB15/testFile.txt': No such
>> file or directory
>
> This is because the shell splits arguments at spaces, so you actually
> gave the touch command two filenames. If you used quotes
> $ touch "../data/Sites/3218AD Eland's
> Bay/Bobbejaansberg/BB15/testFile.txt"
>
> then it would work fine.
>
> It is not necessary to escape characters from the shell inside an XSLT
> stylsheet, because the shell isn't involved.
>
> [...]
>
>> >replace($filename, "[ '\\`&;]", "\\&")
>> >will probably do what you want.
>> >Or use \s instead of space if there might be newlines.
>> >You might also need to replace " with \"
>>
>> I am getting an error when I try what you suggested.
>>
>> <xsl:value-of select="{replace($filename, "[ '\\`&;]", "\\&")}" />
>
> You don't want the curly braces there - value-of is expecting an
> expression, not a string.
>
> [...]
>
>> My file paths have special characters in them and those characters
>> include 'square brackets', 'parentheses', 'ampersands', etc. all of
>> which are interpreted differently by the regular expression given by
>> second argument of replace function.
>
> That doesn't matter.
>
> Actually what I tend to do myself is something like
>    replace($filename, "[^a-zA-Z0-9]+", "-")
> to turn any sequence of characters other than letters or digits into a
> hyphen. If it's a path rather than a filename, include / after the 9
> there. But if the data is untrusted you should not normally allow /
> inside it.
>
> Finally, as Mike Kay implied in a separate message, you probably want
> \$0 as the replacement, not \& (I use too many regular expression
> libraries, sorry)
>
> Liam
>
> --
> Liam Quin - XML Activity Lead, W3C, http://www.w3.org/People/Quin/
> Pictures from old books: http://fromoldbooks.org/
> Ankh: irc.sorcery.net irc.gnome.org freenode/#xml

Current Thread