Re: [xsl] Question on translate() function

Subject: Re: [xsl] Question on translate() function
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 25 Sep 2017 21:45:52 -0000
> I have always presumed that translate() is faster than replace().[1]

Probably. But you never know. It all depends how much effort has gone into the
implementation.

I did a little test: Saxon XQuery from the command line

-qs:"declare variable $in as xs:string external; for $i in 1 to 1000 return
translate($in||$i, '!@B#$%^&amp;*()_+=', '##############')"
 in="a(b)@c" -o:test.out -t -repeat:20

Average execution time: 2.414801ms

-qs:"declare variable $in as xs:string external; for $i in 1 to 1000 return
replace($in||$i, '[!@B#$%^&amp;*()_+=]', '#')" in="a(b)@c" -o:test.out -t
-repeat:20

Average execution time: 4.025353ms

But if the latter were a common idiom then it wouldn't be hard for the regex
optimizer to generate an identical execution plan.

Michael Kay
Saxonica


> I know that I, the mere XSLT programmer, am not supposed to worry my
> pretty little head about optimization unless I actually have a
> problem. And I know that if one of my students asked, that's what I
> would answer: "You have 1 MiB of XML data, and that computer on your
> lap would have been considered a supercomputer a mere two decades
> ago. Yes, A is probably more efficient than B, but the number of
> microseconds you save, even when added up over thousands of
> iterations of this transformation, will end up being less time than
> this conversation. Besides, we don't know what optimization the XSLT
> engine is (or is not) doing -- for all we know it might be better at
> optimizing B, even if un-optimized B is slower. So don't fret the
> speed unless something is running too slow."
>
> But I can't help it, sometimes -- I'd really like to know if
> translate() is significantly more efficient (computationally) than
> replace() or not.
>
> Notes
> -----
> [1] Not sure why I have this prejudice. Perhaps because translate()
>    was in XSLT1, but replace() was not; more likely because
>    replace() uses regular expressions, which I imagine take quite a
>    bit of computing; but most likely as a leftover from my IBM S360
>    Assembler days, when a translate was done in a single machine
>    instruction. (Of course, it only operated on the 256 8-bit chars
>    of EBCDIC, not on Unicode.)
>
>
>> translate($string ,'()''+-*$=' , '#')
>>
>> That means "replace '(' by '#', remove any occurrences of ')' or '
>> or '+' or '-' or '*' or '$' or '=', and leave anything else
>> unchanged."
>>
>> If you want all the characters in the second argument to be
>> replaced by '#' characters then you need to write
>>
>> translate($string ,'()''+-*$=' , '########')
>>
>> Alternatively, use the replace() function.

Current Thread