Re: [xsl] String hashing code

Subject: Re: [xsl] String hashing code
From: Abel Braaksma <abel.online@xxxxxxxxx>
Date: Sat, 15 Dec 2007 01:48:51 +0100
Deborah Pickett wrote:
Since I can't be sure that only one XSLT processor instance will process
*all* my files, I can't rely on the uniqueness of generate-id($node)
where $node is a locally-scoped variable containing a document (` la
Abel's UUID thread, which was nonetheless fascinating).

If you looked carefully, you must've seen it also uses a timestamp. Together it should provide uniqueness, both within one run and between runs. Unless of course you have a system where all are runs run simultaneously in the same microsecond (which is not likely since starting a processor alone takes somewhere between 20 and 400 ms). But I can't judge your system, perhaps relying on clock and node is not enough to guarantee stability?


If the problem you are facing is the possibility of names of files already existing, you can combine two methods. One: try the most randomizing way of creating a filename, two: use unparsed-text-available(...) to check whether a previous run already created that file, or better, let it loop with a counter until it finds a free "slot" (use the high significant bit of the random number of the UID or simply add another counter yourself).

These two techniques combined will guarantee that 0.1% that wasn't guaranteed by this "UUID"-lookalike algorithm ;)

HTH,
Cheers,
-- Abel --

PS: of course, using a hash can be a close to perfect solution too. But remember that on small strings hashes have the tendency to overlap in some xxx% of the cases. So, even with hashes, it is probably best to also add a both a counter (extra-run) and a function to provide uniqueness (intra-run).

Current Thread