Re: [xsl] persistent ids/hashes from strings?

Subject: Re: [xsl] persistent ids/hashes from strings?
From: "Andrew Welch" <andrew.j.welch@xxxxxxxxx>
Date: Tue, 13 Feb 2007 21:00:18 +0000
On 2/13/07, Joern Nettingsmeier <nettings@xxxxxxxxxxxxxxxxxxxxxx> wrote:
how can i generate short persistent ids from string content?

the situation is this: we have an xml data file with a list of
companies, and each company is going to get a sub-directory on a web
server. due to length and charset considerations, it is not feasible to
use the company names directly or in a plain urlencoded way as directory
names.

ideally, i would like to map the names to short numeric or alphanumeric
hashes. since they will be created on disk just once, it is important
that the mapping be persistent over multiple processing runs.
that means the xpath generate-id() function is out.

i googled for some sort of simple message digest function, but i only
found rather heavyweight, cryptographically hard java solutions which
are overkill imho, and nothing at all with plain xsl/xpath.

my toolset includes saxon8, so xpath2/xsl2 solutions are fine. i'm also
prepared to use saxon's java extension feature, but i would prefer a
simple call of a static message digest method directly from the xsl over
having to write an external java wrapper and calling into that.

Maybe a function (in XSLT 2.0) that accepts the company name as a parameter and returns an id would be sufficient? For example:

"sum(string-to-codepoints($companyName))"

You could prefix it with the first few letters of the company name to
distinguish companies with the same letters in their names.

Current Thread