[xsl] Re: Re: Generate Id's usage as primary and foriegn key in the database!!

Subject: [xsl] Re: Re: Generate Id's usage as primary and foriegn key in the database!!
From: "Dimitre Novatchev" <dnovatchev@xxxxxxxxx>
Date: Sat, 25 Oct 2003 16:13:37 +0200
"David Tolpin" <dvd@xxxxxxxxxxxxxx> wrote in message
> > > For a given xsl file when i run one xml file, i get generate id which
> > same
> > > when i change the input xml file having the same structure and based
> > same
> > > dtd.  I am using msxsl as XSLT engine for processing my input xml
> > >
> > > My questions are
> > > 1)Can generate-id() be used to serve purpose like this ?
> > > 2)Is there any other technique avaialable in XSL 1.0 to do the same ?
> >
> > >From the XSLT 1.0 spec (http://www.w3.org/TR/xslt#misc-func):
> >
> > "An implementation is under no obligation to generate the same
> > each time a document is transformed. There is no guarantee that a
> > unique identifier will be distinct from any unique IDs specified in the
> > source document."
> This is not related to the problem described, by the way. The question is
> the opposite: how to ensure that different documents have non-intersecting
> of identifiers.

This is related to the problem. It shows that the spec makes no promises for
generating unique ids for nodes in different documents -- it does not speak
at all about any guarantees for nodes in different documents. This seems to
leave the only possibility to load all necessary xml documents and to
process them in a single run, in this way assuring that no two nodes will be
given the same id.

However, the fact that the generated id for the same node changes from one
run to another is also not suitable for the task, because this means to
completely re-populate the database after each run (suppose that someone
decides to load all xml documents and process them in a single run of an
XSLT processor).

> The answer is that an nmtoken, unique for each document in the set, such
> arbirtrary identifier concatenated with a sequential number of the
document in
> the set, must be passed to the stylesheet as a paramter and concatenated
> each generated identifier. And the URI of each document would serve well
> except that the URI contains characters not allowed in identifiers.

The question was to generate primary (and foreign) keys for database
records -- not identifiers!

> >
> > A more stable unique key generation is to use the XPath expression that
> > selects exactly the node, concatenated with the URI of the xml document.
> >
> 1) What does 'more stable unique key generation' mean? Is there such are
> as 'almost unique keys'?

More stable means one that does not change from one run of the XSLT
processor to the other. As long as a particular xml document remains the
same and the same algorithm is used to generate an XPath expression that
selects the node, the same XPath expression will be selected every time.
> 2) I don't see how this solves the problem, given that concatenation of a
node-set and URI (a string)
> will give a string which is not an indentifier,

If you read the original message you'll understand that the task is to have
strings suitable as primary and foreign keys in a database. These are not
required to be identifiers.

> and
> generate-id(document($uri)|.)
> will generate identical sets of identifiers on different documents of
identical structure
> regardless of $uri's value on most XSLT processors.

You didn't read well my message -- I proposed to concatenate the XPath
expression for the node to the URI of the document. In a pseudo-notation
this is:

concat(document-uri, $identifyingXpathAddress)

A similar solution will use xsl:number to identify a node, instead of XPath
expression. However, if namespace nodes were also to be input into the
database, xsl:number (only) would not be usable.


Dimitre Novatchev.
http://fxsl.sourceforge.net/ -- the home of FXSL

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Current Thread