Subject: Re: [xsl] randomly selecting a node from a node set|
From: "Imsieke, Gerrit, le-tex" <gerrit.imsieke@xxxxxxxxx>
Date: Sun, 06 Jan 2013 01:33:15 +0100
So I have a bunch of boiled-down document structure that looks like:
<tree parent="para"> <count count="2331"> <child>text</child> <child>item</child> <child>item</child> </tree>
<tree parent="para"> <count count="548"> <child>text</child> <child>subpara</child> <child>subpara</child> <child>subpara</child> </tree>
And so on; all the parent-child patterns in a (largish, ~3.5 GB) content set, with the count element giving the frequency with which that pattern occurs.
The overall structure of this document structure description is flat; no tree element contains another tree element.
This document is being used for human review, to know what needs to be defined for the output process, which is not my problem, and to hopefully generate in an automated way a test document for that same output processing, which is my problem.
To produce that generated test document, I need to do two things; I need to produce a nested version of the document structure, inserting in place of the <child> elements some existing pattern for that element name from the available pile of <tree/> elements with associated parent attributes having that name as the value and I then need to populate it with actual content. (I do, however, get to lose the counts.)
So I need to produce, for the first para, something like:
<tree parent="para"> <tree parent="text"> <tree parent="italic"/> </tree> <tree parent="item"> <tree parent="text"/> </tree> <tree parent="item"> <tree parent="para"> <tree parent="text"/> </tree> </tree> </tree>
And then populate it from the actual content, so the tree elements are replaced with the elements whose names are the values of the parent attributes.
I think I can produce the tree; it's not, and isn't supposed to be, acyclic, but setting an arbitrary limit on the number of times it goes section/block/section/block (or para/item/para, and so on) is acceptable.
What I don't know how to do is randomly select elements from the elements available in the appropriate node set.
So I have to replace <child>text</child> with a tree element that has one of the possible patterns for a text element; there are, say, eight such patterns defined, so count(//tree[@parent eq 'text']) is equal to eight.
I don't want _all_ the possible text element patterns; I just want one. But I certainly don't always want to get the first one, either.
So far as I know, unordered() may be implementation-dependent, but it's allowed to be and probably will be consistent, so there really isn't any difference for this purpose between //tree[@parent eq 'text'] and unordered(//tree[@parent eq 'text']), I'll always get the same one.
Anyone got suggestions as to how one picks a random node out of a node sequence?
-- Gerrit Imsieke Geschdftsf|hrer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@xxxxxxxxx, http://www.le-tex.de
Registergericht / Commercial Register: Amtsgericht Leipzig Registernummer / Registration Number: HRB 24930
Geschdftsf|hrer: Gerrit Imsieke, Svea Jelonek, Thomas Schmidt, Dr. Reinhard Vvckler