Re: [xsl] XSL pattern needed for begin/end elements
Subject: Re: [xsl] XSL pattern needed for begin/end elements|
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Wed, 07 Jul 2004 13:01:43 -0400
This is a hard problem, and one for which XSLT was not designed.
Nonetheless, there is enough experience around that some guidance is possible.
At 12:14 PM 7/7/2004, you wrote:
I'm looking for an XSL pattern to solve the problem of going from XML
that has separate begin and end elements to one that does not.
In other words, the "separate begin and end elements" are merely markers
for something not-yet-an-element (actually a sequence of nodes), which you
want to turn into an element.
In other words, this is an up-conversion whereby you want to "wrap" a set
of nodes in another (new) node, depending on their relations to other nodes
Please, please note that I do not control either the source or target
XML formats. If I did, this would be much easier.
Or not -- the problem they're trying to solve arguably is not well handled
by XML. Caveat: depending on how the problem is being scoped. It could be,
as you imply, that a much simpler solution is possible, if the problem is
scoped more narrowly.
Scoped broadly, this is the problem of "multiple concurrent hierarchies"
(short syntax: "overlap"), which is a fairly hot research area: see the
preliminary program for the Extreme conference in Montreal, at
http://www.mulberrytech.com/Extreme/Program.html -- especially Wednesday,
Source XML snip:
<hyperlink_begin id=3D"111" end=3D"222">
<locator_url protocol=3D"http" host_name=3D"www.sf.net"/>
<hyperlink_end id=3D"222" begin=3D"111"/>
Target XML example:
In my case I can assume that associated begin and end hyperlink tags
will occur as siblings -- though generally this is not the case and in
fact, this is the reason the begin and end tags are unique elements.
If you can bank on this assumption, it makes it possible to address this
using "positional grouping". There are two main approaches to this in XSLT
1.0 (covered in the FAQ); but neither are as clean and simple as an XSLT
2.0 group-by construct, which you have available in Saxon 8.
(If Jeni isn't busy with mini-Jeni at the moment, maybe she'll offer this
one, or Mike or someone else will. Having only poked at it, I can say only
that it's somewhat trickier than the general case: you can't use the
"group-starting-with" grouping criterion because your end-markers are a
different element type. ;-)
If you can't assume these are siblings, then you're in uncharted territory
("Here be Dragons"). You could pull XSLT into service as a tag-writing
application (requires that you invoke a serializer to implement the
conversion, and use the dreaded "disable-output-escaping" feature to write
tags) -- but this can't guarantee well-formed output. In fact, if you have
to do this (if you can't use a grouping technique) you can more or less
assume your output will be XML only by accident.
Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
Mulberry Technologies: A Consultancy Specializing in SGML and XML