Re: [xsl] Matching string values across element boundaries

Subject: Re: [xsl] Matching string values across element boundaries
From: Michael Müller-Hillebrand <mmh@xxxxxxxxx>
Date: Mon, 8 Apr 2013 20:58:49 +0200
David,

Can you give a more complex example, how "variable in structure" those
citations may be. This may also shed some light on the kind of processing you
want to do. Changing tags to characters (why are you using ASCII instead of
some high Unicode character from the private use area?) and then back to tags
seems not a very interesting thing

- Michael

Am 08.04.2013 um 20:15 schrieb David Sewell <dsewell@xxxxxxxxxxxx>:

> I expect this has been discussed here before, but I can't locate any
relevant
> discussion, so here goes.
>
> We have input data with many unmarked short-title citations that look like
this:
>
>   Sprague, <hi rend="italic">Braintree Families</hi>
>
> We want to wrap them inside another element, in our case a <ref> to the
> bibliographic expansion. We have a venerable chain of XSLT 2.0 transforms
that
> does this, and pretty well, by preprocessing the data to convert all those
<hi>
> tags into a pair of unique ASCII characters, so that we can do
string-matching
> operations within a single text node that now includes something like
>
>   Sprague, "Braintree Families%
>
> which is easy to handle with xsl:analyze-string. then once we've wrapped all
the
> strings we need to, we post-process with xsl:analyze-string to put the <hi>
> elements back in.
>
> In practice, given the proper regexes, this works quite well and provides
the
> desired output, but I always feel a bit guilty about the hackishness of the
> approach. Given that the citations are quite variable in structure (usually
but
> not always containing <hi> elements, with various combinations of text nodes
at
> start and end), I've never come up with a good general-purpose way to
operate
> purely on elements and text nodes without the convert-tags-to-characters
step.
> Is there one (or more)?
>
> David S.

Current Thread