Re: [xsl] tricky string matching

Subject: Re: [xsl] tricky string matching
From: "Imsieke, Gerrit, le-tex" <gerrit.imsieke@xxxxxxxxx>
Date: Mon, 14 Mar 2011 10:20:14 +0100
Another approach:

- in every element that contains the 'tief' element:
- use analyze-string to replace WS chars with an element (let's call it 'ws')
- in a second pass, group starting with ws, e.g.,
<ws string=" ">CO<tief>2</tief> => <word>CO<tief>2</tief></word>
- in a third pass, replace word/tief with <alias kw="{word}"> and word/node() as content
- in the same pass, dissolve word without tief to plain text


Gerrit

On 2011-03-14 09:52, Szabo, Patrick (LNG-VIE) wrote:
Hi,

I'm using XSLT 2 and Saxon 9

Example-snippet from my input:

...
<absatz>text text text text text text text text CO<tief>2</tief>  text
text text text text text</absatz>
<absatz>text text text text text text text text H<tief>2</tief>O text
text text text text text</absatz>
...

What i have to do is make this look like this:

...
<absatz>text text text text<alias kw="CO2">CO<tief>2</tief></alias>
text text text text text text</absatz>
<absatz>text text text text<alias kw="H2O">H<tief>2</tief>O</alias>
text text text text text text</absatz>
...

I do have an idea on how to solve this problem but it sounds very
inefficient to me.

What would you suggest ?!

I would compile a list with alle the possible "Strings" like

...
CO2
H2O
...

Then i would make the absatz flat so there are no<tief>  anymore.
After that i would tokenize all the text() and see if one of them
matches an entry of my list.

Is there a better way ?!

Kind regards

. . . . . . . . . . . . . . . . . . . . . . . . . .
Patrick Szabo
  XSLT Developer
LexisNexis
Marxergasse 25, 1030 Wien

mailto:patrick.szabo@xxxxxxxxxxxxx
Tel.: +43 (1) 534 52 - 1573
Fax: +43 (1) 534 52 - 146


-- Gerrit Imsieke GeschC$ftsfC<hrer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@xxxxxxxxx, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

GeschC$ftsfC<hrer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard VC6ckler

Current Thread