Re: [xsl] text replacement with mixed content

Subject: Re: [xsl] text replacement with mixed content
From: Andrew Welch <andrew.j.welch@xxxxxxxxx>
Date: Wed, 31 Aug 2011 09:40:47 +0100
On 30 August 2011 20:35, Geert Bormans <geert@xxxxxxxxxxxxxxxxxxx> wrote:
> Hi all,
>
> thanks for reading this.
>
> I have an interesting task.
>
> All through a document I need to replace each occurrence of "my foo" with
> "<replaced>your bar</replaced>"
> But the texts contain mixed content tags, so I might as well find "my
> <bold>foo</bold>" that needs to become "<replaced>your bar</replaced>" as
> well
>
> Note that the I need to keep the tags balanced, so I must not end up with
> "<replaced>your bar</replaced></bold>" in the later case
>
> I have some algorithms in mind, but I am not happy with any of them.
> So I thought I might as well ask here, hoping one of you can come up with
> something really elegant
>
> the replacement tags are pulled out of another document,
> so as a bonus, the text to be replaced could be "my.foo", requiring me
> likely to build correct regexes automatically

This isn't a trivial task, so you may or may not get someone to give
you a working solution for free.....

One way to tackle this is to:

- tokenize the search string into individual words

- mark up those individual works in the document

- identify sequences of that markup

- replace the sequences with the replacement markup


-- 
Andrew Welch
http://andrewjwelch.com

Current Thread