Subject: Re: [xsl] Processing two documents, which order? From: Wolfgang Laun <wolfgang.laun@xxxxxxxxx> Date: Fri, 8 Apr 2011 10:49:58 +0200 |
On 8 April 2011 09:15, Dave Pawson <davep@xxxxxxxxxxxxx> wrote: > > > > Given > > > <property>absolute-position</property> > > > <property>bottom</property> > > > <property>left</property> > > > <property>right</property> > > > <property>top</property> > > > as the input... what would the keys look like? > > > > The 'list to be marked up' is as above > The other document is xml, containing, in other elements those words > > Required output > > <para> Blah blah blah <property>right</property> > > 'items' must be followed by [\s\p{{P}}] so left-handed doesn't get > marked up etc. If, given "left", "left-handed" should not match, the set of stoppers must include space and non-letters (\PL) and not punctuation characters (\pP). If a regular expression is used, the pattern may also have to include the anchor $. And, possibly the symmetric pattern (using '^') should precede the pattern. I'm not at all sure whether a regular expression substitution applied to text nodes in their entirety would not be able to compete with any other approach. A simple algorith can be used to optimize the regular expression, away from the "brute force" pattern joining all words with '|'. Example: Given the words bee-bonnet-bounce-bounty-burn-burst-sea-seal the optimized and anchored regex is (^|\s|\p{P})((?:b(?:ee|o(?:nnet|un(?:ce|ty))|ur(?:n|st))|sea(?:|l)))($|\s|\ p{P}) Here is a text: <p>Bee in my bonnet bounces from bounty. Burst on a bee-line into the sea as a seal</p> Applying global case-insensitive substitution with $1<x>$2</x>$3 produces: <p><x>Bee</x> in my <x>bonnet</x> bounces from <x>bounty</x>. <x>Burst</x> on a <x>bee</x>-line into the <x>sea</x> as a <x>seal</x></p> Disclaimer: My XSLT skills aren't sufficient to create the optimized regex from the word list. If someone is interested enough, I can provide the details. -W > > > regards > > > > > -- > > regards > > -- > Dave Pawson > XSLT XSL-FO FAQ. > http://www.dpawson.co.uk
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Processing two documents,, Dave Pawson | Thread | RE: [xsl] Processing two documents,, Tony Nassar |
Re: [xsl] Processing two documents,, Dave Pawson | Date | Re: [xsl] Processing two documents,, Dave Pawson |
Month |