|
Subject: Re: [xsl] Processing two documents, which order? From: Wolfgang Laun <wolfgang.laun@xxxxxxxxx> Date: Fri, 8 Apr 2011 12:39:55 +0200 |
OK, it isn't XSLT, but it produces an optimzed regex for matching any
number of words.
Change the lines after __DATA__, or modify the reading to take stdin.
.W
#! /usr/bin/perl -w
use strict;
package Trie;
sub new {
my( $class, $letter ) = @_;
return bless { letter => $letter, word => 0, tries => {} }, $class;
}
sub getTrie {
my( $self, $letter ) = @_;
my $tries = $self->{tries};
if( exists( $tries->{$letter} ) ){
return $tries->{$letter};
}
return $tries->{$letter} = new Trie( $letter );
}
sub insertWord {
my( $self, $word ) = @_;
if( length( $word ) == 0 ){
$self->{word} = 1;
} else {
my $head = substr( $word, 0, 1 );
my $tail = substr( $word, 1 );
$self->getTrie( $head )->insertWord( $tail );
}
}
sub dumpTrie {
my( $self ) = @_;
for my $letter ( sort keys( %{$self->{tries}} ) ){
print $letter, ":\n";
$self->{tries}->{$letter}->dumpTrie();
}
}
sub regex {
my( $self ) = @_;
my @keys = sort keys( %{$self->{tries}} );
my $res = '';
my $paren = @keys > 1 || @keys == 1 && $self->{word};
if( @keys ){
$res .= '(' if $paren;
$res .= '|' if $self->{word};
$res .= join( '|',
map( ( $_ . $self->getTrie( $_ )->regex() ), @keys ) );
$res .= ')' if $paren;
}
return $res;
}
package main;
my @words = sort map {chomp; $_} <DATA>;
## print join( '-', @words ), "\n";
my $trie = new Trie( "root" );
for my $word ( @words ){
$trie->insertWord( $word );
}
my $regex = $trie->regex();
print "$regex\n";
__DATA__
bee
bonnet
bounce
bounty
burn
burst
sea
seal
On 8 April 2011 12:11, Dave Pawson <davep@xxxxxxxxxxxxx> wrote:
> On Fri, 08 Apr 2011 11:06:52 +0100
> David Carlisle <davidc@xxxxxxxxx> wrote:
>
>> On 08/04/2011 11:00, Dave Pawson wrote:
>> > Efficiency isn't an issue. The alternative is hand editing.
>> > 15 minutes run time is good with that sort of comparison
>>
>> then why the question which started this thread, which is purely an
>> efficiency question?
>
> Sorry, I was thinking of efficiency wrt hand editing.
>
>
>>
>> An alternative, as I suggested, isn't hand editing but doing it all
>> the properties at once in a single template with a single regex, and
>> just starting saxon once.
>
> I really couldn't contemplate writing such a regex to be honest David.
>
>
>
> --
>
> regards
>
> --
> Dave Pawson
> XSLT XSL-FO FAQ.
> http://www.dpawson.co.uk
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] Processing two documents,, Dave Pawson | Thread | Re: [xsl] Processing two documents,, David Carlisle |
| Re: [xsl] Processing two documents,, Dave Pawson | Date | Re: [xsl] Processing two documents,, David Carlisle |
| Month |