Re: [xsl] XSLT1.0 and wildcards

Subject: Re: [xsl] XSLT1.0 and wildcards
From: "Pankaj Bishnoi" <pankaj.bishnoi@xxxxxxxxxxx>
Date: Sun, 3 Sep 2006 15:08:04 +0530
Thanks Abel and Michael
----- Original Message ----- 
From: "Abel Braaksma" <abel.online@xxxxxxxxx>
To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Tuesday, October 03, 2006 1:51 PM
Subject: Re: [xsl] XSLT1.0 and wildcards


> Pankaj Bishnoi wrote:
> > For example The address line is like this: melkweg 51a.
> >
> > This means I have to map this like this:  street = melkweg, number = 51,
> > extension = a
> > Are there such wildcards, and/or is there a better way to do this?
>
> Hi Pankaj,
>
> Yes, there's a better way. As a matter of fact, this happens to be a
> specialized field of science. Depending on your needs, there are
> numerous ways to resolve this. Since you appear to live in Holland and
> the address exists in Amsterdam, consider the following address lines:
>
> 1) Melkweg 51 A
> 2) Plein 40-45 123-IV
> 3) 1ste J vd Heijdenstraat 12-hs
>
> ad 1) this is a common extension
> ad 2) street is "Plein 40-45", nr is "123", suffix (floor) is "IV"
> ad 3) suffix 'hs' means 'huis' means ground floor in Holland. Note the
> number in the streetname.
>
> Perhaps you'd thought of all this already. International addresses pose
> even more challenges: the French and the English place their
> streetnumbers as ordinals as start of the address line. I hope you won't
> have to deal with non-western characters or hebrew digits.... Hopefully
> nobody entered the postal code or city name on the same line ;-)
>
> (that's why postal companies offer products to normalize the addresses
> to some well-known format. But beware, they offer about 95% matches, the
> rest will still dropout)
>
> Now, for a solution with XSLT 1, it will be quite a challenge. I think
> you will have to pass the address line multiple times through the
> translate-filter that was proposed by Michael.
>
> When you can resort to XSLT 2 or a filter before processing (like with
> client-side, you may be able to use javascript + regular expression to
> filter, on server side, you may use java/.net/perl/php + regular
> expression to filter your data), the regular expression may look like
> this (needs tweaking):
>
> ^(.*) ([0-9]+)([ -]?([a-zA-Z]))?$
>
> $1 contains streetname
> $2 contains number
> $4 contains suffix (use $4 if you want it to include space or hyphen)
>
> The regex will work for the above three examples (spaces are important
> in the regex).
>
> Cheers,
> -- Abel Braaksma

Current Thread