Subject: RE: [xsl] Identifying place names in text... From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Thu, 21 Jul 2005 17:38:07 +0100 |
This isn't difficult, no need to contemplate doing it in Java. You can tokenize the text using the tokenize() function in XSLT 2.0, or the str:tokenize() function/template in EXSLT (www.exslt.org). Then look up each token in your list of place names, using a key for efficiency. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: Karl Koch [mailto:TheRanger@xxxxxxx] > Sent: 21 July 2005 14:56 > To: Mulberry list > Subject: [xsl] Identifying place names in text... > > Hello group, > > I would like to find a way of automatically identifying > references to places > in XML text. The thing is that I have a very large set of > content. In this > content there are sometimes references to particular places, > which I want to > know about. > > This is my xml structure (made up for simplification): > > <bookshelf: > <book> > <title>1000 years of London's history</title> > ... > </book> > <book> > <title>1984</title> > ... > </book> > </bookshelf> > > Can I use XSLT to search for place names in the title of all > the books? I > would like to use a wordlist of geographical place names > (which I already > have). This would contain coutry and city names. The > stylesheet would match > occurances of these words in the <title> XML element. The > output here would > be a list of all books which have references about locations > in the title. > In this example, the result would only be the first book, > because it has > "London" in th title. > > Perhaps this is the point where XSLT is getting too > complicated and I should > consider Java as a solution. However, I am continuously > impressed by the > power of XSLT and therefore I ask here because I think there > might be even a > solution for that problem using XSLT. > > A note on the side: The output of this stylesheet would be a > helper and an > additional control for a mainly handcrafted process. I could > discover books > which I have overseen in the manual process. > > Any help would be greatly appreciated. > > Kind Regards, > Karl > > -- > 5 GB Mailbox, 50 FreeSMS http://www.gmx.net/de/go/promail > +++ GMX - die erste Adresse fo?=r Mail, Message, More +++
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Identifying place names i, Michael Kay | Thread | RE: Re: [xsl] About encoding - or s, cknell |
Re: [xsl] some problems with my fir, David Carlisle | Date | RE: [xsl] Identifying place names i, Michael Kay |
Month |