Subject: Re: [xsl] Inverting names with Jr and Sr considered From: Wolfgang Laun <wolfgang.laun@xxxxxxxxx> Date: Tue, 6 Nov 2012 10:57:57 +0100 |
Hopefully you won't have names like "Augustus De Morgan", which should not be transformed to "Morgan, Augustus De". And I think this is the time and the place to quote this article once more: http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-name s/ -W On 06/11/2012, Mark <mark@xxxxxxxxxxxx> wrote: > I agree, my specification is likely not complete. However, my input is a > single document written by one person indexing a single journal. There is a > > great deal of consistency to the data and I doubt that there are as many as > > 1000 names. That said: > > I received an answer off the list (thus do not feel authorized to post it > here) that will help me discover what oddities I have not covered. It > explained the regex expressions it used so that perhaps if modification is > required, I may be able to do it. > > Thanks for your time, Michael; as always this list provides the most > consistent and practical advice around, something you all can be proud of. > > Mark > > -----Original Message----- > From: Michael Kay > Sent: Tuesday, November 06, 2012 2:10 AM > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: Re: [xsl] Inverting names with Jr and Sr considered > > I wouldn't even attempt to write any code based on this as the > specification. For this to work at all well, you're going to need to > iteratively adapt the solution to handle all the names in your dataset, > or at least a sample of a couple of thousand of them. There's just too > much variation in the names you might encounter. Are "Jr" and "Sr" > really the only suffixes, and are they always spelt this way, or do you > also get "III" and "Jnr" and "Jnr."? > > If I'm wrong, and the names are all regular and in the pattern you > describe, then I think you can just tokenize on whitespace and do > something like > > suffix := $tokens[last()][. = ('Jr', 'Sr')] > stem := if ($suffix) then remove($tokens, count($tokens)) else $tokens > value-of select="concat($stem[last()], ',']), remove($stem, > count($stem), if ($suffix) then concat('(', $suffix, ')') else '')" > > Michael Kay > Saxonica > > On 05/11/2012 23:45, Mark wrote: >> This must have been done many times, so can some one show me where to find >> >> the answer? >> >> I have a series of personal names in natural order that I need to invert. >> >> The surname is always last except when followed by Jr, or Sr (either >> of which may not be present). I want to represent: >> >> J Allen Rogers > Rogers, J Allen >> Bill T Wilson Jr > Wilson, Bill T (Jr) >> A B Brown > Brown, A B >> John Victor Case Sr > Case, John Victor (Sr) >> >> and so on. There may be a single space or multiple spaces between some the >> >> elements of the name. >> >> It looks like <xsl:analyze-string> will do this, but I do not know how to >> >> write regex. >> >> Thanks, >> Mark
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Inverting names with Jr a, Mark | Thread | [xsl] Perfomance: 'conditional inst, Norbert Heidbrink |
Re: [xsl] Inverting names with Jr a, Mark | Date | [xsl] Building Dynamic width Cals T, Mailing Lists Mail |
Month |