Re: [xsl] Inverting names with Jr and Sr considered

Subject: Re: [xsl] Inverting names with Jr and Sr considered
From: "Mark" <mark@xxxxxxxxxxxx>
Date: Tue, 6 Nov 2012 02:41:28 -0700
I agree, my specification is likely not complete. However, my input is a single document written by one person indexing a single journal. There is a great deal of consistency to the data and I doubt that there are as many as 1000 names. That said:

I received an answer off the list (thus do not feel authorized to post it here) that will help me discover what oddities I have not covered. It explained the regex expressions it used so that perhaps if modification is required, I may be able to do it.

Thanks for your time, Michael; as always this list provides the most consistent and practical advice around, something you all can be proud of.

Mark

-----Original Message----- From: Michael Kay
Sent: Tuesday, November 06, 2012 2:10 AM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [xsl] Inverting names with Jr and Sr considered


I wouldn't even attempt to write any code based on this as the
specification. For this to work at all well, you're going to need to
iteratively adapt the solution to handle all the names in your dataset,
or at least a sample of a couple of thousand of them. There's just too
much variation in the names you might encounter. Are "Jr" and "Sr"
really the only suffixes, and are they always spelt this way, or do you
also get "III" and "Jnr" and "Jnr."?

If I'm wrong, and the names are all regular and in the pattern you
describe, then I think you can just tokenize on whitespace and do
something like

suffix := $tokens[last()][. = ('Jr', 'Sr')]
stem := if ($suffix) then remove($tokens, count($tokens)) else $tokens
value-of select="concat($stem[last()], ',']), remove($stem,
count($stem), if ($suffix) then concat('(', $suffix, ')') else '')"

Michael Kay
Saxonica

On 05/11/2012 23:45, Mark wrote:
This must have been done many times, so can some one show me where to find the answer?

I have a series of personal names in natural order that I need to invert. The surname is always last except when followed by bJrb, or bSrb (either of which may not be present). I want to represent:

J Allen Rogers b> Rogers, J Allen
Bill T Wilson Jr b> Wilson, Bill T (Jr)
A B Brown b> Brown, A B
John Victor Case Sr b> Case, John Victor (Sr)

and so on. There may be a single space or multiple spaces between some the elements of the name.

It looks like <xsl:analyze-string> will do this, but I do not know how to write regex.

Thanks,
Mark

Current Thread