[xsl] Using analyze-string to catch roman numerals?

Subject: [xsl] Using analyze-string to catch roman numerals?
From: "Tony Zanella" <tony.zanella@xxxxxxxxx>
Date: Thu, 9 Oct 2008 15:18:01 -0400
Hello all,

Given the following input:

<root>
    <head>CHAPTER II. THE WRECKED FOUNDATIONS OF DOMESTICITY</head>
    <head>PROBLEMA. HELOISE XXIX.</head>
    <head>Selected Letters</head>
    <head>The Second Part of Henry IV.</head>
    <head>VIII</head>
    <head>APPENDIX VII</head>
    <head>Appendix VII</head>
    <head>APPENDIX</head>
    <head>CALVIN XVII</head>
    <head>ILLUSTRATION</head>
</root>

and the following template:

<xsl:template match="head">
        <xsl:choose>
            <xsl:when test="not(matches(.,'^(.*?)([IVXL]+)(.*?)$'))">
                <xsl:value-of select="lower-case(.)"/>
            </xsl:when>
            <xsl:when test="matches(.,'^(.*?)([IVXL]+)(.*?)$')">
                <xsl:analyze-string select="." regex="^(.*?)([IVXL]+)(.*?)$">
                    <xsl:matching-substring>
                        <xsl:value-of select="lower-case(regex-group(1))"/>
                        <xsl:value-of select="upper-case(regex-group(2))"/>
                        <xsl:value-of select="lower-case(regex-group(3))"/>
                    </xsl:matching-substring>
                </xsl:analyze-string>
            </xsl:when>
            <xsl:otherwise/>
        </xsl:choose>
    </xsl:template>

I'm trying to use analyze-string to do the following:
Test for a roman numeral. If there isn't one, lower-case(.). If there
is one, break (.) into its roman numeral and non-roman numeral parts,
lower-case()ing the latter.

The output I get is:

    chapter II. the wrecked foundations of domesticity
    probLema. heloise xxix.
    selected Letters
    the second part of henry IV.
    VIII
    appendIX vii
    appendix VII
    appendIX
    caLVIn xvii
    ILLustration

When what I want is this:

	chapter II. the wrecked foundations of domesticity
	problema. heloise XXIX.
	selected letters
	the second part of henry IV.
	VIII
	appendix VII
	appendix VII
	appendix
	calvin XVII
	illustration

 Between my relative inexperience with both regexes and XSLT, thanks
for any help!
Tony

Current Thread