[xsl] How to map XML vocabulary 1 to XML vocabulary 2 and vice versa?

Subject: [xsl] How to map XML vocabulary 1 to XML vocabulary 2 and vice versa?
From: "Costello, Roger L. costello@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 4 Apr 2019 19:02:41 -0000
Hi Folks,

I have two XML vocabularies with closely related data. I want to convert XML
instances that conform to vocabulary 1 to an equivalent instance conforming to
vocabulary 2, and vice versa. The data in the two vocabularies might be
structured quite differently.

Below is one approach for doing this kind of mapping between vocabularies. I
am interested in hearing other approaches.

The two vocabularies deal with magnetic variation.

For those who are unaware, magnetic variation is the difference between true
north and magnetic north.

Here is an XML instance that uses vocabulary 1:

<MAG_VAR>W014404</MAG_VAR>

I want to convert it to vocabulary 2:

<magneticVariation>
    <magneticVariationEWT>West</magneticVariationEWT>
    <magneticVariationValue>14.7</magneticVariationValue>
</magneticVariation>

And vice-versa.

Here is that mapping between the two formats:

Vocabulary 1	Vocabulary 2
W		West
E 		East
n/a     	True

Vocabulary 1 expresses magnetic variation in degrees, minutes, and tenths of
minutes. Here is the structure of the magnetic variation data:

	xyyyzzw

x = W or E corresponding to West and East, respectively
yyy = the part of the magnetic variation in degrees
zz = the part of the magnetic variation in minutes
w = the part of the magnetic variation in tenths of a minute

So, the data W014404 means the magnetic variation is West, 014 degrees, 40.4
minutes.

Vocabulary 2 expresses magnetic variation as a decimal value 0 to 180, with
one digit to the right of the decimal point.

So, the data West, 14.7 means the magnetic variation is 14.7 degrees West.

Converting W014404 to West, 14.7 requires:

Map the first character of W014404 to West.
Map the second, third, and fourth characters to 14 degrees.
Map the fifth, sixth, and seventh characters (404) to .7 degrees. Note that
this will involve some arithmetic.

Doing the reverse mapping from West, 14.7 to W014404 requires:

Map West to W.
Map 14 to 014.
Map .7 to 420. Notice the loss of precision.

One approach to implementing these mappings is to use XSLT: one template that
maps vocabulary 1 to vocabulary 2, a second template that maps vocabulary 2 to
vocabulary 1. The templates can be placed in the same XSLT file and
distinguished using XSLT modes. When we want to execute the mapping to convert
vocabulary 1 to vocabulary 2, we can invoke Saxon on the command line with the
-im flag:

java -jar saxon/saxon9he.jar magnetic-variation-vocab1.xml
-xsl:transform-magnetic-variation.xsl -o:magnetic-variation-vocab2.xml
-im:vocab1_to_vocab2

At the bottom of this message is the actual XSLT code to perform the mapping.
It works, but I am wondering if there is a better approach to mapping
vocabularies than hand-coding a bunch of template rules? Perhaps an
alternative approach would be to write a regular expression description of
each vocabulary's data and then write code that consumes the regex
descriptions and automatically generates XSLT templates? How would a regex
indicate the units? Would that approach need augmenting to indicate units?
Perhaps there are other approaches?

Here is XSLT that converts vocabulary 1 to vocabulary 2:

<xsl:template match="MAG_VAR" mode="vocab1_to_vocab2">
    <magneticVariation>
        <xsl:variable name="magVar" select="./text()"/>
        <magneticVariationEWT>
            <xsl:choose>
                <xsl:when test="starts-with($magVar, 'W')">West</xsl:when>
                <xsl:when test="starts-with($magVar, 'E')">East</xsl:when>
                <xsl:otherwise>
                    <xsl:value-of select="error(xs:QName('MAG__VAR'), 'Invalid
first character')"/>
                </xsl:otherwise>
            </xsl:choose>
        </magneticVariationEWT>
        <magneticVariationValue>
            <xsl:variable name="unrounded-value"
select="xs:integer(substring($magVar, 2,3)) + (xs:integer(substring($magVar,
5,3)) div 600)"/>
            <xsl:variable name="rounded-value" select="(round($unrounded-value
* 10)) div 10"/>
            <xsl:value-of select="$rounded-value" />
        </magneticVariationValue>
    </magneticVariation>
</xsl:template>

Here is XSLT that converts vocabulary 2 to vocabulary 1:

<xsl:template match="magneticVariation" mode="vocab2_to_vocab1">
    <MAG__VAR>
        <xsl:choose>
            <xsl:when test="magneticVariationEWT eq 'West'">W</xsl:when>
            <xsl:when test="magneticVariationEWT eq 'East'">E</xsl:when>
            <xsl:when test="magneticVariationEWT eq 'True'">
                <xsl:value-of select="error(xs:QName('magneticVariation'),
'Vocab1 does not support magVar True')"/>
            </xsl:when>
            <xsl:otherwise><xsl:value-of
select="error(xs:QName('magneticVariation'), 'Invalid value for
magneticVariationEWT')"/></xsl:otherwise>
        </xsl:choose>
        <xsl:variable name="degrees"
select="f:make-3-digits(substring-before(magneticVariationValue, '.'))"/>
        <xsl:value-of select="$degrees"/>
        <xsl:variable name="tenths-degree" select="concat('.',
substring-after(magneticVariationValue, '.'))" />
        <xsl:variable name="minutes" select="xs:decimal($tenths-degree) *
600"/>
        <xsl:value-of select="$minutes"/>
    </MAG__VAR>
</xsl:template>

/Roger

Current Thread