[xsl] Inferring XSD schema from XML document

Subject: [xsl] Inferring XSD schema from XML document
From: "Will Hawes" <info@xxxxxxxxxxxx>
Date: Fri, 24 Jan 2003 08:52:54 -0000
I'm using an XSL transformation to infer an XSD schema for an XML
document for the purposes of creating a database table. I am using a
Customers table to illustrate matters here but please note that I'm
looking for a generic solution.

XML:

<root>
<Customers>
<Company/>
</Customers>
<Customers>
<CustomerID/>
<Company/>
<Customers/>
</root>

The XSL is used to generate elements containing complexTypes for any
node in the XML with children (e.g. a database table), and elements
containing simpleTypes for each node with no child nodes (e.g. columns
in a table):

XSL:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
xmlns:xs="http://www.w3.org/2001/XMLSchema";>

<xsl:output
method="xml"/>

<xsl:key name="kDistinctComplexTypes" match="*[count(*) &gt; 0]"
use="local-name()"/>
<xsl:key name="kDistinctSimpleTypes" match="*[count(*) = 0]"
use="local-name()"/>

<xsl:template match="/">
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema";>
<xsl:apply-templates select="*"/>
</xs:schema>
</xsl:template>

<xsl:template match="*"/>

<!-- complexType -->
<xsl:template match="*[generate-id() =
generate-id(key('kDistinctComplexTypes',local-name()))]">
<xs:element>
<xsl:attribute name="name">
<xsl:value-of select="local-name()"/>
</xsl:attribute>

<xsl:variable name="ElementName">
<xsl:choose>
<xsl:when test="count(*/*) &gt; 0">xs:choice</xsl:when>
<xsl:otherwise>xs:sequence</xsl:otherwise>
</xsl:choose>
</xsl:variable>

<xs:complexType>
<xsl:element name="{$ElementName}">
<!-- output min/maxOccurs for xs:choice -->
<xsl:if test="$ElementName = 'xs:choice'">
<xsl:attribute name="minOccurs">0</xsl:attribute>
<xsl:attribute name="maxOccurs">unbounded</xsl:attribute>
</xsl:if>

<xsl:apply-templates select="*[generate-id() =
generate-id(key('kDistinctComplexTypes',local-name()))]"/>

<xsl:variable name="CurrentLocalName" select="local-name()"/>
<xsl:for-each select="../*/*[generate-id() =
generate-id(key('kDistinctSimpleTypes',local-name())[local-name(..) =
$CurrentLocalName])]">
<xs:element>
<xsl:attribute name="name">
<xsl:value-of select="local-name()"/>
</xsl:attribute>
<xsl:attribute name="type">
<xsl:text>xs:string</xsl:text>
</xsl:attribute>
<xsl:attribute name="minOccurs">0</xsl:attribute>
</xs:element>
</xsl:for-each>

</xsl:element>
</xs:complexType>

</xs:element>
</xsl:template>

</xsl:stylesheet>


For each distinctly named node with children in the source document,
the stylesheet outputs a corresponding complexType in the schema. It
then outputs simpleTypes for every possible child for that complexType
and siblings with the same name. In the above XML for example, a
single "Customers" complexType would be output with child simpleTypes
of "CustomerID" and "Company".

This part works to a fashion, but there is a problem with ordinal
positions of simpleType nodes in the resulting schema. In the above
XML document for example, the order of the children for a Customers
node should visibly be "CustomerID", "Company". However, the
stylesheet actually outputs these nodes in the order "Company",
"CustomerID".

I can see why the above is happening, i.e. because by default the
simpleTypes will be output in document order. In this case, "Company"
appears in the document as a child of the first "Customers" node and
before the first instance of "CustomerID", which is a child of the
second "Customers" node.

I need to ensure that the stylesheet follows the order of the relevant
parent node in the source XML and not document order. I can see that
this would be possible to do from code after the stylesheet has been
processed, but I'd be interested to know if it's possible to do within
the stylesheet itself and if so how I might go about this.

Regards

Will


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread