|
Subject: RE: [xsl] Re: up-converting From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Tue, 28 Sep 2004 14:23:33 +0100 |
I haven't looked through your code in detail, but it looks similar to a
problem I used as an exercise at the Oxford Summer School. Here we had a set
of records with COBOL-like level numbers
<A level="1"/>
<B level="2"/>
<C level="3"/>
<D level="2"/>
and the task is to create a hierarchically nested structure. (The actual
input was a GEDCOM file).
the solution is a recursive grouping like this:
<xsl:template name="g">
<xsl:param name="sequence" as="element()*"/>
<xsl:param name="level" as="xs:integer"/>
<xsl:for-each-group select="$sequence"
group-starting-with="*[@level=$level]">
<xsl:copy>
<xsl:call-template name="g">
<xsl:with-param name="sequence" select="current-group() except ."/>
<xsl:with-param name="level" select="$level+1"/>
</
</
</
</
Now it seems to me your problem is very similar, except you have no explicit
level number. But I think you could use a similar approach, where the same
template is used for each level of grouping and the only thing that changes
is the grouping key.
Michael Kay
http://www.saxonica.com/
> -----Original Message-----
> From: Jim_Albright@xxxxxxxxxxxx [mailto:Jim_Albright@xxxxxxxxxxxx]
> Sent: 28 September 2004 13:07
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: RE: [xsl] Re: up-converting
>
> I have a solution for the up-converting problem that I had.
> It isn't as
> elegant as I was hoping for. Maybe someone here can give me a
> few more
> pointers.
> Thanks again for including the for each group structure as
> that makes the
> solution much easier.
>
> My general problem is conversion of a flat XML (WordML)
> document to one
> with hierarchy.
> After tossing out all of the formatting info,
> The first step is to map all the paragraphs that indicate
> divs to their
> appropriate level. I use a, b, c, d, ... for new element
> names in order to
> make this more generic.
> The a, b, c, indicate the head or title for the div.
> The div may be nested: a, b, c, d.
> Some divs may be omitted: a, b, d.
> Divs may be followed by other divs or paragraphs. Paragraphs
> may contain
> spans.
>
> Next use the for-each-group structure to put a aa element
> around the a
> elements.
> Next use the for-each-group structure to put a bb element
> around the b
> elements and aaa instead of aa.
> ...
> Each of these steps builds the required hierarchy one step at a time.
> Since some divs may be omitted I couldn't find a way to combine these
> steps.
> Next the head/title is pulled out.
> Toss out any div with no head/title
>
> sample input
> <?xml version="1.0" encoding="UTF-8"?>
> <document>
> <a>level aaaa head 1</a>
> <b>level bbbb head 2</b>
> <c>level ccccc head 3</c>
> <dfg>cc 4 blah</dfg>
> <e>level eeee head 5 </e>
> <fhh>cc blah 6</fhh>
> <c>level ccccc head 7</c>
> <df>cc 8 blah<kkk>kkk within df within c</kkk>
> </df>
> <d>level dddd head 9</d>
> <iuo>dd 10 blah</iuo>
> <jtt>dd blah 11</jtt>
> <c>level ccccc head 12</c>
> <df>cc 13 blah</df>
> <e>cc level eeeee head 14</e>
> <fss>ee blah 15</fss>
> <b>level bbbbb head 16</b>
> <c>level ccccc head 17</c>
> <df>cc 18 blah</df>
> <e>cc level eeeee head 19</e>
> <fhy>ee blah 20</fhy>
> </document>
>
> and the required output is
> <?xml version="1.0" encoding="UTF-8"?>
> <document>
> <div-a>
> <title>level aaaa head 1</title>
> <div-b>
> <title>level bbbb head 2</title>
> <div-c>
> <title>level ccccc head 3</title>
> <dfg>cc 4 blah</dfg>
> <div-e>
> <title>level eeee head 5 </title>
> <fhh>cc blah 6</fhh>
> </div-e>
> </div-c>
> <div-c>
> <title>level ccccc head 7</title>
> <df>cc 8 blah<kkk>kkk within df within c</kkk>
> </df>
> <div-d>
> <title>level dddd head 9</title>
> <iuo>dd 10 blah</iuo>
> <jtt>dd blah 11</jtt>
> </div-d>
> </div-c>
> <div-c>
> <title>level ccccc head 12</title>
> <df>cc 13 blah</df>
> <div-e>
> <title>cc level eeeee head 14</title>
> <fss>ee blah 15</fss>
> </div-e>
> </div-c>
> </div-b>
> <div-b>
> <title>level bbbbb head 16</title>
> <div-c>
> <title>level ccccc head 17</title>
> <df>cc 18 blah</df>
> <div-e>
> <title>cc level eeeee head 19</title>
> <fhy>ee blah 20</fhy>
> </div-e>
> </div-c>
> </div-b>
> </div-a>
> </document>
>
>
>
> Next use the for-each-group structure to put a aa element
> around the a
> elements.
>
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> <xsl:output method="xml" version="1.0"
> encoding="UTF-8" indent="yes"/>
> <xsl:template match="document">
> <document>
> <xsl:for-each-group select="*"
> group-starting-with="a">
> <aa>
> <xsl:for-each
> select="current-group()">
> <xsl:copy-of
> select="."/>
> </xsl:for-each>
> </aa>
> </xsl:for-each-group>
> </document>
> </xsl:template>
> <xsl:template match="@*|node()" name="copy-current-node">
> <xsl:copy>
> <xsl:apply-templates select="@*|node()"/>
> </xsl:copy>
> </xsl:template>
> </xsl:stylesheet>
>
>
> with output of
>
> <?xml version="1.0" encoding="UTF-8"?>
> <document>
> <aa>
> <a>level aaaa head 1</a>
> <b>level bbbb head 2</b>
> <c>level ccccc head 3</c>
> <dfg>cc 4 blah</dfg>
> <e>level eeee head 5 </e>
> <fhh>cc blah 6</fhh>
> <c>level ccccc head 7</c>
> <df>cc 8 blah<kkk>kkk within df within c</kkk>
> </df>
> <d>level dddd head 9</d>
> <iuo>dd 10 blah</iuo>
> <jtt>dd blah 11</jtt>
> <c>level ccccc head 12</c>
> <df>cc 13 blah</df>
> <e>cc level eeeee head 14</e>
> <fss>ee blah 15</fss>
> <b>level bbbbb head 16</b>
> <c>level ccccc head 17</c>
> <df>cc 18 blah</df>
> <e>cc level eeeee head 19</e>
> <fhy>ee blah 20</fhy>
> </aa>
> </document>
>
> Next use the for-each-group structure to put a bb element
> around the b
> elements and aaa instead of aa.
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> <xsl:output method="xml" version="1.0"
> encoding="UTF-8" indent="yes"/>
> <xsl:template match="document">
> <document>
> <xsl:apply-templates/>
> </document>
> </xsl:template>
> <xsl:template match="aa">
> <aaa>
> <xsl:for-each-group select="*"
> group-starting-with="b">
> <bb>
> <xsl:for-each
> select="current-group()">
> <xsl:copy-of
> select="."/>
> </xsl:for-each>
> </bb>
> </xsl:for-each-group>
> </aaa>
> </xsl:template>
> <xsl:template match="@*|node()" name="copy-current-node">
> <xsl:copy>
> <xsl:apply-templates select="@*|node()"/>
> </xsl:copy>
> </xsl:template>
> </xsl:stylesheet>
>
> <?xml version="1.0" encoding="UTF-8"?>
> <document>
> <aaa>
> <bb>
> <a>level aaaa head 1</a>
> </bb>
> <bb>
> <b>level bbbb head 2</b>
> <c>level ccccc head 3</c>
> <dfg>cc 4 blah</dfg>
> <e>level eeee head 5 </e>
> <fhh>cc blah 6</fhh>
> <c>level ccccc head 7</c>
> <df>cc 8 blah<kkk>kkk within df within c</kkk>
>
> </df>
> <d>level dddd head 9</d>
> <iuo>dd 10 blah</iuo>
> <jtt>dd blah 11</jtt>
> <c>level ccccc head 12</c>
> <df>cc 13 blah</df>
> <e>cc level eeeee head 14</e>
> <fss>ee blah 15</fss>
> </bb>
> <bb>
> <b>level bbbbb head 16</b>
> <c>level ccccc head 17</c>
> <df>cc 18 blah</df>
> <e>cc level eeeee head 19</e>
> <fhy>ee blah 20</fhy>
> </bb>
> </aaa>
> </document>
>
>
> continue adding the levels
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> <xsl:output method="xml" version="1.0"
> encoding="UTF-8" indent="yes"/>
> <xsl:template match="document">
> <document>
> <xsl:apply-templates/>
> </document>
> </xsl:template>
> <xsl:template match="aaa">
> <aaa>
> <xsl:apply-templates/>
> </aaa>
> </xsl:template>
> <xsl:template match="bb">
> <bbb>
> <xsl:for-each-group select="*"
> group-starting-with="c">
> <cc>
> <xsl:for-each
> select="current-group()">
> <xsl:copy-of
> select="."/>
> </xsl:for-each>
> </cc>
> </xsl:for-each-group>
> </bbb>
> </xsl:template>
> <xsl:template match="@*|node()" name="copy-current-node">
> <xsl:copy>
> <xsl:apply-templates select="@*|node()"/>
> </xsl:copy>
> </xsl:template>
> </xsl:stylesheet>
>
> <?xml version="1.0" encoding="UTF-8"?>
> <document>
> <aaa>
> <bbb>
> <cc>
> <a>level aaaa head 1</a>
> </cc>
> </bbb>
> <bbb>
> <cc>
> <b>level bbbb head 2</b>
> </cc>
> <cc>
> <c>level ccccc head 3</c>
> <dfg>cc 4 blah</dfg>
> <e>level eeee head 5 </e>
> <fhh>cc blah 6</fhh>
> </cc>
> <cc>
> <c>level ccccc head 7</c>
> <df>cc 8 blah<kkk>kkk within df within c</kkk>
>
>
> </df>
> <d>level dddd head 9</d>
> <iuo>dd 10 blah</iuo>
> <jtt>dd blah 11</jtt>
> </cc>
> <cc>
> <c>level ccccc head 12</c>
> <df>cc 13 blah</df>
> <e>cc level eeeee head 14</e>
> <fss>ee blah 15</fss>
> </cc>
> </bbb>
> <bbb>
> <cc>
> <b>level bbbbb head 16</b>
> </cc>
> <cc>
> <c>level ccccc head 17</c>
> <df>cc 18 blah</df>
> <e>cc level eeeee head 19</e>
> <fhy>ee blah 20</fhy>
> </cc>
> </bbb>
> </aaa>
> </document>
>
> ....
>
>
> finally at aaa see if there is a descendant a, if so that is
> the title for
> this group, otherwise no title
>
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet version="1.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
> <xsl:output method="xml" version="1.0" encoding="UTF-8"
> indent="yes"/>
> <xsl:strip-space elements="*"/>
> <xsl:template match="document">
> <document>
> <xsl:apply-templates/>
> </document>
> </xsl:template>
> <xsl:template match="aaa">
> <div-a>
> <xsl:choose>
> <xsl:when test="descendant::a">
> <title>
> <xsl:apply-templates
> select="descendant::a"/>
> </title>
> <xsl:apply-templates
> select="child::*"/>
> </xsl:when>
> <xsl:otherwise>
> <xsl:apply-templates
> select="child::*"/>
> </xsl:otherwise>
> </xsl:choose>
> </div-a>
> </xsl:template>
> <xsl:template match="bbb">
> <xsl:choose>
> <xsl:when test="descendant::b">
> <div-b>
> <title>
> <xsl:apply-templates
> select="descendant::b"/>
> </title>
> <xsl:apply-templates
> select="child::*"/>
> </div-b>
> </xsl:when>
> <xsl:otherwise>
> <xsl:apply-templates
> select="child::*"/>
> </xsl:otherwise>
> </xsl:choose>
> </xsl:template>
> <xsl:template match="ccc">
> <xsl:choose>
> <xsl:when test="descendant::c">
> <div-c>
> <title>
> <xsl:apply-templates
> select="descendant::c"/>
> </title>
> <xsl:apply-templates
> select="descendant::*[preceding-sibling::c]"/>
> <xsl:apply-templates
> select="child::*"/>
> </div-c>
> </xsl:when>
> <xsl:otherwise>
> <xsl:apply-templates
> select="child::*[not(c)]|descendant::*[preceding-sibling::c]"/>
> </xsl:otherwise>
> </xsl:choose>
> </xsl:template>
> <xsl:template match="ddd">
> <xsl:choose>
> <xsl:when test="descendant::d">
> <div-d>
> <title>
> <xsl:apply-templates
> select="descendant::d"/>
> </title>
> <xsl:apply-templates
> select="descendant::*[preceding-sibling::d]"/>
> <xsl:apply-templates
> select="child::*"/>
> </div-d>
> </xsl:when>
> <xsl:otherwise>
> <xsl:apply-templates
> select="child::*|descendant::*[preceding-sibling::d]"/>
> </xsl:otherwise>
> </xsl:choose>
> </xsl:template>
> <xsl:template match="eee">
> <xsl:choose>
> <xsl:when test="descendant::e">
> <div-e>
> <title>
> <xsl:apply-templates
> select="descendant::e"/>
> </title>
> <xsl:apply-templates
> select="descendant::*[preceding-sibling::e]"/>
> <xsl:apply-templates
> select="child::*"/>
> </div-e>
> </xsl:when>
> <xsl:otherwise>
> <xsl:apply-templates
> select="child::*|descendant::*[preceding-sibling::e]"/>
> </xsl:otherwise>
> </xsl:choose>
> </xsl:template>
>
> <xsl:template match="a|b|c|d|e|f|g|h|i">
> <xsl:apply-templates/>
> </xsl:template>
> <xsl:template match="@*|node()" >
> <xsl:copy>
> <xsl:apply-templates select="@*|node()"/>
> </xsl:copy>
> </xsl:template>
> </xsl:stylesheet>
>
> and then we can get rid of divs that have no title. Thus solving the
> missing div problem.
>
>
>
> Jim Albright
> 704 843-0582
> Wycliffe Bible Translators
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| RE: [xsl] Re: up-converting, Jim_Albright | Thread | [xsl] matching namespace declaratio, Jean-Roch Meurisse |
| RE: [xsl] Nested for-each-group, Andrew Welch | Date | [xsl] matching namespace declaratio, Jean-Roch Meurisse |
| Month |