Subject: RE: [xsl] Re: up-converting From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Tue, 28 Sep 2004 14:23:33 +0100 |
I haven't looked through your code in detail, but it looks similar to a problem I used as an exercise at the Oxford Summer School. Here we had a set of records with COBOL-like level numbers <A level="1"/> <B level="2"/> <C level="3"/> <D level="2"/> and the task is to create a hierarchically nested structure. (The actual input was a GEDCOM file). the solution is a recursive grouping like this: <xsl:template name="g"> <xsl:param name="sequence" as="element()*"/> <xsl:param name="level" as="xs:integer"/> <xsl:for-each-group select="$sequence" group-starting-with="*[@level=$level]"> <xsl:copy> <xsl:call-template name="g"> <xsl:with-param name="sequence" select="current-group() except ."/> <xsl:with-param name="level" select="$level+1"/> </ </ </ </ Now it seems to me your problem is very similar, except you have no explicit level number. But I think you could use a similar approach, where the same template is used for each level of grouping and the only thing that changes is the grouping key. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: Jim_Albright@xxxxxxxxxxxx [mailto:Jim_Albright@xxxxxxxxxxxx] > Sent: 28 September 2004 13:07 > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: RE: [xsl] Re: up-converting > > I have a solution for the up-converting problem that I had. > It isn't as > elegant as I was hoping for. Maybe someone here can give me a > few more > pointers. > Thanks again for including the for each group structure as > that makes the > solution much easier. > > My general problem is conversion of a flat XML (WordML) > document to one > with hierarchy. > After tossing out all of the formatting info, > The first step is to map all the paragraphs that indicate > divs to their > appropriate level. I use a, b, c, d, ... for new element > names in order to > make this more generic. > The a, b, c, indicate the head or title for the div. > The div may be nested: a, b, c, d. > Some divs may be omitted: a, b, d. > Divs may be followed by other divs or paragraphs. Paragraphs > may contain > spans. > > Next use the for-each-group structure to put a aa element > around the a > elements. > Next use the for-each-group structure to put a bb element > around the b > elements and aaa instead of aa. > ... > Each of these steps builds the required hierarchy one step at a time. > Since some divs may be omitted I couldn't find a way to combine these > steps. > Next the head/title is pulled out. > Toss out any div with no head/title > > sample input > <?xml version="1.0" encoding="UTF-8"?> > <document> > <a>level aaaa head 1</a> > <b>level bbbb head 2</b> > <c>level ccccc head 3</c> > <dfg>cc 4 blah</dfg> > <e>level eeee head 5 </e> > <fhh>cc blah 6</fhh> > <c>level ccccc head 7</c> > <df>cc 8 blah<kkk>kkk within df within c</kkk> > </df> > <d>level dddd head 9</d> > <iuo>dd 10 blah</iuo> > <jtt>dd blah 11</jtt> > <c>level ccccc head 12</c> > <df>cc 13 blah</df> > <e>cc level eeeee head 14</e> > <fss>ee blah 15</fss> > <b>level bbbbb head 16</b> > <c>level ccccc head 17</c> > <df>cc 18 blah</df> > <e>cc level eeeee head 19</e> > <fhy>ee blah 20</fhy> > </document> > > and the required output is > <?xml version="1.0" encoding="UTF-8"?> > <document> > <div-a> > <title>level aaaa head 1</title> > <div-b> > <title>level bbbb head 2</title> > <div-c> > <title>level ccccc head 3</title> > <dfg>cc 4 blah</dfg> > <div-e> > <title>level eeee head 5 </title> > <fhh>cc blah 6</fhh> > </div-e> > </div-c> > <div-c> > <title>level ccccc head 7</title> > <df>cc 8 blah<kkk>kkk within df within c</kkk> > </df> > <div-d> > <title>level dddd head 9</title> > <iuo>dd 10 blah</iuo> > <jtt>dd blah 11</jtt> > </div-d> > </div-c> > <div-c> > <title>level ccccc head 12</title> > <df>cc 13 blah</df> > <div-e> > <title>cc level eeeee head 14</title> > <fss>ee blah 15</fss> > </div-e> > </div-c> > </div-b> > <div-b> > <title>level bbbbb head 16</title> > <div-c> > <title>level ccccc head 17</title> > <df>cc 18 blah</df> > <div-e> > <title>cc level eeeee head 19</title> > <fhy>ee blah 20</fhy> > </div-e> > </div-c> > </div-b> > </div-a> > </document> > > > > Next use the for-each-group structure to put a aa element > around the a > elements. > > <?xml version="1.0" encoding="UTF-8"?> > <xsl:stylesheet version="1.0" > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > <xsl:output method="xml" version="1.0" > encoding="UTF-8" indent="yes"/> > <xsl:template match="document"> > <document> > <xsl:for-each-group select="*" > group-starting-with="a"> > <aa> > <xsl:for-each > select="current-group()"> > <xsl:copy-of > select="."/> > </xsl:for-each> > </aa> > </xsl:for-each-group> > </document> > </xsl:template> > <xsl:template match="@*|node()" name="copy-current-node"> > <xsl:copy> > <xsl:apply-templates select="@*|node()"/> > </xsl:copy> > </xsl:template> > </xsl:stylesheet> > > > with output of > > <?xml version="1.0" encoding="UTF-8"?> > <document> > <aa> > <a>level aaaa head 1</a> > <b>level bbbb head 2</b> > <c>level ccccc head 3</c> > <dfg>cc 4 blah</dfg> > <e>level eeee head 5 </e> > <fhh>cc blah 6</fhh> > <c>level ccccc head 7</c> > <df>cc 8 blah<kkk>kkk within df within c</kkk> > </df> > <d>level dddd head 9</d> > <iuo>dd 10 blah</iuo> > <jtt>dd blah 11</jtt> > <c>level ccccc head 12</c> > <df>cc 13 blah</df> > <e>cc level eeeee head 14</e> > <fss>ee blah 15</fss> > <b>level bbbbb head 16</b> > <c>level ccccc head 17</c> > <df>cc 18 blah</df> > <e>cc level eeeee head 19</e> > <fhy>ee blah 20</fhy> > </aa> > </document> > > Next use the for-each-group structure to put a bb element > around the b > elements and aaa instead of aa. > <?xml version="1.0" encoding="UTF-8"?> > <xsl:stylesheet version="1.0" > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > <xsl:output method="xml" version="1.0" > encoding="UTF-8" indent="yes"/> > <xsl:template match="document"> > <document> > <xsl:apply-templates/> > </document> > </xsl:template> > <xsl:template match="aa"> > <aaa> > <xsl:for-each-group select="*" > group-starting-with="b"> > <bb> > <xsl:for-each > select="current-group()"> > <xsl:copy-of > select="."/> > </xsl:for-each> > </bb> > </xsl:for-each-group> > </aaa> > </xsl:template> > <xsl:template match="@*|node()" name="copy-current-node"> > <xsl:copy> > <xsl:apply-templates select="@*|node()"/> > </xsl:copy> > </xsl:template> > </xsl:stylesheet> > > <?xml version="1.0" encoding="UTF-8"?> > <document> > <aaa> > <bb> > <a>level aaaa head 1</a> > </bb> > <bb> > <b>level bbbb head 2</b> > <c>level ccccc head 3</c> > <dfg>cc 4 blah</dfg> > <e>level eeee head 5 </e> > <fhh>cc blah 6</fhh> > <c>level ccccc head 7</c> > <df>cc 8 blah<kkk>kkk within df within c</kkk> > > </df> > <d>level dddd head 9</d> > <iuo>dd 10 blah</iuo> > <jtt>dd blah 11</jtt> > <c>level ccccc head 12</c> > <df>cc 13 blah</df> > <e>cc level eeeee head 14</e> > <fss>ee blah 15</fss> > </bb> > <bb> > <b>level bbbbb head 16</b> > <c>level ccccc head 17</c> > <df>cc 18 blah</df> > <e>cc level eeeee head 19</e> > <fhy>ee blah 20</fhy> > </bb> > </aaa> > </document> > > > continue adding the levels > <?xml version="1.0" encoding="UTF-8"?> > <xsl:stylesheet version="1.0" > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > <xsl:output method="xml" version="1.0" > encoding="UTF-8" indent="yes"/> > <xsl:template match="document"> > <document> > <xsl:apply-templates/> > </document> > </xsl:template> > <xsl:template match="aaa"> > <aaa> > <xsl:apply-templates/> > </aaa> > </xsl:template> > <xsl:template match="bb"> > <bbb> > <xsl:for-each-group select="*" > group-starting-with="c"> > <cc> > <xsl:for-each > select="current-group()"> > <xsl:copy-of > select="."/> > </xsl:for-each> > </cc> > </xsl:for-each-group> > </bbb> > </xsl:template> > <xsl:template match="@*|node()" name="copy-current-node"> > <xsl:copy> > <xsl:apply-templates select="@*|node()"/> > </xsl:copy> > </xsl:template> > </xsl:stylesheet> > > <?xml version="1.0" encoding="UTF-8"?> > <document> > <aaa> > <bbb> > <cc> > <a>level aaaa head 1</a> > </cc> > </bbb> > <bbb> > <cc> > <b>level bbbb head 2</b> > </cc> > <cc> > <c>level ccccc head 3</c> > <dfg>cc 4 blah</dfg> > <e>level eeee head 5 </e> > <fhh>cc blah 6</fhh> > </cc> > <cc> > <c>level ccccc head 7</c> > <df>cc 8 blah<kkk>kkk within df within c</kkk> > > > </df> > <d>level dddd head 9</d> > <iuo>dd 10 blah</iuo> > <jtt>dd blah 11</jtt> > </cc> > <cc> > <c>level ccccc head 12</c> > <df>cc 13 blah</df> > <e>cc level eeeee head 14</e> > <fss>ee blah 15</fss> > </cc> > </bbb> > <bbb> > <cc> > <b>level bbbbb head 16</b> > </cc> > <cc> > <c>level ccccc head 17</c> > <df>cc 18 blah</df> > <e>cc level eeeee head 19</e> > <fhy>ee blah 20</fhy> > </cc> > </bbb> > </aaa> > </document> > > .... > > > finally at aaa see if there is a descendant a, if so that is > the title for > this group, otherwise no title > > <?xml version="1.0" encoding="UTF-8"?> > <xsl:stylesheet version="1.0" > xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > <xsl:output method="xml" version="1.0" encoding="UTF-8" > indent="yes"/> > <xsl:strip-space elements="*"/> > <xsl:template match="document"> > <document> > <xsl:apply-templates/> > </document> > </xsl:template> > <xsl:template match="aaa"> > <div-a> > <xsl:choose> > <xsl:when test="descendant::a"> > <title> > <xsl:apply-templates > select="descendant::a"/> > </title> > <xsl:apply-templates > select="child::*"/> > </xsl:when> > <xsl:otherwise> > <xsl:apply-templates > select="child::*"/> > </xsl:otherwise> > </xsl:choose> > </div-a> > </xsl:template> > <xsl:template match="bbb"> > <xsl:choose> > <xsl:when test="descendant::b"> > <div-b> > <title> > <xsl:apply-templates > select="descendant::b"/> > </title> > <xsl:apply-templates > select="child::*"/> > </div-b> > </xsl:when> > <xsl:otherwise> > <xsl:apply-templates > select="child::*"/> > </xsl:otherwise> > </xsl:choose> > </xsl:template> > <xsl:template match="ccc"> > <xsl:choose> > <xsl:when test="descendant::c"> > <div-c> > <title> > <xsl:apply-templates > select="descendant::c"/> > </title> > <xsl:apply-templates > select="descendant::*[preceding-sibling::c]"/> > <xsl:apply-templates > select="child::*"/> > </div-c> > </xsl:when> > <xsl:otherwise> > <xsl:apply-templates > select="child::*[not(c)]|descendant::*[preceding-sibling::c]"/> > </xsl:otherwise> > </xsl:choose> > </xsl:template> > <xsl:template match="ddd"> > <xsl:choose> > <xsl:when test="descendant::d"> > <div-d> > <title> > <xsl:apply-templates > select="descendant::d"/> > </title> > <xsl:apply-templates > select="descendant::*[preceding-sibling::d]"/> > <xsl:apply-templates > select="child::*"/> > </div-d> > </xsl:when> > <xsl:otherwise> > <xsl:apply-templates > select="child::*|descendant::*[preceding-sibling::d]"/> > </xsl:otherwise> > </xsl:choose> > </xsl:template> > <xsl:template match="eee"> > <xsl:choose> > <xsl:when test="descendant::e"> > <div-e> > <title> > <xsl:apply-templates > select="descendant::e"/> > </title> > <xsl:apply-templates > select="descendant::*[preceding-sibling::e]"/> > <xsl:apply-templates > select="child::*"/> > </div-e> > </xsl:when> > <xsl:otherwise> > <xsl:apply-templates > select="child::*|descendant::*[preceding-sibling::e]"/> > </xsl:otherwise> > </xsl:choose> > </xsl:template> > > <xsl:template match="a|b|c|d|e|f|g|h|i"> > <xsl:apply-templates/> > </xsl:template> > <xsl:template match="@*|node()" > > <xsl:copy> > <xsl:apply-templates select="@*|node()"/> > </xsl:copy> > </xsl:template> > </xsl:stylesheet> > > and then we can get rid of divs that have no title. Thus solving the > missing div problem. > > > > Jim Albright > 704 843-0582 > Wycliffe Bible Translators
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Re: up-converting, Jim_Albright | Thread | [xsl] matching namespace declaratio, Jean-Roch Meurisse |
RE: [xsl] Nested for-each-group, Andrew Welch | Date | [xsl] matching namespace declaratio, Jean-Roch Meurisse |
Month |