Subject: RE: Converting non-pure trees to pure trees From: Kay Michael <Michael.Kay@xxxxxxx> Date: Tue, 21 Nov 2000 10:00:37 -0000 |
> I have a XML file which I have automatically converted from > msword, the basic structure is: > > <worddocument> > <p>paragraph <b>hello</b> <i>world</i></p> > <p>paragraph <b>hello</b> <i>world</i></p> > <p>paragraph <b>hello</b> <i>world</i></p> > <pagebreak/> > <p>2/1</p> > <p>paragraph <b>hello</b> <i>world</i></p> > <p>paragraph <b>hello</b> <i>world</i></p> > <p>paragraph <b>hello</b> <i>world</i></p> > <pagebreak/> > <p>2/2</p> > <p>paragraph <b>hello</b> <i>world</i></p> > <p>paragraph <b>hello</b> <i>world</i></p> > <p>paragraph <b>hello</b> <i>world</i></p> > <worddocument/> This is a grouping problem, of the kind I call "grouping by position". Grouping problems in XSLT are not easy: for background, see www.jenitennison.com. All grouping problems require two nested loops. The outer loop selects a representative element for each group, which in this case seems to be a <p> element that is immediately preceded by a <pagebreak> element: <xsl:for-each select="p[preceding-sibling::*[1][self::pagebreak]"> <mongraph id="{.}"> ... </mongraph> </xsl:for-each> Inside this you need an inner loop that processes all the elements within one group. In this case these are "all the <p> elements that follow the "representative" element, up to the next "representative" element. Or to put it another way, all following <p> elements whose first preceding <page-break> is the same as the first preceding <page-break> of the current element. So the inner loop can be: <xsl:for-each select="following-sibling::p[ generate-id(preceding-sibling::page-break[1]) = generate-id(current()/preceding-sibling::page-break[1])]" <xsl:copy-of select="."/> </xsl:for-each> In Saxon there is a simpler solution using the saxon:leading() extension function. Mike Kay > > I wish to transform this tree using some knowledge I have > about the document: > The first page is always the "introduction", whilst all > sebsequent pages are "monographs" > > <semanticdocument> > <introduction> > <p>paragraph <b>hello</b> <i>world</i></p> > <p>paragraph <b>hello</b> <i>world</i></p> > <p>paragraph <b>hello</b> <i>world</i></p> > </introduction> > <mongraphs> > <mongraph id="2/1"> > <p>paragraph <b>hello</b> <i>world</i></p> > <p>paragraph <b>hello</b> <i>world</i></p> > <p>paragraph <b>hello</b> <i>world</i></p> > </mongraph id="2/1"> > <mongraph id="2/2"> > <p>paragraph <b>hello</b> <i>world</i></p> > <p>paragraph <b>hello</b> <i>world</i></p> > <p>paragraph <b>hello</b> <i>world</i></p> > </mongraph> > </mongraphs> > <semanticdocument/> > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Converting non-pure trees to pure t, Philip Fitzsimons | Thread | RE: xpath not.., Fu, Gwowen |
Re: Changing a xsl:param value from, Frédéric SCHWEBEL | Date | RE: using ancestorChildNumber in VB, Kay Michael |
Month |