Subject: RE: [xsl] Transforming flat ?WordML? source to a hierarchical XML output. From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Wed, 12 Sep 2007 11:38:59 +0100 |
There's an example of XSLT 2.0 code for converting a hierarchy expressed as a flat structure with level numbers into a real XML hierarchy at http://www.idealliance.org/proceedings/xml04/papers/111/mhk-paper.html Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: David Medley [mailto:DAVEMEDLEY@xxxxxxxxxx] > Sent: 11 September 2007 15:27 > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx > Subject: [xsl] Transforming flat ?WordML? source to a > hierarchical XML output. > > Using following: > > Saxon XSLT processor, version 8.9 > > XSLT 2.0 > > > I am trying to process XML source generated by Microsoft Word > (WORDML). > > WordML has no concept of hierarchy, and so each paragraph in > the source looks like below: > > <w:p> > <w:pPr> > <w:pStyle w:val="Normal"/> > </w:pPr> > <w:r> > <w:rPr/> > <w:t>Normal Paragraph</w:t> > </w:r> > </w:p> > <w:p> > <w:pPr> > <w:pStyle w:val="Number"/> > <w:listPr> > <w:ilvl w:val="0"/> > </w:listPr> > </w:pPr> > <w:r> > <w:rPr/> > <w:t>Top Level List</w:t> > </w:r> > </w:p> > <w:p> > <w:pPr> > <w:pStyle w:val="Number"/> > <w:listPr> > <w:ilvl w:val="0"/> > </w:listPr> > </w:pPr> > <w:r> > <w:rPr/> > <w:t>Top Level List</w:t> > </w:r> > </w:p> > <w:p> > <w:pPr> > <w:pStyle w:val="Bulleted"/> > <w:listPr> > <w:ilvl w:val="1"/> > </w:listPr> > </w:pPr> > <w:r> > <w:rPr/> > <w:t>Nested List Level 1</w:t> > </w:r> > </w:p> > <w:p> > <w:pPr> > <w:pStyle w:val="Bulleted"/> > <w:listPr> > <w:ilvl w:val="1"/> > </w:listPr> > </w:pPr> > <w:r> > <w:rPr/> > <w:t>Nested List Level 1</w:t> > </w:r> > </w:p> > <w:p> > <w:pPr> > <w:pStyle w:val="Number"/> > <w:listPr> > <w:ilvl w:val="2"/> > </w:listPr> > </w:pPr> > <w:r> > <w:rPr/> > <w:t>Nested List Level 2</w:t> > </w:r> > </w:p> > <w:p> > <w:pPr> > <w:pStyle w:val="Number"/> > <w:listPr> > <w:ilvl w:val="3"/> > </w:listPr> > </w:pPr> > <w:r> > <w:rPr/> > <w:t>Nested List Level 3</w:t> > </w:r> > </w:p> > <w:p> > <w:pPr> > <w:pStyle w:val="Number"/> > <w:listPr> > <w:ilvl w:val="4"/> > </w:listPr> > </w:pPr> > <w:r> > <w:rPr/> > <w:t>Nested List Level 4</w:t> > </w:r> > </w:p> > <w:p> > <w:pPr> > <w:pStyle w:val="Number"/> > <w:listPr> > <w:ilvl w:val="4"/> > </w:listPr> > </w:pPr> > <w:r> > <w:rPr> > <w:i/> > </w:rPr> > <w:t>Nested List Level 4</w:t> > </w:r> > </w:p> > <w:p> > <w:pPr> > <w:pStyle w:val="Number"/> > <w:listPr> > <w:ilvl w:val="5"/> > </w:listPr> > </w:pPr> > <w:r> > <w:rPr> > <w:b/> > </w:rPr> > <w:t>Nested List Level 5</w:t> > </w:r> > </w:p> > <w:p> > <w:pPr> > <w:pStyle w:val="Number"/> > <w:listPr> > <w:ilvl w:val="5"/> > </w:listPr> > </w:pPr> > <w:r> > <w:rPr> > <w:u w:val="single"/> > </w:rPr> > <w:t>Nested List Level 5</w:t> > </w:r> > </w:p> > <w:p> > <w:pPr> > <w:pStyle w:val="Normal"/> > </w:pPr> > <w:r> > <w:rPr/> > <w:t>Normal Paragraph</w:t> > </w:r> > </w:p> > > This displays in word as follows: > > Normal Paragraph > 1. Top Level List > 2. Top Level List > * Nested List Level 1 > * Nested List Level 1 > 1. Nested List Level 2 > a. Nested List Level 3 > i. Nested List Level 4 > ii. Nested List Level 4 > 1. Nested List Level 5 > 2. Nested List Level 5 > Normal Paragraph > > > I need the outcome to be as follows: > > <Paragraph>Normal Paragraph</Paragraph> > <List type="numbered"> > <Item>Top Level List</Item> > <Item>Top Level List > <List type="bulleted"> > <Item>Nested List Level 1</Item> > <Item>Nested List Level 1 > <List type="numbered"> > <Item>Nested > List Level 2 > <List type=" > numbered"> > > <Item> Nested List Level 3 > > < List type="numbered"> <Item>Nested List Level > 4</Item> <Item>Nested List Level 4 > <List type="numbered"> > <Item>Nested List Level 5</Item> > <Item>Nested List Level 5</Item> > </List> > </Item> > > </ > List> > > </Item> > </List> > </Item> > </List> > </Item> > </List> > </Item> > </List> > <Paragraph>Normal Paragraph</Paragraph> > > > I think what is required is a grouping procedure, grouping > the paragraphs depending on the value of x-path > 'w:pPr/w:listPr/w:ilvl/@w:val' for each paragraph. > My attempt to do this has been unsuccessful resulting in > problems of not all paragraphs having the x-path > 'w:pPr/w:listPr/w:ilvl/@w:val' and therefore the grouping falls over. > > I hope you can help me in this matter, thank you for reading. > > > Thank you, > David Medley > IT Specialist > > Application Services, GBS > IBM Office Internal: 299263 External: +44 (0) 1252 55 9263 > Mobile: +44 (0) 7790-778801 > E-mail: davemedley@xxxxxxxxxx > Notes: David Medley/UK/IBM@IBMGB > > > > > > > > Unless stated otherwise above: > IBM United Kingdom Limited - Registered in England and Wales > with number 741598. > Registered office: PO Box 41, North Harbour, Portsmouth, > Hampshire PO6 3AU
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] xsl:function, Michael Kay | Thread | [xsl] Escaped markup 2 temporary tr, Jesper Tverskov |
Re: [xsl] Connecting to Data Base u, Joe Fawcett | Date | [xsl] Test for mixed content, Jesper Tverskov |
Month |