Subject: Re: [xsl] Transforming flat ?WordML? source to a hierarchical XML output. From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx> Date: Tue, 11 Sep 2007 11:40:25 -0400 |
Cheers, Wendell Piez
Using following:
Saxon XSLT processor, version 8.9
XSLT 2.0
I am trying to process XML source generated by Microsoft Word (WORDML).
WordML has no concept of hierarchy, and so each paragraph in the source looks like below:
<w:p> <w:pPr> <w:pStyle w:val="Normal"/> </w:pPr> <w:r> <w:rPr/> <w:t>Normal Paragraph</w:t> </w:r> </w:p> <w:p> <w:pPr> <w:pStyle w:val="Number"/> <w:listPr> <w:ilvl w:val="0"/> </w:listPr> </w:pPr> <w:r> <w:rPr/> <w:t>Top Level List</w:t> </w:r> </w:p> <w:p> <w:pPr> <w:pStyle w:val="Number"/> <w:listPr> <w:ilvl w:val="0"/> </w:listPr> </w:pPr> <w:r> <w:rPr/> <w:t>Top Level List</w:t> </w:r> </w:p> <w:p> <w:pPr> <w:pStyle w:val="Bulleted"/> <w:listPr> <w:ilvl w:val="1"/> </w:listPr> </w:pPr> <w:r> <w:rPr/> <w:t>Nested List Level 1</w:t> </w:r> </w:p> <w:p> <w:pPr> <w:pStyle w:val="Bulleted"/> <w:listPr> <w:ilvl w:val="1"/> </w:listPr> </w:pPr> <w:r> <w:rPr/> <w:t>Nested List Level 1</w:t> </w:r> </w:p> <w:p> <w:pPr> <w:pStyle w:val="Number"/> <w:listPr> <w:ilvl w:val="2"/> </w:listPr> </w:pPr> <w:r> <w:rPr/> <w:t>Nested List Level 2</w:t> </w:r> </w:p> <w:p> <w:pPr> <w:pStyle w:val="Number"/> <w:listPr> <w:ilvl w:val="3"/> </w:listPr> </w:pPr> <w:r> <w:rPr/> <w:t>Nested List Level 3</w:t> </w:r> </w:p> <w:p> <w:pPr> <w:pStyle w:val="Number"/> <w:listPr> <w:ilvl w:val="4"/> </w:listPr> </w:pPr> <w:r> <w:rPr/> <w:t>Nested List Level 4</w:t> </w:r> </w:p> <w:p> <w:pPr> <w:pStyle w:val="Number"/> <w:listPr> <w:ilvl w:val="4"/> </w:listPr> </w:pPr> <w:r> <w:rPr> <w:i/> </w:rPr> <w:t>Nested List Level 4</w:t> </w:r> </w:p> <w:p> <w:pPr> <w:pStyle w:val="Number"/> <w:listPr> <w:ilvl w:val="5"/> </w:listPr> </w:pPr> <w:r> <w:rPr> <w:b/> </w:rPr> <w:t>Nested List Level 5</w:t> </w:r> </w:p> <w:p> <w:pPr> <w:pStyle w:val="Number"/> <w:listPr> <w:ilvl w:val="5"/> </w:listPr> </w:pPr> <w:r> <w:rPr> <w:u w:val="single"/> </w:rPr> <w:t>Nested List Level 5</w:t> </w:r> </w:p> <w:p> <w:pPr> <w:pStyle w:val="Normal"/> </w:pPr> <w:r> <w:rPr/> <w:t>Normal Paragraph</w:t> </w:r> </w:p>
This displays in word as follows:
Normal Paragraph 1. Top Level List 2. Top Level List * Nested List Level 1 * Nested List Level 1 1. Nested List Level 2 a. Nested List Level 3 i. Nested List Level 4 ii. Nested List Level 4 1. Nested List Level 5 2. Nested List Level 5 Normal Paragraph
I need the outcome to be as follows:
<Paragraph>Normal Paragraph</Paragraph> <List type="numbered"> <Item>Top Level List</Item> <Item>Top Level List <List type="bulleted"> <Item>Nested List Level 1</Item> <Item>Nested List Level 1 <List type="numbered"> <Item>Nested List Level 2 <List type=" numbered"> <Item> Nested List Level 3 < List type="numbered"> <Item>Nested List Level 4</Item> <Item>Nested List Level 4 <List type="numbered"> <Item>Nested List Level 5</Item> <Item>Nested List Level 5</Item> </List> </Item> </ List> </Item> </List> </Item> </List> </Item> </List> </Item> </List> <Paragraph>Normal Paragraph</Paragraph>
I think what is required is a grouping procedure, grouping the paragraphs depending on the value of x-path 'w:pPr/w:listPr/w:ilvl/@w:val' for each paragraph. My attempt to do this has been unsuccessful resulting in problems of not all paragraphs having the x-path 'w:pPr/w:listPr/w:ilvl/@w:val' and therefore the grouping falls over.
I hope you can help me in this matter, thank you for reading.
====================================================================== Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Transforming flat ?WordML? s, David Medley | Thread | [xsl] Connecting to Data Base using, Karanam.Chowdary |
Re: [xsl] Detecting and replacing ", oryann9 | Date | Re: [xsl] Detecting and replacing ", Wendell Piez |
Month |