Subject: Re: [xsl] WordML to XML From: Vasu Nanjangud <vasdeep@xxxxxxxxx> Date: Fri, 11 Feb 2005 18:14:57 -0800 (PST) |
Joris, et al... My requirement is specifically to convert wordML to xml. i.e. strip off the "wordML" specific tags, but retain the "formatting instructions". For example: For a wordDocument with contents as "I have bold and italics and underscore", this is the source wordML document. ------------------------------------------------------ <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <?mso-application progid="Word.Document"?> <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:sl="http://schemas.microsoft.com/schemaLibrary/2003/core" xmlns:aml="http://schemas.microsoft.com/aml/2001/core" xmlns:wx="http://schemas.microsoft.com/office/word/2003/auxHint" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:dt="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882" w:macrosPresent="no" w:embeddedObjPresent="no" w:ocxPresent="no" xml:space="preserve"><o:DocumentProperties><o:Title>I have bold and italics and underscore</o:Title><o:Author>vnanjang</o:Author><o:LastAuthor>vnanjang</o:LastAuthor><o:Revision>1</o:Revision><o:TotalTime>0</o:TotalTime><o:Created>2005-02-12T01:52:00Z</o:Created><o:LastSaved>2005-02-12T01:52:00Z</o:LastSaved><o:Pages>1</o:Pages><o:Words>5</o:Words><o:Characters>34</o:Characters><o:Company>Oracle Corporation</o:Company><o:Lines>1</o:Lines><o:Paragraphs>1</o:Paragraphs><o:CharactersWithSpaces>38</o:CharactersWithSpaces><o:Version>11.5604</o:Version></o:DocumentProperties><w:fonts><w:defaultFonts w:ascii="Times New Roman" w:fareast="SimSun" w:h-ansi="Times New Roman" w:cs="Times New Roman"/><w:font w:name="SimSun"><w:altName w:val="e.d="/><w:panose-1 w:val="02010600030101010101"/><w:charset w:val="86"/><w:family w:val="Auto"/><w:pitch w:val="variable"/><w:sig w:usb-0="00000003" w:usb-1="080E0000" w:usb-2="00000010" w:usb-3="00000000" w:csb-0="00040001" w:csb-1="00000000"/></w:font><w:font w:name="@SimSun"><w:panose-1 w:val="02010600030101010101"/><w:charset w:val="86"/><w:family w:val="Auto"/><w:pitch w:val="variable"/><w:sig w:usb-0="00000003" w:usb-1="080E0000" w:usb-2="00000010" w:usb-3="00000000" w:csb-0="00040001" w:csb-1="00000000"/></w:font></w:fonts><w:styles><w:versionOfBuiltInStylenames w:val="4"/><w:latentStyles w:defLockedState="off" w:latentStyleCount="156"/><w:style w:type="paragraph" w:default="on" w:styleId="Normal"><w:name w:val="Normal"/><w:rPr><wx:font wx:val="Times New Roman"/><w:sz w:val="24"/><w:sz-cs w:val="24"/><w:lang w:val="EN-US" w:fareast="ZH-CN" w:bidi="AR-SA"/></w:rPr></w:style><w:style w:type="character" w:default="on" w:styleId="DefaultParagraphFont"><w:name w:val="Default Paragraph Font"/><w:semiHidden/></w:style><w:style w:type="table" w:default="on" w:styleId="TableNormal"><w:name w:val="Normal Table"/><wx:uiName wx:val="Table Normal"/><w:semiHidden/><w:rPr><wx:font wx:val="Times New Roman"/></w:rPr><w:tblPr><w:tblInd w:w="0" w:type="dxa"/><w:tblCellMar><w:top w:w="0" w:type="dxa"/><w:left w:w="108" w:type="dxa"/><w:bottom w:w="0" w:type="dxa"/><w:right w:w="108" w:type="dxa"/></w:tblCellMar></w:tblPr></w:style><w:style w:type="list" w:default="on" w:styleId="NoList"><w:name w:val="No List"/><w:semiHidden/></w:style></w:styles><w:docPr><w:view w:val="print"/><w:zoom w:percent="100"/><w:doNotEmbedSystemFonts/><w:proofState w:spelling="clean" w:grammar="clean"/><w:attachedTemplate w:val=""/><w:defaultTabStop w:val="720"/><w:characterSpacingControl w:val="DontCompress"/><w:optimizeForBrowser/><w:validateAgainstSchema/><w:saveInvalidXML w:val="off"/><w:ignoreMixedContent w:val="off"/><w:alwaysShowPlaceholderText w:val="off"/><w:compat><w:dontAllowFieldEndSelect/><w:applyBreakingRules/><w:useWord2002TableStyleRules/><w:useFELayout/></w:compat></w:docPr><w:body><wx:sect><w:p><w:pPr><w:rPr><w:b/><w:b-cs/><w:i/><w:i-cs/><w:u w:val="single"/></w:rPr></w:pPr><w:r><w:rPr><w:b/><w:b-cs/><w:i/><w:i-cs/><w:color w:val="000000"/><w:u w:val="single"/></w:rPr><w:t>I have bold and italics and underscore</w:t></w:r></w:p><w:sectPr><w:pgSz w:w="12240" w:h="15840"/><w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800" w:header="720" w:footer="720" w:gutter="0"/><w:cols w:space="720"/><w:docGrid w:line-pitch="360"/></w:sectPr></wx:sect></w:body></w:wordDocument> ------------------------------------------------------ I need to write an XSLT that will give me the following output.. ------------------------------------------------------ <?xml version="1.0" encoding="UTF-8"?> <b> <i> <u> I have bold, italics and underscore </u> </i> </b> ------------------------------------------------------ Though this looks like html, html output is not what I'm interested in. I have provided here a simplification of my requirement. In reality, my wordML document will have some of my custom tags and data, like the above, will be part of these custom tags.. For example, the output in xml could be.. ------------------------------------------------------ <?xml version="1.0" encoding="UTF-8"?> <vasuarticletag> <b> <i> <u> I have bold, italics and underscore </u> </i> </b> </vasuarticletag> ------------------------------------------------------ So, I need help in writing an xslt which will 1. traverse through every "w:r" block. 2. Look for "w:rPr" tags with "w:i", "w:b" , "w:u" children. 3. If they exist, output <i>, <b>, <u> tags, then output the contents of the corresponding "w:t" block and then close the <i>, <b>, <u> tags. Requesting your help... Regards, Vasu __________________________________ Do you Yahoo!? Yahoo! Mail - Find what you need with new enhanced search. http://info.mail.yahoo.com/mail_250
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Understanding Identity Tr, Karl Stubsjoen | Thread | Re: [xsl] WordML to XML, Joris Gillis |
Re: [xsl] Aborting XSL processing d, Dimitre Novatchev | Date | RE: [xsl] How to create document wi, Touchtel |
Month |