Subject: [xsl] XSLT2 node comparison, wordlists From: "James Cummings" <cummings.james@xxxxxxxxx> Date: Wed, 24 Oct 2007 17:56:02 +0100 |
I'm sure this is easy to do in XSLT2 but I've just not got my head wrapped around how to compare things properly in an efficient manner. Let's say I have a wordlist where automatically generated from another file I've got instances of how each word was used. In many cases these are identical in spelling, and what I want to do is merge them and store links between the original file and the wordlist in a stand-off markup method. Say the file has entries for each word which are like: ===== <entry xml:id="let22-w27"> <form> <orth type="hw">the</orth> <form type="orthVar"> <orth xml:id="w72">The</orth> <orth xml:id="w3955">The</orth> <orth xml:id="w4513">The</orth> <orth xml:id="w4578">The</orth> <orth xml:id="w4650">The</orth> <orth xml:id="w4672">The</orth> <orth xml:id="w4703">The</orth> <orth xml:id="w4824">The</orth> <orth xml:id="w4830">The</orth> <orth xml:id="w2045">the</orth> <orth xml:id="w2079">the</orth> <orth xml:id="w2101">the</orth> <orth xml:id="w2112">the</orth> <orth xml:id="w2333">the</orth> <orth xml:id="w2400">the</orth> <orth xml:id="w2442">the</orth> <orth xml:id="w1402">T<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w2422">T<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w6458">T<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w7822">T<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w2097">t<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w2155">t<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w2482">t<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w5887">t<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="w5642">T<ex>h</ex>e</orth> <orth xml:id="w5378">t<ex>h</ex>e</orth> </form> </form> </entry> ===== What I want to end up with is for each form[@type='orthVar'] only distinct-values for the orth elements therein with new @xml:id values, and the old ones preserved at the bottom of the file linking new values with the current ones (which are copies from a different file). So something like: ===== <div> <entry xml:id="let22-w27"> <form> <orth type="hw">the</orth> <form type="orthVar" n="6"> <!-- n= num of diff variants--> <orth xml:id="let22-w27-vA">The</orth> <orth xml:id="let22-w27-vB">the</orth> <orth xml:id="let22-w27-vC">T<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="let22-w27-vD">t<ex>h</ex><hi rend="sup">e</hi></orth> <orth xml:id="let22-w27-vE">T<ex>h</ex>e</orth> <orth xml:id="let22-w27-vF">t<ex>h</ex>e</orth> </form> </form> </entry> <!-- more entries --> <!-- at bottom of file --> <div type="links"> <linkGrp xml:id="let22-w27-lg"> <!-- links between the orth form above with its instance in file.xml --> <link targets="#let22-w27-vA file.xml#w72 file.xml#w3955 file.xml#w4513 file.xml#w4578 file.xml#w4650 file.xml#w4672 file.xml#w4703 file.xml#w4824 file.xml#w4830"/> <link targets="#let22-w27-vB file.xml#w2045 file.xml#w2079 file.xml#w2101 file.xml#w2112 file.xml#w2333 file.xml#w2400 file.xml#w2442"/> <link targets="#let22-w27-vC file.xml#w1402 file.xml#w2422 file.xml#w6458 file.xml#w7822 "/> <link targets="#let22-w27-vD file.xml#w2097 file.xml#w2155 file.xml#w2482 file.xml#w5887"/> <link targets="#let22-w27-vE file.xml#w5642"/> <link targets="#let22-w27-vF file.xml#w5378"/> </linkGrp> <!-- more linkGrps --> </div> </div> ====== XSLT2 is certainly usable in this case, but all of my attempts have been hideously inefficient, or fail to accurately compare the nested children properly. Suggestions? Thanks, -James -- James Cummings, Cummings dot James at GMail dot com
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Mixing XML and non-XML in, Wendell Piez | Thread | RE: [xsl] XSLT2 node comparison, wo, Michael Kay |
RE: Re: [xsl] handling tags and PIs, Naschke, Pete | Date | RE: [xsl] XSLT2 node comparison, wo, Michael Kay |
Month |