Subject: [xsl] Collating riffled lists From: "Mat Myszewski" <mmyszew@xxxxxxxxxxx> Date: Mon, 29 Sep 2003 13:42:56 -0400 |
This is my first XSLT project. I have a recursive solution to a problem which I hope one of you can improve on. This an abstraction of a problem that arose in the context of scraping PDF docs. PDF->Adobe->HTML->tidy->XML->scrape with XSLT->... The PDF->HTML conversion, or for that matter, lassoing the text in Acrobat Reader, cutting and pasting it, yields a different order than what is displayed on screen by Acrobat Reader. It's not so badly mangled that it can't be recovered. However, related items are no longer near one another. I need to recover the original relationship between the related items. I'm hoping someone can come up with a better solution that the one I present below, which I believe is O(n squared), where n is large (the original document is 170+ pages). I've considered outputting the a's and b's into two result files with two xslt programs and processing those. I think XSLT 1.1 would allow this to be done within a single xslt program by building two node sets, but I'd like to stick to 1.0, if possible. XML source: <?xml version="1.0"?> <list> <a>a1</a> <a>a2</a> <a>a3</a> <b>b1</b> <b>b2</b> <a>a4</a> <b>b3</b> <a>a5</a> <b>b4</b> <a>a6</a> <b>b5</b> <b>b6</b> </list> The XSLT: <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <!-- match first a and make top level call to recursive template --> <xsl:template match="a[1]"> <xsl:call-template name="do_a"> <xsl:with-param name="ix" select="1" /> </xsl:call-template> </xsl:template> <!-- recursive template counts a's --> <xsl:template name="do_a"> <xsl:param name="ix" /> <!-- output this a --> <xsl:text> </xsl:text> <xsl:value-of select="$ix" /><xsl:text>: </xsl:text><xsl:value-of select="."/> <!-- output corresponding b --> <xsl:text> </xsl:text><xsl:copy-of select="/list/b[$ix]/text()" /> <!-- This for-each moves to the next a; doesn't loop. --> <xsl:for-each select="following-sibling::a[1]"> <!-- increment counter and output rest of a's --> <xsl:call-template name="do_a"> <xsl:with-param name="ix" select="$ix+1" /> </xsl:call-template> </xsl:for-each> </xsl:template> <!-- suppress other output --> <xsl:template match="text()" /> </xsl:stylesheet> And this is the output (Saxon 6.5.1 with XFactor GUI): <?xml version="1.0" encoding="utf-8"?> 1: a1 b1 2: a2 b2 3: a3 b3 4: a4 b4 5: a5 b5 6: a6 b6 Thanks, Mat M. XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Best way to use global pa, Americo Albuquerque | Thread | RE: [xsl] Collating riffled lists, Michael Kay |
[xsl] Best way to use global params, Kathy Burke | Date | RE: [xsl] defaut xslt stylesheet fo, Jacoby, Peter R. |
Month |