Subject: Re: [xsl] Processing mixed content. [Was: Parsing complex line (mixed text and markup)] From: "Manfred Staudinger" <manfred.staudinger@xxxxxxxxx> Date: Sun, 17 Feb 2008 14:38:09 +0100 |
On 16/02/2008, Ilya Lifshits <chehlo@xxxxxxxxx> wrote: > I wonder if the Michael first suggestion has disadvantages for your opinion and > you are trying to improve, or this is just another possible solution ? I would think, this solution is more general, but I had hoped to get Michael to comment on that. Certainly it's easy to implement in XSLT 1.0. Anyway here is a _corrected_ version of the above, tested with saxon 9.0 <xsl:template match="tbentry"> <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:variable name="curr" select="."/> <xsl:variable name="temp"> <xsl:apply-templates select="node()" mode="text"/> </xsl:variable> <xsl:for-each select="tokenize($temp, ',')"> <entry> <xsl:for-each select="tokenize($temp, '@xy')"> <xsl:choose> <xsl:when test="starts-with(., 'xy')"> <xsl:apply-templates select="$curr/node()[xs:integer(substring(current(), 3))]"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="."/> </xsl:otherwise> </xsl:choose> </xsl:for-each> </entry> </xsl:for-each> </xsl:copy> </xsl:template> <xsl:template match="*" mode="text"> <xsl:value-of select="concat('@xyxy', position(), '@xy')"/> </xsl:template> Manfred On 16/02/2008, Ilya Lifshits <chehlo@xxxxxxxxx> wrote: > While I'm absolutely not capable to comment if this solution is valid, > since i'm completely newbie . I wander if the Michael first suggestion > has disadvantages for your opinion and you are trying to improve, or > this is just another possible solution ? > From my newbie point of view the Michael suggestion is more straight > forward and clear. > > Ilya. > > > On Feb 15, 2008 10:43 PM, Manfred Staudinger > <manfred.staudinger@xxxxxxxxx> wrote: > > Hi All, > > > > I would like to propose a third variant and to get your comments about it. > > > > On 15/02/2008, Michael Kay <mike@xxxxxxxxxxxx> wrote: > > > On 14/02/2008, Ilya Lifshits <chehlo@xxxxxxxxx> wrote: > > > > I'm using xslt 2.0 processor both saxon and and altova. > > > > > > > > I'm trying to parse complex line like: > > > > <tbentry>Some text, Some more text <xref linkend="somelink"> > > > > even more text , , ,</tbentrys> > > > > > > > > and get following output : > > > > > > > > <row> > > > > <entry>Some text</entry> > > > > <entry>Some more text <xref > > > > linkend="ut_man_related_docs"> and even more text </entry> </row> > > > > > > > > Number of entries is not constant. > > > > > > > > I have easily find the solution of this without mixing the > > > > text and markup by using tokenize function. > > > > But failed to separate text and markup using this approach. > > > > Example can be found here : http://pastebin.com/m40fd204f > > > > > > > > To formalize the goal: I want to simplify life of our tech > > > > writes by creating wrappers on top of DocBook that will > > > > help transform from my defined syntax to standard Docbook code. > > > > So if there is another more appropriate way (which is not WYSIWYG > > > > editor) to achieve this, i can completely change the source line: > > > > <tblrow>Some text, Some more text <xref linkend="somelink"> > > > > even more text </tblrow> as soon as it's still easy to write > > > > > > This problem has come up in the past and it's not particularly easy. There > > > seem to be two main approaches: > > > > > > (a) convert the string delimiters into element markup, and then use grouping > > > facilities (xsl:for-each-group) to analyze the overall structure > > > > > > (b) convert the markup into string delimiters, and then use > > > xsl:analyze-string. > > > > > > Both work, but I think (a) is probably a bit easier. > > > > > > Do all the delimiters (commas) occur in top-level text nodes, or can they > > > occur nested within elements? I'll assume the former. > > > > > > Start by making a copy of the data in which the commas are replaced by > > > <comma/> elements: > > > > > > <xsl:template match="tbentry"> > > > <xsl:variable name="temp"> > > > <xsl:apply-templates mode="replace-commas"/> > > > </xsl:variable> > > > <xsl:for-each-group select="$temp/child::node()" > > > group-starting-with="comma"> > > > <entry><xsl:copy-of select="current-group()[not(self::comma)]"/></entry> > > > <xsl:for-each-group> > > > </xsl:template> > > > > > > <xsl:template match="*" mode="replace-commas"> > > > <xsl:copy-of select="."/> > > > </xsl:template> > > > > > > <xsl:template match="text()" mode="replace-commas"> > > > <xsl:analyze-string select="." regex=","> > > > <xsl:matching-substring><comma/></xsl:matching-substring> > > > <xsl:non-matching-substring><xsl:value-of > > > select="."/></xsl:non-matching-substring> > > > </xsl:analyze-string> > > > </xsl:template> > > > > > > > (c) convert the elements into strings which contain the position() > > of the element. After processing the string, reinsert those elements. > > > > Let's assume the document does not contain 'xy'. Then > > <xsl:template match="tbentry"> > > <xsl:variable name="temp"> > > <xsl:apply-templates mode="text"/> > > </xsl:variable> > > <xsl:for-each select="tokenize($temp, ',')"> > > <entry> > > <xsl:for-each select="tokenize(., '@xy')"> > > <xsl:choose> > > <xsl:when test="starts-with(., 'xy')"> > > <!-- A --> <xsl:apply-templates > > select="/node()[xs:integer(substring(., 3))]"/> > > </xsl:when> > > <xsl:otherwise> > > <xsl:value-of select="."/> > > </xsl:otherwise> > > </xsl:choose> > > <xsl:for-each> > > </entry> > > <xsl:for-each> > > </xsl:template> > > > > <xsl:template match="*" mode="text"> > > <xsl:value-of select="concat('@xyxy', position(), '@xy')"/> > > </xsl:template> > > <xsl:template match="text()" mode="text"> > > <xsl:value-of select="."/> > > </xsl:template> > > > > Not tested and I'm uncertain about (A), but a very similar solution > > works fine in XSLT 1.0, where the processing of the string is done by > > recursive templates. > > > > Thanks in advance, > > > > Manfred > > http://documenta.rudolphina.org/Indices/Index.html
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Processing mixed content., Ilya Lifshits | Thread | RE: [xsl] Processing mixed content., Michael Kay |
Re: [xsl] Passing parameters using , Mark | Date | RE: [xsl] Processing mixed content., Michael Kay |
Month |