|
Subject: RE: [xsl] Parsing complex line (mixed text and markup) From: "Michael Kay" <mike@xxxxxxxxxxxx> Date: Thu, 14 Feb 2008 23:14:30 -0000 |
This problem has come up in the past and it's not particularly easy. There
seem to be two main approaches:
(a) convert the string delimiters into element markup, and then use grouping
facilities (xsl:for-each-group) to analyze the overall structure
(b) convert the markup into string delimiters, and then use
xsl:analyze-string.
Both work, but I think (a) is probably a bit easier.
Do all the delimiters (commas) occur in top-level text nodes, or can they
occur nested within elements? I'll assume the former.
Start by making a copy of the data in which the commas are replaced by
<comma/> elements:
<xsl:template match="tbentry">
<xsl:variable name="temp">
<xsl:apply-templates mode="replace-commas"/>
</xsl:variable>
..[G]..
</xsl:template>
<xsl:template match="*" mode="replace-commas">
<xsl:copy-of select="."/>
</xsl:template>
<xsl:template match="text()" mode="replace-commas">
<xsl:analyze-string select="." regex=",">
<xsl:matching-substring><comma/></xsl:matching-substring>
<xsl:non-matching-substring><xsl:value-of
select="."/></xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
Then (at [G] above) process the new tbentry using grouping
<xsl:for-each-group select="$temp/child::node()"
group-starting-with="comma">
<entry><xsl:copy-of select="current-group()"/></entry>
<xsl:for-each-group>
Not tested!
Michael Kay
http://www.saxonica.com/
> -----Original Message-----
> From: Ilya Lifshits [mailto:chehlo@xxxxxxxxx]
> Sent: 14 February 2008 22:38
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] Parsing complex line (mixed text and markup)
>
> Hello experts,
>
> I'm using xslt 2.0 processor both saxon and and altova.
>
> I'm trying to parse complex line like:
> <tbentry>Some text, Some more text <xref linkend="somelink">
> even more text , , ,</tbentrys>
>
> and get following output :
>
> <row>
> <entry>Some text</entry>
> <entry>Some more text <xref
> linkend="ut_man_related_docs"> and even more text </entry> </row>
>
> Number of entries is not constant.
>
> I have easily find the solution of this without mixing the
> text and markup by using tokenize function.
> But failed to separate text and markup using this approach.
> Example can be found here : http://pastebin.com/m40fd204f
>
> To formalize the goal: I want to simplify life of our tech
> writes by creating wrappers on top of DocBook that will
> help transform from my defined syntax to standard Docbook code.
> So if there is another more appropriate way (which is not WYSIWYG
> editor) to achieve this, i can completely change the source line:
> <tblrow>Some text, Some more text <xref linkend="somelink">
> even more text </tblrow> as soon as it's still easy to write
> :) The only solution i found is pass linkend entry as an
> attribute to tblrow and another attribute which will specify
> the entry number.
> But this is very limited solution and will not allow me to
> use xref in 2 entries for example.
> Additional note, I'm absolutely newby in XML.
>
> Thanks in advance,
> Ilya.
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| [xsl] Parsing complex line (mixed t, Ilya Lifshits | Thread | RE: [xsl] Parsing complex line (mix, Michael Kay |
| [xsl] Parsing complex line (mixed t, Ilya Lifshits | Date | RE: [xsl] Parsing complex line (mix, Michael Kay |
| Month |