Subject: [xsl] text nodes From: James Cummings <James.Cummings@xxxxxxxxxxxxxx> Date: Thu, 15 Apr 2004 11:15:23 +0100 (BST) |
I'm converting a strangely formed html file that I've 'tidy'ed to xhtml to TEI xml. The creators have included line numbers inside the span they are using to mark lines and always had 4 digits in order to left-justify them. They then have marked those digits that aren't used with a <font> tag and the same colour as the background. (*sigh*). What I want to acheive is to move the line numbers into a line element and remove them from the text of that line. My first attempt was to use xsl:number, position() and/or count() to just re-number the lines doesn't work because the line numbers have been editorially decided in certain places to compensate for missing lines, etc. and may have no bearing on the number of lines in that particular file. Assuming <b> is the removeable <font> tag, but that all the rest of the intellectual content needs to be preserved, given: ----- <root> <a>This is a line</a> <a>This is a line</a> <a>This is a line</a> <c><a>This is a line</a></c> <a>5<b>000</b> line <d>five</d></a> <a>This is a line</a> <a>This is a line</a> <a>This is a line</a> <a>This is a line</a> <c><a>10<b>00</b> line ten</a></c> <a>This is a line</a> <a>This is a line</a> <a>This is a line</a> <a>This is a line</a> <a>This is a line and lots missing</a> <a><d>1000 This is</d> also a line</a> <a><d>This </d>is a line</a> <a>3523 This is a later line</a> <a>This is a line</a> </root> ----- My xsl currently looks like: ----- <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"><div><xsl:apply-templates /></div></xsl:template> <xsl:template match="b"/> <xsl:template match="c"><p><xsl:apply-templates/></p></xsl:template> <xsl:template match="d"><d><xsl:apply-templates/></d></xsl:template> <xsl:template match="//a"> <xsl:variable name="num"><xsl:value-of select="translate(text()[1], 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz&;-_ ', '')"/> </xsl:variable> <l><xsl:if test="not($num = '')"> <xsl:attribute name="n"><xsl:value-of select="$num"/></xsl:attribute></xsl:if> <xsl:apply-templates/></l> </xsl:template> </xsl:stylesheet> ----- and so produces something like: ----- <div> <l>This is a line</l> <l>This is a line</l> <l>This is a line</l> <p><l>This is a line</l></p> <l n="5">5 line <d>five</d></l> <l>This is a line</l> <l>This is a line</l> <l>This is a line</l> <l>This is a line</l> <p><l n="10">10 line ten</l></p> <l>This is a line</l> <l>This is a line</l> <l>This is a line</l> <l>This is a line</l> <l>This is a line and lots missing</l> <l><d>1000 This is</d> also a line</l> <l><d>This </d>is a line</l> <l n="3523">3523 This is a later line</l> <l>This is a line</l> </div> ---- I've tried matching and translate()'ing text()[1] to remove numbers, but like my way of getting the line numbers, it fails if the line number happens to be inside another element, as with 1000 in my example. So how do I a) grab the line number more successfully for the @n and b) remove the line number from the text of the line without removing anything I shouldn't, or missing one? Suggestions? Solutions? -James --- Dr James Cummings, Oxford Text Archive, University of Oxford James.Cummings at ota.ahds.ac.uk
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] Looking for Tools & Books, Passin, Tom | Thread | Re: [xsl] text nodes, David Carlisle |
RE: [xsl] how do you determine if a, Jarno.Elovirta | Date | Re: [xsl] text nodes, David Carlisle |
Month |