Subject: [xsl] Merging lines of 3 words or less From: James Cummings <cummings.james@xxxxxxxxx> Date: Thu, 8 Sep 2005 10:32:19 +0100 |
In doing a transcription of some psalms, someone marked up as separate lines instances where the editor of the print version had wrapped (and indented) a line. What I want to do is pass the file through a stylesheet a merge and lines with 3 or less words into the line before. If the source file looks something along the lines of : ---- <?xml version="1.0" encoding="UTF-8"?> <div type="psalm" n="5"> <lg n="5:1"> <l n="1"><w>Myne</w> <w>wordes</w>, <w>lauerd</w>, <w>with</w> <w>eres</w></l> <l n="2"><w>byse;</w></l> <l n="3"><w>Vnderstande</w> <w><c type="thorn">þ</c>e</w> <w>crie</w> <w>ofe</w> <w>me</w>.</l> </lg> <lg n="5:2"> <l n="1"><w>Bihald</w> <w>vnto</w> <w>my</w> <w>bede</w> <w>steuene</w>,</l> <l n="2"><w>Mi</w> <w>kynge</w> <w>and</w> <w>my</w> <w>god</w> <w>ofe</w> <w>heuene</w>.</l> </lg> <lg n="5:3"> <l n="1"><w>For</w> <w>to</w> <w><c type="thorn">þ</c>e</w>, <w>lauerd</w>, <w>bidde</w> <w>sal</w> .<w>I</w>.<w>;</w></l> <l n="2"><w>Mi</w> <w>steuene</w> <w>sal</w> <w>tou</w> <w>here</w> <w>erli</w>.</l> </lg> <lg n="5:4"> <l n="1"><w>Erli</w> <w>sal</w> .<w>I</w>. <w>to</w> <w><c type="thorn">þ</c>e</w> <w>se</w> <w>and</w> <w>stande;</w></l> <l n="2"><w>For</w> <w>noght</w> <w>god</w> <w>artou</w> <w>wiknes</w> <w>willande</w>,</l> </lg> <lg n="5:5"> <l n="1"><w>Ne</w> <w>wone</w> <w>sal</w> <w>lither</w> <w>biside</w> <w><c type="thorn">þ</c>e</w> ,</l> <l n="2"><w>Ne</w> <w>vnrightwise</w> <w>bifor</w> <w><c type="thorn">þ</c>in</w> <w>eyhen</w> <w>be</w>.</l> </lg> <lg n="5:6"> <l n="1"><w><c type="THORN">Þ</c>ou</w> <w>hated</w> <w>al</w> <w><c type="thorn">þ</c>at</w> <w>wirkes</w> <w>wiknesse;</w></l> <l n="2"><w><c type="THORN">Þ</c>at</w> <w>lighe</w> <w>spekes</w> <w>leses</w> <w>tou</w> <w>mare</w> <w>and</w></l> <l n="3"><w>lesse</w>,</l> </lg> (etc.) </div> ----- Now, the way I'm doing it which *seems* to work is: ----- <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:template match="/"><xsl:apply-templates/></xsl:template> <xsl:template match="node()|@*" priority="-1"> <xsl:copy><xsl:apply-templates select="node()|@*"/></xsl:copy> </xsl:template> <xsl:template match="lg"> <lg n="{@n}"> <xsl:for-each select="l"> <xsl:choose> <xsl:when test="count(w) > 3"> <xsl:variable name="lineNum"><xsl:number count="l[count(w) > 3]" from="lg"/></xsl:variable> <l n="{$lineNum}"> <xsl:apply-templates /> <xsl:if test="following-sibling::l[1][count(w) < 4]"> <xsl:apply-templates select="following-sibling::l[1]"/> </xsl:if> </l> </xsl:when> <xsl:otherwise/> </xsl:choose> </xsl:for-each> </lg> </xsl:template> <xsl:template match="l[count(w) >3]"> <xsl:copy><xsl:apply-templates select="node()|@*"/></xsl:copy> </xsl:template> <xsl:template match="l[count(w) < 3]"> <xsl:apply-templates /> </xsl:template> </xsl:stylesheet> ----- I'm just wondering if this is having any unforseen side-effects that I'm not noticing? In 150 psalms there are only about 20 instances of lg/l's containing less-than 4 words which are in fact real lines. The rest should be merged. I figured it was easier to go and correct these 20 after automatically fixing the hundreds (a few per psalm) which are wrong. Is this the best way to do it? -James -- James Cummings, Cummings dot James at GMail dot com
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] custom xsl sorting, David Carlisle | Thread | Re: [xsl] Merging lines of 3 words , David Carlisle |
Re: [xsl] Get value from update.xml, Joris Gillis | Date | Re: [xsl] Get value from update.xml, Joris Gillis |
Month |