Subject: Re: [xsl] Correcting misplaced spaces in XML documents From: "Trevor Nicholls trevor@xxxxxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Sun, 26 Mar 2023 13:39:01 -0000 |
Thank you Gerrit, that looks like a very useful project which I will have a close look at. I would not have thought of the complication with footnotes without your comments, but that's something I could well encounter in our documents. Thanks to others who made suggestions too. (Syd) I can't be completely generic because there are elements where leading spaces really are significant (e.g. code snippets). But I'll look at your methods as well. cheers T -----Original Message----- From: Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Sent: Sunday, 26 March 2023 23:21 To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx Subject: Re: [xsl] Correcting misplaced spaces in XML documents Hi Trevor, emphasis-normalize-space [1] can deal with whitespace within nested elements and with embedded footnotes whose accidental leading or trailing whitespace shouldn't be pulled out and put into the surrounding paragraph. Gerrit [1] https://github.com/gimsieke/emphasis-normalize-space On 26.03.2023 03:33, Trevor Nicholls trevor@xxxxxxxxxxxxxxxxxx wrote: > I suppose this falls into the category of data cleanup. > > In the very simple case I am importing documents which have content > like > this: > > <para>Press the<keyname> Escape </keyname>key.</para> > > You'll notice that the adjacent spaces are wrapped in the keyname > element when they should just be adjacent to it, not in it. > > This is a pathological case, usually the keyname is correct, but > occasionally there is a leading or a trailing space, hardly ever both. > > I've written a simple stylesheet which corrects this situation, > identifying leading and trailing whitespace, and outputting the > appropriate breakdown: > > <xsl:template match="keyname"> > > <xsl:variable name="leading">b&</xsl:variable> > > <xsl:variable name="trailing">b&</xsl:variable> > > <xsl:variable name="content">b&</xsl:variable> > > <xsl:if test="$leading" != ''><xsl:value-of > select="$leading"/></xsl:if> > > <xsl:element name="keyname"> > > <xsl:apply-templates select="@*"/> > > <xsl:value-of select="$content" /> > > </xsl:element> > > <xsl:if test="$trailing" != ''><xsl:value-of > select="$trailing"/></xsl:if> > > </xsl:template> > > This is all fine, and it's adequate for the job when the "greedy" > elements only contain text, which is the case for keynames. > > However now I want to extend the stylesheet to correct some other > cases where the content model of the element is not just simple text. > > For example: > > <para>Select the<filename> <var>username</var>.profile > </filename>file.</para> > > Although the cases I am looking at right now only have a content model > of text or <var> elements, a more general solution would be welcome > because other cases are going to turn up where elements are nested two > or three levels deep. > > I've got myself neck deep into conditionals trying to extend my simple > template to cope with this, and I'm sure there's a straightforward way > of doing it that doesn't need several hundred lines of code. > > Can anyone point me to a cleaner way of doing it? > > cheers > > T
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Correcting misplaced spac, Imsieke, Gerrit, le- | Thread | Re: [xsl] Correcting misplaced spac, Chris Papademetrious |
Re: [xsl] Correcting misplaced spac, Chris Papademetrious | Date | Re: [xsl] Correcting misplaced spac, Imsieke, Gerrit, le- |
Month |