Subject: Re: [xsl] Finding list items in XHTML From: "Ritu" <rkama@xxxxxxxxxxx> Date: Thu, 12 Sep 2002 17:03:32 -0500 |
Try this <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="xml" version="1.0" encoding="us-ascii" omit-xml-declaration="no" doctype-public="+//ISBN 0-9673008-1-9//DTD OEB 1.0.1 Document//EN" doctype-system="oebdoc101.dtd" indent="no" /> <xsl:template match="p/span[starts-with(.,'➤')]"> <li><xsl:apply-templates /></li> </xsl:template> <xsl:template match="p/span[starts-with(.,'&(!!char1!!);')]"> <li><xsl:apply-templates /></li> </xsl:template> <xsl:template match="span[normalize-space(text())='➤ ']" /> <xsl:template match="span[normalize-space(text())='&(!!char1!!); ']" /> Ritu ----- Original Message ----- From: "Chris Loschen" <loschen@xxxxxxxxxxxxx> To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx> Sent: Tuesday, November 12, 2002 5:43 PM Subject: Re: [xsl] Finding list items in XHTML > Thank you very much for your help! > > I'm trying to tackle the <p> -> <li> problem first, since that seemed to be > easier. However, I don't seem > to have it right yet. Here's the stylesheet as it currently exists: > > ***** > > <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> > > <xsl:output method="xml" version="1.0" encoding="us-ascii" > omit-xml-declaration="no" doctype-public="+//ISBN 0-9673008-1-9//DTD OEB > 1.0.1 Document//EN" doctype-system="oebdoc101.dtd" indent="no" /> > > <xsl:template match="p[starts-with(.,'➤ ')]"> > <li><xsl:apply-templates /></li> > </xsl:template> > > <xsl:template match="p[starts-with(.,'&(!!char1!!); ')]"> > <li><xsl:apply-templates /></li> > </xsl:template> > > <xsl:template match="span[.='➤ ']" /> > > <xsl:template match="span[.='&(!!char1!!); ']" /> > > <!-- The Identity Transformation --> > <!-- Whenever you match any node or any attribute --> > <xsl:template match="node()|@*"> > <!-- Copy the current node --> > <xsl:copy> > <!-- Including any attributes it has and any child nodes --> > <xsl:apply-templates select="@*|node()"/> > </xsl:copy> > </xsl:template> > > </xsl:stylesheet> > > *** > > I need the output to be encoded as us-ascii because some of the downstream > tools are expecting > Unicode character references rather than pure UTF-8. > > Unfortunately, it doesn't seem to be finding the nodes I want, either the > <p> or the <span> elements. > Might it be the Unicode reference or the entity that's causing the problem? > From what I can figure out, > the syntax looks to be OK, but perhaps I'm wrong. > > Here's a sample of the input XML: > > *** > > <p class="hang-text-6"><span class="hang-text-2">➤ </span>Everyone > in a company should have a written job description that accurately > reflects their responsibilities and is related to their compensation.</p> > <a id="page-059"></a> > > <p class="hang-text-3"><span class="hang-text-2">&(!!char1!!); > </span>HR can help you deal with problem employees. Because most companies fear > wrongful termination suits, problem employees often require delicate handling. > Know your firm’s procedures for these situations and work closely > with HR to > resolve them quickly.</p> > > *** > > This is unchanged by the transformation. Do you see the error of my ways? > > Perhaps I just need to look at it with fresh eyes in the morning. Thanks > again for your help. > > > At 03:02 PM 11/12/02, you wrote: > >Hi Chris, > > > >At 12:45 PM 11/12/2002, you wrote: > >>My input (and output) is essentially XHTML (actually OEB, but they're > >>almost identical). It has > >>a series of <p> elements, and in two specific cases, I need to convert > >>the <p> elements > >>to <li> elements. Those cases are: > >> > >>(1) when the <p> element starts with a <span> with the contents "➤ > >>" (yes, that's a character reference), and > >>(2) when the <p> element starts with a <span> with the contents > >>"&(!!char1!!); " (that string exactly) > >> > >>In both of these cases, I need to replace the <p> element with an <li> > >>element > >>and delete the child <span> element entirely. > > > >This is tractable. > > > >Understand, first, that your solution is in XPath, not XSLT. That is, your > >code may be able to use (for example) either a template-driven approach, > >or a for-each iteration or even other techniques; but either way you'll > >need XPath to do your testing. > > > >Since you're wisely starting with an identity template, and since it's > >probably the best solution in any case, we'll assume a template-driven > >approach. > > > >As you know, you can match <p> elements with a template. You can also > >qualify the match. So > > > ><xsl:template match="p[starts-with(.,'➤')]"> > > <li> > > <xsl:apply-templates/> > > </li> > ></xsl:template> > > > >matches <p> elements that start with this character. Note the match > >succeeds irrespective of the presence or absence of a <span> element, > >since all that's being tested is the string value of the <p>, which > >includes any elements inside it. This may or may not be good enough. Note > >also that any <p> elements that don't start with this character will fail > >to match, and thus presumably will be picked up by the identity template. > > > >If you wanted a stricter test, you could say, for example, > > > ><xsl:template match="p[child::*[1]/self::span[.='➤']]"> > > > >This template matches any <p> element whose first child element is a > ><span> element with value '➤'. > > > >You get the idea -- you need to be very precise on exactly how you want > >the test to work. The major gotcha to keep in mind here is the presence or > >absence of text node children of your <p>, especially any whitespace-only > >text nodes that are there only for formatting your source. (Come back and > >ask if this is mysterious or if the expressions you try aren't getting the > >results you anticipate.) > > > >Deleting the child span is easy; just include the template > > > ><xsl:template match="span[.='➤']"/> > > > >in your identity transform. (This expression matches any span whose string > >value is that character. The template, having matched such a span, does > >nothing with it, so it doesn't appear in your output.) > > > >>Perhaps the more difficult part of this is that I also would like to take > >>a series of > >>such elements and surround the entire series with a <ul> </ul> element > >>structure. > > > >This is basically a grouping problem, and appears in the archive of this > >list (and the XSL FAQ) under various guises, such as introducing hierarchy > >into flat structures, etc. In your case this'll be much easier to do in a > >separate pass. (It can be done in one pass, but you need to understand the > >logic for the two tasks separately in any case.) > > > >For that second level down (lists inside lists), the same basic techniques > >will work. If you break the processing into two passes (1. change <p> to > ><li>, 2. group <li> structures), make sure in pass 1. that the > >second-level <li> elements have some kind of attribute or other marker to > >distinguish them, so they can be grouped properly in the second pass to > >interpolate the correct hierarchy. > > > >If these hints aren't enough to get you rolling, or if you need help with > >exactly how to write the XPath, come back and ask. (If asking about XPath > >and matching, show us your source so we can see about any of those pesky > >text nodes, etc.) But hopefully this will help you break the problems down. > > > >Regards, > >Wendell > > > > > > > >====================================================================== > >Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx > >Mulberry Technologies, Inc. http://www.mulberrytech.com > >17 West Jefferson Street Direct Phone: 301/315-9635 > >Suite 207 Phone: 301/315-9631 > >Rockville, MD 20850 Fax: 301/315-8285 > >---------------------------------------------------------------------- > > Mulberry Technologies: A Consultancy Specializing in SGML and XML > >====================================================================== > > > > > >XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > > --Chris > > -------------------------------------------------------------------------- -------------- > Texterity ~ XML and PDF ePublishing Services > -------------------------------------------------------------------------- -------------- > Chris Loschen, XML Developer > Texterity, Inc. > 144 Turnpike Road > Southborough, MA 01772 USA > tel: +1.508.804.3033 > fax: +1.508.804.3110 > email: loschen@xxxxxxxxxxxxx > http://www.texterity.com/ > > > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list > XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Finding list items in XHT, Chris Loschen | Thread | Re: [xsl] Finding list items in XHT, Wendell Piez |
[xsl] Page Break, Charles Ohana | Date | [xsl] ANN: mtxslt, a multi-XSLT-eng, Anthony B. Coates |
Month |