Re: [xsl] Finding list items in XHTML

Subject: Re: [xsl] Finding list items in XHTML
From: "Ritu" <rkama@xxxxxxxxxxx>
Date: Thu, 12 Sep 2002 17:03:32 -0500
Try this

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>

<xsl:output method="xml" version="1.0" encoding="us-ascii"
omit-xml-declaration="no" doctype-public="+//ISBN 0-9673008-1-9//DTD OEB
1.0.1 Document//EN" doctype-system="oebdoc101.dtd" indent="no" />

<xsl:template match="p/span[starts-with(.,'&#10148;')]">
<li><xsl:apply-templates /></li>
</xsl:template>

<xsl:template match="p/span[starts-with(.,'&amp;(!!char1!!);')]">
<li><xsl:apply-templates /></li>
</xsl:template>

<xsl:template match="span[normalize-space(text())='&#10148; ']" />

<xsl:template match="span[normalize-space(text())='&amp;(!!char1!!); ']" />


Ritu
----- Original Message -----
From: "Chris Loschen" <loschen@xxxxxxxxxxxxx>
To: <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Tuesday, November 12, 2002 5:43 PM
Subject: Re: [xsl] Finding list items in XHTML


> Thank you very much for your help!
>
> I'm trying to tackle the <p> -> <li> problem first, since that seemed to
be
> easier. However, I don't seem
> to have it right yet. Here's the stylesheet as it currently exists:
>
> *****
>
> <xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
>
> <xsl:output method="xml" version="1.0" encoding="us-ascii"
> omit-xml-declaration="no" doctype-public="+//ISBN 0-9673008-1-9//DTD OEB
> 1.0.1 Document//EN" doctype-system="oebdoc101.dtd" indent="no" />
>
> <xsl:template match="p[starts-with(.,'&#10148; ')]">
> <li><xsl:apply-templates /></li>
> </xsl:template>
>
> <xsl:template match="p[starts-with(.,'&amp;(!!char1!!); ')]">
> <li><xsl:apply-templates /></li>
> </xsl:template>
>
> <xsl:template match="span[.='&#10148; ']" />
>
> <xsl:template match="span[.='&amp;(!!char1!!); ']" />
>
> <!-- The Identity Transformation -->
>    <!-- Whenever you match any node or any attribute -->
>    <xsl:template match="node()|@*">
>      <!-- Copy the current node -->
>      <xsl:copy>
>        <!-- Including any attributes it has and any child nodes -->
>        <xsl:apply-templates select="@*|node()"/>
>      </xsl:copy>
>    </xsl:template>
>
> </xsl:stylesheet>
>
> ***
>
> I need the output to be encoded as us-ascii because some of the downstream
> tools are expecting
> Unicode character references rather than pure UTF-8.
>
> Unfortunately, it doesn't seem to be finding the nodes I want, either the
> <p> or the <span> elements.
> Might it be the Unicode reference or the entity that's causing the
problem?
>  From what I can figure out,
> the syntax looks to be OK, but perhaps I'm wrong.
>
> Here's a sample of the input XML:
>
> ***
>
> <p class="hang-text-6"><span class="hang-text-2">&#10148; </span>Everyone
> in a company should have a written job description that accurately
> reflects their responsibilities and is related to their compensation.</p>
> <a id="page-059"></a>
>
> <p class="hang-text-3"><span class="hang-text-2">&amp;(!!char1!!);
> </span>HR can help you deal with problem employees. Because most companies
fear
> wrongful termination suits, problem employees often require delicate
handling.
> Know your firm&rsquo;s procedures for these situations and work closely
> with HR to
> resolve them quickly.</p>
>
> ***
>
> This is unchanged by the transformation. Do you see the error of my ways?
>
> Perhaps I just need to look at it with fresh eyes in the morning. Thanks
> again for your help.
>
>
> At 03:02 PM 11/12/02, you wrote:
> >Hi Chris,
> >
> >At 12:45 PM 11/12/2002, you wrote:
> >>My input (and output) is essentially XHTML (actually OEB, but they're
> >>almost identical). It has
> >>a series of <p> elements, and in two specific cases, I need to convert
> >>the <p> elements
> >>to <li> elements. Those cases are:
> >>
> >>(1) when the <p> element starts with a <span> with the contents
"&#10148;
> >>" (yes, that's a character reference), and
> >>(2) when the <p> element starts with a <span> with the contents
> >>"&amp;(!!char1!!); " (that string exactly)
> >>
> >>In both of these cases, I need to replace the <p> element with an <li>
> >>element
> >>and delete the child <span> element entirely.
> >
> >This is tractable.
> >
> >Understand, first, that your solution is in XPath, not XSLT. That is,
your
> >code may be able to use (for example) either a template-driven approach,
> >or a for-each iteration or even other techniques; but either way you'll
> >need XPath to do your testing.
> >
> >Since you're wisely starting with an identity template, and since it's
> >probably the best solution in any case, we'll assume a template-driven
> >approach.
> >
> >As you know, you can match <p> elements with a template. You can also
> >qualify the match. So
> >
> ><xsl:template match="p[starts-with(.,'&#10148;')]">
> >   <li>
> >     <xsl:apply-templates/>
> >   </li>
> ></xsl:template>
> >
> >matches <p> elements that start with this character. Note the match
> >succeeds irrespective of the presence or absence of a <span> element,
> >since all that's being tested is the string value of the <p>, which
> >includes any elements inside it. This may or may not be good enough. Note
> >also that any <p> elements that don't start with this character will fail
> >to match, and thus presumably will be picked up by the identity template.
> >
> >If you wanted a stricter test, you could say, for example,
> >
> ><xsl:template match="p[child::*[1]/self::span[.='&#10148;']]">
> >
> >This template matches any <p> element whose first child element is a
> ><span> element with value '&#10148;'.
> >
> >You get the idea -- you need to be very precise on exactly how you want
> >the test to work. The major gotcha to keep in mind here is the presence
or
> >absence of text node children of your <p>, especially any whitespace-only
> >text nodes that are there only for formatting your source. (Come back and
> >ask if this is mysterious or if the expressions you try aren't getting
the
> >results you anticipate.)
> >
> >Deleting the child span is easy; just include the template
> >
> ><xsl:template match="span[.='&#10148;']"/>
> >
> >in your identity transform. (This expression matches any span whose
string
> >value is that character. The template, having matched such a span, does
> >nothing with it, so it doesn't appear in your output.)
> >
> >>Perhaps the more difficult part of this is that I also would like to
take
> >>a series of
> >>such elements and surround the entire series with a <ul> </ul> element
> >>structure.
> >
> >This is basically a grouping problem, and appears in the archive of this
> >list (and the XSL FAQ) under various guises, such as introducing
hierarchy
> >into flat structures, etc. In your case this'll be much easier to do in a
> >separate pass. (It can be done in one pass, but you need to understand
the
> >logic for the two tasks separately in any case.)
> >
> >For that second level down (lists inside lists), the same basic
techniques
> >will work. If you break the processing into two passes (1. change <p> to
> ><li>, 2. group <li> structures), make sure in pass 1. that the
> >second-level <li> elements have some kind of attribute or other marker to
> >distinguish them, so they can be grouped properly in the second pass to
> >interpolate the correct hierarchy.
> >
> >If these hints aren't enough to get you rolling, or if you need help with
> >exactly how to write the XPath, come back and ask. (If asking about XPath
> >and matching, show us your source so we can see about any of those pesky
> >text nodes, etc.) But hopefully this will help you break the problems
down.
> >
> >Regards,
> >Wendell
> >
> >
> >
> >======================================================================
> >Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
> >Mulberry Technologies, Inc.                http://www.mulberrytech.com
> >17 West Jefferson Street                    Direct Phone: 301/315-9635
> >Suite 207                                          Phone: 301/315-9631
> >Rockville, MD  20850                                 Fax: 301/315-8285
> >----------------------------------------------------------------------
> >   Mulberry Technologies: A Consultancy Specializing in SGML and XML
> >======================================================================
> >
> >
> >XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>
> --Chris
>
> --------------------------------------------------------------------------
--------------
> Texterity ~ XML and PDF ePublishing Services
> --------------------------------------------------------------------------
--------------
> Chris Loschen, XML Developer
> Texterity, Inc.
> 144 Turnpike Road
> Southborough, MA 01772 USA
> tel: +1.508.804.3033
> fax: +1.508.804.3110
> email: loschen@xxxxxxxxxxxxx
> http://www.texterity.com/
>
>
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread