Re: [xsl] Finding list items in XHTML

Subject: Re: [xsl] Finding list items in XHTML
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Tue, 12 Nov 2002 15:02:17 -0500
Hi Chris,

At 12:45 PM 11/12/2002, you wrote:
My input (and output) is essentially XHTML (actually OEB, but they're almost identical). It has
a series of <p> elements, and in two specific cases, I need to convert the <p> elements
to <li> elements. Those cases are:


(1) when the <p> element starts with a <span> with the contents "&#10148; " (yes, that's a character reference), and
(2) when the <p> element starts with a <span> with the contents "&amp;(!!char1!!); " (that string exactly)


In both of these cases, I need to replace the <p> element with an <li> element
and delete the child <span> element entirely.

This is tractable.


Understand, first, that your solution is in XPath, not XSLT. That is, your code may be able to use (for example) either a template-driven approach, or a for-each iteration or even other techniques; but either way you'll need XPath to do your testing.

Since you're wisely starting with an identity template, and since it's probably the best solution in any case, we'll assume a template-driven approach.

As you know, you can match <p> elements with a template. You can also qualify the match. So

<xsl:template match="p[starts-with(.,'&#10148;')]">
  <li>
    <xsl:apply-templates/>
  </li>
</xsl:template>

matches <p> elements that start with this character. Note the match succeeds irrespective of the presence or absence of a <span> element, since all that's being tested is the string value of the <p>, which includes any elements inside it. This may or may not be good enough. Note also that any <p> elements that don't start with this character will fail to match, and thus presumably will be picked up by the identity template.

If you wanted a stricter test, you could say, for example,

<xsl:template match="p[child::*[1]/self::span[.='&#10148;']]">

This template matches any <p> element whose first child element is a <span> element with value '&#10148;'.

You get the idea -- you need to be very precise on exactly how you want the test to work. The major gotcha to keep in mind here is the presence or absence of text node children of your <p>, especially any whitespace-only text nodes that are there only for formatting your source. (Come back and ask if this is mysterious or if the expressions you try aren't getting the results you anticipate.)

Deleting the child span is easy; just include the template

<xsl:template match="span[.='&#10148;']"/>

in your identity transform. (This expression matches any span whose string value is that character. The template, having matched such a span, does nothing with it, so it doesn't appear in your output.)

Perhaps the more difficult part of this is that I also would like to take a series of
such elements and surround the entire series with a <ul> </ul> element structure.

This is basically a grouping problem, and appears in the archive of this list (and the XSL FAQ) under various guises, such as introducing hierarchy into flat structures, etc. In your case this'll be much easier to do in a separate pass. (It can be done in one pass, but you need to understand the logic for the two tasks separately in any case.)


For that second level down (lists inside lists), the same basic techniques will work. If you break the processing into two passes (1. change <p> to <li>, 2. group <li> structures), make sure in pass 1. that the second-level <li> elements have some kind of attribute or other marker to distinguish them, so they can be grouped properly in the second pass to interpolate the correct hierarchy.

If these hints aren't enough to get you rolling, or if you need help with exactly how to write the XPath, come back and ask. (If asking about XPath and matching, show us your source so we can see about any of those pesky text nodes, etc.) But hopefully this will help you break the problems down.

Regards,
Wendell



======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list



Current Thread