RE: Marked up data with sectbreaks (was Re: [xsl] XSLT FAQ ideas)

Subject: RE: Marked up data with sectbreaks (was Re: [xsl] XSLT FAQ ideas)
From: "Passin, Tom" <tpassin@xxxxxxxxxxxx>
Date: Tue, 18 Feb 2003 10:48:16 -0500
[ Elliotte Rusty Harold]

> At 6:25 PM -0500 2/17/03, Wendell Piez wrote:
> >So far's I know that's a new solution or one not seen in 
> public anyhow.
> Cool. I invented something! In case anyone wants to see details look 
> at <>
> >The other known solutions are (1) indexing nodes to sectbreaks
> >(generate-id(preceding::sectbreak[1]) works pretty well), then 
> >pulling them when matching sectbreaks ... documented somewhat in the 
> >FAQ under "flat" (flat structure to hierarchy). An alternative is a 
> >forward stepwise tree walk, in which a template matching a sectbreak 
> >has you proceed through the following sibling nodes one by one until 
> >you get a new sectbreak.
> >
> >The problem gets more complicated (much more) if your breaks are not
> >all at the same level.
> Yes, I can see that. Fortunately, mine are at the same level. I 
> wonder if multi-level breaking could be handled just by using the 
> following axis instead of following-sibling?
> -- 

I had a nasty case where I had to extract some specific data items from
an HTML page.  There were markers before and after the data but they
were different markers (different from each other, I mean) and in some
cases at different levels.

Once I could identify the right marker nodes and the candidate target
data items, I took the intersection of all the candidate data items
__after__ the first marker and __before__ the last marker.  I forget now
which of the axes I used for this particular case, following,
following-sibling, etc.

What made it worse was that the HTML (which I had no control over) was
hand-written, invalid, and from time to time changed the way in which it
was invalid.  So I had to supply alternate expressions to make it work
under the various observed permutations.  (BTW, I used Tidy fo convert
the HTML into usable xml).  One example of this is that one of the end
markers was an <a> element that had a <p> inside it (invalid!), and the
<p> had an identifiable string I used to identify that it was indeed the
marker.  Sometimes the nesting of the a and p elements was reversed.

Actually, the first set of items were used to index to the second set
(they were url fragment identifiers pointing into the same page), and
the second set actually pointed to the third set.  The data I needed was
a combination of the second and third sets.  All three sets had to be
extracted using thir own before and after markers.  Whew!


Tom P

 XSL-List info and archive:

Current Thread