Re: [xsl] Re: transform html h1 with a div

Subject: Re: [xsl] Re: transform html h1 with a div
From: "Imsieke, Gerrit, le-tex" <gerrit.imsieke@xxxxxxxxx>
Date: Wed, 31 Oct 2012 21:33:15 +0100
Giuseppe,

I once posted a solution [1] that I further simplified for your case:

<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
  xmlns:xs="http://www.w3.org/2001/XMLSchema";
  xmlns:my="my"
  version="2.0"
  exclude-result-prefixes="my xs"
  >

<xsl:output method="xml" indent="yes" />

  <xsl:template match="body">
    <xsl:copy>
      <xsl:sequence select="my:hierarchize(*)" />
    </xsl:copy>
  </xsl:template>

<xsl:function name="my:hierarchize" as="element(*)*">
<xsl:param name="nodes" as="element(*)*" />
<xsl:variable name="headings" select="$nodes[my:isHeading(.)]" as="element(*)*" />
<!-- Start each group with a heading whose level number is lower than or equal to
the minimal level number of all preceding headings.
Example: h3, h2, h3, h4, h2
First group starts with first h3 (because there has to be a first group).
Next two groups start with h2.
-->
<xsl:for-each-group select="$nodes"
group-starting-with="*[my:isHeading(.)][
my:hlevel(.) le min(
(
for $preceding-heading in $headings[. &lt;&lt; current()]
return my:hlevel($preceding-heading),
6
)
)
]">
<xsl:choose>
<xsl:when test="my:isHeading(.)">
<div class="{name()}">
<xsl:sequence select="., my:hierarchize(current-group()[position() gt 1])" />
</div>
</xsl:when>
<xsl:otherwise>
<xsl:sequence select="current-group()" />
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</xsl:function>


  <xsl:function name="my:isHeading" as="xs:boolean">
    <xsl:param name="elt" as="element(*)" />
    <xsl:value-of select="matches(name($elt), '^h\d$')" />
  </xsl:function>

  <xsl:function name="my:hlevel" as="xs:integer">
    <xsl:param name="elt" as="element(*)" />
    <xsl:value-of select="number(replace(name($elt), '^h(\d)$', '$1'))" />
  </xsl:function>

</xsl:stylesheet>


Applying this stylesheet to


<body>
<h3>Title 3</h3>
<p>bla bla bla</p>
<p>bla bla bla</p>
<h2>Title 2</h2>
<p>bla bla bla</p>
<p>bla bla bla</p>
<h3>Title 3</h3>
<p>bla bla bla</p>
<p>bla bla bla</p>
<h4>Title 4</h4>
<p>bla bla bla</p>
<p>bla bla bla</p>
</body>

gives the following output:

<body>
   <div class="h3">
      <h3>Title 3</h3>
      <p>bla bla bla</p>
      <p>bla bla bla</p>
   </div>
   <div class="h2">
      <h2>Title 2</h2>
      <p>bla bla bla</p>
      <p>bla bla bla</p>
      <div class="h3">
         <h3>Title 3</h3>
         <p>bla bla bla</p>
         <p>bla bla bla</p>
         <div class="h4">
            <h4>Title 4</h4>
            <p>bla bla bla</p>
            <p>bla bla bla</p>
         </div>
      </div>
   </div>
</body>

Gerrit


[1] http://markmail.org/message/cbt5ktnscauddcii


On 2012-10-31 20:06, Giuseppe Briotti wrote:
Michael Kay <mike <at> saxonica.com> writes:



I'm not sure why you are going to such lengths to deal with missing
levels. Surely the standard code that simply processes levels h1, h2,
... h6 in turn, recursively, will cope with missing levels without any
special code needed?

Michael Kay
Saxonica

Hi Michael, the problem is that I must process fragment of html too. This means that I can have situation like the following:

<h3>Title 3</h3>
<p>bla bla bla</p>
<p>bla bla bla</p>
<h2>Title 2</h2>
<p>bla bla bla</p>
<p>bla bla bla</p>
<h3>Title 3</h3>
<p>bla bla bla</p>
<p>bla bla bla</p>
<h4>Title 4</h4>
<p>bla bla bla</p>
<p>bla bla bla</p>

that should be nested as:

<div class="h3">
<h3>Title 3</h3>
<p>bla bla bla</p>
<p>bla bla bla</p>
</div>

<div class="h2">
<h2>Title 2</h2>
<p>bla bla bla</p>
<p>bla bla bla</p>

<div class="h3">
<h3>Title 3</h3>
<p>bla bla bla</p>
<p>bla bla bla</p>

<div class="h4">
<h4>Title 4</h4>
<p>bla bla bla</p>
<p>bla bla bla</p>
</div>

</div>

</div>

This means that I cannot start basically with h1 to h6, but I must choose a more
complex strategy in grouping and then process recursively each starting group.
In the above example, the first "starting group" is the first h3, the second one
is the h2 ("containing" the second h3 and h4) that must be processed recursively
to create the second h3 and h4 div.

Applying a pattern with for-each-group with group-starting-with works fine to
select the topmost group, the problem is because the involving preceding-sibling
it, of course, doesn't work on subsequent deepest current-group()...

Giuseppe.


-- Gerrit Imsieke Geschdftsf|hrer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@xxxxxxxxx, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschdftsf|hrer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vvckler

Current Thread