Re: [xsl] transform html h1 with a div

Subject: Re: [xsl] transform html h1 with a div
From: Giuseppe Briotti <g.briotti@xxxxxxxxx>
Date: Wed, 31 Oct 2012 01:24:28 +0100
So, I try to extend this approach:

http://www.xmlplease.com/xhtml/xhtml-hierarchy

in order to work on fragment and, generally, with more "lazy" condition:

* The higher h* level can be any of h1, h2, ... h6
* Not all the intermediate levels are present

Thus, basically, a conservative approach is to consider a section
starting from a given Hx and
ending to the next Hy, where y>=x. And all the intermediate Hz (with
z<x) must be processed in same way...

I write this XSLT and it works fine. But probably this is not so efficient:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="2.0" xmlns="http://www.w3.org/1999/xhtml"; >
    <xsl:output method="xhtml" indent="yes"/>

    <!--identity template -->
    <xsl:template match="@*|node()" >
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <!-- body of html document -->
    <xsl:template match="body">
            <!-- start of recursion, defining the group by condition
on starting element -->
            <xsl:for-each-group
select="element()|comment()|processing-instruction()"
group-starting-with="*[local-name()='h1'
                or (local-name()='h2' and count(preceding-sibling::h1)=0)
                or (local-name()='h3' and count(preceding-sibling::h1
| preceding-sibling::h2 )=0)
                or (local-name()='h4' and count(preceding-sibling::h1
| preceding-sibling::h2  | preceding-sibling::h3 )=0)
                or (local-name()='h5' and count(preceding-sibling::h1
| preceding-sibling::h2  | preceding-sibling::h3 |
preceding-sibling::h4 )=0)
                or (local-name()='h6' and count(preceding-sibling::h1
| preceding-sibling::h2  | preceding-sibling::h3 |
preceding-sibling::h4 | preceding-sibling::h5 )=0)
                ]">
                <xsl:message><xsl:text>gruppo (body)=
</xsl:text><xsl:value-of
select="current-group()/local-name()"/></xsl:message>
                <xsl:call-template name="gruppo"/>
            </xsl:for-each-group>
    </xsl:template>

    <!-- template with recursion -->
     <xsl:template name="gruppo">
             <xsl:choose>

             <!-- check if the first element og current group is h*.
In case the div is required -->
              <xsl:when test="local-name()='h1' or local-name()='h2'
or local-name()='h3' or local-name()='h4' or local-name()='h5' or
local-name()='h6'" >
                 <xsl:element name="div">

                    <!--class attirbute of div element. the header
local name is the value -->
                    <xsl:attribute name="class"><xsl:value-of
select="local-name()"/></xsl:attribute>

                    <!-- copy of the header tag  -->
                    <xsl:element name="{local-name()}">
                        <xsl:apply-templates select="@*|node()"/>
                    </xsl:element>

                    <!-- recursion applied to the current group,
removing the first, already processed, element -->
                     <xsl:for-each-group select="current-group()
except  ." group-starting-with="*[ (local-name()='h1'
                        or (local-name()='h2' and
count(preceding-sibling::h1)=0)
                        or (local-name()='h3' and
count(preceding-sibling::h1 | preceding-sibling::h2 )=0)
                        or (local-name()='h4' and
count(preceding-sibling::h1 | preceding-sibling::h2  |
preceding-sibling::h3 )=0)
                        or (local-name()='h5' and
count(preceding-sibling::h1 | preceding-sibling::h2  |
preceding-sibling::h3 | preceding-sibling::h4 )=0)
                        or (local-name()='h6' and
count(preceding-sibling::h1 | preceding-sibling::h2  |
preceding-sibling::h3 | preceding-sibling::h4 | preceding-sibling::h5
)=0) )
                        ]">
                                               <xsl:call-template
name="gruppo"/>
                    </xsl:for-each-group>
                </xsl:element>
        </xsl:when>
             <xsl:otherwise>
                 <!-- the first element is not a h*. The div element
is not required, simply copy this first element -->
                 <xsl:element name="{local-name()}">
                     <xsl:apply-templates select="@*|node()"/>
                 </xsl:element>

                    <!-- recursion applied to the current group,
removing the first, already processed, element -->
                 <xsl:for-each-group select="current-group() except
." group-starting-with="*[ (local-name()='h1'
                     or (local-name()='h2' and
count(preceding-sibling::h1)=0)
                     or (local-name()='h3' and
count(preceding-sibling::h1 | preceding-sibling::h2 )=0)
                     or (local-name()='h4' and
count(preceding-sibling::h1 | preceding-sibling::h2  |
preceding-sibling::h3 )=0)
                     or (local-name()='h5' and
count(preceding-sibling::h1 | preceding-sibling::h2  |
preceding-sibling::h3 | preceding-sibling::h4 )=0)
                     or (local-name()='h6' and
count(preceding-sibling::h1 | preceding-sibling::h2  |
preceding-sibling::h3 | preceding-sibling::h4 | preceding-sibling::h5
)=0) )
                     ]">
                     <xsl:call-template name="gruppo"/>
                 </xsl:for-each-group>
             </xsl:otherwise>
         </xsl:choose>
    </xsl:template>

    <!-- create some style for html header, to visual evaluate the
nested divs  -->
    <xsl:template match="head">
        <head>
        <style type="text/css">
            div.h1 {border-width: 1; border: solid; background-color:
#FFFF33; width:95%;}
            div.h2 {border-width: 1; border: solid; background-color:
#FF9933; width:95%;}
            div.h3 {border-width: 1; border: solid; background-color:
#FF0033; width:95%;}
            div.h4 {border-width: 1; border: solid; background-color:
#99FFFF; width:95%;}
            div.h5 {border-width: 1; border: solid; background-color:
#9999FF; width:95%;}
            div.h6 {border-width: 1; border: solid; background-color:
#9933FF; width:95%;}
        </style>
        <xsl:apply-templates/>
        </head>
    </xsl:template>

</xsl:stylesheet>

This work as expected on xml file with lazy condition like this one:

<?xml version="1.0" encoding="UTF-8"?>
<html>
    <head>
        <title></title>
    </head>
    <body>
        <p>prima riga di testo prima del primo titolo</p>
        <p>seconda riga di testo prima del primo titolo</p>
        <h3>titolo h3</h3>
        <p>testo dopo il titolo h3</p>
        <h4>titolo h4</h4>

        <h6>titolo h6</h6>
        <p>questo h il testo dopo il titolo h6</p>

        <h2>titolo h2</h2>
        <p>questo h il testo dopo il primo titolo h2</p>
        <h2>titolo h2</h2>
        <p>questo h il testo dopo il secondo titolo h2</p>
        <h4>titolo h4</h4>
        <p>questo h il testo dopo il titolo h4</p>

        <h1>titolo 1B</h1>
        <p>questo h il 10 testo dopo il <i>titolo 1B</i></p>
        <p>questo h il 20 testo dopo il <i>titolo 1B</i></p>
        <p>questo h il 30 testo dopo il <i>titolo 1B</i></p>
    </body>
</html>

But I think that the xslt is not well written... for instance, I think
it can be improved by simply copy the <p> elements that appears in the
current-group before the first h* element.

Any suggestion?

Thanks in advance...

--

Giuseppe Briotti
g.briotti@xxxxxxxxx

"Alme Sol, curru nitido diem qui
promis et celas aliusque et idem
nasceris, possis nihil urbe Roma
visere maius."
(Orazio)

Current Thread