Re: [xsl] XHTML DTD aware transformation and indentation behaviour

Subject: Re: [xsl] XHTML DTD aware transformation and indentation behaviour
From: Ganesh Babu N <nbabuganesh@xxxxxxxxx>
Date: Thu, 2 Feb 2012 16:36:49 +0530
Dear Matthieu,

You can achieve this by downloading all the modules of xhtml11.dtd and
place them in local and using catalogs and changing the indent to
"yes" which align your XHTML output in a tree structure. There is not
need to comment the DOCTYPE in the source file.

Regards,
Ganesh


On Thu, Feb 2, 2012 at 4:18 PM, Matthieu Ricaud-Dussarget
<matthieu.ricaud@xxxxxxxxx> wrote:
> Hi all,
>
> In my project I concatenate multiple xhtml files in one xml files. This
> aggregate file has to be edited by hand, that means indentation is
important
> here for convenience.
>
> Before I discovered XML Catalog, I used to delete all DOCTYPE declarations
> within source XHTML file with a perl script (which also remplace named
> entities with UTF-8 ones). This worked fine : the concatenated files were
> indented exactly like the XHTML sources.
>
> But this was a bit dangerous in case I didn't match a special entity to
> replace with perl. And this was not a really good XML practice.
>
> Now that I'm using a local XML Catalog and run my tranformation with Saxon
> in command line with this options :
> -r:org.apache.xml.resolver.tools.CatalogResolver
> -x:org.apache.xml.resolver.tools.ResolvingXMLReader
> -y:org.apache.xml.resolver.tools.ResolvingXMLReader
>
> Lets go in the probleme, my XSL is a simple identity template :
>
> <xsl:output method="xhtml" indent="no" encoding="UTF-8"
> omit-xml-declaration="no" doctype-public="-//W3C//DTD XHTML 1.1//EN"
> doctype-system="http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"/>
>
> <xsl:template match="* | @* | processing-instruction() | comment()"
> mode="copy">
> <xsl:copy copy-namespaces="no">
> <xsl:apply-templates select="node()|@*" mode="copy"/>
> </xsl:copy>
> </xsl:template>
>
> <xsl:template match="/">
> <xsl:apply-templates mode="copy"/>
> </xsl:template>
>
> this is my XML source :
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
> "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd";>
> <html xmlns="http://www.w3.org/1999/xhtml";>
> <head>
> <title>title</title>
> <link href="my.css" rel="stylesheet" type="text/css" />
> <script type="text/javascript" src="my.js"></script>
> </head>
> <body>
> <div class="body">
> <div class="pageTitre_container">
> <h1>
> <span>Title 1</span>
> </h1>
> <p><span class="big">This</span> is <span class="little">a
> paragraphe</span></p>
> <p><span class="big">This</span> is <span class="little">a
> paragraphe</span></p>
> </div>
> </div>
> <table>
> <caption>This is a table</caption>
> <thead>
> <tr>
> <td>Col 1</td>
> <td>Col 2</td>
> <td>Col 3</td>
> <td>Col 4</td>
> <td>Col 5</td>
> </tr>
> </thead>
> <tbody>
> <tr>
> <td> </td>
> <td colspan="3" rowspan="7">
> <p class="entitre-en-savoir-">@ savoir</p>
> <p class="no">
> <span class="no-style-override-5">Certains grands magasins proposent des
> comparatifs trhs complets, prenez le temps de les parcourir. Vous pouvez
> igalement chercher des infos sur Internet via les sites des fabricants, ou
> sur les forums&#160;: rien ne vaut lavis dun consommateur pour se faire
> une idie pricise du produit&#160;!</span>
> </p>
> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> <td> </td>
> <td> </td>
> <td> </td>
> </tr>
> </tbody>
> </table>
> </body>
> </html>
>
> Which gives as output :
>
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE html
>  PUBLIC "-//W3C//DTD XHTML 1.1//EN"
> "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd";>
> <html xmlns="http://www.w3.org/1999/xhtml";><head><meta
> http-equiv="Content-Type" content="text/html; charset=UTF-8"
> /><title>title</title><link href="my.css" rel="stylesheet" type="text/css"
> /><script type="text/javascript" src="my.js"></script></head><body><div
> class="body">
> <div class="pageTitre_container">
> <h1>
> <span>Title 1</span>
> </h1>
> <p><span class="big">This</span> is <span class="little">a
> paragraphe</span></p>
> <p><span class="big">This</span> is <span class="little">a
> paragraphe</span></p>
> </div>
> </div><table><caption>This is a table</caption><thead><tr><td>Col
> 1</td><td>Col 2</td><td>Col 3</td><td>Col 4</td><td>Col
> 5</td></tr></thead><tbody><tr><td> </td><td colspan="3" rowspan="7">
> <p class="entitre-en-savoir-">@ savoir</p>
> <p class="no">
> <span class="no-style-override-5">Certains grands magasins proposent des
> comparatifs trhs complets, prenez le temps de les parcourir. Vous pouvez
> igalement chercher des infos sur Internet via les sites des fabricants, ou
> sur les forums : rien ne vaut lavis dun consommateur pour se faire une
> idie pricise du produit !</span>
> </p>
> </td><td> </td></tr><tr><td> </td><td> </td></tr><tr><td> </td><td>
> </td></tr><tr><td> </td><td> </td></tr><tr><td> </td><td>
</td></tr><tr><td>
> </td><td> </td></tr><tr><td> </td><td> </td></tr><tr><td> </td><td>
> </td><td> </td><td> </td><td> </td></tr></tbody></table></body></html>
>
> If I comment the DOCTYPE in the source I get :
>
> <?xml version="1.0" encoding="UTF-8"?><!--<!DOCTYPE html PUBLIC
"-//W3C//DTD
> XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd";>-->
> <!DOCTYPE html
>  PUBLIC "-//W3C//DTD XHTML 1.1//EN"
> "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd";>
> <html xmlns="http://www.w3.org/1999/xhtml";>
> <head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
> <title>title</title>
> <link href="my.css" rel="stylesheet" type="text/css" />
> <script type="text/javascript" src="my.js"></script>
> </head>
> <body>
> <div class="body">
> <div class="pageTitre_container">
> <h1>
> <span>Title 1</span>
> </h1>
> <p><span class="big">This</span> is <span class="little">a
> paragraphe</span></p>
> <p><span class="big">This</span> is <span class="little">a
> paragraphe</span></p>
> </div>
> </div>
> <table>
> <caption>This is a table</caption>
> <thead>
> <tr>
> <td>Col 1</td>
> <td>Col 2</td>
> <td>Col 3</td>
> <td>Col 4</td>
> <td>Col 5</td>
> </tr>
> </thead>
> <tbody>
> <tr>
> <td> </td>
> <td colspan="3" rowspan="7">
> <p class="entitre-en-savoir-">@ savoir</p>
> <p class="no">
> <span class="no-style-override-5">Certains grands magasins proposent des
> comparatifs trhs complets, prenez le temps de les parcourir. Vous pouvez
> igalement chercher des infos sur Internet via les sites des fabricants, ou
> sur les forums : rien ne vaut lavis dun consommateur pour se faire une
> idie pricise du produit !</span>
> </p>
> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> </tr>
> <tr>
> <td> </td>
> <td> </td>
> <td> </td>
> <td> </td>
> <td> </td>
> </tr>
> </tbody>
> </table>
> </body>
> </html>
>
>
> the head element is now indented and the table too, this is what i would
> like... but I don't want to comment the doctype in the source.
>
> Has it something to do with the XHTML DTD model ? Any Idea how to achieve
> what I'd like ?
>
> Thanks,
>
> Matthieu.

Current Thread