Re: HTML to XML

Subject: Re: HTML to XML
From: Cristobal Galiano Fernandez <c.galiano@xxxxx>
Date: Tue, 07 Dec 1999 12:07:00 +0100
Hi Marco!

A days ago I found this reference.

<!-- html2xhtml.xsl:  HTML to XHTML  XSL stylesheet converter
  ========================================================

   Copyright 1999 David Carlisle NAG Ltd


  The following stylesheet takes as input an XSL stylesheet that writes
  HTML, and produces a stylesheet that writes XML that hopefully matches the
  XHTML specification. (It does not check that the output matches the DTD.)
  It does the following things:

  * Adds a DOCTYPE giving FPI and URL for one of the three flavours
    of XHTML1. (Transitional unless the original stylesheet asked for
    Frameset or Strict HTML.)

  * Writes all HTML elements and attributes as lowercase, with
    elements being written in the XHTML namespace.

  * Writes <BR> as <br class="html-compat"/> . (Appendix C recommends
    <br /> rather than <br/> but an XSL stylesheet has no control
    over the concrete syntax of the linearisation, so adding an attribute
    is probably the best that can be done. (No attribute is added if
    the element already has attributes.)

  * Changes the output method from html to xml in xsl:output
    (and also in the xt:document extension element).

  * Forces a line break after opening tags of non empty elements to ensure
    that they are never written with XML empty syntax, so
    <p>
    </p>
    not
    <p/>

  * Copies any elements from XSL or XT namespaces through to the new
    stylesheet.

  -->

  <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
                  xmlns:axsl="/dev/null"
                  xmlns:xt="http://www.jclark.com/xt";
                  xmlns:xhtml="http://www.w3.org/1999/xhtml";
                  version="1.0"
                  >
  <xsl:namespace-alias stylesheet-prefix="axsl" result-prefix="xsl"/>

  <xsl:output method="xml" indent="yes"/>

  <xsl:template match="xsl:*|xt:*">
  <xsl:copy>
    <xsl:copy-of select="@*"/>
    <xsl:apply-templates/>
  </xsl:copy>
  </xsl:template>

  <xsl:template match="xsl:output|xt:document">
  <xsl:copy>
    <xsl:attribute name="method">xml</xsl:attribute>
    <xsl:choose>
    <xsl:when test="contains(@doctype-public,'Frameset')">
      <xsl:attribute name="doctype-public">
       <xsl:text>-//W3C//DTD XHTML 1.0 Frameset//EN</xsl:text>
      </xsl:attribute>
      <xsl:attribute name="doctype-system">

<xsl:text>http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd</xsl:text>
      </xsl:attribute>
    </xsl:when>
    <xsl:when test="contains(@doctype-public,'Strict')">
      <xsl:attribute name="doctype-public">
       <xsl:text>-//W3C//DTD XHTML 1.0 Strict//EN</xsl:text>
      </xsl:attribute>
      <xsl:attribute name="doctype-system">
         <xsl:text>http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd</xsl:text>
      </xsl:attribute>
    </xsl:when>
    <xsl:otherwise>
      <xsl:attribute name="doctype-public">
       <xsl:text>-//W3C//DTD XHTML 1.0 Transitional//EN</xsl:text>
      </xsl:attribute>
      <xsl:attribute name="doctype-system">

<xsl:text>http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd</xsl:text>
      </xsl:attribute>
    </xsl:otherwise>
    </xsl:choose>
    <xsl:attribute name="indent">yes</xsl:attribute>
    <xsl:copy-of select="@*[not(
        name(.)='method' or
        name(.)='doctype-public' or
        name(.)='doctype-system' or
        name(.)='indent'
        )  ]"/>
    <xsl:apply-templates/>
  </xsl:copy>
  </xsl:template>

  <xsl:template match="*|xsl:element">
  <xsl:variable name="n">
    <xsl:choose>
    <xsl:when test="self::xsl:element">
      <xsl:value-of  select="translate(@name,
                     'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
                     'abcdefghijklmnopqrstuvwxyz')"/>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of  select="translate(local-name(.),
                     'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
                     'abcdefghijklmnopqrstuvwxyz')"/>
    </xsl:otherwise>
    </xsl:choose>
  </xsl:variable>
  <xsl:element
    name="{$n}"
     namespace="http://www.w3.org/1999/xhtml";>
    <xsl:for-each select="self::*[not(self::xsl:element)]/@* |
                         self::xsl:element/@use-attribute-sets">
      <xsl:attribute name="{translate(local-name(.),
                     'ABCDEFGHIJKLMNOPQRSTUVWXYZ',
                     'abcdefghijklmnopqrstuvwxyz')}">
       <xsl:value-of select="."/>
      </xsl:attribute>

    </xsl:for-each>
    <xsl:choose>
    <xsl:when test="not(@*) and ($n='br' or $n='hr')">
      <xsl:attribute name="class">html-compat</xsl:attribute>
    </xsl:when>
    <xsl:otherwise>
     <xsl:text>&#xA;</xsl:text>
    <xsl:apply-templates/>
    </xsl:otherwise>
    </xsl:choose>
  </xsl:element>
  </xsl:template>


  </xsl:stylesheet>

Marco.Mistroni@xxxxxxxxx wrote:

> hi all,
>         has anyone written a stylesheet for converting HTML to XML???i would
> be interested to talk for exchanging some ideas..
> with best regards
>         marco
>
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread