RE: [xsl] pretty-printing XML into HTML

Subject: RE: [xsl] pretty-printing XML into HTML
From: "Lars Huttar" <lars_huttar@xxxxxxx>
Date: Wed, 17 Dec 2003 17:16:08 -0600
Hi all,
In case you're interested, here's my solution.
I was having a dickens of a time getting namespace
declarations to appear in appropriate places
(i.e. where and only where necessary).
I was thinking I'd have to build up association lists of namespaces
needed and namespaces currently in scope, and pass them to
recursive template calls... then pondering how to
construct and parse such lists as string, and not wanting to have
to mess with escaping
spaces within URI's... Or should I use the node-set() extension
to build association lists, thus sacrificing some portability,
but saving myself a lot of headache and making things more efficient?

Finally I realized (duh!) I could use the namespace nodes provided
in the source tree! I didn't need to keep my own lists of what
had already been declared.
I.e. for each source tree element being serialized, I could just produce
a namespace declaration for each namespace node (of the current element)
that wasn't a duplicate of one of the current element's parent's
namespace nodes.

So, for anyone who wants to serialize XML to HTML using XSL, my
solution is below.
(If you want you could make indent-increment and ns-decl-extra-indent
into parameters of the first template, instead of variables.)

Regards,
Lars


<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
  <xsl:output method="html" indent="yes" />

  <!-- escape-xml mode: serialize XML tree to text, with indent
    Based very loosely on templates by Wendell Piez -->

  <xsl:variable name="nl"><xsl:text>&#10;</xsl:text></xsl:variable>
  <xsl:variable name="indent-increment" select="'  '" />
  <xsl:variable name="ns-decl-extra-indent" select="'     '" />

  <xsl:template match="*" mode="escape-xml">
    <xsl:param name="indent-string" select="$indent-increment" />
    <xsl:param name="is-top" select="'true'" /> <!-- true if this is the top of the tree being
serialized -->
    <xsl:param name="exclude-prefixes" select="''" /> <!-- ns-prefixes to avoid declaring -->

    <xsl:value-of select="$indent-string" />
    <xsl:call-template name="write-starttag">
      <xsl:with-param name="is-top" select="$is-top" />
      <xsl:with-param name="indent-string" select="$indent-string" />
      <xsl:with-param name="exclude-prefixes" select="$exclude-prefixes" />
    </xsl:call-template>
    <xsl:if test="*"><xsl:value-of select="$nl" /></xsl:if>
    <xsl:apply-templates mode="escape-xml">
      <xsl:with-param name="indent-string" select="concat($indent-string, $indent-increment)" />
      <xsl:with-param name="is-top" select="'false'" />
    </xsl:apply-templates>
    <xsl:if test="*"><xsl:value-of select="$indent-string" /></xsl:if>
     <xsl:if test="*|text()|comment()|processing-instruction()"><xsl:call-template
name="write-endtag" /></xsl:if>
    <xsl:value-of select="$nl" />
  </xsl:template>

  <xsl:template name="write-starttag">
    <xsl:param name="is-top" select="'false'" />
    <xsl:param name="exclude-prefixes" select="''" /> <!-- ns-prefixes to avoid declaring -->
    <xsl:param name="indent-string" select="''" />

    <xsl:text>&lt;</xsl:text>
    <xsl:value-of select="name()"/>
    <xsl:for-each select="@*">
     <xsl:call-template name="write-attribute"/>
    </xsl:for-each>
    <xsl:call-template name="write-namespace-declarations">
      <xsl:with-param name="is-top" select="$is-top" />
      <xsl:with-param name="exclude-prefixes" select="$exclude-prefixes" />
      <xsl:with-param name="indent-string" select="$indent-string" />
    </xsl:call-template>
    <xsl:if test="not(*|text()|comment()|processing-instruction())"> /</xsl:if>
    <xsl:text>></xsl:text>
  </xsl:template>

  <xsl:template name="write-endtag">
     <xsl:text>&lt;/</xsl:text>
     <xsl:value-of select="name()"/>
     <xsl:text>></xsl:text>
  </xsl:template>

  <xsl:template name="write-attribute">
     <xsl:text> </xsl:text>
     <xsl:value-of select="name()"/>
     <xsl:text>="</xsl:text>
     <xsl:value-of select="."/>
     <xsl:text>"</xsl:text>
  </xsl:template>

  <!-- Output namespace declarations for the current element. -->
  <!-- Assumption: if an attribute in the source tree uses a particular namespace, its parent
element
   will have a namespace node for that namespace (because the declaration for the namespace
   must be on the parent element or one of its ancestors). -->
  <xsl:template name="write-namespace-declarations">
    <xsl:param name="is-top" select="'false'" />
    <xsl:param name="indent-string" select="''" />
    <xsl:param name="exclude-prefixes" select="''" />

    <xsl:variable name="current" select="." />
    <xsl:variable name="parent-nss" select="../namespace::*" />
    <xsl:for-each select="namespace::*">
      <xsl:variable name="ns-prefix" select="name()" />
      <xsl:variable name="ns-uri" select="string(.)" />
      <xsl:if test="not(contains(concat(' ', $exclude-prefixes, ' xml '), concat(' ', $ns-prefix, '
')))
                  and ($is-top = 'true' or not($parent-nss[name() = $ns-prefix and string(.) =
$ns-uri]))
                  ">
        <!-- This namespace node doesn't exist on the parent, at least not with that URI,
          so we need to add a declaration. -->
        <!--
          We could add the test
              and ($ns-prefix = '' or ($current//.|$current//@*)[substring-before(name(), ':') =
$ns-prefix])
          i.e. "and it's used by this element or some descendant (or descendant-attribute)
thereof:"
         Only problem with the above test is that sometimes namespace declarations are needed even
though
          they're not used by a descendant element or attribute: e.g. if the input is a stylesheet,
prefixes have
          to be declared if they're used in XPath expressions [which are in attribute values]. We
could have
          problems in this area with regard to xsp-request.
        -->
        <xsl:value-of select="concat($nl, $indent-string, $ns-decl-extra-indent)" />
        <xsl:choose>
          <xsl:when test="$ns-prefix = ''">
            <xsl:value-of select="concat('xmlns=&quot;', $ns-uri, '&quot;')" />
          </xsl:when>
          <xsl:otherwise>
            <xsl:value-of select="concat('xmlns:', $ns-prefix, '=&quot;', $ns-uri, '&quot;')" />
          </xsl:otherwise>
        </xsl:choose>
      </xsl:if>
    </xsl:for-each>
  </xsl:template>

</xsl:stylesheet>


> -----Original Message-----
> From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf Of Lars Huttar
> Sent: Wednesday, December 17, 2003 2:50 PM
> To: XSL-List (E-mail)
> Subject: [xsl] pretty-printing XML into HTML
>
>
> (Apologies if you got this twice. I didn't see any responses
> to this message since I sent it yesterday so I wanted to make sure it
> got out.)
>
> Hi all,
>
> Yes, I know pretty-printing XML is an FAQ. I've looked at
> http://www.dpawson.co.uk/xsl/sect2/pretty.html,
> but this all seems to refer to copying an XML tree to XML
> with indentation.
>
> My requirement is this.
> I have an XML source document that includes certain fragments
> (subtrees) that are generated XML, each to be written to a file on the
> file system. Each fragment is contained in a
> <source:fragment> element,
> for processing by Cocoon's SourceWritingTransformer.
> This all works fine.
>
> However, we want to have an alternative view of this XML
> source document,
> which lets the user view the collection of generated files in
> one HTML browser view. E.g. this view (generated by a stylesheet)
> puts the name of each output filename in an <h2> element,
> then "serializes" (i.e. pretty-prints) the corresponding XML fragment
> into HTML. E.g. the data
>  <source:write>
>
> <source:source>context:/mount/gem/enterprise/index.xsp</source:source>
>   <source:fragment>
>    <xsp:page>
>     <index-page>
>      <page-set role="focal objects">
>       <page-ref name="select_Ethnologue_Country"
> label="Ethnologue Country" />
>       ...
>      </page-set>
>     </index-page>
>    </xsp:page>
>   </source:fragment>
>  </source:write>
>
> is transformed to:
>
> <h2>enterprise/index.xsp</h2>
>
> <pre>  & lt;xsp:page& gt;
>     & lt;index-page& gt;
>       & lt;page-set role="focal objects"& gt;
>         & lt;page-ref name="select_Ethnologue_Country"
> label="Ethnologue Country" /& gt;
>         ...
>       & lt;/page-set& gt;
>     & lt;/index-page& gt;
>   & lt;/xsp:page& gt;
> </pre>
>
> (I put spaces after ampersands above, hoping this would avoid
> mailer munging.)
> (Notice I haven't yet handled namespace declarations, e.g. for xsp:.)
>
> First of all, am I doing something fundamentally wrong here design-
> wise? Does it not make sense to have XML data that you'd want
> to both treat as XML data, and make visible to a browser user?
> Am I wrongly mixing metaphors of text and markup?
> If I should be taking a different approach, advice would be
> appreciated!
>
> Assuming that I'm on the right track... this pretty-printing XML to
> HTML is a fair bit of work (especially getting the namespace
> declarations in optimal places). I know Tidy is supposed to do
> a good job at this, but don't think you can integrate Tidy
> into the operation of a stylesheet, i.e. process subtrees of your
> source document with Tidy and graft the results into your result
> tree. (If you can do this in Cocoon, without reducing portability,
> I'd be interested to hear about it.)
>
> So... if this is a "normal" thing to do, hopefully somebody has
> already written a pretty good XML-to-HTML pretty-printer in XSL?
> It seemed to me there was an extension function in Saxon or in
> XSLT 2.0 that would serialize a source tree fragment, hopefully
> with indentation, but I can't find a reference to it right now.
> Any ideas?
>
> Below is the stylesheet I import for writing out the XML as
> HTML (template with mode="escape-xml").
>
> Thanks,
> Lars
>


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread