Re: [xsl] How to Remove Empty or Unpopulated Attributes When Copying a File

Subject: Re: [xsl] How to Remove Empty or Unpopulated Attributes When Copying a File
From: Evan Lenz <evan@xxxxxxxxxxxx>
Date: Sat, 01 Apr 2006 21:57:14 -0800
Hi Greg,

The example on that web page is misguided. While it does effectively perform an identity transformation, you can't simply add a template rule to customize the processing of attributes. The rule is defined in such a way that attributes are *always* copied. You can't override this behavior, since templates are never applied to attributes. (The @* in the match pattern has no effect, so its presence is just misleading.)

Here's the true identity transform rule:

<xsl:template match="@* | node()">
 <xsl:copy>
   <xsl:apply-templates select="@* | node()"/>
 </xsl:copy>
</xsl:template>

The select="@* | node()" part ensures that templates will be applied to attributes to determine how to process them (rather than using <xsl:copy-of>).

When the rule gets invoked for an attribute, it copies the attribute, and, since attributes can't have attributes or child nodes, the xsl:apply-templates doesn't do anything, but it's perfectly fine (not an error) to have it there. Thus, a single rule can conveniently apply to both attributes and elements (and the other child nodes--text nodes, comments, and processing instructions).

Now that I re-read your email, it sounds like you may have already gotten this far.

To remove attributes that don't have any non-whitespace characters, just add this higher-priority rule to the mix:

<xsl:template match="@*[not(normalize-space(.))]"/>

The normalize-space() function takes a string and removes all leading and trailing whitespace characters and then converts contiguous sequences of whitespace characters to a single space. For a string that has all whitespace characters, it effectively returns the empty string. The not() function converts its argument to a boolean and returns the complement (true if false, false if true). An empty string converts to boolean false, so the predicate (everything inside []) returns true whenever the given attribute's value is empty or when it consists only of whitespace characters. Finally, the template rule is empty (it does nothing). This has the effect of stripping the matching attributes from the result.

I hope this helps!

Evan Lenz

P.S. You can find a succinct but comprehensive description of how the processing model for template rules works in the free PDF sample chapter for my XSLT 1.0 Pocket Reference, linked to on this page: http://www.oreilly.com/catalog/xsltpr/



floatingisland@xxxxxxx wrote:
Hi,

I'm trying to copy everything in an XML file except unpopulated attributes or attributes containing only whitespace. So far, I have not been successful. Although I did manage to copy a file and delete all the attributes using a modified version the style sheet found here: http://www.abbeyworkshop.com/howto/xslt/copy_xml_document/index.html

I've tried variations of other examples, but so far no success. As a side note, does it make a difference in this case if the attributes (empty or populated) have namespaces, beyond including the namespaces in the XSLT Style Sheet.

If it makes a difference, I'm using the XSLT parser in PHP5, which I believe is the libxml parser. Any assistance you could provide would be helpful.

Thanks,

Greg

Current Thread