Re: [xsl] Validation XSLT using XSLT 1.0

Subject: Re: [xsl] Validation XSLT using XSLT 1.0
From: Abel Braaksma <>
Date: Thu, 03 Jul 2008 07:34:31 +0200
Ganesh Babu N wrote:
Dear All,

I am writing a validation style sheet. I am struck at following point
in my validation. Can anyone point me to solution.

1. I want to validate the @picfile value is equal to entity name and
tiff images must be equal to the actual images file name on the image

My question is how to get entity name and value in to a test.

<!ENTITY I9780073379470_A_TB004 SYSTEM "9780073379470_A_TB004.tif" NDATA TIFF>

<graphic picfile="I9780073379470_A_TB004"/>

You cannot test the file existence without extension functions. In XSLT 2.0 you could do unparsed-text-available(), but only if the target did not contain anything illegal in XML 1.0 or XML 1.1 (meaning: no nulls and that's unlikely for a picture).

You can use the XSLT 1.0 function unparsed-entity-uri to get to the URI of the entity.

It is not possible to access the DTD data (the entity declaration itself) without resorting to extension functions. That is because the DTD is processed before it gets to the XSLT processor.
2. In the XML file we are using named entities eg:- &nbsp; &acute;
instead of &#123; or &#x123; in XSLT how to find if the XML file
contains entities other than named entities?

You mean, how to find numeric entities? That is not possible, an entity, be it numeric or named, is transparent to the represented character(s) to XSLT. Again because they are processed before they get to the XSLT processor.

There is a (very complex) workaround using XSLT 2.0 using unparsed-text() on the source, but that requires a hell of a lot extra processing.

3. How to find non-ascii characters in the XML file and report an
error using XSLT.

You cannot. If characters in the source do not exist in the expected character set, the XML processor will error out (not the XSLT processor).

If the source is correct in terms of codepage and/or all non-ascii characters (assuming you mean 7-bit US-ASCII here???) are correctly escaped as entities, then you could do something like the following in XSLT 2.0 (albeit quite resource intensive):

<xsl:template match="text()[string-to-codepoints(.)[. gt 127]]">
  <xsl:message terminate="yes" >Not Ascii!!</xsl:message>

4. How to find double enters in the XML file and report an error using XSLT.

What is an "enter"? Perhaps you mean a newline? In that case, this fourth one is trivial:

<xsl:template match="text()[contains(., '&#xA;&#xA;')]">
  <xsl:message terminate="yes">double 'enter'</xsl:message>

I could able to write around 45 validation points using XSLT but got
struck with these 4. Please help me resolving these issues.

Some things are not possible to validate, but shouldn't want to either, because if they do not validate, they will not get passed the XML processor, period. Working on entities and entity references should not be your concern. The XSLT processor will choose whether or not to use entities in the output depending on the encoding you have chosen and these need not be the same named entities as when the XML was input.

-- Abel --

Current Thread