Re: [xsl] XSLT Language Grammar

Subject: Re: [xsl] XSLT Language Grammar
From: Mike Brown <mike@xxxxxxxx>
Date: Tue, 13 May 2003 03:03:18 -0600 (MDT)
Fatih TURKMEN wrote:
> My purpose is to write an XSLT processor for my senior
> project.I am using predictive parsing to parse XPath
> and XSLT expressions(statements).In w3c specifications
> XPath grammar is given.I have eliminated left
> recursion in this grammar and still working on my
> parser.But it is difficult to extract XSLT grammar
> from the speicification.So I need XSLT grammar for
> XSLT parser.

I hope you're not trying to reinvent the wheel of XML parsing. XSLT is a
namespace-aware "application of" XML, intended to be processed after all the
syntactic sugar has been interpreted and abstracted away by an XML parser.
There are still some grammatical concerns within the logical stylesheet tree,
but the validation required is in general too complex to be handled in a parser.

It can be said that the XSLT 1.0 grammar, at a syntactic level, is simply that
of XML 1.0 plus Namespaces in XML 1.0. You might be able to extrapolate a bit
further and recognize attribute values that are XPath expressions, QNames,
Attribute Value Templates, URIs, etc., but 1. these details aren't summarized
anywhere, and certainly aren't summarized as EBNF productions; you just have
to pore over the spec to find the relevant prose; and 2. this is typically
handled not at parse time but in a separate stylesheet
preparation/"compilation" step after the stylesheet tree has been built.

Also, a a higher level, the rules of containment for XSLT instruction elements
are spelled out as a DTD fragment in Appendix C of XSLT 1.0, but DTDs are not
expressive enough to capture some of the more complex rules described in the
spec's prose. As many people who want to run an XSLT stylesheet through a
validating XML parser discover, it's hard to make a DTD for XSLT documents
because, aside from namespace issues, literal result elements and top-level
data elements can be anything and can appear almost anywhere in the
stylesheet. Likewise, it is going to be very difficult to produce a detailed
grammar for XSLT because you're going to have to allow for essentially random
XML to be embedded throughout the stylesheet.

Food for thought...

 XSL-List info and archive:

Current Thread