Subject: Re: [xsl] transform optimization for a schema-constrained domain From: Joerg Pietschmann <joerg.pietschmann@xxxxxx> Date: Thu, 26 Jul 2001 08:43:28 +0200 |
> From: "Huebel, David" <dhuebel@xxxxxxxxxxxxxx> > > Hello, > > Are there any XSLT processors that can use a schema for the input domain to > improve performance? [...] > Has this been implemented anywhere, As far as i know, no processor uses knowledge of a DTD or Schema attached to the XML source for anything other than validation (and its actually the parser doing that). In case of DTDs it has probably something to do with the lack of an API for acessing the element definitions. With schemas being XML, this is no longer an excuse but then schema support is not only horribly complex but still somewhat in development. > and does anyone have any comments on its > usefulness? It would provide for some optimizations. The most obvious example is that the processor could use a table which tells whether elements may have certain descendents to optimize tree scanning for expressions involving // and perhaps for building lookup tables for key()s. Of course this would imply that the source XML must be validated against the DTD or Schema, and it is not clear whether the up-front costs pay off. There are further useful optimizations if elements are defined as sequence of child elements. For example if <!ELEMENT e (a,b,c,d,e)>, child elements could be looked up by index instead of scanning the subtree for the node. In the case <!ELEMENT e (a,b?,c?,d?,e?)> you could stop scanning the subtree for c-elements once a c or d is found. The data type support of schemas may provide some more opportunities. If an element or attribute is defined to be a number, the validation step could as well store the value as an internal number representation as it has to verify it anyway, and the expression evaluation machinery could blindly access the numerical value instead of converting it from the string representation every time. For values constrained by regexps, string processing operations may also be optimizable. Note that the optimizations above are all peanuts. They apply however even to carefully crafted XSL code. There may be a huge potential for optimization of lazily built, otherwise quite inefficient style sheets. If there is a <xsl:for-each select="//stuff"> while the stuff-element has already been removed from the DTD/Schema, getting rid of the for-each of could be a substantial win. This leads to my last point: A DTD/Schema-aware XSL processor could warn me of misspellings and incompatibilities of the XML structure and XPath- expression in the XSL. For example, if i have an <!ELEMENT e (name,stuff)> and <xsl:value-of select="e/naem"/>, the processor could tell me that i wrote something wrong. If i change the XML structure, for example by removing the name-child a defining the name as an attribute, and forget to change the XSL, the processor would also tell me. If such a feature had benn available it would already have saved me a awful huge amount of debugging time. Waiting for implementation of this... :-) Regards J.Pietschmann -- XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] transform optimization fo, Huebel, David | Thread | RE: [xsl] transform optimization fo, Michael Kay |
Re: [xsl] Re: Bug Talk on the List , cutlass | Date | RE: [xsl] non-breaking white space , Armin Fabritius |
Month |