[xsl] use cases for d-o-e

Subject: [xsl] use cases for d-o-e
From: Joerg Pietschmann <joerg.pietschmann@xxxxxx>
Date: Wed, 09 Jan 2002 15:27:59 +0100
Jeni Tennison wrote: [in some other thread]
> However, for all the complaints that
> we make about people using disable-output-escaping, we rarely suggest
> that it should be removed from the language just because people who
> don't know any better use it inappropriately.

Ok, i'll bite.
So far, i'm aware of the following (ab-)uses of d-o-e
1. Process data which got as a string into the XSLT processor but is
 really marked up data. This includes simple insertion into the result
 tree. Usually, the string is read from a data base or has been CDATA
 in a source XML document.
2. Produce a result document which looks like XML or HTML at a cursory
 glance but actually isn't, like PHP or ASP source.
3. Insert entity references into the result.
4. Insert DOCTYPE declarations with an internal subset into the result.

Does somebody know of uses which are not completely theoretical and
don't fit in one of the categories above?

The first use appears often enough that it can't be easily. Two solutions
without d-o-e:
- Use document with the data: protocol, which is already well specified:
 document(concat('data:content=text/xml;encoding=ISO-8859-1;',$data)
 (Hope i got it right)
- Get a XPath parse function
 xf:parse('text/xml','ISO-8859-1',$data)
What's your opinion:
- Is mandating support for the data: protocol a good idea?
- Should we have a parse() function instead?
- Should one or both be mentioned in the standard but made an optional
 feature of the processor (but every one who likes being taken serious
 implements it anyway :-)?
The advantage of avoiding d-o-e for this use case is obviously that it
works even if the result is not serialized, for example when a browser
renders directly form the tree generated by the XSLT processor.

The second is a somewhat harder nut to crack, but d-o-e can be avoided
by setting output method to "text" and doing the serialization of
elements by hand. Actually, the usual practice of setting the output
method to xml or html *is* incorrect, it suggests that the output is
XML (or HTML) which it actually isn't. Furthermore, PHP and ASP
processor can't be fed with a XML tree anyway, as far as i'm aware of.
The text output method could be made easier to swallow in XSLT 2.0, the
stylesheet writer could construct a result tree in a variable as if he
was constructing an XML tree, and then apply a generic template set
imported from a library to this tree:
  <xsl:import href="serialize.xsl"/>
  <xsl:template match="/">
   <xsl:variable name="result">
     <!-- insert processing here -->
   </xsl:variable>
   <xsl:apply-templates select="$result" mode="serialize"/>
  </xsl:template>
So, for this purpose i expect we could drop d-o-e too.

Case three: the most usual case is generating entity references which
expand to character references. I don't think it is unreasonable to tell
the perpetrators just not doing it.
I know there are members of this list claiming that the possibility to
output entity references referring to more complex stuff is essential
to their work. Well, i'll just ignore this. :-)

Case four: This came up only once so far. Don't know what to do about it.
I'm not very happy with the solution as the output method has to be set
to text while it's really XML (sort of opposite of case 2).

Poll: Who does agree we can drop d-o-e without making too much
customers unhappy? Who does not, and why not?
NAG members are not allowed to invoke case three to thwart the proposal!
 :-) :-)

Regards
J.Pietschmann

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread