David,
At 08:11 PM 7/24/2003, you wrote:
Continuing on an earlier problem, I have an XML file that has an element
which will have "escaped" XML content.  David Carlisle helped me
discover the "disable-output-escaping" attribute of "xsl:value-of",
which gives me a valid tree fragment.
Actually this is not the case. It writes a *string* to output -- the same 
as it does otherwise. Only the usual escaping of nettlesome characters like 
"<" and "&" (which would break well-formedness if output raw) into their 
entity-reference forms &_lt; and &_amp; (underscores added), is disabled.
Consequently, what you get in your output is a "valid tree fragment" only 
in the sense that it may *turn out to be*, in the serialized output form, 
XML, which could be *then* be parsed into an XML tree.
If your string is something like:
"Here's my XML string with a < character. (Not!)"
when it's serialized in this form, it creates output that is *not* a 
well-formed XML snippet and cannot be parsed by an XML parser. (My string 
says 'not' since it's lying when it says it's XML.)
This is only one of many reasons that the d-o-e feature is Not Recommended. 
It puts you at risk of getting unparseable output if you had any garbage in 
your input.
Now, I need to convert that tree fragment to a nodeset so I can operate
on it.
Okay....
I noticed the "xalan:nodeset" (and "exslt:node-set") function.  I assume
this takes a tree fragment and returns a nodeset.
Yes, but it takes a "result tree fragment" (see the FAQ or the XSLT Rec on 
what this is), not a string, so it's not useful to you.
Using Xalan, you may be hosed -- unless you care to pipeline your XML 
through another parse:
parse/transform #1 -- escapes output on "XML" strings in serialized result
parse/transform #2 -- does whatever processing you want to do
If your XML doesn't break between these two steps (because you fed 
malformed strings to process #1), process #2 will see a real tree fragment 
there.
Saxon has an extension function that lets you feed such strings to a parser 
from within the transform. That's what you need if you want to do this in 
one pass.
(But what you really need is real XML input: that won't be subject to 
breakage. Since you're getting pseudo-XML input you can't tell whether it's 
actually going to work until you try it. You may need a tidier in your 
pipeline.)
It's bad design, as the RSS folks are discovering, to mix escaped markup 
into XML and expect it to be picked up later, since it muddies the divide 
between XML-as-character-string and XML-as-model. Although some designers 
think it's a feature that the embedded "markup" doesn't have to be 
well-formed, this is a trap: they are just putting themselves back in the 
world where they have to wage an endless campaign to clean up the bad 
markup they asked for. Like saying they want to live in a Japanese house 
since it's cleaner, but then letting people wear shoes anyhow. (And 
complaining "I thought this house was supposed to stay clean!")
It's just easier to assure the stuff is well-formed at the point of 
creation, not to rely on markup-escaping to try to sneak it in as markup 
later. And there are tools that can help with this. Unfortunately, if 
you're hired as the guy to come in and clean the house after the party, you 
may not be in a position to make no-shoes rules.
Good luck,
Wendell
======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list