RE: [xsl] display & as text

Subject: RE: [xsl] display & as text
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Mon, 07 Jun 2010 15:23:56 -0400

At 02:28 PM 6/7/2010, you wrote:
> At 07:08 PM 6/4/2010, you wrote:
>>Is there a book that explains where and how and when to do the string
>>manipulation programmatically on incoming data necessary to allow it to have
>>the illegal entities changed to legal ones? I am the owner of several xml
>>and xsl books that don't seem to cover that part of the process.
> Strictly speaking the question (indeed the entire thread) is off-topic.

I believe the person was trying to help me out by that question. The object of my original post was to get a way to use the "&" character as regular text in an xml file. And I think by her wording she was trying to interface with the high minded pros on this list to get a relavent solution to my first post.

This thread was already addressed by three or four of the most high-minded pros in the business, including (if memory serves me right) Mike Kay, David Carlisle and Liam Quinn, whose individual and collective expertise in XML is exceeded by no one's. They and other helpful experts have already tried and failed to satisfy your off-topic question, so I'm not hopeful....

> But it is both simple and complicated -- probably why there's no
> treatment of it in a book. The simple version of it is too simple to
> need it. The complicated version is both too deep and too general to
> be much use to a working programmer who has a specific set of issues.

Situation: for non-programmers copying and pasting text with "&" characters.

Non-programmers either need to acquire some measure of "programming" knowhow (it's been done), or use tools designed for them, or accept the risks and complications. Yet while that sounds harsh, it isn't, or no more so than in any other technical realm.

In particular, what we are discussing here is no more difficult than learning how to unplug the lamp fixture from the wall before sticking your screwdriver into it. It's like this:

You: "I want to unscrew the fixture from the lamp, but I keep getting an electric shock."
Us: "First, unplug the fixture from the wall."
You: "No, that's too complicated."
Us: "Well, you could switch off the power at the circuit breaker if you wanted" (i.e. use a CDATA marked section).
You: "No. That's too hard. There must be an easy way."

> The complex answer accounts for how some characters aren't allowed in
> XML, so they need to be scrubbed or changed into something else
> (these are mainly control characters you won't ordinarily see), and
> how certain constructs (namely, entity references) will be legal if
> you have declarations for them....

So what is the code?

That's like asking "how do I get to the grocery store"? I can tell you about the one in my neighborhood, but it probably won't help you.

What I can tell you is that it all boils down to simple search/replace. For single files, this is commonly done interactively in a text editor or XML editor. For large sets of files it can be automated, but due to the nature of the requirements and the vagaries of non-XML input, it can't very well be packaged and sold (except as part of a larger toolkit).

An XSLT programmer might use the XPath 2.0 function unparsed-text() from a stylesheet to get non-XML input into an XML context, and work from there. That suggestion was already made by one of the high-minded pros (and it's a good one).

> The simple answer is that, if these complications don't intervene,
> simply escaping all "&" into "&amp;" and "<" into "&lt;" should do
> the trick....

No thanks. Think about your Grandma's carpal tunnel syndrome or Stephen Hawking's ALS disease.

What, two global search/replace routines, supported in any text editor or word processor, is too hard for Grandma? How did she get this far in her work with XML, I wonder?

I guess I'll write her a macro so the two routines can be executed from a single button. (Oh, oXygen already has one?)


Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.      
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
  Mulberry Technologies: A Consultancy Specializing in SGML and XML

Current Thread