Subject: Re: [xsl] xslt replace special characters From: Mike Brown <mike@xxxxxxxx> Date: Mon, 11 Nov 2002 13:38:52 -0700 (MST) |
Alice Fan wrote: > Thanks Greg. Right in the UI, we want the user to enter their URL. Their > URL will most likely have name/value pairs. Is there an easier way? There > is no otherway of filtering '&' before it gets processed in the XSL? It doesn't matter if they're entering a URL/URI or not. Any text that you intend to put into an XML document needs to be screened, to preserve well-formedness / parseability. 1. Always note the following: - non-XML characters need to be removed or replaced (U+0000..U+0008, U+000B, U+000C, U+000E..U+001F, U+D800..U+DFFF, U+FFFE..U+FFFF) - a string is not a URI if it violates URI syntax, so if the text is destined for a URI-pseudotype attribute value (like href or src in HTML/XHTML), characters above U+007F should be escaped by writing their equivalent UTF-8 bytes as '%xx' for each byte, where xx is the hex notation for the byte (though this isn't strictly necessary; a conforming HTML user agent will do this automatically) - additional translation of ASCII-range characters (U+0000..U+007F) in text destined for URI attributes is not required but is wise, to ensure conformance to URI syntax; %-escape everything except a-z, A-Z, 0-9, and these: - _ . ! ~ * ' ( ) ; / ? : @ & = + $ , [ ] 2. If and when the XML document exists in serialized form (i.e., as a string, not as a DOM object), note the following: - if the text is not destined for a CDATA section, markup characters '&' and '<' need to be escaped - if the text is destined for a CDATA section, the '>' in ']]>' needs to be escaped - if the text is destined for a comment, it must not contain '--' (how you handle such an offense is up to you) - if the text is destined for an attribute value delimited by apostrophes, then apostrophes in the value must be escaped (usually use ' unless in HTML) - if the text is destined for an attribute value delimited by quotes, then quotes in the value must be escaped (usually use ") - if the text is destined for a non-URI attribute value, then tab, LF, and CR need to be escaped to facilitate round-tripping I probably missed one or two cases, but as you can see, you can't just slap any old text into a document and call it XML... - Mike ____________________________________________________________________________ mike j. brown | xml/xslt: http://skew.org/xml/ denver/boulder, colorado, usa | resume: http://skew.org/~mike/resume/ XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] xslt replace special char, Noel Golding | Thread | Re: [xsl] xslt replace special char, Alice Fan |
Re: [xsl] xslt replace special char, Noel Golding | Date | RE: [xsl] xslt replace special char, Alice Fan |
Month |