Subject: RE: Character entities From: Mike Brown <mbrown@xxxxxxxxxxxxx> Date: Mon, 14 Feb 2000 11:23:03 -0700 |
> Does HTML "know" UTF-8? Like XML, HTML 4.0 is mostly defined in terms of UCS/Unicode characters, which of course must be encoded. There is a mechanism for a document to signal its own character encoding via a META declaration. This could be overridden by a charset parameter in an HTTP Content-Type header. Since HTML doesn't prescribe UTF-8 as a default and because the META declaration can appear pretty far down in the document HEAD, the recommendation states that only ASCII (U+0000 through U+007F) characters should be used in the document up to that point. This stuff is discussed at http://www.w3.org/TR/1999/REC-html401-19991224/charset.html#spec-char-encodi ng It is worth pointing out that the value of the recommendation is only as good as the user agents' support for it. The 4.0 browsers seem to do okay with automatically selecting the proper encoding when interpreting a document, but you may have noticed that they also let the user manually choose it even if the document signaled its own encoding. XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: Character entities, Frederic Schwebel | Thread | RE: Character entities, Beckers, Marc |
Re: JavaScript, David Carlisle | Date | Re: XPATH : How to get the level of, Juergen Hermann |
Month |