RE: Character entities

> Does HTML "know" UTF-8?

Like XML, HTML 4.0 is mostly defined in terms of UCS/Unicode characters,
which of course must be encoded. There is a mechanism for a document to
signal its own character encoding via a META declaration. This could be
overridden by a charset parameter in an HTTP Content-Type header.

Since HTML doesn't prescribe UTF-8 as a default and because the META
declaration can appear pretty far down in the document HEAD, the
recommendation states that only ASCII (U+0000 through U+007F) characters
should be used in the document up to that point.

This stuff is discussed at
http://www.w3.org/TR/1999/REC-html401-19991224/charset.html#spec-char-encodi
ng

It is worth pointing out that the value of the recommendation is only as
good as the user agents' support for it. The 4.0 browsers seem to do okay
with automatically selecting the proper encoding when interpreting a
document, but you may have noticed that they also let the user manually
choose it even if the document signaled its own encoding.


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

Current Thread
Re: Character entities, (continued) Sebastian Rahtz - Mon, 14 Feb 2000 15:31:31 +0000 (GMT) Steve Tinney - Mon, 14 Feb 2000 10:34:59 -0500 Beckers, Marc - Mon, 14 Feb 2000 18:27:36 +0100 Frederic Schwebel - Tue, 15 Feb 2000 16:19:42 +0100 (MET) Mike Brown - Mon, 14 Feb 2000 11:23:03 -0700 <= Beckers, Marc - Tue, 15 Feb 2000 10:14:54 +0100 Kay Michael - Tue, 15 Feb 2000 10:06:00 -0000

<- Previous	Index	Next ->
RE: Character entities, Frederic Schwebel	Thread	RE: Character entities, Beckers, Marc
Re: JavaScript, David Carlisle	Date	Re: XPATH : How to get the level of, Juergen Hermann
	Month

<-prev [Thread] next->	<-prev [Date] next->
Month Index \| List Home