Re: AW: AW: Encoded question

Subject: Re: AW: AW: Encoded question
From: Mike Brown <mike@xxxxxxxx>
Date: Mon, 6 Nov 2000 21:49:43 -0700 (MST)
Josef Vosyka wrote:
> As I was told &#153; is the best way to archieve (TM) symbol --- the majority of
> major browsers display it as TM.

Do not rely on what people tell you; consult the actual specifications at In XML, XSLT, and HTML, &#153; is *by definition* ISO/IEC
10646-1:1993 character number 153, which is not a trademark symbol. HTML
also happens to define &trade; as an entity reference.

Browsers are letting you get away with &#153; because old versions were
lax about what numeric character references meant. It is for backward
compatibility. So yes, you are right, it is the 'best' way in that it is
the most reliable.. for as long as browsers support this broken
notation... but it is not 'best' as in 'correct in any way whatsoever'.

> If you use the unicode value, it does not work on some IEs (but it did work on
> my linux :-)

Did you actually try it in the transformation? You will not get &#x2122;
in the HTML just because you put &#x2122; in the stylesheet. The
stylesheet is not a literal specification for output. Try it and see, with
output method="html". I almost guarantee you will not see &#x2122; in the

> I managed to make xalan to render it as &#153; if I used encoding="us-ascii".

Then this is bug #48571249854175123484 in Xalan. There are half a dozen
Xalan bugs posted here every week, it seems like.

> I do not wanna use encoding="windows-1252" because than it only works on
> windows-1252.

The output is going to be bits & bytes in *some* encoding.

In HTML you are allowed to represent the trademark symbol as one of these
references *only*:

(It is certain browsers that let you use &#153;)

Since the references themselves are comprised of all ASCII characters, and
since almost all encodings subset ASCII, they are allowed in all

Also, instead of using the reference, *if the encoding supports it*, you
can use the directly encoded character. That is,

  if encoding is:   byte sequence for that character is:
  ===============   ====================================
  utf-8             0xE2 0x84 0xA2
  windows-1252      0x99
  iso-8859-1        n/a. must use reference. e.g.,
                     0x26 0x74 0x72 0x61 0x64 0x65 0x3B
                      &    t    r    a    d    e    ;
  us-ascii          n/a. must use reference.
However in XSLT you have no way of demanding that the output be certain
bytes or certain references. You must rely on the wisdom of the XSLT
processor's output method to convert the character you want (and there is
only one trademark character) to the right sequence of bits, according to
the encoding you asked for in xsl:output.

   - Mike
Mike J. Brown, software engineer at         My XML/XSL resources: in Denver, Colorado, USA 

 XSL-List info and archive:

Current Thread