Re: [xsl] escaping/entities on the fly?

Subject: Re: [xsl] escaping/entities on the fly?
From: Kevin Rodgers <kevin.rodgers@xxxxxxx>
Date: Fri, 4 Mar 2005 16:37:13 -0700
Gabriel K. writes:
> > 1. If you click on the link in the mail message, does your mail
> >    composition agent put "=?ISO-8859-1?Q?kabelskcccc=E5p?=" in the
> >    subject line?
>
> Yes it does. Are you saying this is correct, that it should be this way?
>
> > 2. If you change the address to your own and send it, is the message
> >    delivered with the subject intact?  Has your mail user agent added
> >    the appropriate MIME headers:
>
> The message is delivered with the subject : "kabelskccccep". So this seems
> to work. :)
>
> > MIME-Version: 1.0
> > Content-Type: text/plain
>
> Yes it has:
> Content-Type: text/plain;
>  format=flowed;
>  charset="iso-8859-1";
>  reply-type=original

Good, so now we are back on topic (XSLT).

> >> My link is created like this:
> >>
> >> <a>
> >>     <xsl:attribute name="href">
> >>         mailto:<xsl:value-of select="$_settings/supportMail"/>?subject=
> >>         <xsl:value-of select="$_shared/serviceName"/>: <xsl:value-of
> >> select="$_fullName"/>&#32;<xsl:value-of select="mir:KNP"/>
> >>
> >>     </xsl:attribute>
> >> <xsl:value-of select="$_shared/mailLink/responsible"/>
> >> </a>
> >>
> >> the output of <xsl:value-of select="$_fullName"/> can be "kabelskep" for
> >> instance.
> >> So how do you suggest I modify the code?
>
> > What does the above XSLT generate?
>
> it generates this:
> <a href="mailto:support@xxxxxxxxxx?subject= Mirakel Webbserver:
> Kabelsk%C3%A5p KS0100">

Besides the non-ASCII character, the spaces need to be handled correctly
(encoded as %20).

> So my problem is that "e" is transformed to "%C3%A5" by the the XSL
> processor and that does not display correctly in the mail client.
> Instead it shows: "KabelskC%p", and the message source says:
> "Kabelsk=C3=A5p".

No.  First you must encode the subject text per RFC 2047, then you must
pass that through the XSLT escape-uri() function.  The problem is that
XSLT does not have built-in support for RFC-2047 encoding, so you will
have to implement it.

Here's my attempt to implement it, which cheats by using uri-escape() to
convert each unsafe character and then translate() to convert each "%XX"
to "=XX":

<xsl:function name="rfc2047:qp-encode"
  xmlns:rfc2047="http://ietf.org/rfc/rfc2047.txt";>
  <xsl:param name="text"/>
  <xsl:value-of select="'=?UTF-8?Q?'"/>
  <xsl:for-each select="string-to-codepoints($text)">
    <!-- Preserve printable ASCII chararacters, except for space (32)
         and question mark (63) ; escape ASCII control characters (0-31
         and 127), and all non-ASCII characters (128-). -->
    <xsl:choose>
      <xsl:when test=". &lt;= 32 or . = 63 or . &gt;= 127">
        <xsl:value-of select="translate(escape-uri(codepoints-to-string(.),
                                                   true()),
                                        '%', '=')"/>
      </xsl:when>
      <xsl:otherwise>
        <xsl:value-of select="codepoints-to-string(.)"/>
      </xsl:otherwise>
    </xsl:choose>
  </xsl:for-each>
  <xsl:value-of select="'?='"/>
</xsl:function>

I hope the xsl-list readership will suggest improvements to that; in
particular, it wouldn't work when I added as="xs:string" to the
xsl:function declaration.  Once rfc2047:qp-encode() is robust, the same
boilerplate should be used to implement rfc2047:base64-encode().
Anyway, you should be able to do something like:

<a href="concat('mailto:',
                $_settings/supportMail,
                '?subject=',
                escape-uri(rfc2047:qp-encode(concat($_shared/serviceName),
                                                    ': ',
                                                    $_fullName,
                                                    ' ',
                                                    mir:KNP),
                           true())">
<xsl:value-of select="$_shared/mailLink/responsible"/>
</a>

--
Kevin Rodgers

Current Thread