Re: [xsl] cursed RTF outputs

Subject: Re: [xsl] cursed RTF outputs
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 28 Jan 2023 17:24:34 -0000
2004? Rather older than that...

So long as we use multiple character encodings, and don't have reliable
metadata saying what encoding a file or data transfer is using, this problem
will always be with us. And the methodology for solving it will always be the
same as it always was: follow the data through the complete processing
pipeline, looking at the actual binary representation of the data at each
stage, to find where in the chain of processing some component C stored or
transmitted a piece of data in encoding X to a component D that thought it was
in encoding Y.

Remember, encoding problems don't happen within a component (such as an XSLT
processor). They happen on the boundary between components, where one piece of
software thinks the encoding is X and another piece of software thinks it is
Y.

And we're not going to be able to solve an encoding problem by looking at
characters rendered on the page (which is what you're asking us to do). You
can only solve it by looking at bytes.
Michael Kay
Saxonica

On 28 Jan 2023, at 16:23, Jean-Paul Rehr
rehrjb@xxxxxxxxx<mailto:rehrjb@xxxxxxxxx>
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx<mailto:xsl-list-service@xxxxxxxxxxxx
rytech.com>> wrote:

It seems I am stuck in 2004 problems.

I am trying to get an RTF-compliant output from XSLT (3.0) but as usual, like
back in the old mailing lists, characters aren't coming through. Has this been
solved with a particular encoding yet?

My test situation is with this node:

<node>C) C( C  B( B0 C.</node>

And this stylesheet using output encoding Windows-1250:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
version="3.0">

  <xsl:mode on-no-match="shallow-copy"/>
  <xsl:output method="text" encoding="Windows-1250"/>

  <xsl:template match="/">
    {\rtf1
\par <xsl:apply-templates/>
     }
  </xsl:template>

</xsl:stylesheet>

I get an rtf document with these characters

C ? ? B$ B. C4

Many thanks in advance,
Jean-Paul

PS: incidentally https://xsltfiddle.liberty-development.net/6qLYEp2 even says
it won't output because the characters are unsupported. So this just adds to
my confusion about how to make any of this work.


XSL-List info and archive<http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe<http://lists.mulberrytech.com/unsub/xsl-list/293509> (by
email<>)

Current Thread