Re: [xsl] Form feed character () in decoded xs:base64Binary

Subject: Re: [xsl] Form feed character () in decoded xs:base64Binary
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 13 Jul 2020 18:14:17 -0000
You could use bin:to-octets, then look for and remove any octets representing
invalid characters... Not easy. The theory is that (if XML 1.1 isn't enabled)
then the xs:string data type doesn't allow a FF character, therefore any
operation that tries to generate one must fail.

Michael Kay
Saxonica



> On 13 Jul 2020, at 19:00, Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> What happens if you use version="1.1" in the XML declaration (of the
stylesheet)?
>
> On 13.07.2020 19:54, Martynas JuseviD
ius martynas@xxxxxxxxxxxxx wrote:
>> Hi,
>> I'm transforming large JSON files with some email data using XSLT 3.0.
>> They contain xs:base64Binary literals which I'm decoding using
>> bin:decode-string() and want to include the decoded values in the
>> output XML.
>> The problem is that some of the decoded string values have illegal XML
>> 1.0 characters in them, such as Form feed (&#xc;).
>> I want to remove them but cannot find a way.
>> I can't use translate(., '&#xc;', '') because the stylesheet would not
>> be well-formed anymore.
>> I can't even use replace(., codepoints-to-string(12), '') because I
>> get this error (with Saxon 10.1 EE):
>>     codepoints-to-string(): invalid XML character [xc]. Found while
>> atomizing the second argument of fn:replace()
>> Are there any native XSLT options here?
>> Thanks.

Current Thread