Re: [xsl] JSON-encoding strings in XSLT 2.0

Subject: Re: [xsl] JSON-encoding strings in XSLT 2.0
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Wed, 6 Nov 2013 15:37:08 +0000
I must admit I wasn't really thinking in terms of performance. I guess
for performance it would be better to write

<xsl:variable name="json-escapes">
 <esc j="\\" x="\"/>
 <esc j="\n" x="&10;"/>
 ...
</xsl:variable>

<xsl:key name="json-escapes" match="esc" use="@j"/>

<xsl:analyze-string select="$in" regex="\\|\'|\n|....">
 <xsl:matching-substring>
  <xsl:value-of select="key('json-escapes", ., $json-escapes)/@x"/>
 </xsl:matching-substring>
 <xsl:non-matching-substring>
   <xsl:value-of select="."/>
 </xsl:non-matching-substring>
</xsl;analyze-string>

Michael Kay
Saxonica

On 6 Nov 2013, at 14:18, Hermann Stamm-Wilbrandt <STAMMW@xxxxxxxxxx> wrote:

> DataPower appliance had this cascade of  regexp:relace()  statements in the
> past in stylesheet "store:///jsonx2json.xsl" for "<json:string>" escaping.
>
>
> Customer raised a PMR on bad performance for JSONX <json:string>
> conversions
> to JSON (he proved by using stylesheet profiling).
>
> I did fix that in this May fixpack by introducing new extension function
> for doing the escaping in firmware rather than XSLT and just calling that
> by
> "dp:encode(., 'json-escape')":
> http://www-01.ibm.com/support/docview.wss?uid=swg1IC90781
>
> This improved conversion runtime for a 2.2MB JSONX customer sample file
> with 740KB JSON output by a factor of 11.
>
>
> So if you have to do JSON escaping in XSLT, you have no choice.
> If not, then better do it in a new extension function.
>
>
> Mit besten Gruessen / Best wishes,
>
> Hermann Stamm-Wilbrandt
> Level 3 support for XML Compiler team and Fixpack team lead
> WebSphere DataPower SOA Appliances
> https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
> https://twitter.com/HermannSW/     http://www.stamm-wilbrandt.de/ce/
> ----------------------------------------------------------------------
> IBM Deutschland Research & Development GmbH
> Vorsitzende des Aufsichtsrats: Martina Koederitz
> Geschaeftsfuehrung: Dirk Wittkopp
> Sitz der Gesellschaft: Boeblingen
> Registergericht: Amtsgericht Stuttgart, HRB 243294
>
>
>
>  From:       Michael Kay <mike@xxxxxxxxxxxx>
>
>  To:         xsl-list@xxxxxxxxxxxxxxxxxxxxxx,
>
>  Date:       10/29/2013 12:46 PM
>
>  Subject:    Re: [xsl] JSON-encoding strings in XSLT 2.0
>
>
>
>
>
>
>
> On 29 Oct 2013, at 11:02, Martynas Jusevihius <martynas@xxxxxxxxxxxx>
> wrote:
>
>> Thanks Michael. I was looking at http://json.org and here's what I came
> up with:
>>
>>   <xsl:template match="text()" mode="json-identity">
>>       <xsl:value-of
>> select="replace(replace(replace(replace(replace(replace(., '\\',
>> '\\\\'), '''', '\\'''), '&quot;', '\\&quot;'), '&#09;', '\\t'),
>> '&#10;', '\\n'), '&#13;', '\\r')"/>
>>   </xsl:template>
>>
>> Can this be improved?
>
> Well, I'm not going to check that the list of characters to be escaped is
> complete, but you've got the right idea. I would code it like this for
> readability:
>
>> <xsl:template match="text()" mode="json-identity">
>       <xsl:variable name="v" select="."/>
>       <xsl:variable name="v" select="replace($v, '\\', '\\\\')"/>
>       <xsl:variable name="v" select="replace($v, '&quot;', '\\&quot;')"/>
>       ...
>       <xsl:value-of select="$v"/>
> </xsl:variable>
>
> or in 3.0 you can use the "!" operator for function chaining:
>
>> <xsl:template match="text()" mode="json-identity">
>       <xsl:value-of select="replace(., '\\', '\\\\') ! replace(.,
> '&quot;', '\\&quot;') ! ....."/>
> </xsl:variable>
>
> Michael Kay
> Saxonica
>
>
>
>>       <xsl:value-of
>> select="replace(replace(replace(replace(replace(replace(., '\\',
>> '\\\\'), '''', '\\'''), '&quot;', '\\&quot;'), '&#09;', '\\t'),
>> '&#10;', '\\n'), '&#13;', '\\r')"/>
>   </xsl:template>
>
>>
>> On Tue, Oct 29, 2013 at 10:37 AM, Michael Kay <mike@xxxxxxxxxxxx> wrote:
>>>
>>> There's no built-in function for the job, but picking out the characters
> =
>>> that need special treatment (e.g. replacing newline by "\n") isn't =
>>> difficult. Handling astral characters is a bit tricky because JSON =
>>> requires them to be represented as a surrogate pair, but again the logic
> =
>>> for that isn't really difficult.
>>>
>>> Michael Kay
>>> Saxonica
>>>
>>> On 29 Oct 2013, at 00:56, Martynas Jusevihius <martynas@xxxxxxxxxxxx>
> wrote:
>>>
>>>> Hey,
>>>>
>>>> is there some way in XSLT 2.0 to encode strings for use in JSON? In my
>>>> case, the stylesheet has to encode all text nodes in a XHTML fragment
>>>> which then gets passed to WYSIWYM editor constructor. Could this be
>>>> done as identity transform?
>>>>
>>>> I had solved this problem when I used XSLT 1.0 on PHP by calling
>>>> json_encode() as extension function, but now I'm in the Java world.
>>>> http://php.net/manual/en/function.json-encode.php
>>>>
>>>> Martynas
>>>> graphityhq.com

Current Thread