RE: [xsl] analyze-string regex

Subject: RE: [xsl] analyze-string regex
From: "Rushforth, Peter" <Peter.Rushforth@xxxxxxxxxxxxxxxxx>
Date: Thu, 27 Mar 2014 17:06:42 +0000
Hi Graydon,

> > allowing me to generate a sequence of elements from a string
> > containing multiple objects that I match.  If that is true (the spec
> > says it should be, I think), then my regex is faulty.  Could you offer
> > any suggestions about the following, please?
>
> Don't get yourself stuck writing a regex like that?

Thanks, I appreciate the soundness of that advice (now).

>
> There might be a JSON-to-XML library out there, which'd be my first choice.


Yes, that was my first instinct too.  I found a generic json2xml.xsl library,
but it
performed too slowly.  I figured I needed something more specific in order to
speed things up.


>
> If not, I couldn't figure out precisely what you were trying to do -- I
> would have had to be able to figure out which regex sub-match was which
> -- but the following is the sort of thing I think works a lot better as
> an approach.  Saw up the input with tokenize when you can (the "we don't
> need quotes when we've got braces" bit of JSON isn't a help!) and get
> some use out of the regular structure, restricting string matches to
> replace and relatively short and simple stuff you (or at least I!) can
> comprehend.

I would normally work with tokenize as well,  but I thought this might be a
good
opportunity to use some declarative code which (might) perform better.

What I came up seems to work ok:

  <xsl:function name="ex:locationJson2Options">
    <xsl:param name="json"/><!--           1    2                            3
4                              5                6                         7
8 9 10 11                  12         13                       14
15 16                      17                                     -->
    <xsl:variable name="regexps"
select="'(\{.*?(&quot;title&quot;:.*?&quot;(.*?)&quot;).*?(&quot;qualifier&qu
ot;:.*?&quot;(.*?)&quot;).*?(&quot;type&quot;:.*?&quot;(.*?)&quot;).*?((((&qu
ot;bbox&quot;:.*?\[(.*?)\]).*?(&quot;geometry&quot;:.*?(\{.*?\})).*?\}{1,}))|
((&quot;geometry&quot;:.*?(\{.*?\})).*?\}{1,})))'"/>
    <xsl:analyze-string select="$json" regex="{$regexps}" flags="s">
      <xsl:matching-substring>
        <xsl:if test="regex-group(11)"><!-- if a bbox exists we've got an
option -->
          <xsl:element name="option">
         <xsl:if test="regex-group(9)">
              <xsl:attribute name="data-bbox"
select="translate(regex-group(12),' ','')"/>
         </xsl:if>
            <xsl:value-of select="regex-group(3)"/>
          </xsl:element>
        </xsl:if>
      </xsl:matching-substring>
    </xsl:analyze-string>
  </xsl:function>

I am glad it didn't lead to madness, as in this case ;-)
http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml
-self-contained-tags

Cheers,
Peter

Current Thread