RE: [xsl] How to select for ' in XPATH?

Subject: RE: [xsl] How to select for ' in XPATH?
From: Hermann Stamm-Wilbrandt <STAMMW@xxxxxxxxxx>
Date: Wed, 5 Aug 2009 19:56:31 +0200
> I don't really know anything about the shell that you are using and any
> escaping or unescaping that it is doing, so it's a bit hard to tell.
I used this one:
http://www.xmlsh.org

> The general rule in XPath 2.0 is that if a string literal is enclosed
> in single quotes, an apostrophe should be represented as a pair of
> adjacent apostrophes.
I tried that hint as it was given by Martin, too.

In xmlsh this works:
$ xpath '/*/*/*[contains(normalize-space(.),"""")]' <tst.html
<p>apos and quot: ' " </p>
$ xpath '/*/*/*[contains(normalize-space(.),"''")]' <tst.html
<p>lt and gt: &lt; &gt; </p>
<p>apos and quot: ' " </p>
$

You are right, it is not clear what escaping/unescaping the shell does,
at least I do not see why the second xpath matches both <p>'s.


My real problem seems to be that I need a XPATH 1.0 solution since
I want to do this in a browser environment, right?


The real problem is as follows:
- open an arbitrary web page in Firefox browser

- with a bookmarklet do an arbitrary selection in that page
  (http://en.wikipedia.org/wiki/Bookmarklet)

- then the bookmarklet generates eg. the following xpath:
  "//*[contains(normalize-space(.),'xyz')]"
  where xyz is replaced by the actual selection data

- then Mozilla's document.evaluate() is used to determine the
  corresponding node in the DOM
  (
https://developer.mozilla.org/en/Introduction_to_using_XPath_in_JavaScript)

This all works really fine as long as there is no &apos; character in
the selection ...

It is just this case where I need to figure out how to pass the apos
character to document.evaluate(). For simplicity let us assume that
the selection contains the &apos; character, only.

The XPATH "//*[contains(normalize-space(.),''')]" is definitely wrong,
but what would be right?

Neither "//*[contains(normalize-space(.),'''')]" nor
"//*[contains(normalize-space(.),'\')]" works.]


Interestingly "//*[contains(normalize-space(.),'%20')]"
matches for &quot;

Sadly "//*[contains(normalize-space(.),'%27')]"
does not match for &apos;

This is the JavaScript statement for the evaluation:)]
e = document.evaluate(unescape(s),document,null,
                      XPathResult.FIRST_ORDERED_NODE_TYPE, null);

Any hint what can be done to make this work?
(I have no control over the webpage nor control over user selection)


Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Developer, XML Compiler
WebSphere DataPower SOA Appliances
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter
Geschaeftsfuehrung: Erich Baier
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


                                                                           
             "Michael Kay"                                                 
             <mike@xxxxxxxxxxx                                             
             m>                                                         To 
                                       <xsl-list@xxxxxxxxxxxxxxxxxxxxxx>   
             08/05/2009 07:20                                           cc 
             PM                                                            
                                                                   Subject 
                                       RE: [xsl] How to select for &apos;  
             Please respond to         in XPATH?                           
             xsl-list@xxxxxxxx                                             
              lberrytech.com                                               
                                                                           
                                                                           
                                                                           
                                                                           





I don't really know anything about the shell that you are using and any
escaping or unescaping that it is doing, so it's a bit hard to tell. The
general rule in XPath 2.0 is that if a string literal is enclosed in single
quotes, an apostrophe should be represented as a pair of adjacent
apostrophes.

Regards,

Michael Kay
http://www.saxonica.com/
http://twitter.com/michaelhkay



> -----Original Message-----
> From: Hermann Stamm-Wilbrandt [mailto:STAMMW@xxxxxxxxxx]
> Sent: 05 August 2009 18:04
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] How to select for &apos; in XPATH?
>
>
> Hello,
>
> I tried to select for special characters with XPATH below.
> While I succeeded for some I am unable to select for the
> &apos; character (') and got an error message.
>
> Any hint how this can be done?
>
> $ xmlsh
> $ cat tst.html
> <html><body>
> <p>lt and gt: &lt; &gt; </p>
> <p>apos and quot: &apos; &quot; </p>
> </body></html>
> $ tidy -q -xml tst.html;
> <html>
>   <body>
>     <p>lt and gt: &lt; &gt;</p>
>     <p>apos and quot: ' "</p>
>   </body>
> </html>
>
> $ xpath "/*/*/*[contains(normalize-space(.),'<')]" <tst.html
> <p>lt and gt: &lt; &gt; </p> $ xpath
> "/*/*/*[contains(normalize-space(.),'>')]" <tst.html <p>lt
> and gt: &lt; &gt; </p> $ xpath
> "/*/*/*[contains(normalize-space(.),'\"')]" <tst.html <p>apos
> and quot: ' " </p> $ xpath
> "/*/*/*[contains(normalize-space(.),'\'')]" <tst.html
> Exception running: xpath
> net.sf.saxon.s9api.SaxonApiException: XPath syntax error at char 34 in
> {...ontains(normalize-space(.),...}:
>     Unmatched quote in expression
> $
>
>
> Mit besten Gruessen / Best wishes,
>
> Hermann Stamm-Wilbrandt
> Developer, XML Compiler
> WebSphere DataPower SOA Appliances
> ----------------------------------------------------------------------
> IBM Deutschland Research & Development GmbH Vorsitzender des
> Aufsichtsrats: Martin Jetter
> Geschaeftsfuehrung: Erich Baier
> Sitz der Gesellschaft: Boeblingen
> Registergericht: Amtsgericht Stuttgart, HRB 243294

Current Thread