Re: [xsl] Scraping to Analyze Structure

Subject: Re: [xsl] Scraping to Analyze Structure
From: "Hank Ratzesberger xml@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 8 Sep 2016 18:40:51 -0000
Ah, yes, curl , tagsoup that's a good strategy.

Even so, since this is an application that requires many inputs to get to
various pages, selenium or some webdriver is nevertheless needed.

I found the xslt example here:

http://stackoverflow.com/questions/953197/how-do-you-output-the-current-element-path-in-xslt

and it outputs something like this:


/html/body/div/div[4]/table/tbody/tr/td[3]/div/div[2]/div/div[2]/div/div/div/table/tbody/tr[6]/td[2]/p/span[1]/strong
/html/body/div/div[4]/table/tbody/tr/td[3]/div/div[2]/div/div[2]/div/div/div/table/tbody/tr[6]/td[2]/p/span[1]/br
/html/body/div/div[4]/table/tbody/tr/td[3]/div/div[2]/div/div[2]/div/div/div/table/tbody/tr[6]/td[2]/p/span[2]
/html/body/div/div[4]/table/tbody/tr/td[3]/div/div[2]/div/div[2]/div/div/div/table/tbody/tr[6]/td[2]/p/span[2]/a

So, that's what I was looking for.  Having a little trouble getting this to
work in Java.

Best,
Hank



On Mon, Aug 29, 2016 at 5:28 PM, Ihe Onwuka ihe.onwuka@xxxxxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> On bash I do
>
> curl theWebPage.html | java -jar $HOME/tagsoup-1.2.1.jar --nons  | java
> -jar $HOME/saxon9he.jar -s:- -xsl:yourXSLTFile.xsl
>
> which pipes the web page under test into tagsoup which converts it to well
> formed XML which I then pipe into an XSL transformation.
>
> I don't bother with things like Selenium for exactly the reasons you are
> complaining about but of course your team may not buy into that.
>
>



> On Mon, Aug 29, 2016 at 7:25 PM, Hank Ratzesberger xml@xxxxxxxxxxxx <
> xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>> Hi XSL List,
>>
>> I am hoping to improve our test automation built on Selenium. The xpath
>> to elements in our tests is complicated. Any changes break the workflow and
>> fixing the xpath is manual process and slow.
>>
>> If, in the process of running a test, if the web page was scrapped and
>> put into an xml file, or even a text file, with xpath to all inputs and
>> other controls, differences could be reported, and those differences might
>> even be able to be cut and pasted to fix the test in the next update.
>>
>> In any case, processing this way could rationalize / normalize the xpath
>> to all controls.  This way, developers don't have to keep deciphering when
>> pages change.
>>
>> Has anyone here seen something like this? It would seem to be something
>> xslt was made for.
>>
>> Best,
>> Hank
>>
>> --
>> Hank Ratzesberger
>> XMLWerks.com
>> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
>> EasyUnsubscribe <http://-list/601651> (by email)
>>
>
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <-list/506689> (by
> email <>)
>



-- 
Hank Ratzesberger
XMLWerks.com

Current Thread