[xsl] feasibility of HTML input

Subject: [xsl] feasibility of HTML input
From: didoss@xxxxxxxxxxx
Date: Fri, 17 Mar 2006 17:04:12 +0000
I'm new to the list and to xsl and xslt.

The goal of this e-mail is to just confirm the feasibility of my endeavor.  It 
would be a bonus if someone pushed me in a helpful direction - or I can keep 
wandering, which is ok too.

I haven't found much about the feasibility of using an html file as input.  I 
didn't find anything useful through Google searches, though being new to xsl and 
xslt, I might have not entered the right phrase.  The 2 O'Reilly books that I 
have also didn't clearly direct me towards a solution - but also didn't say that 
it couldn't be done.

Digging through the FAQ, here, I *did* finally find a couple references to using 
HTML as input.  That at least gave me confidence that this is not a completely 
insane idea.  I didn't get a clear idea of the requirements, but definitely 
understood that I should TIDY my html before trying to parse it.  :)

So, here I am thinking that it might be possible, but I have spent a bit of time 
digging, and decided that I might want to check with the experts before spinning 
wheels further.

=========================================
Is this feasible,...worthwhile,...better done with another utility?
=========================================

My team produces nightly JUnit reports and Emma coverage reports for our code.  
I have added a task to copy off the top-level html pages for these results for 
historical purposes.  I would like to be able to run a transform across the 
files in the respective directory (one transform for JUnit and one for Emma) to 
create summary files (probably comma delimited, to be able to pull into Excel).  
The summary file could then be used to recognize and learn from trends in these 
results.

If this is feasible and worthwhile, and not better done with another utility, I 
will send my current xsl and what I'm running into with it.

Thanks for any advice and/or direction you can provide,
Dianne

Current Thread