[xsl] [ANN] Post statistics (SVG via XSLT)

Subject: [xsl] [ANN] Post statistics (SVG via XSLT)
From: "Joris Gillis" <roac@xxxxxxxxxx>
Date: Tue, 06 Sep 2005 21:21:19 +0200
Hi posters of this list,

As an exercise to finally start with XSLT 2.0, I've created a set of stylesheets that produce a graphical representation of the post activity on this list (or any other list).
So what does the output look like? http://users.telenet.be/root-jg/f/postStat/rendered.png is the png version of the generated svg file http://users.telenet.be/root-jg/f/postStat/result.svg. Due to some MIME-type problem, you'll need to save that file to the disk first before viewing; alternatively, try this html container: http://users.telenet.be/root-jg/f/postStat/container.html

==The process of generating the output==
In what follows, all URL's have this xml:base "http://users.telenet.be/root-jg/f/postStat/";
For your convenience, I've bundled this base directory: http://users.telenet.be/root-jg/f/postStat/postStat.zip

===The input XML===
The stylesheet takes a threads-based archive file of biglist as input (found at http://www.biglist.com/lists/xsl-list/archives/#browse). Unfortunately, the format is html (and not even valid), so I had to tidy it up (http://infohound.net/tidy/).
A sample input XML is this tidied up XHTML thread-based archive of this list:
archive.xml (the original was www.biglist.com/lists/xsl-list/archives/200508/threads.html)

===The stylesheets===
I used a chain of stylesheets to make the process reusable and extensible.
* The main stylesheet: time-based-linear-svg-stat.xsl
** imports: post-archiver.xsl
** includes: color-scheme-posters.xsl

They are all in XSLT 2.0 format.
The processor I used was Michael Kay's Saxon 8.5 .

===The output===
The output is an SVG graphic representing a time-based linear dot chart.
To the top 10 of posters, a color-scheme is attributed.
I would have liked to scale the dots according to the post's size, but biglist doesn't provide this information. Please point me to better archival pages (preferable XHTML), if you know about them.

==conclusions drawn from the statistics==
* There are less posts during weekends
* Most activity is between 8 am and 11 pm (GMT)
* Michael Kay is the most active poster, moreover his posts span over 16 hours.

Please note that I'm just an amateur without statistical education. This part is just for fun.

==My motivations to code it==
* I like statistics. (This is not the first time I wrote an XSLT to present stats: http://www.ticalc.org/archives/files/fileinfo/300/30075.html)
* It's an exercise to get used to XSLT 2.0 and Xpath 2.0, after reading Michael Kay's books on these topics.
* I used this project to test my coding efficiency. I imposed myself a dead-line of 12 hours.
* It makes use of SVG, which I like to promote; It's used far too little IMHO.
* I hope the process to obtain the input file makes clear that switching to XHTML should be done rather yesterday than tomorrow.

==Why did I post this on this list?==
* Since I used statistics of this very list as a sample, some posters here might be interested.
* It's coded in XSLT, the topic of this list. Feel free to view the stylesheet and criticize them.

Please don't hesitate to reply with comments of any kind.

Joris Gillis (http://users.telenet.be/root-jg/me.html)
B+NN= N?N/N4N1 OON9 N?ON4N-N= N?N/N4N1B;  - N#O	N:ON1ON7O

Current Thread