Re: [xsl] Data science, data analytics using XSLT streaming

Hi,

I agree with Andrew. These projects are fun, and XSLT pipelines are
well suited to them, because they are capable of exposing the semantic
issues and keeping syntax out of the way. This is even true when the
first step is a rendering of a non-XML format into an XML
representation, and the last step is serializing the data in a form
optimized for something else (such as your query engine of choice).

And no, streaming is not necessary, although it can help.

Plus, the biggest problem isn't scale anyway: it's the semantic
integrity of the data or (more likely) the lack thereof. This can be
compounded by non-technical issues such as data owners not seeing the
information they actually have because they are blinded by their
expectations of what it is "supposed" to be.

Cheers, Wendell

Wendell Piez | http://www.wendellpiez.com
XML | XSLT | electronic publishing
Eat Your Vegetables
_____oo_________o_o___ooooo____ooooooo_^

On Tue, Nov 5, 2013 at 5:26 AM, Andrew Welch <andrew.j.welch@xxxxxxxxx> wrote:
>> XSLT streaming is all about processing large amounts of (XML-formatted) data.
>>
>> So XSLT streaming should fit in the "data science" and "data analytics" categories.
>>
>> Broad Question: Would you provide a scenario/example of doing data science/data analytics using XSLT streaming please?
>
> Typically the data is held in multiple files rather than 1 big one, so
> you don't necessarily need streaming, just a set of steps that process
> directories of xml into various intermediate formats, then into the
> final presentation view (such as a table with the data grouped,
> sorted, with counts)
>
> I've done this sort of thing a few times now and I always enjoy it.
>
>
> --
> Andrew Welch
> http://andrewjwelch.com

Current Thread
[xsl] Data science, data analytics using XSLT streaming Costello, Roger L. - 5 Nov 2013 10:12:33 -0000 Martynas Jusevičius - 5 Nov 2013 10:25:31 -0000 Andrew Welch - 5 Nov 2013 10:26:58 -0000 Wendell Piez - 5 Nov 2013 16:35:32 -0000 <= Ihe Onwuka - 5 Nov 2013 17:29:48 -0000 Michael Kay - 5 Nov 2013 17:44:11 -0000 Ihe Onwuka - 5 Nov 2013 18:23:11 -0000 Ihe Onwuka - 5 Nov 2013 10:41:45 -0000

Current Thread

[xsl] Data science, data analytics using XSLT streaming
- Costello, Roger L. - 5 Nov 2013 10:12:33 -0000
  - Martynas Jusevičius - 5 Nov 2013 10:25:31 -0000
  - Andrew Welch - 5 Nov 2013 10:26:58 -0000
    - Wendell Piez - 5 Nov 2013 16:35:32 -0000 <=
      - Ihe Onwuka - 5 Nov 2013 17:29:48 -0000
      - Michael Kay - 5 Nov 2013 17:44:11 -0000
      - Ihe Onwuka - 5 Nov 2013 18:23:11 -0000
  - Ihe Onwuka - 5 Nov 2013 10:41:45 -0000

<- Previous	Index	Next ->
Re: [xsl] Data science, data analyt, Andrew Welch	Thread	Re: [xsl] Data science, data analyt, Ihe Onwuka
Re: [xsl] xsl 2.0?, Peter Flynn	Date	Re: [xsl] Data science, data analyt, Ihe Onwuka
	Month

<-prev [Thread] next->	<-prev [Date] next->
Month Index \| List Home