Re: [xsl] An efficient XSLT program that searches a large XML document for all occurrences of a string?

Subject: Re: [xsl] An efficient XSLT program that searches a large XML document for all occurrences of a string?
From: "Michael Kay mike@xxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 3 May 2024 15:46:37 -0000
>Is this an argument that streaming should be in the core spec?

It depends what you are trying to achieve. There are too many good XSLT
processors that have fallen by the wayside because their implementors weren't
able to fund further development. You're not going to improve that situation
by making it even more costly to implement the spec.

Michael Kay
Saxonica

> On 3 May 2024, at 16:36, Piez, Wendell A. (Fed) wendell.piez@xxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Mike and XSL-List,
>
> Is this an argument that streaming should be in the core spec?
>
> Since this limit will always be there, even if it moves? (And in view of
observations on how the real world is not always as accommodating as we might
like.)
>
> Agree also with Dmitry and Liam. Know your tools. Even if an XML parser
canbt swallow it whole, there are ways.
>
> Cheers, Wendell
>
> From: Michael Kay michaelkay90@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
> Sent: Friday, May 3, 2024 4:32 AM
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: Re: [xsl] An efficient XSLT program that searches a large XML
document for all occurrences of a string?
>
>
>
>
> On 3 May 2024, at 00:25, Dimitre Novatchev dnovatchev@xxxxxxxxx
<mailto:dnovatchev@xxxxxxxxx> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx
<mailto:xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>> wrote:
>
>
> If I were related to any activity that collects and structures such large
quantities of data, I would envisage splitting and keeping this data into
smaller, manageable chunks, wherever possible.
>
>
> That's a good recommendation, but it's a workaround for the fact that the
technology isn't as scalable as we would like.
>
> If a system offers you the opportunity to get an XML report of all the
transactions occurring between two dates at a range of locations, then sooner
or later someone is going to submit a query that delivers a 5Gb report, and in
an ideal world, they wouldn't have to do things differently just because the
amount of data has exceeded some arbitrary threshold.
>
> Growth in data size tends to creep up on you. The log files that we keep of
licenses issued to Saxon users are now much larger than we ever envisaged when
we started. You don't want to have to change the design just because things
have grown incrementally. We did change the design: we switched to one XML
file per year. But it would be nice if we weren't forced into that by
technology limitations.
>
> Michael Kay
> Saxonica
>
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/3302254> (by
email <>)
> XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
> EasyUnsubscribe <http://lists.mulberrytech.com/unsub/xsl-list/293509> (by
email <>)

Current Thread