Re: [xsl] An efficient XSLT program that searches a large XML document for all occurrences of a string?

Subject: Re: [xsl] An efficient XSLT program that searches a large XML document for all occurrences of a string?
From: "Piez, Wendell A. (Fed) wendell.piez@xxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 3 May 2024 15:36:33 -0000
Mike and XSL-List,

Is this an argument that streaming should be in the core spec?

Since this limit will always be there, even if it moves? (And in view of
observations on how the real world is not always as accommodating as we might
like.)

Agree also with Dmitry and Liam. Know your tools. Even if an XML parser can't
swallow it whole, there are ways.

Cheers, Wendell

From: Michael Kay michaelkay90@xxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Friday, May 3, 2024 4:32 AM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [xsl] An efficient XSLT program that searches a large XML
document for all occurrences of a string?




On 3 May 2024, at 00:25, Dimitre Novatchev
dnovatchev@xxxxxxxxx<mailto:dnovatchev@xxxxxxxxx>
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx<mailto:xsl-list-service@xxxxxxxxxxxx
rytech.com>> wrote:


If I were related to any activity that collects and structures such large
quantities of data, I would envisage splitting and keeping this data into
smaller, manageable chunks, wherever possible.

That's a good recommendation, but it's a workaround for the fact that the
technology isn't as scalable as we would like.

If a system offers you the opportunity to get an XML report of all the
transactions occurring between two dates at a range of locations, then sooner
or later someone is going to submit a query that delivers a 5Gb report, and in
an ideal world, they wouldn't have to do things differently just because the
amount of data has exceeded some arbitrary threshold.

Growth in data size tends to creep up on you. The log files that we keep of
licenses issued to Saxon users are now much larger than we ever envisaged when
we started. You don't want to have to change the design just because things
have grown incrementally. We did change the design: we switched to one XML
file per year. But it would be nice if we weren't forced into that by
technology limitations.

Michael Kay
Saxonica

XSL-List info and archive<http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe<http://lists.mulberrytech.com/unsub/xsl-list/3302254> (by
email<>)

Current Thread