Re: [xsl] Approach to transform 250GB xml data

Subject: Re: [xsl] Approach to transform 250GB xml data
From: "Imsieke, Gerrit, le-tex gerrit.imsieke@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Wed, 10 Sep 2014 10:37:51 -0000
Out of curiosity: how do you intend to access/process the 250 GB once they are transformed?

If it is a huge DB dump, maybe you can dump it in slices or, if it is an XML database with XSLT 2 capabilities, transform it in place.

Gerrit

On 10.09.2014 11:48, Vishnu vishnu@xxxxxxxxxxxx wrote:
The transformation is just for renaming the element or attributes and to change the tree structure only(not for sorting).

Thanks!

Vishnu Singh

________________________________________
From: Michael Kay mike@xxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Sent: Wednesday, September 10, 2014 1:42 PM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [xsl] Approach to transform 250GB xml data

It is not practical to transform this using XSLT except by use of a streaming XSLT processor such as Saxon-EE, and even then it depends on the detailed nature of the transformation to be performed. Some transformations are readily streamed (e.g. renaming all the elements), others are impossible (e.g. sorting). Tell us more about what the transformation is doing.

Michael Kay
Saxonica
mike@xxxxxxxxxxxx
+44 (0) 118 946 5893




On 10 Sep 2014, at 08:36, Vishnu vishnu@xxxxxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:


Hi,

I have approx 250GB xml data and I want to transform it using XSLT 2.0. What should be the best approach to transform this database.

I tried it with ANT but it gave me JAVA heap space error message.

Please suggest.

Thanks!

Vishnu Singh
"This e-mail and any attachments transmitted with it are for the sole use of the intended recipient(s) and may contain confidential , proprietary or privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this e-mail or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful."


"This e-mail and any attachments transmitted with it are for the sole use of the intended recipient(s) and may contain confidential , proprietary or privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this e-mail or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful."




-- Gerrit Imsieke GeschC$ftsfC<hrer / Managing Director le-tex publishing services GmbH Weissenfelser Str. 84, 04229 Leipzig, Germany Phone +49 341 355356 110, Fax +49 341 355356 510 gerrit.imsieke@xxxxxxxxx, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

GeschC$ftsfC<hrer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard VC6ckler

Current Thread