Re: [xsl] Design of XML so that it may be efficiently strea

Firstly, I question the premise that XML should be designed to enable streamed
transformation. One could equally well argue that you should design it so it
doesn't need to be transformed at all. Transformation is only necessary
because the data isn't in the form you want it; designing it so that it can
easily be transformed into the form you want it seems a little odd. Unless
perhaps you are thinking of designing the intermediate formats in a processing
pipeline.

>
> 1. Use lots of attributes. Store in them the data needed for processing the
node.

Certainly for data that can conveniently be represented as attributes, this
will make streamed processing easier. But don't overdo it.
>
> 2. Have one child element only.

No, if there are two things that should naturally be represented as child
elements, then represent them that way. There are plenty of techniques still
available for streamed processing: accumulators, xsl:iterator, fold-left,
xsl:fork.
>
>
> So, to enable efficient stream processing, design XML like this:
>
> <root a="..." b="..." c="...">
>      <node d="..." e="..." f="...">
>            <node g="..." h="..." i="...">
>                  <node j="..." k="..." l="...">
>                        <node m="..." n="..." o="...">
>                              <node p="..." q="..." r="...">
>                                  ...
>                             </node>
>                        </node>
>                  </node>
>            </node>
>      </node>
> </root>
>
> This results in a massively deep tree. For Gigabyte-sized XML files, the
nesting could be a billion levels deep (or more).
>
No, such a design is completely bizarre and defeats the whole purpose of
streaming, which is to reduce memory use.

I would add some more important design criteria. Put metadata and reference
information (stuff that's needed for reference throughout document processing)
at the start of the document rather than the end, or in a separate document.
Use hierarchic nesting for relationships rather than id/idref style pointers
(even perhaps if it means holding the data redundantly).

Michael Kay
Saxonica

<- Previous	Index	Next ->
Re: [xsl] Design of XML so that it , Piotr Bański	Thread	Re: [xsl] Design of XML so that it , Wolfgang Laun
Re: [xsl] XSLT streaming: the proce, Michael Kay	Date	Re: [xsl] Design of XML so that it , Wolfgang Laun
	Month

<-prev [Thread] next->	<-prev [Date] next->
Month Index \| List Home

Re: [xsl] Design of XML so that it may be efficiently stream-processed