RE: [xsl] Advice/feedback on stylesheet?

Subject: RE: [xsl] Advice/feedback on stylesheet?
From: "Jim Stoll" <jestoll@xxxxxxxxxx>
Date: Wed, 31 Mar 2004 10:44:41 -0500
M. David,
I looked at my stylesheet again and have simplified it somwhat - as I mentioned earlier, the LEVEL_ element wasn't really needed, as top-level nodes are identifiable by the lack of a <PARENTROWID_> element - thus can be identified w/o using LEVEL_.  So, I took the LEVEL_ stuff out.  Also, the requirement of a <ROWSET> root tag was a little arbitrary on my part - by default, the Oracle functions that result in the 'flattened' XML wrap the results in a <ROWSET> tag, but that can be set to anything by the user, so I've allowed for the root tag to be anything, w/ the assumption that the hierarchical elements are always directly under the root element, whatever it may be named. (Also, I am no longer providing the option of whether to include the root element in the output - the expectation is that the user will subsequently transform the 'reconstituted' hierarchical XML themselves anyway, so they can remove the root in their particular use if desired - KISS... :-)

At any rate, in allowing an arbitrarily-named root element, I matched top-level (ie, no parent 'data' elements) on "/*/*[not(PARENTROWID_)]" - you indicated in your prior email that you'd do "child::*" - is there a functional or performance difference between "/*/*" and "child::*" ? (I tried substituting child::* in for my current /*/*, but it didn't match properly - I think I likely need to change something else, too, but haven't taken the time to dig into it yet...)

I'm including the new XSL below - the original XML can now lose the LEVEL_ elements, too - but them being there won't hurt anything, as I do a blind copy of all sub-elements in a 'data' node, so LEVEL_ will just be treated generically as another part of the 'data' element.

Thanks Again for all of your help!!

Jim

<<XSL>>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; version="1.0">

	<!--
		Preconditions on the database-generated XML processed by this stylesheet:
			- there will always be a root element that is not part of the 'data' hierarchy (ie, its part of the XML hierarchy, but not part of the original relational CONNECT BY data hierarchy)
			- all 'data' elements will be siblings underneath the root element (result of the 'flattening' nature of the database's result-set-to-XML conversion)
			- every 'data' element will have a child  'utility' element named ROWID_ that is a unique identifier for the row in the original CONNECT BY data hierarchy
			- every data element that is a child of another data element will have a child  'utility' element named PARENTROWID_ that identifies the data element's parent
	-->
						
	<!--child elements will be wrapped in an element of their  - this wrapper element may be specified by the user - defaults to CHILDREN-->
	<xsl:param name="children-tag-param">CHILDREN</xsl:param>
	
	<!--by default, the mandatory ROWID_ and PARENTROWID_ tags provided in the database-generated XML (and required for this stylesheet's operations) will not be included in the output, but can be if desired-->
	<xsl:param name="output-rowid-tag-param">false</xsl:param>
	<xsl:param name="output-parent-rowid-tag-param">false</xsl:param>

	<!--recall precondition that the database-generated XML will always have a root element named ROWSET, and that all 'data' elements will be siblings under the root, so map all nodes to their parent-->
	<xsl:key name="key-nodes-by-parent" match="/*/*" use="PARENTROWID_"/>
	
	<xsl:template match="/*">
		<xsl:copy>
			<xsl:copy-of select="@*"/>
			<xsl:apply-templates/>
		</xsl:copy>
	</xsl:template>
	
	<!--all top-level elements will have no PARENTROWID_ element - only match/process these, as they will in turn process thier children via the key map-->
	<xsl:template match="/*/*[not(PARENTROWID_)]">
		<xsl:call-template name="duplicate-node">
			<xsl:with-param name="duplicatee-param" select="."/>
		</xsl:call-template>
	</xsl:template>
	
	<xsl:template match="/*/*[PARENTROWID_]">
		<!--these are all children of parent elements, and will be called by the parents in order to nest properly, so don't do anything for these... -->
	</xsl:template>
	
	<!--for each node - either top-level parents resulting from a match, or children resulting from key iteration - duplicate the node, its children and attributes-->
	<xsl:template name="duplicate-node">
		<xsl:param name="duplicatee-param"/>
		<xsl:copy>
			<xsl:copy-of select="@*"/>
			<!--output  the 'utility' elements if/as specified by params-->
			<xsl:copy-of select="*[(not(self::thisElement)) and (not(name()='ROWID_') or (name()='ROWID_' and $output-rowid-tag-param='true')) and (not(name()='PARENTROWID_') or (name()='PARENTROWID_' and $output-parent-rowid-tag-param='true'))]"/>
			<xsl:element name="{$children-tag-param}">
				<xsl:for-each select="key('key-nodes-by-parent', ROWID_)">
					<xsl:call-template name="duplicate-node">
						<xsl:with-param name="duplicatee-param" select="."/>
					</xsl:call-template>
				</xsl:for-each>
			</xsl:element>
		</xsl:copy>
	</xsl:template>
</xsl:stylesheet>



>>
Subject: RE: [xsl] Advice/feedback on stylesheet?
From: "M. David Peterson" <m.david@xxxxxxxxxx>
Date: Tue, 30 Mar 2004 13:08:11 -0700
 

Sorry for the late reply to this Jim...  The weekend became a bit more
hectic than originally planned and I retired quite early last night to
catch up on some much needed sleep.

> building a temporary tree and of using the mode... Are there any
concerns/drawbacks of taking this type of approach when working with
large sets of data? (ie, would memory/resource use possibly be a
problem?)

I don't know of any specifically other than the obvious fact that the
larger your XSLT tree is the more memory it will consume.  Beyond the
obvious(im sure you are referring more to particular methods of
transforming data taking more resources than other methods) unless the
possibility exists that there could potentially be hundreds of different
mode matches to the same element name I don't see it becoming much of a
resource issue.  However the better person to answer a resource related
question would probably be Michael Kay or any one the other guys on this
list who are intimately familiar with the inner workings of XSLT
processors and the effects caused by different methods of transforming
data. 

> So, any further info/advice that you can provide in the context of
this generic/service approach to the problem would be most welcome!

Let me take a look at the original source file for this and see but it
seems pretty cut and dry.  Each node (whatever the name) will always be
a child element of "rowset" or whatever other name oracle decides to
call the containing parent element.  So, by using "child::*" in the
match attribute of the first template and then "child::*" in the select
attribute of the first apply-templates (within this first template) will
give you a subset of the children who parent is a child of root.  Then
just using "*" as your match attribute along with the correct mode you
would get the same result as if you had used that actual name of the
element.  Keep in mind that this is assuming that all the grandchildren
of root have the same name or, if they don't, the same rules for
processing apply to them.  

Ill look at this a little later this afternoon and see if I can prove to
make the same stylesheet work with the same data without using any
specific element names.

Chat with you later...

<M:D/>



-----Original Message-----
From: Jim Stoll [mailto:jestoll@xxxxxxxxxx] 
Sent: Monday, March 29, 2004 2:14 PM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: RE: [xsl] Advice/feedback on stylesheet?

Hi M. David,
Thanks for your reply and example - I look forward to seeing your site!
(and I'm glad that I could provide some input to your site, even if in a
wholly passive manner... :-)

I definitely learned some things from your response (and am still
contemplating a few of them) - I especially like the possibilites
associated with building a temporary tree and of using the mode
attribute of the select and match conditions - that's a new technique to
me and looks to have a lot of very cool uses!  Are there any
concerns/drawbacks of taking this type of approach when working with
large sets of data? (ie, would memory/resource use possibly be a
problem?)

As regards the original problem itself, what I'm trying to do is to
provide a generic stylesheet that users can apply to any set of data
resulting from a particular type of database query - I'm using Oracle
9.1, which allows hierarchical relational queries (ie, the relational
data can be meaningfully represented hierarchically, via level and
heritage (specifically, parent) data), and which allows xml-conversion
of relational data, but that unfortunately 'flattens' the hierarchical
relational data in the process of converting that relational data to
XML.  So, in order to do this, I need to have the 'utility' elements
that I identified (LEVEL_, ROWID_ and PARENTROWID_ - that establish the
depth/heritage in the hierarchy [sidebar: in all actuality, I just need
to identify those that are top-level and those that aren't - which now
occurs to me that I could do simply based on the presence/absence of a
PARENTROWID_ element - but Oracle's CONNECT BY query produces a 'level'
value t!
 hat fits the bill nicely] and then I want to be able to generically
're-constitute' the hierarchical nature of _any_ data that is generated
via the CONNECT BY (herarchical relational) query (but that is
subsequently squashed flat by the XML conversion process).  Thus, I
can't reference 'data' elements in the stylesheet (like NODE, SRCD,
etc), as I don't know what those will actually be in the generic/runtime
use of the stylesheet - I can only rely on the presence of these
'utility' elements that I have predicated as requirements for anyone
wishing to use this 'reconstitution' service.

So, any further info/advice that you can provide in the context of this
generic/service approach to the problem would be most welcome!

Thanks for the help!!

Jim

Current Thread