RE: [xsl] Matching a recursive local element structure

Subject: RE: [xsl] Matching a recursive local element structure
From: "David Lee" <dlee@xxxxxxxxxxx>
Date: Sat, 5 Feb 2011 06:56:19 -0500
Brandon: Can you summarize the algorithm you've got so far?
Two phase process.

Phase 1 (in java)
Walk the Schema using Apache Schema API, and construct an XML document
describing (simply) all types, elements, attributes.



Elements
   Add it
   For each Attribute
      Add as child
   For each Particle
     Add as child
  [ TBD: For each Wildcard ... ???]

Particle
   If element add as child
   If model group , for each element add as child

For non-recursive schemas this works and creates an XML document like this:

.... Snippet
      <jxon:element typeCategory="complex" contentType="element">
         <jxon:name uri="" localname="peer_reviewers"/>
         <jxon:element typeCategory="complex" contentType="element">
            <jxon:name uri="" localname="peer_reviewer"/>
            <jxon:element typeCategory="simple" variety="atomic">
               <jxon:name uri="" localname="name"/>
               <jxon:type uri="http://www.w3.org/2001/XMLSchema";
localname="string"/>
            </jxon:element>
            <jxon:element typeCategory="simple" variety="atomic">
               <jxon:name uri="" localname="degree"/>
               <jxon:type uri="http://www.w3.org/2001/XMLSchema";
localname="string"/>
            </jxon:element>
            <jxon:element typeCategory="complex" contentType="element">
               <jxon:name uri="" localname="title_affil"/>
               <jxon:element typeCategory="complex" contentType="mixed">
                  <jxon:name uri="" localname="para"/>
                  <jxon:element typeCategory="complex" contentType="simple">
                     <jxon:name uri="" localname="link"/>
                     <jxon:attribute typeCategory="simple" variety="atomic">
                        <jxon:name uri="" localname="type"/>
                        <jxon:type uri="http://www.w3.org/2001/XMLSchema";
localname="string"/>
                     </jxon:attribute>
                     <jxon:attribute typeCategory="simple" variety="atomic">
                        <jxon:name uri="" localname="app"/>
                        <jxon:type uri="http://www.w3.org/2001/XMLSchema";
localname="string"/>
                     </jxon:attribute>
                     <jxon:attribute typeCategory="simple" variety="atomic">
                        <jxon:name uri="" localname="target"/>
                        <jxon:type uri="http://www.w3.org/2001/XMLSchema";
localname="string"/>
                     </jxon:attribute>
--------------------

Phase 2:
An xquery program that reads this simplified schema dump and among many
things attempts to create a <xsl:template> to match each unique entry.

The relevant part for elements


(: Generate a match string for an element :)
declare function common:match_elem( $name as element() , $e as element() )
as xs:string
{

	fn:string-join( ($e/ancestor::*/jxon:name/@localname ,
$name/@localname) , "/" )

};


And no it hasnt been enhanced for namespaces yet.  TBD.  (not hard I think,
I have the relevant info).

This produces simple match strings like
	peer_reviewers/peer_reviewer/name

and a template like
	<template match="peer_reviewers/peer_reviewer/name" priority="{depth
of match string}">


I use a priority to disambiguate cases where the raw element is also matched
like


	<template match="name" priority="{depth of match string}">

I want the more explicit match to take precedence.



This works perfectly as long as the schema is not recursive.
When it is recursive the Java part runs forever (until it runs out of
stack).
If I set an arbitrary recursive depth then the data is incomplete and it
works as long as instance documents never have element as deeply nested as
the recursion level.

So my thinking is I need to catch the recursion, mark it somehow and
generate match strings that indicate it.
To make things more exciting I also need to generate match strings for a
reverse transformation but thats beyond the scope of this discussion.

My current thinking is to detect the first point of recursive entry into the
"loop" and mark that,  then go only 1 level deep.
Then at that position use a "//" instead of "/"
I dont think this will be perfect but perhaps In combination with the
priorities may be close enough.







----------------------------------------
David A. Lee
dlee@xxxxxxxxxxx
http://www.xmlsh.org

-----Original Message-----
From: Brandon Ibach [mailto:brandon.ibach@xxxxxxxxxxxxxxxxxxx]
Sent: Friday, February 04, 2011 10:48 PM
To: xsl-list
Subject: Re: [xsl] Matching a recursive local element structure


Yeah, still no joy from Sourceforge, yet.  Can you summarize the
algorithm you've got so far?

-Brandon :)


On Fri, Feb 4, 2011 at 8:27 PM, David Lee <dlee@xxxxxxxxxxx> wrote:
> Everything is checked into sourceforge but its giving me network errors
now
> (under https://xmlsh.svn.sourceforge.net/svnroot/xmlsh/extensions/json,
> This URl is giving me network errors
> http://xmlsh.svn.sourceforge.net/viewvc/xmlsh/
> But I dont recommend it ... its quite complex and large.  I'm not really
> asking for people read or  write this for me ... or analyze the code.
> (if you did it would be awesome ! but its a challenging task) ..
> The basic concept is I'm extracting from the XSD using Apache schema API a
> 'minimal' description of the element structure, then using xquery
attempting
> to produce match expressions for each element (and attribute) declaration
> and trying to avoid infinite recursion.   Its amazingly non-trivial.
>
> Which is why I'm asking for is abstract ideas ...  and of course willing
to
> accept abstract answers ... or none of course ...
> I'm not asking for a solution just hoping maybe a suggestion on paths to
> explore.
> It just 'seems like its such an obvious problem' that people would have
run
> into it before and just know if off the top of their heads ...
> I'm hoping there is a 'simple pattern' that match expressions might 'match
> up' with XSD structures in an 'obvious' way ...
> But alas I suspect that may be asking too much.
>
> My next thought is this might be best solved with a schema-aware xslt
> expression,  but in the general case these may not be types, just
recursive
> references.
>
> Recursion is fun !
>
>
>
> ----------------------------------------
> David A. Lee
> dlee@xxxxxxxxxxx
> http://www.xmlsh.org
>
>
> -----Original Message-----
> From: Brandon Ibach [mailto:brandon.ibach@xxxxxxxxxxxxxxxxxxx]
> Sent: Friday, February 04, 2011 8:14 PM
> To: xsl-list
> Subject: Re: [xsl] Matching a recursive local element structure
>
>
> Can we see the code you have so far?  It'd be a lot easier to address
> specific issues in existing code than to philosophize about an
> abstract approach.
>
> -Brandon :)
>
>
> On Fri, Feb 4, 2011 at 8:06 PM, David Lee <dlee@xxxxxxxxxxx> wrote:
>> Thanks for the ideas (all!)
>> Let me restate my question maybe it might lead to another idea  (I'm
still
>> floundering !)
>>
>> For every element declaration in an XSD I would like to generate a unique
>> XSLT match expression that matches that element declaration (but no
> others).
>> I've got it working quite well for both global and local elements until I
>> hit a recursive structure then well ... it recurses :)
>>
>> Thanks for any suggestions !
>>
>> I *feel* this should be solvable because while the structure are
> infinitely
>> recursive, each level of the recursion matches the same element
> declaration
>> so shouldnt have to be unrolled ... I just cant yet get my head around a
>> match expression to catch it right.
>>
>> But maybe its not finitely solvable ?
>>
>>
>>
>> ----------------------------------------
>> David A. Lee
>> dlee@xxxxxxxxxxx
>> http://www.xmlsh.org
>>
>>
>> -----Original Message-----
>> From: Brandon Ibach [mailto:brandon.ibach@xxxxxxxxxxxxxxxxxxx]
>> Sent: Friday, February 04, 2011 7:44 PM
>> To: xsl-list
>> Subject: Re: [xsl] Matching a recursive local element structure
>>
>>
>> Perhaps this approach is not as generic as you may have had in mind,
>> but for this case, I think it would work.
>>
>> <template match=section/text//list/item[not(ancestor::subheading)] > 
>>
>> -Brandon :)
>>
>>
>> On Fri, Feb 4, 2011 at 7:01 PM, David Lee <dlee@xxxxxxxxxxx> wrote:
>>> Suppose I have a schema which describes a recursive structure as local
>>> elements.
>>> Example (pseudo DTD, and pseudo xml I can provide more formal defs if
>> needed
>>> )
>>>
>>> Element section  (text)*
>>> Element text ( list | para | bold | #PCDATA )*
>>> Element list ( item*)
>>> Element item ( text | subheading ) *
>>> Element subheading (text)*
>>>
>>> So for example doc may look like
>>>
>>> <section>
>>>   <text>Text
>>>       <list>
>>>                       <item><para>Item Text</para></item>
>>>                       <item><para>Item Text2</para></item>
>>>                       <item><para>Item Text</para>
>>>       <list><item><para>More text> </item></list></para></item>
>>>                 </list>
>>>    </text>
>>> </section>
>>>
>>>
>>> The key point is that the schema is recursive, so an xpath (or xslt
> match)
>>> might be
>>>
>>>                 section/text
>>>                 section/text/list/item/para
>>>
>>> section/text/list/item/list/item/list/item/list/item/list/item . Can
get
>>> really long here !!!!
>>>
>>>
>>>
>>> Now suppose I want to avoid an infinite number of XSLT match strings but
> I
>>> want to match say list/item but ONLY within section/text
>>> (presume there may be a different list/item locally defined within say
>>> subheader)
>>>
>>>
>>> Suggestions on to a good way to do that ?
>>>
>>> <template match=section/text//list/item > 
>>>
>>> But this might match
>>>                 section/text/subheading/list/item
>>> or
>>>                 section/text/list/item/subheading/list/item
>>>
>>>
>>> which I dont want.
>>>
>>> I only want to match the list/item which is a local element definition
>>> below section  (recursively),.
>>> so the match should select
>>>                 section/text/list/item/list/item/list/item
>>> but not
>>>                 section/text/list/item/subheading/list/item
>>>
>>> ( which I would say match with
>>>                 subheading/list/item
>>>                 subheading/list/item/list/item
>>> )
>>>
>>>
>>> Is there an obvious way to do this ?
>>> Its entirely possible that Im asking an impossible question (that is
the
>>> schemas may simply not allow this restriction in the first place),
>>> But Im trying to solve a general problem so asking a general question.
>>>
>>> This is based on generating match strings from XSD element declarations
> so
>>> its really a XSD question as well 
>>> Maybe its impossible to describe a schema such that a descendant
>> list/item
>>> is distinguishable if its under section or subheading ?
>>>
>>> Thanks for any suggestion !
>>>
>>>
>>> -David
>>>
>>> ----------------------------------------
>>> David A. Lee
>>> dlee@xxxxxxxxxxx
>>> http://www.xmlsh.org
>>>
>>>
>>>
>>> ----------------------------------------
>>> David A. Lee
>>> dlee@xxxxxxxxxxx
>>> http://www.xmlsh.org

Current Thread