Re: topological sort

> I have been waiting with bated breath for a response to this post,

Me too.

> > Some pseudo code for orientation:
> > 
> > processed_list=nil
> > template process
> >   generate structures which are not in the processed_list
> >                       and only refer to structure types not in the
                                                            ^^^ wrong
> >                             processed_list
> >   add the structure to the processed list
> >   call process
> ?

My mistake. Sorry.

> I finally decided that the select was:
> NOT processed
> AND (
>       NOT (
>                 field/type/ref exists
>                 AND
>                         NOT field/type/ref reference processed
>           )
>      )

Should be correct.
The problem here is that quantifiers and predicates are (implicitely)
used. The condition should read
  NOT processed
  AND NOT EXISTS field :
    EXISTS type/ref
    AND NOT type/ref : processed
A count()=0 means NOT EXISTS.

> which, when somebody's (De Morgan's?) rule
> ( !(A and B) ) = ( !A or !B )
> is applied to  (!(A and (!B))) gives
> (!A or !(!B)), i.e. (!A or B), which looks easier to me.  So the select
> becomes
> NOT processed
> AND (
>         field/type/ref does NOT exist
>         OR
>         field/type/ref reference processed
>     )

More precisely
  NOT processed
  AND NOT EXISTS field/type/ref
      OR FORALL field/type/ref : processed
You are distributing over a predicate and over quantifiers. A FORALL
would have a representation like
  count(node[predicate])=count(node)

> The other thing concerns the duplication of the select.  Why not just
> use the $actualprocessed variable for both requirements?

One select constructs the actualprocessed variable, the other produces
the desired output, in my example only the struct name. In the real
application, processing of struct elements is more complex.

Well, rescanning the actualprocessed variable may be faster than doing
two selects. I will check it.

Is anybody out there who knows a trick which avoids duplicating the
foreach/select? I have the feeling that the selects are slow because
of repeated lookups in the  processed_list, especially as in real world
the structure names are long and some are substrings of other names
which makes even boyer-moore lookup slow (Does xalan use boyer-moore
or simple substring search? I'm too lazy to check...)

> Anyway, the
> following stylesheet seems to work in the same way as the orginal, on
> the same data.

Incomplete test case :-) Try again after inserting either
    <field>
      <name>D2</name>
      <type><long/></type>
    </field>
or
    <field>
      <name>D2</name>
      <type><ref>A</ref></type>
    </field>
into the D struct, which will both cause D to be processed before E
despite the dependency. 
Your style sheet does not match the predicate expression above.

> I suspect that something useful could be done with keys here, but I have
> no idea what.

Me too. Can one of the gurus please enlighten us? Is there another
algorithm which doesn't use string lists for bookkeeping? Should i
rearrange my XML structure to make a more efficient algorithm feasible?
I'm still waiting for any ideas.

J.Pietschmann


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
<- Previous	Index	Next ->
Re: topological sort, Peter B. West	Thread	Re[2]: topological sort, Jeni Tennison
unparsed-entity-uri ??, Carlberg, Anders	Date	Re: Computing string-value of nodes, Taras Tielkes
	Month
<-prev [Thread] next->	<-prev [Date] next->
Month Index \| List Home