Subject: Re: [xsl] finding a minimum set of references From: Dimitre Novatchev <dnovatchev@xxxxxxxxx> Date: Wed, 9 Jan 2013 04:56:44 -0800 |
This can be done in three steps: 1. Get the distinct values of concat(@area,'+',@site) for all tree elements. 2. Get a sorted sequence of the corresponding `loc` elements that were produced in step 1. The sort is by the count of `tree` elements that have that particular loc element as descendant. 3. Recursively add to the "minList" the top of the sorted loc elements that isn't yet in this "minList". Stop condition -- when there is nothing more to add. I believe that this would really produce the minimum sequence of locations -- due to the fact that we pick locations in decreasing order of trees that reference them. I would be willing to produce the code that implements this algorithm, but I would need to be provided with a complete (but not too long) source XML document -- so that the result could be easily verified. Cheers, Dimitre On Tue, Jan 8, 2013 at 5:37 PM, Graydon <graydon@xxxxxxxxx> wrote: > So I have a lot (~600) tree elements that look like: > > <tree parent="attribution"> > <count count="1"/> > <locations> > <loc area="loiprop" cite="2010-07-prop10"/> > </locations> > </tree> > <tree parent="block"> > <count count="10"/> > <locations> > <loc area="loiprop" cite="2004-09-bud2004"/> > <loc area="loiprop" cite="2010-07-prop10"/> > <loc area="loiprop" cite="2010-08-allprop"/> > </locations> > </tree> > <tree parent="block"> > <key>section</key> > <count count="6689"/> > <locations> > <loc area="CRCc945-en" cite="2606"/> > <loc area="CRCc945-en" cite="402"/> > <loc area="CRCc945-en" cite="6804"/> > <loc area="CRCc945-en" cite="8301"/> > <loc area="CRCc945-en" cite="8308.1"/> > .... > </locations> > <child>section</child> > </tree> > > The @area,@cite pairs represent documents with the locations of particular patterns (via the @parent and the child elements of the tree elements) of documentation content, one tree element per pattern. (So there are lots of tree/@parent="block" elements, etc.) > > What's wanted is the minimum number of documents that contain _all_ the patterns, for output testing purposes. > > So I need to produce the, or at least _a_, shortest list of <loc/> elements so that every (all ~600) tree element contains at least one loc element on the list. > > This has turned out to be much less straightforward than I thought. I can't shake the feeling that there's a grouping solution, but I'm not seeing it if there is. So I'd appreciate any algorithm hints anybody has got; unfortunately, efficiency is a concern. > > Demonstration that this is really an NP-complete problem also grateful accepted! > > Thanks! > Graydon > -- Cheers, Dimitre Novatchev --------------------------------------- Truly great madness cannot be achieved without significant intelligence. --------------------------------------- To invent, you need a good imagination and a pile of junk ------------------------------------- Never fight an inanimate object ------------------------------------- To avoid situations in which you might make mistakes may be the biggest mistake of all ------------------------------------ Quality means doing it right when no one is looking. ------------------------------------- You've achieved success in your field when you don't know whether what you're doing is work or play ------------------------------------- Facts do not cease to exist because they are ignored. ------------------------------------- Typing monkeys will write all Shakespeare's works in 200yrs.Will they write all patents, too? :) ------------------------------------- I finally figured out the only reason to be alive is to enjoy it.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] finding a minimum set of , Wolfgang Laun | Thread | Re: [xsl] finding a minimum set of , Graydon |
Re: [xsl] finding a minimum set of , Wolfgang Laun | Date | Re: [xsl] problem defining param va, Wendell Piez |
Month |