Re: [xsl] Recursively traversing an outline with level gaps

Subject: Re: [xsl] Recursively traversing an outline with level gaps
From: Martynas Jusevicius <martynas.jusevicius@xxxxxxxxx>
Date: Wed, 24 Mar 2010 22:25:10 +0100
The styles.xml file describes which style belongs to which level of
the outline. You know the outline numbering in Word or OpenOffice?
Only XHTML elements with styles described there have to go into the
ToC.

The XHTML could also look like this:

<p class="Heading_1">1</p>
<p class="Text_body">Text 1</p>
<p class="Heading_3">1.1.1</p>
<p class="Quote">Quote</p>
<p class="Heading_1">2</p>
<p class="Heading_2">2.1</p>
<p class="Text_body">Text 2</p>
<p class="Heading_3">2.1.1</p>
<p class="Text_body">Text 3</p>
<p class="Heading_2">2.2</p>

But if styles.xml is the same as before, the outline should look the
same, because Text_body and Quote elements do not go into the ToC (no
@text:outline level defined on Text_body):
1
 1.1.1
2
 2.1
   2.1.1
 2.2

I'm pretty sure the problem is with the level gaps, just not sure how
to fully solve it :)

On Wed, Mar 24, 2010 at 10:12 PM, Dimitre Novatchev
<dnovatchev@xxxxxxxxx> wrote:
> It seems to me that the additional xml file is not necessary, neither
> recursive processing is needed.
>
> Just output the values of every <p> element.
>
> Or is there something that I'm getting wrong?
>
>
> --
> Cheers,
> Dimitre Novatchev
> ---------------------------------------
> Truly great madness cannot be achieved without significant intelligence.
> ---------------------------------------
> To invent, you need a good imagination and a pile of junk
> -------------------------------------
> Never fight an inanimate object
> -------------------------------------
> You've achieved success in your field when you don't know whether what
> you're doing is work or play
>
>
>
> On Wed, Mar 24, 2010 at 1:31 PM, Martynas Jusevicius
> <martynas.jusevicius@xxxxxxxxx> wrote:
>> Sorry, you're right. I can make a simplified test case:
>>
>> ODT (styles.xml):
>> <style:style style:name="Heading_1" text:outline-level="1"/>
>> <style:style style:name="Heading_2" text:outline-level="2"/>
>> <style:style style:name="Heading_3" text:outline-level="3"/>
>> <style:style style:name="Heading_4" text:outline-level="4"/>
>>
>> XHTML:
>> <p class="Heading_1">1</p>
>> ...
>> <p class="Heading_3">1.1.1</p>
>> ...
>> <p class="Heading_1">2</p>
>> ...
>> <p class="Heading_2">2.1</p>
>> ...
>> <p class="Heading_3">2.1.1</p>
>> ...
>> <p class="Heading_2">2.2</p>
>>
>> Desired outline (also what OpenOffice.org produces):
>> 1
>>  1.1.1
>> 2
>>  2.1
>>    2.1.1
>>  2.2
>>
>> But if you process it recursively level after level as I described, you
get:
>> 1
>> 2
>>  2.1
>>    2.1.1
>>  2.2
>>
>> Notice 1.1.1 is missing, because 1.1 is missing as well -- in other
>> words, there is a gap between levels 1 and 3.
>>
>> Does that make it clearer? How would you process such outline that all
>> the levels are included, no matter there are gaps between them?
>>
>> Martynas
>>
>> On Wed, Mar 24, 2010 at 8:03 PM, Dimitre Novatchev <dnovatchev@xxxxxxxxx>
wrote:
>>> Nobody needs the copyrighted files. I was asking for sample files.
>>> This is a *class of problems* and many sample files (not only the
>>> copyrighted ones exist). Or are you attempting to copyright all xml
>>> files that are appropriate samples for this problem? :)
>>>
>>> Please, do understand that by not providing well-defined problem, you
>>> are severely decreasing the chances of somebody willing to spend their
>>> time guessworking.
>>>
>>>
>>>
>>> On Wed, Mar 24, 2010 at 11:58 AM, Martynas Jusevicius
>>> <martynas.jusevicius@xxxxxxxxx> wrote:
>>>> These are copyrighted texts, so I'm unfortunately not able to provide
them..
>>>>
>>>> On Wed, Mar 24, 2010 at 7:55 PM, Dimitre Novatchev <dnovatchev@xxxxxxxxx>
wrote:
>>>>> Where are the sample Xhtml and xml documents?
>>>>>
>>>>> On Wed, Mar 24, 2010 at 11:51 AM, Martynas Jusevicius
>>>>> <martynas.jusevicius@xxxxxxxxx> wrote:
>>>>>> Hey list,
>>>>>>
>>>>>> I want to create a nested list (ToC) from a XHTML source, which
>>>>>> contains @class attributes on elements.
>>>>>> A separate document, styles.xml, contains information about styles and
>>>>>> which outline level they belong to. @style:name matches @class in
>>>>>> XHTML.
>>>>>>
>>>>>> The tricky part is, that not all styles are necessarily used in XHTML.
>>>>>> And if they are, the outline hierarchy is not necessarily maintained
>>>>>> -- for example, only styles with level 1, 3 and 5 can be used.
>>>>>>
>>>>>> How would you traverse such a structure? My approach is to do it
>>>>>> recursively, by finding all styles for the current level, and all
>>>>>> elements of these styles (simplified):
>>>>>>
>>>>>> <xsl:template match="h:*">
>>>>>>  <xsl:param name="level" select="1"/>
>>>>>>  <xsl:variable name="level-classes" select="key('style-by-level',
>>>>>> $level, $styles-doc)//@style:name
>>>>>>  <xsl:variable name="level-elements" select="key('element-by-class',
>>>>>> $level-classes)"/>
>>>>>>  <li>
>>>>>>    <xsl:value-of select="."/>
>>>>>>    <ol>
>>>>>>      <xsl:apply-templates="$level-elements">
>>>>>>        <xsl:with-param name="level" select="$level + 1"/>
>>>>>>      </xsl:apply-templates>
>>>>>>    </ol>
>>>>>>  </li>
>>>>>> </xsl:template>
>>>>>>
>>>>>> But this gives problems since levels are not necessarily consecutive.
>>>>>> The first level can be 2, for example.
>>>>>> I also tried iterating only through actually used levels like (1, 3,
>>>>>> 5), but it's not a full solution either, because the level hierarchy
>>>>>> can differ in each branch.
>>>>>>
>>>>>> Help appreciated.
>>>>>>
>>>>>> Martynas
>>>>>> odt2epub.com
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Cheers,
>>>>> Dimitre Novatchev
>>>>> ---------------------------------------
>>>>> Truly great madness cannot be achieved without significant
intelligence.
>>>>> ---------------------------------------
>>>>> To invent, you need a good imagination and a pile of junk
>>>>> -------------------------------------
>>>>> Never fight an inanimate object
>>>>> -------------------------------------
>>>>> You've achieved success in your field when you don't know whether what
>>>>> you're doing is work or play
>>>>> -------------------------------------
>>>>> I enjoy the massacre of ads. This sentence will slaughter ads without
>>>>> a messy bloodbath.
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Cheers,
>>> Dimitre Novatchev
>>> ---------------------------------------
>>> Truly great madness cannot be achieved without significant intelligence.
>>> ---------------------------------------
>>> To invent, you need a good imagination and a pile of junk
>>> -------------------------------------
>>> Never fight an inanimate object
>>> -------------------------------------
>>> You've achieved success in your field when you don't know whether what
>>> you're doing is work or play
>>> -------------------------------------
>>> I enjoy the massacre of ads. This sentence will slaughter ads without
>>> a messy bloodbath.

Current Thread