[xsl] cleanup of <div>-elements

Subject: [xsl] cleanup of <div>-elements
From: "Madlik, Monika (LNG-VIE) monika.madlik@xxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 27 Feb 2023 16:30:52 -0000
Hi,

I have a problem with an XML-file that has to be converted.

I get XML-files that are semi-structured. So I have the h1/h2-information in
it and also tables, lists, ...
Paragraphs are tagged with <p> - but not always. Sometimes <p> is missing and
instead of it a weird construct of <div>-elements is tagged around texts and
other elements.

Is there a possibility to unravel this div-constructs without loosing texts
and structure? I need to have the element <p> around texts and markup for i.e.
strong text or italic text, ...

My problem is, that the div-elements could appear in any form and any depth
and it's also possible that many div-elements are wrapped around other
div-elements.

Example-XML:
<root>
              <h1>...</h1>
              <p>...</p>
              <ul>
                            <li>...</li>
                            <li>...</li>
              </ul>
              <div>
                            <h1>...</h1>
                            <h2>...</h2>
                            <p>...</p>
                            <h2>...</h2>
                            <p>...</p>
                            <h1>...</h1>
                            <p>...</p>
                            <h2>...</h2>
                            <p>...</p>
                            <div>
                                          <h1>...</h1>
                                          <div>...<sup><a href="#footnote-9"
id="9" rel="footnote">[9]</a></sup></div>
                            </div>
                            <div>
                                          <br/> ... <strong>...</strong>
...<sup><a href="#footnote-10" id="10" rel="footnote">[10]</a></sup>
                                          <div>
                                                         <h1>...</h1>
                                          </div>
                            </div>
                            <p>...</p>
              </div>
</root>

The yellow marked text should look like this after my transformation:
<h1>...</h1>
<p>...<sup><a href="#footnote-9" id="9" rel="footnote">[9]</a></sup></p>
<p><br/> ... <strong>...</strong> ...<sup><a href="#footnote-10" id="10"
rel="footnote">[10]</a></sup></p>
<h1>...</h1>


Thanks a lot,
Monika

Current Thread