|
Subject: Re: [xsl] Changing a from unstructured HTML to XML From: Martin Honnen <Martin.Honnen@xxxxxx> Date: Tue, 21 Sep 2010 15:29:44 +0200 |
I am working with an HTML input file, and I'd like to group things better by sections (ultimately, with the intent of using xml:result-document to create a new file for each section).
What I have is not uncommon:
<h1 class="section">Section Name</h1> <h1 class="headline">Headline name</h1> [... assorted HTML marked up text ...] <h1 class="headline">Headline 2</h1> [... assorted HTML marked up text ...] <h1 class="headline">Headline 3</h1> [... assorted HTML marked up text ...] <h1 class="section">Section 2</h1> <h1 class="headline">Headline 4</h1> [... assorted HTML marked up text ...] <h1 class="headline">Headline 5</h1> [... assorted HTML marked up text ...] <h1 class="headline">Headline 6</h1> [... assorted HTML marked up text ...]
and so on.
What I'd like to end up with is, if possible
<section id="Section Name"> <headline id="Headline "> [...marked up text...] </headline id="Headline 2"> <headline> [...marked up text...] </headline> <headline id="Headline 3"> [...marked up text...] </headline> </section>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@*, node()"/>
</xsl:copy>
</xsl:template><body> <h1 class="section">Section Name</h1> <h1 class="headline">Headline name</h1> [... assorted HTML marked up text ...] <h1 class="headline">Headline 2</h1> [... assorted HTML marked up text ...] <h1 class="headline">Headline 3</h1> [... assorted HTML marked up text ...] <h1 class="section">Section 2</h1> <h1 class="headline">Headline 4</h1> [... assorted HTML marked up text ...] <h1 class="headline">Headline 5</h1> [... assorted HTML marked up text ...] <h1 class="headline">Headline 6</h1> [... assorted HTML marked up text ...] </body>
<body>
<section id="Section Name">
<headline id="Headline name">
[... assorted HTML marked up text ...]
</headline>
<headline id="Headline 2">
[... assorted HTML marked up text ...]
</headline>
<headline id="Headline 3">
[... assorted HTML marked up text ...]
</headline>
</section>
<section id="Section 2">
<headline id="Headline 4">
[... assorted HTML marked up text ...]
</headline>
<headline id="Headline 5">
[... assorted HTML marked up text ...]
</headline>
<headline id="Headline 6">
[... assorted HTML marked up text ...]
</headline>
</section>
</body>Martin Honnen http://msmvps.com/blogs/martin_honnen/
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| [xsl] Changing a from unstructured , Evan Leibovitch | Thread | [xsl] Fwd: NLM3.0, Evan Leibovitch |
| [xsl] Changing a from unstructured , Evan Leibovitch | Date | [xsl] Fwd: NLM3.0, Evan Leibovitch |
| Month |