[xsl] Breaking paragraphs one linebreaks

Subject: [xsl] Breaking paragraphs one linebreaks
From: "Manuel Souto Pico terminolator@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Thu, 9 May 2019 13:44:01 -0000
Dear all,

I have a bilingual TMX file containing many tu elements like this,
containing full paragraphs:

<?xml version="1.0" encoding="UTF-8"?>
<tmx version="1.4">
   <header segtype="paragraph" adminlang="en"/>
   <body>
      <tu tuid="1">
         <tuv xml:lang="es">
            <seg>El PSOE ganarC-a en 10 de las 12 comunidades donde habrC!
elecciones autonC3micas el 26 de mayo, segC:n el C:ltimo barC3metro del CIS.
&lt;br&gt;Las excepciones serC-an Cantabria, donde el PRC, el partido de
Miguel Cngel Revilla, serC-a primera fuerza. &lt;br&gt;&lt;br&gt;Navarra
Suma, la coaliciC3n de PP, Ciudadanos y UPN, serC-a primera fuerza en la
comunidad foral.</seg>
         </tuv>
         <tuv xml:lang="uz">
            <seg>PSOE, MDHning eng so'nggi barometri bo'yicha 26 mayda
bo'lib o'tadigan mintaqaviy saylovlarda 12 ta jamoaning 10tasida g'olib
chiqadi.&lt;br&gt;Istisnolarga ko'ra, Cantabria, XXR, Migel Anxel Revilla
partiyasi birinchi kuch bo'ladi.&lt;br&gt;&lt;br&gt;"Navarra Suma", PP,
Cuudadanos va UPN koalitsiyasi mintaqaviy hamjamiyatning birinchi kuchi
bo'ladi.</seg>
         </tuv>
      </tu>
   </body>
</tmx>

As you can see there are a few (escaped) line break tags between sentences.

I would like to transform that into something like this, where every tu
element contains only sentences:

<?xml version="1.0" encoding="UTF-8"?>
<tmx version="1.4">
   <header segtype="paragraph" adminlang="en"/>
   <body>
      <tu tuid="1">
         <tuv xml:lang="es">
            <seg>El PSOE ganarC-a en 10 de las 12 comunidades donde habrC!
elecciones autonC3micas el 26 de mayo, segC:n el C:ltimo barC3metro del
CIS.</seg>
         </tuv>
         <tuv xml:lang="uz">
            <seg>PSOE, MDHning eng so'nggi barometri bo'yicha 26 mayda
bo'lib o'tadigan mintaqaviy saylovlarda 12 ta jamoaning 10tasida g'olib
chiqadi.</seg>
         </tuv>
      </tu>
      <tu tuid="2">
         <tuv xml:lang="es">
            <seg>Las excepciones serC-an Cantabria, donde el PRC, el partido
de Miguel Cngel Revilla, serC-a primera fuerza. </seg>
         </tuv>
         <tuv xml:lang="uz">
            <seg>Istisnolarga ko'ra, Cantabria, XXR, Migel Anxel Revilla
partiyasi birinchi kuch bo'ladi.</seg>
         </tuv>
      </tu>
      <tu tuid="3">
         <tuv xml:lang="es">
            <seg>Navarra Suma, la coaliciC3n de PP, Ciudadanos y UPN, serC-a
primera fuerza en la comunidad foral.</seg>
         </tuv>
         <tuv xml:lang="uz">
            <seg>"Navarra Suma", PP, Cuudadanos va UPN koalitsiyasi
mintaqaviy hamjamiyatning birinchi kuchi bo'ladi.</seg>
         </tuv>
      </tu>
   </body>
</tmx>

Do you think I can use XSLT to do this more or less easily?

I wrote a few XSLT stylesheets years ago but I'm far from being a savvy
user.

Thanks in advance for any tips.

Cheers, Manuel

Current Thread