[xsl] xsl:output method="xhtml", indent, linefeeds problem

Subject: [xsl] xsl:output method="xhtml", indent, linefeeds problem
From: Raimund Kammering <raimund.kammering@xxxxxxx>
Date: Thu, 15 Aug 2013 16:17:37 +0200
Hello (again ;-),

some time ago I switched from using Xalan to Saxon (Saxon-B 9.1 since I need
extensions) together with a switch over from XSL 1.0 to 2.0.

With this change I ran into the heavily discussed problem of forbidden HTML
characters (and unfortunately these are widely spread out in the data we need
process).
To have an easy way out I changed the output method from 'html' to 'xml',
which solved the problem with the forbidden characters, but forced me to take
special care of empty elements and e.g. need to introduce artificially empty
spaces.

An example:

XML
<root>
  ...
  <text></text>

</root>

XSL

...
  <xsl:template match="text">
    <textarea name="text" style="width: 100%" rows="14" cols="100">
        <xsl:value-of select="."/>
    </textarea>
  </xsl:template>


leads to the 'textarea' tag not being closed properly and therefor the rest of
the produced output appears in the textarea! Adding an artificial empty space
solves this but gives in this case a predefined space in the textarea, which
is not nice!

So I found that if I use xhtml as output method, I a) get around the problem
with the forbidden HTML chars and b) do not need to do the above described
trick! BUT now I run into a new problem with '<br />' tag not properly being
handled!

I have a XLS loop dumping data into a single table cell and therefore need to
forcibly end the line to get the right formatting:

XML:
  ...
  <entry>
      <operator>Doe, John</operator>
      <operator>Mustermann, Max</operator>
      <operator>Kammering, Raimund</operator>
  </entry>
  

XSL:
  <?xml version="1.0"?>
  <xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
  <xsl:output method="xhtml" indent="no"/>
  ...
    <td>
      <xsl:for-each select="operator">
        <xsl:value-of select="." /><br />
    </xsl:for-each>
  </td>
  

HTML:
  
  <td>Doe, John<br></br>Mustermann, Max<br></br>Kammering,
Raimund<br></br></td>

This is rendered by (m)any (tested Chrome, Safari (<- even shows wrong source
code <br><br>!!) and IE in recent versions) as:

    Doe, John

    Mustermann, Max

    Kammering, Raimund

-> two linefeeds instead of one!

While if I use output method 'xml' I got:

  <td>Doe, John<br/>Mustermann, Max<br/>Kammering, Raimund<br/></td>

which is nicely rendered as:

    Doe, John
    Mustermann, Max
    Kammering, Raimund

Using output method 'html' I get:

  <td>Doe, John<br>Mustermann, Max<br>Kammering, Raimund<br></td>

which is still rendered as wanted as:

    Doe, John
    Mustermann, Max
    Kammering, Raimund

Long story short: I guess I did not do my homework about doctype, handling of
<br></br> vs <br/> or even <br /> and that like... but is there an easy way
out without searching and replacing all chars which are forbidden when using
output method html?

Raimund

Current Thread