RE: [xsl] xsl:output method="xhtml", indent, linefeeds problem

Subject: RE: [xsl] xsl:output method="xhtml", indent, linefeeds problem
From: "Lizzi, Vincent" <Vincent.Lizzi@xxxxxxxxxxxxxxxxxxxx>
Date: Thu, 15 Aug 2013 19:32:21 +0000
Hi Raimund,

I see others have responded about the <br /> tags. In regard to:

>> is there an easy way out without searching and replacing all chars which
are forbidden when using output method html?

You can declare the encoding of the output as "US-ASCII" or "ISO646-US" which
forces all characters not in these limited character sets to be coded as
character entities. This tends to be an easier approach than the alternatives.
By setting

<xsl:output encoding="US-ASCII"

characters other than (English) letters, numbers and punctuation are output as
character entities. For example, @ is output using the decimal Unicode value
as:

&#192;

and browsers should display the @. You didn't mention what character encodings
you're using, but this might do what you need.

Vincent


-----Original Message-----
From: Raimund Kammering [mailto:raimund.kammering@xxxxxxx]
Sent: Thursday, August 15, 2013 10:18 AM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: [xsl] xsl:output method="xhtml", indent, linefeeds problem

Hello (again ;-),

some time ago I switched from using Xalan to Saxon (Saxon-B 9.1 since I need
extensions) together with a switch over from XSL 1.0 to 2.0.

With this change I ran into the heavily discussed problem of forbidden HTML
characters (and unfortunately these are widely spread out in the data we need
process).
To have an easy way out I changed the output method from 'html' to 'xml',
which solved the problem with the forbidden characters, but forced me to take
special care of empty elements and e.g. need to introduce artificially empty
spaces.

An example:

XML
<root>
  ...
  <text></text>
.
</root>

XSL

...
  <xsl:template match="text">
    <textarea name="text" style="width: 100%" rows="14" cols="100">
        <xsl:value-of select="."/>
    </textarea>
  </xsl:template>
.

leads to the 'textarea' tag not being closed properly and therefor the rest of
the produced output appears in the textarea! Adding an artificial empty space
solves this but gives in this case a predefined space in the textarea, which
is not nice!

So I found that if I use xhtml as output method, I a) get around the problem
with the forbidden HTML chars and b) do not need to do the above described
trick! BUT now I run into a new problem with '<br />' tag not properly being
handled!

I have a XLS loop dumping data into a single table cell and therefore need to
forcibly end the line to get the right formatting:

XML:
  ...
  <entry>
      <operator>Doe, John</operator>
      <operator>Mustermann, Max</operator>
      <operator>Kammering, Raimund</operator>
  </entry>
  .

XSL:
  <?xml version="1.0"?>
  <xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
  <xsl:output method="xhtml" indent="no"/>
  ...
    <td>
      <xsl:for-each select="operator">
        <xsl:value-of select="." /><br />
    </xsl:for-each>
  </td>
  .

HTML:
  .
  <td>Doe, John<br></br>Mustermann, Max<br></br>Kammering,
Raimund<br></br></td>

This is rendered by (m)any (tested Chrome, Safari (<- even shows wrong source
code <br><br>!!) and IE in recent versions) as:

    Doe, John

    Mustermann, Max

    Kammering, Raimund

-> two linefeeds instead of one!

While if I use output method 'xml' I got:

  <td>Doe, John<br/>Mustermann, Max<br/>Kammering, Raimund<br/></td>

which is nicely rendered as:

    Doe, John
    Mustermann, Max
    Kammering, Raimund

Using output method 'html' I get:

  <td>Doe, John<br>Mustermann, Max<br>Kammering, Raimund<br></td>

which is still rendered as wanted as:

    Doe, John
    Mustermann, Max
    Kammering, Raimund

Long story short: I guess I did not do my homework about doctype, handling of
<br></br> vs <br/> or even <br /> and that like... but is there an easy way
out without searching and replacing all chars which are forbidden when using
output method html?

Raimund

Current Thread