RE: RE: [xsl] Saxon's handling of line breaks

Subject: RE: RE: [xsl] Saxon's handling of line breaks
From: Salvatore Mangano <smangano@xxxxxxxxxx>
Date: Mon, 6 May 2002 21:37:45 -0400
Thanks Evan. Now it is clear as day!



---- On Mon, 6 May 2002, Evan Lenz (evan@xxxxxxxxxxxx) wrote:

> Mike's global statement was correct:
> > Line breaks in the input document and the stylesheet are
> > automatically converted to a single NL character by the
> > XML parser - that's defined by the XML standard.
> 
> The normative reference can be read at [1].
> 
> However, I think he spoke a little hastily in each of these 
two sentences:
> 
> > The XSLT specification doesn't give the
> > processor license to
> > do anything else.
> 
> No, the XSLT specification doesn't forbid outputting CR, LF, 
or CRLF; any of
> these is fine, because they will all be interpreted the same 
by an XML
> parser.
> 
> > If you want to output CRLF, you must do it
> > explicitly, by
> > writing <xsl:text>
> > </xsl:text>.
> 
> No, that won't work of course, because the XML parser can't 
tell the
> difference between CR, LF, or CRLF.
> 
> The upshot is that Saxon is right and Xalan is right, and XML 
doesn't give
> you control over which character to output. That's up to the 
discretion of
> the serializer (which might provide various mechanisms for
> parameterization).
> 
> Evan
> 
> [1] http://www.w3.org/TR/REC-xml#sec-line-ends
> 
> > -----Original Message-----
> > From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf Of 
Salvatore
> > Mangano
> > Sent: Monday, May 06, 2002 3:13 PM
> > To: Michael Kay
> > Subject: Re: RE: [xsl] Saxon's handling of line breaks
> >
> >
> > If you look at the sample I provide I do indeed output
> > <xsl:text>
> > </xsl:text>. Yet the result is as if the CR is stripped.
> >
> > Also, I do not mention notepad because it is my prefered
> > editor. I mention it only as a tool for diagnosing the 
problem
> > simply *because* it doesn't do what many other editors
> > automatically do.
> >
> > To make the problem plain as day please consider the 
following
> > stylesheet:
> >
> > <xsl:stylesheet version="1.0"
> > xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
> >
> > <xsl:output method="text"/>
> >
> > <xsl:template match="/">
> > foo<xsl:text>
> > </xsl:text>bar
> > </xsl:template>
> >
> > </xsl:stylesheet>
> >
> > According to your explanation foo and bar should be 
seperated
> > by whatever is enclosed in the xsl:text element. In this 
case
> > it should be a CRLF combination because the stylesheet was
> > created in an editor that writes out CR+LF at the end of 
line.
> > However, after processing the stylesheet the CR was indeed
> > stripped with saxon but not with xalan. Explain?
> >
> >
> >
> > > Line breaks in the input document and the stylesheet are
> > automatically
> > > converted to a single NL character by the XML parser - 
that's
> > defined by the
> > > XML standard.
> > >
> > > With the "text" output method, Saxon outputs the 
characters
> > that it finds,
> > > without change. The XSLT specification doesn't give the
> > processor license to
> > > do anything else. If you want to output CRLF, you must do 
it
> > explicitly, by
> > > writing <xsl:text>
> > </xsl:text>. You could make this
> > > platform-dependent by putting it in an external entity or
> > supplying it as a
> > > stylesheet parameter.
> > >
> > > I think most modern text editors will understand NL as a
> > newline character
> > > even on the Windows platform: perhaps it's time you moved 
off
> > Notepad.
> > >
> > > Michael Kay
> > > Software AG
> > > home: Michael.H.Kay@xxxxxxxxxxxx
> > > work: Michael.Kay@xxxxxxxxxxxxxx
> > >
> > > > -----Original Message-----
> > > > From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > > > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf 
Of
> > Sal Mangano
> > > > Sent: 06 May 2002 15:41
> > > > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > > > Subject: [xsl] Saxon's handling of line breaks
> > > >
> > > >
> > > >
> > > > Working with Saxon 6.5.1 on the Windows platform I 
noticed
> > that line
> > > > breaks literally represented as text elements are being
> > output
> > > > incorrectly for the Windows platform.
> > > >
> > > > For example,
> > > >
> > > > <xsl:stylesheet version="1.0"
> > > > xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
> > > >
> > > > <xsl:output method="text" encoding="UTF-8"/>
> > > >
> > > > <xsl:strip-space elements="*"/>
> > > >
> > > > <xsl:template match="number">
> > > >   <xsl:value-of select="."/><xsl:text>
> > > > </xsl:text>
> > > > </xsl:template>
> > > >
> > > > </xsl:stylesheet>
> > > >
> > > > When I capture the output produced by this stylesheet 
in a
> > > > file and open
> > > > in the Windows notepad editor it does not display 
correctly
> > because
> > > > notepad expects CR+NL pairs. Now if I open the 
stylesheet in
> > > > notepad, it
> > > > DOES display correctly which leads me to believe that 
the
> > > > <text> element
> > > > is actually enclosing a CR+NL pair. It seems that the
> > either the
> > > > stylesheet parser or the output serializer in saxon is
> > > > stripping the CR.
> > > > When I use the same stylesheet with xalan it works
> > correctly.
> > > >
> > > > Is this a bug in saxon or a misunderstanding on my part?
> > > >
> > > > In general, how are stylesheets supposed to deal with 
line
> > breaks in a
> > > > portable fashion?
> > > >
> > > > Thanks,
> > > >
> > > > Sal
> > > >
> > > >
> > > >
> > > >  XSL-List info and archive:
> > http://www.mulberrytech.com/xsl/xsl-list
> > > >
> > >
> > >
> > >  XSL-List info and archive:
> > http://www.mulberrytech.com/xsl/xsl-list
> > >
> > >
> > >
> >
> >
> >  XSL-List info and archive:  
http://www.mulberrytech.com/xsl/xsl-list
> >
> >
> 
> 
>  XSL-List info and archive:  
http://www.mulberrytech.com/xsl/xsl-list
> 
> 
> 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread