RE: [xsl] Saxon 9.4 <bold></bold> Transformed to (newline)</bold> Problem

Subject: RE: [xsl] Saxon 9.4 <bold></bold> Transformed to (newline)</bold> Problem
From: Raymond Lillibridge <RLillibridge@xxxxxxxxxxxx>
Date: Wed, 16 Jan 2013 17:53:51 +0000
Here is an update for this thread.

Due to my need to process the output file further, using Perl (reading line by
line), I need the paragraph content to not have any newlines or white-space
introduced.  I also cannot use the @indent="no" attribute due to existing post
process Perl applications.

I've tried adding the result-document attribute of
saxon:suppress-indentation="bold para" but the results are still the same,
with or without it.  However, I may not have it set up correctly in my style
sheet.

Looking at the <para/> (fntest.xml below) that begins with "THIS IS THE
PARAGRAPH THAT IS NOT BEHAVING PROPERLY." I'll list the command line input and
the resulting XML after the transformations using Saxon (java), Saxon (.Net),
and XMLSpy.  Note that the input <bold></bold> (inline tag, not a start tag)
has been converted to <bold/> (a good thing, but not necessary if this is
"creating the newlines, &c.).

----- Transformation using:  Saxon (java) -----
C:\>java net.sf.saxon.Transform -s:fntest.xml -xsl:book.xsl -t

Saxon-EE 9.3.0.4J from Saxonica
Java version 1.7.0_09
Stylesheet compilation time: 687 milliseconds
Processing file:/V:/XSLT/MCC/FN/fntest.xml
Using parser
com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser
Building tree for file:/V:/XSLT/MCC/FN/fntest.xml using class
net.sf.saxon.tree.tiny.TinyBuilder
Tree built in 16 milliseconds
Tree size: 38 nodes, 449 characters, 31 attributes
Writing to file:/V:/XSLT/MCC/FN/Book_ALL.xml
Execution time: 171ms
Memory used: 6704936
NamePool contents: 30 entries in 30 chains. 8 prefixes, 8 URIs

----- RESULTS -----
...
   <para block_type="block" gclevel="0">
Testing multiple noanchor tags.  Here is tag ONE<footnoteref id="Rfn_A_28"
linkend="fn_A_28" mark="ONE"/> and here is tag TWO<footnoteref id="Rfn_A_29"
linkend="fn_A_29" mark="TWO"/>
      <bold/>, and finally here is tag THREE (single-dagger)<footnoteref
id="Rfn_A_30" linkend="fn_A_30" mark=""/> Good-Bye!


----- Transformation using:  Saxon (.NET) -----
C:\>transform -xsl:book.xsl -s:fntest.xml -t

Saxon-EE 9.4.0.1N from Saxonica
.NET 2.0.50727.5444 on Microsoft Windows NT 6.1.7601 Service Pack 1
Found registry key at HKEY_LOCAL_MACHINE\Software\Saxonica\SaxonEE-N\Settings
Software installation path: c:\Program Files\Saxonica\SaxonEE9.4N
Using license serial number M008550
URIResolver.resolve href="file:/V:/XSLT/MCC/FN/book.xsl" base="null"
Using parser org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser
Stylesheet compilation time: 1497 milliseconds
Processing file:/V:/XSLT/MCC/FN/fntest.xml
Building tree for file:///V:/XSLT/MCC/FN/fntest.xml using class
net.sf.saxon.tree.tiny.TinyBuilder
Tree built in 16 milliseconds
Tree size: 38 nodes, 449 characters, 31 attributes
Writing to file:/V:/XSLT/MCC/FN/Book_ALL.xml
Execution time: 390ms
Memory used: 3024792
NamePool contents: 22 entries in 22 chains. 6 URIs

----- RESULTS -----
...
   <para block_type="block" gclevel="0">
Testing multiple noanchor tags.  Here is tag ONE<footnoteref id="Rfn_A_28"
linkend="fn_A_28" mark="ONE"/> and here is tag TWO<footnoteref id="Rfn_A_29"
linkend="fn_A_29" mark="TWO"/>
      <bold/>, and finally here is tag THREE (single-dagger)<footnoteref
id="Rfn_A_30" linkend="fn_A_30" mark=""/> Good-Bye!


----- Transformation using:  XMLSpy (2013 sp1) Build-in XSLT Engine (NOT
Command-line)-----
...
	<para block_type="block" gclevel="0">
Testing multiple noanchor tags.  Here is tag ONE<footnoteref id="Rfn_A_28"
linkend="fn_A_28" mark="ONE"/> and here is tag TWO<footnoteref id="Rfn_A_29"
linkend="fn_A_29" mark="TWO"/><bold/>, and finally here is tag THREE
(single-dagger)<footnoteref id="Rfn_A_30" linkend="fn_A_30" mark=""/>
Good-Bye!


------------------------------
What we need:
We have over 50 style sheets and processes all using Saxon (java) and would
really like to have it work for this book.xsl transformation (listed below)
such that the output, when converting <bold></bold> to <bold/>, didn't also
insert any newlines, spaces, or other white-space.  Please see the sample
output from XMLSpy (above).

If I have the saxon:suppress-indentation="para bold" setup incorrectly or if
anyone has any suggestions, please let me know.  Also, if my last resort is to
create a schema, I'll do that.  "Thanks!" for any and all help.



Below you can find the sample input XML and XSL files I'm using.

------------------------------
 fntest.xml
------------------------------
<?xml version="1.0" encoding="UTF-8"?>

<level1 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";  origin="fn_A">
<title>CHARTER</title>
<subtitle>Testing footnotes</subtitle>

<para block_type='block' gclevel="0">
This is a block of text entered here just for the fun of it!
</para>

<para block_type='block' gclevel="0">
THIS IS THE PARAGRAPH THAT IS NOT BEHAVING PROPERLY.  HA!  Here is tag
ONE<footnoteref id="Rfn_A_28" linkend="fn_A_28" mark="ONE"/> and here is tag
TWO<footnoteref id="Rfn_A_29" linkend="fn_A_29" mark="TWO"/><bold></bold>, and
finally here is tag THREE (single-dagger)<footnoteref id="Rfn_A_30"
linkend="fn_A_30" mark=""/> Good-Bye!

<footnote id="fn_A_28" location="end_page"  numbered="yes" >
<refeditor>
	<para block_type="hang" gclevel="0"><bold>History-I</bold>How is THIS for a
wonky reference note?</para>
</refeditor>
</footnote>

<footnote id="fn_A_29" location="end_page"  numbered="yes" >
<refeditor>
	<para block_type="hang" gclevel="0"><bold>History-II</bold>How is THIS for
another wonky reference note?</para>
</refeditor>
</footnote>

<footnote id="fn_A_30" location="end_page"  numbered="yes" >
<refeditor>
	<para block_type="hang" gclevel="0"><bold>History-III</bold>How is THIS for
another wonky reference note?</para>
</refeditor>
</footnote>
</para>

<para block_type='block' gclevel="0">
This is another block of text entered here just for the fun of it! (again)
</para>
</level1>


------------------------------
book.xsl
------------------------------
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
xmlns:fo="http://www.w3.org/1999/XSL/Format";
xmlns:mcc="http://www.municode.com/xslt";
xmlns:saxon="http://saxon.sf.net";>

<xsl:strip-space elements="*" />
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="/">
	<xsl:result-document href="Book_ALL.xml" saxon:suppress-indentation="para
bold">
		<xsl:apply-templates select="node()" />
	</xsl:result-document>
</xsl:template>

<xsl:template match="book">
		<xsl:element name="book">
			<xsl:element name="bookinfo">
				<xsl:element name="title"></xsl:element>
				<xsl:element name="subtitle"></xsl:element>
			</xsl:element>
			<xsl:apply-templates select="node()" />
		</xsl:element>
</xsl:template>

<!-- CATCH-ALL ==================================================== -->
	<xsl:template match="@*|node()">
		<xsl:copy-of select="." copy-namespaces="no" />
	</xsl:template>

</xsl:stylesheet>




-----Original Message-----
From: Michael Kay [mailto:mike@xxxxxxxxxxxx]
Sent: Tuesday, January 15, 2013 12:04 PM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: Re: [xsl] Saxon 9.4 <bold></bold> Transformed to (newline)</bold>
Problem

There are several ways you could fix this problem.

First, you could switch indentation off entirely (indent="no").

Secondly, you could use schema-awareness on the output side (that is, validate
the output against a schema). If you do this, Saxon will not add whitespace
within an element that has a mixed-content content model.
Validation of course needs Saxon-EE.

Thirdly, you could use the suppress-indentation xsl:output parameter.
This was introduced as a Saxon extension (saxon:suppress-indentation) and has
found its way into the XSLT 3.0 specification. Either way, you will need
Saxon-PE or higher.

Note that the fact that the bold element is empty has nothing to do with it;
the indentation will occur for any start tag unless it is suppressed.

I assume that you're not complaining about the translation of <bold></bold> to
<bold/>?

Michael Kay
Saxonica


On 15/01/2013 14:14, Raymond Lillibridge wrote:
> List members,
>
> Due to some batch processing, some of my input XML may have empty elements.
>
> Here is some sample XML:
> <level1>
> <para> Here is some text inside a para tag. <bold></bold> Note that
> the 'bold' element before the word Note is empty.  I would like it to
> stay that way without the insertion of a newline.</para> </level1>
>
>
> When I transform this XML, using Saxon 9.4, the <bold></bold> element is
getting converted similar to the following:
> <level1>
> <para> Here is some text inside a para tag.
>           <bold/> Note that the 'bold' element before the word Note is
> empty.  I would like it to stay that way.</para> </level1>
>
>
> The Problem:
> Due to further batch processing needs, I do not want the insertion of a
newline before the <bold/> element, which is being created after running an
XSLT transformation on the sample XML above.  (XMLSpy does not insert the
newline, by the way, but I want to use Saxon for my transformation.)  In my
xsl file I do not have an explicit template match for the 'bold' element.
>
> My XSL:
> <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="2.0"
> xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> xmlns:fo="http://www.w3.org/1999/XSL/Format";
> xmlns:mcc="http://www.municode.com/xslt";>
>
> <xsl:strip-space elements="*" />
> <xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
>
> <!-- To Get:  {$InputDocPath} -->
> <xsl:include href="./MCC_LIB.xsl"/>
>
> <xsl:template match="/">
> 	<xsl:result-document href="{$InputDocPath}/Book_ALL.xml">
> 		<xsl:apply-templates select="node()" />
> 	</xsl:result-document>
> </xsl:template>
>
> <xsl:template match="book">
> 		<xsl:element name="book">
> 			<xsl:element name="bookinfo">
> 				<xsl:element name="title"></xsl:element>
> 				<xsl:element name="subtitle"></xsl:element>
> 			</xsl:element>
> 			<xsl:apply-templates select="node()"/>
> 		</xsl:element>
> </xsl:template>
>
>
> <xsl:template match="level1|level2|level3|level4|level5|level6">
> 	<xsl:copy-of select="./node()" copy-namespaces="no" />
> </xsl:template>
>
>
> <!-- CATCH-ALL ==================================================== -->
> 	<xsl:template match="@*|node()">
> 		<xsl:copy-of select="./node()" copy-namespaces="no" />
> 	</xsl:template>
> </xsl:stylesheet>
>
>
>
> Looking in the Saxon documentation, I was not able to find a switch to
control the transformation behavior that changes the <bold></bold> to
(newline)<bold/>.
>
> I'd rather not use XMLSpy since all my other batch transformations are using
Saxon.
> If there is a configuration switch for Saxon, could someone direct me where
I may learn about it.
>
> Or, would it be more practical to write a template to remove the "empty"
<bold></bold> element, or better yet, remove all empty elements?  I don't know
how this would be written, and would appreciate any insights someone may
offer.
>
>
> Kind regards,
>
> Raymond Lillibridge
> Sr. Software Engineer
> rlillibridge@xxxxxxxxxxxx
> Municipal Code Corporation | Facebook | Twitter

Current Thread