Re: [xsl] preserve structure of input XML file

Subject: Re: [xsl] preserve structure of input XML file
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Wed, 18 May 2011 11:54:32 -0400
Hi,

To clarify Richard's comment, he is describing a situation where the whitespace is preserved in the result document, but it does not appear in rendition since even though it is present, the application processing the result is not displaying it.

Richard mentions how this can happen in browser display of HTML and describes a workaround; but of course this can be a problem generally and not just in HTML. For example, XSL-FO formatters will do the same thing except where they are told not to.

This is why it's important to distinguish the requirement for "preserving whitespace in the result document" and "preserving whitespace in display". Although you can't preserve it in display if you haven't preserved it at all, these are still two separate questions. And only one of them (the first) is properly an XSLT question; the other is a question about the output format and how an application handles it.

Cheers,
Wendell

On 5/17/2011 7:35 PM, Richard Fozzard wrote:
Another question is what is your output?

As Wendell points out, if your output is XML, whitespace is usually
preserved. But if you're trying to generate HTML from an XML element like:

<abstract>
The primary parameters measured in this dataset are:
- temperature
- wind speed
- humidity

The units are:

Temperature Wind Speed Humidity
==============================
degrees C km/h percent

Global Attributes of level 1a datasets are: Mission and Documentation,
Data Time, Data Quality, File Metrics, and Scene Coordinates. Vgroups
included in the dataset are Scan-Line Attributes, Raw SeaStar Data,
Converted Telemetry, Navigation, Sensor Tilt, and Calibration. Of the
six Vgroups, four Vgroups, Scan-Line At tributes, Raw SeaStar Data,
Converted Telemetry, and Navigation, contain data that are functions of
scan lines.
<abstract>

and your XSLT does:

<p><xsl:value-of select="abstract"><p>

Any HTML browser would collapse all your significant whitespace, losing
the indenting and the table, squishing everything together into an
unreadable mess.

If you simply used <pre>...</pre> instead, then you'd keep the indenting
and table, but the final paragraph would scroll endlessly to the right,
rather than wrapping with the window size.

If this is your problem, you might consider using our printFormatted.xsl
template which tries to guess the intent of the author, and preserve
whitespace when it finds consecutive spaces and tabs, but outputs as an
ordinary paragraph otherwise:

http://www.ngdc.noaa.gov/metadata/published/views/xml2text/xml-to-text-ISO.xsl


which imports:


http://www.ngdc.noaa.gov/metadata/published/views/xml2text/printFormatted.xsl


We've found it to work reasonably well on many different combinations of whitespace.

! or ?
--Rich

Richard Fozzard, Computer Scientist
Geospatial Metadata at NGDC: http://www.ngdc.noaa.gov/metadata

Cooperative Institute for Research in Environmental Sciences (CIRES)
Univ. Colorado & NOAA National Geophysical Data Center, Enterprise Data
Systems 325 S. Broadway, Skaggs 1B-305, Boulder, CO 80305
Office: 303-497-6487, Cell: 303-579-5615, Email: richard.fozzard@xxxxxxxx

-- ====================================================================== Wendell Piez mailto:wapiez@xxxxxxxxxxxxxxxx Mulberry Technologies, Inc. http://www.mulberrytech.com 17 West Jefferson Street Direct Phone: 301/315-9635 Suite 207 Phone: 301/315-9631 Rockville, MD 20850 Fax: 301/315-8285 ---------------------------------------------------------------------- Mulberry Technologies: A Consultancy Specializing in SGML and XML ======================================================================

Current Thread