RE: AW: [xsl] Detecting carriage return and newline feed in XML Data

Subject: RE: AW: [xsl] Detecting carriage return and newline feed in XML Data
From: "Michael Kay" <mike@xxxxxxxxxxxx>
Date: Mon, 1 Nov 2004 11:26:04 -0000
OK, it seems that what you need to preserve is the fact that there is a
newline - not the particular representation of the newline (NL vs CR/LF).

Newlines in attribute values are not considered significant by an XML
parser, and are converted to spaces. For this kind of content it would be
much better to generate elements rather than attributes, then the newlines
(but not their particular representation) would be preserved. The
alternative would be to represent the newlines within the attribute as
&#xa;. If you can't change the code that generates the XML, then your only
option is to preprocess the XML with some non-XML-aware tool.

XSLT can't help you here, I'm afraid: the damage is done before XSLT kicks
in.

Michael Kay
http://www.saxonica.com/

> -----Original Message-----
> From: michella@xxxxxxx [mailto:michella@xxxxxxx]
> Sent: 01 November 2004 10:49
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: AW: AW: [xsl] Detecting carriage return and newline
> feed in XML Data
>
> > XML input is processed by the XML parser before it gets
> anywhere near the
> > XSLT processor. The only way to prevent XML's normalization
> of whitespace
> > characters (whether in element or attribute content) is to write the
> > characters as character references, e.g. &#x0D; You can of
> course do that by
> > preprocessing the file in some non-XML-aware tool before
> submitting it to
> > the XML parser.
> >
> > Are you really sure you need to do this? Somehow, you're
> not using XML the
> > way it was intended to be used and that's always bad news.
> I've forgotten
> > what your original problem was, if you ever explained it.
> >
> > Michael Kay
>
>
> Ok, let me explain the whole problematic :
>
> 1. The XML Document is generated by System Architect (Popkin
> Software). This software is intended to help build EAI
> (Enterprise Application Integration).
> 2. Each diagram, such as each symbol it contains have their
> own user defined properties. One of them is a free text field
> (here SAProperty/@SAPrpValue) which we use to freely describe
> the property of his respectiv symbol.
> 3. The text inside is divided by a number of paragraphs (who
> are commonly separated through carriage return and new line feed).
> 4. The System Architect cleverly export all structured
> diagrams and their properties into one single XML. The text
> field described before is as well stored as an attribut of an
> XML element. Below a (tiny) part of the overhall 60MB XML Document :
>
> <?xml version="1.0" encoding="UTF-16" ?>
> <Classes>
> 	<Class>
> 		<SADefinition SAObjId="_2753"
> SAObjName="app_HybridPost" SAObjMinorTypeName="Application"
> 		SAObjMinorTypeNum="309" SAObjMajorTypeNum="3"
> SAObjAuditId="MiL"
> SAObjUpdateDate="25.08.2004" SAObjUpdateTime="09:20:26">
> 		<SAProperty SAPrpName="Description"
> 		SAPrpValue="Mit der Anwendung HybridPost wird
> die bestehende Infrastruktur von Postfinance f|r den
> 	Druck und die Verpackung von Kundendokumenten von
> Drittkunden im Printcenter Z|rich genutzt.
> 		Das Projekt &quot;Strategie HybridPost&quot;,
> das sich zur Zeit in der Voranalyse-Phase befindet, hat zum
> 	Ziel, die HybridPost-Lvsung weiterzuentwickeln und
> zusdtzliche Komponenten wie Archivierung, Billing,
> 	Druck, Verpackung und Call-Center in die bestehende
> Lvsung zu integrieren.
> 		Die Plattform HypoShare wird als Teil des
> Anwendungssystems HybridPost modelliert." SAPrpEditType="1"
> 	SAPrpLength="4074"/>
> 		<SAProperty SAPrpName="GUID"
> SAPrpValue="b1318511-4b95-11d6-8062-00c09f0645a1"
> 		SAPrpEditType="1" SAPrpLength="64"/>
> 		...
> 		BLABLABLA....
> 		...
> 	</Class>
> </Classes>
>
> 5. You'll see that after the word "genutzt." and
> "integrieren", there is a carriage return (assuming that your
> browser handles it)
> 6. I need to have it in my FOP processed PDF document printed
> without loosing the paragraphs.
>
> I hope it will help ;-)
>
> Cheers
>
> Lawrence

Current Thread