Re: [xsl] best practices for preserving spaces in mixed content when making XML to XML

Subject: Re: [xsl] best practices for preserving spaces in mixed content when making XML to XML
From: "John P. McCaskey" <groups@xxxxxxxxxxxxxxxx>
Date: Sat, 07 Sep 2013 16:58:02 -0400
Dorothy,

Challenges of preserving whitespace in mixed content come up often in TEI encoding. You may find this helpful:
http://wiki.tei-c.org/index.php/XML_Whitespace

-- John



On 9/6/2013 12:32 PM, Dorothy Hoskins wrote:
HI, I would like advice on the following situation:
I am copying the majority of one XML to another XML but I want to
insert some label text into some elements.
Beginning XML:
<note type="warning"><para>This is <b>mixed content</b> text for a
specific purpose.para></note>

Final XML could be this:
<note type="warning"><para><label>label text: </label>This is <b>mixed
content</b> text for a specific purpose.</para></note>

or it could be this:
<note type="warning"><label>label text: </label><para>This is <b>mixed
content</b> text for a specific purpose.</para></note>
The <label> will have distinctive output formatting.

My questions:
1) Which output is generally preferable, when <para> is a block
element - <label> before <para> or <label> inside <para>? Since <para>
is a block element, it seems that it would be better to put the
<label> inside the <para> rather than before it, within the <note>.

2) Assuming that I might want to make changes within the <para> also,
what is the best method to preserve the spaces around inline elements
such as <b>? For example, if there is an inline <image> in the <para>,
I might want to change its @href>, but probably I would simply copy
the <b>. If there aren't any inlines, I could just copy the <para>
without further processing, but I've had a problem retaining the
spaces around inlines when using apply-templates.

3) Is it better practice to insert the punctuation (colon and space)
after generated label text as part of the <label> content, or to
output them as text literals after the <label>? For example, I can
generate the text of the label inside the <label> element, followed by
<xsl:text>: </xsl:text> or use some other method such as a variable to
externalize the following punctuation and pull that variable value in
with logic, so that the punctuation could be different for different
languages.

Thanks, Dorothy

Current Thread