Subject: Re: [xsl] best practices for preserving spaces in mixed content when making XML to XML|
From: "John P. McCaskey" <groups@xxxxxxxxxxxxxxxx>
Date: Sat, 07 Sep 2013 16:58:02 -0400
Challenges of preserving whitespace in mixed content come up often in TEI encoding. You may find this helpful: http://wiki.tei-c.org/index.php/XML_Whitespace
HI, I would like advice on the following situation: I am copying the majority of one XML to another XML but I want to insert some label text into some elements. Beginning XML: <note type="warning"><para>This is <b>mixed content</b> text for a specific purpose.para></note>
Final XML could be this: <note type="warning"><para><label>label text: </label>This is <b>mixed content</b> text for a specific purpose.</para></note>
or it could be this: <note type="warning"><label>label text: </label><para>This is <b>mixed content</b> text for a specific purpose.</para></note> The <label> will have distinctive output formatting.
My questions: 1) Which output is generally preferable, when <para> is a block element - <label> before <para> or <label> inside <para>? Since <para> is a block element, it seems that it would be better to put the <label> inside the <para> rather than before it, within the <note>.
2) Assuming that I might want to make changes within the <para> also, what is the best method to preserve the spaces around inline elements such as <b>? For example, if there is an inline <image> in the <para>, I might want to change its @href>, but probably I would simply copy the <b>. If there aren't any inlines, I could just copy the <para> without further processing, but I've had a problem retaining the spaces around inlines when using apply-templates.
3) Is it better practice to insert the punctuation (colon and space) after generated label text as part of the <label> content, or to output them as text literals after the <label>? For example, I can generate the text of the label inside the <label> element, followed by <xsl:text>: </xsl:text> or use some other method such as a variable to externalize the following punctuation and pull that variable value in with logic, so that the punctuation could be different for different languages.