RE: [xsl] How To Map From Hierarchy to Wrapped Text Sequences?

Subject: RE: [xsl] How To Map From Hierarchy to Wrapped Text Sequences?
From: "Bjorndahl, Brad" <brad.bjorndahl@xxxxxxxxxxxxxxxx>
Date: Thu, 10 Apr 2008 11:19:26 -0400
Eliot,

This is how I handle a similar problem (DITA to Adobe MIF):
www.biglist.com/lists/lists.mulberrytech.com/xsl-list/archives/200803/msg0060
8.html
I think this is relevant.
I also have a special "FM formats" file in XML that has variables that contain
element/FM style  mappings. I xsl:include it in my transforms so I can look up
the correlation.

Brad


-----Original Message-----
From: Eliot Kimber [mailto:ekimber@xxxxxxxxxxxx]
Sent: April 10, 2008 10:43 AM
To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Subject: [xsl] How To Map From Hierarchy to Wrapped Text Sequences?

I am working on generating Adobe InCopy article (INCX) files from DITA source.
The challenge I face is that the DITA source is typical documentation XML,
where you have mixed content with embedded inline elements that may be nested
to any depth, e.g.:

<p>Some text <i>italic text <b>now bold italic</b> back to italic</i> more
text</p>

In the INCX representation of this, each text string with distinct formatting
is separately wrapped as a "text run", making the above into:

<txsr><pcnt>Some text </pcnt></txsr>
<txsr><pcnt>italic text </pcnt></txsr>
<txsr><pcnt>now bold italic</pcnt></txsr> <txsr><pcnt> back to
italic</pcnt></txsr> <txsr><pcnt> more text& #x0a;</pcnt></txsr>

(INCX details omitted for simplicity)

An INCX file is essentially just a long sequence of txsr elements. There is no
structural nesting in the InCopy data--newlines are the only structural
markers (newlines signal paragraph breaks, so all input newlines have to be
normalized to whitespace and newlines emitted only at the ends of visual
blocks).

There is no require that the data be normalized so as to produce the smallest
number of text runs to achieve the the formatting result but I do have to
correlate specific input element types to the appropriate character and
paragraph styles for each text run (not shown in the example above, but each
txsr points to the character and paragraph style that determines its
formatting).

The only obvious solution I can think of for this using XSLT 2 is to do two
passes:

1. Generate an intermediate data set where blocks are wrapped and each font
change is indicated by an empty marker element.

2. Use for-each-group to translate the text with markers into text runs.

Is there a simpler or more elegant solution I'm missing?

Thanks,

Eliot

--
Eliot Kimber
Senior Solutions Architect
"Bringing Strategy, Content, and Technology Together"
Main: 610.631.6770
www.reallysi.com
www.rsuitecms.com

Current Thread