Re: [xsl] Managing cross references when combining XML files into one document

Subject: Re: [xsl] Managing cross references when combining XML files into one document
From: "Eric Larson" <ionrock@xxxxxxxxx>
Date: Fri, 29 Dec 2006 10:21:00 -0600
You can try to break the processing into stages by using something
simple such as:

<!-- source: Michael Kay (I think) -->
<xsl:variable name="phase-1-output">
 <xsl:apply-templates select="/" mode="phase-1"/>
</xsl:variable>

<xsl:variable name="phase-2-output">
 <xsl:apply-templates select="$phase-1-output" mode="phase-2"/>
</xsl:variable>

<xsl:template match="/">
 <xsl:copy-of select="$phase-8-output"/>
</xsl:template>


I am not sure if my example is correct but it might spark some ideas on how to create something of a min-pipeline within one stylesheet.

Also, depending on what you want to do with your output (you mentioned
html) you can use something like WebWorks ePublisher to process the
output. It uses XSL for all its processing, which makes it very power
for those who know XSL. It would take care of all your cross reference
issues and probably a good deal more.

***Note: I do work for Quadralay who make ePublisher, so I am very
biased in this case. With that said I truly think it is a great tool
for working with FrameMaker and XSL.

At the very least, I hope the single stylesheet pipeline technique
might give you some ideas.

Good luck.

Eric

On 12/29/06, Trevor Nicholls <trevor@xxxxxxxxxxxxxxxxxx> wrote:
Hi, I have a problem with multiple documents and a couple of approaches
whose feasibility I want to explore.

Background:

I have a library of XML documents, and a document schema which allows
documents to "include" other documents via an
  <include srcfile="uri" />
element. Each XML file is structured with a top-level <document> node.

The documents are edited using structured FrameMaker, which allows an XSL
process to run on the opened file before the user sees it (this process
performs a few Frame-required mods to the source, such as wrapping all
graphic elements in a paragraph-type element, and adjusting table
structures), and similarly another XSL process to run on the saved file
(this is the logical converse of the import XSL, obviously).

So far so good. At this point I have two XSL files which merge all the
"included" XML into one input file for Framemaker to edit, and split it all
apart again on save. Framemaker uses Xalan-C as its XSL engine, so we are
stuck with XSL 1.0 features, but this is sufficient (with Xalan's
<redirect:write> extension), and our files roundtrip.

The basic import XSL (without the Frame requirements) is this:
  <xsl:template match="/">
    <xsl:apply-templates>
      <xsl:with-param name="currdoc" select="" />
    </xsl:apply-templates>
  </xsl:template>

  <!-- Include content of subfiles (inside an 'include' element) -->
  <xsl:template match="include">
    <xsl:element name="include">
      <xsl:attribute name="srcfile">
        <xsl:value-of select="@srcfile" />
      </xsl:attribute>
      <xsl:apply-templates select="document(@srcfile)/document">
        <xsl:with-param name="currdoc" select="@srcfile" />
      </xsl:apply-templates>
    </xsl:element>
  </xsl:template>

  <xsl:template match="node()|@*">
    <xsl:param name="currdoc" />
    <xsl:copy>
      <xsl:apply-templates select="@*|node()">
        <xsl:with-param name="currdoc" select="$currdoc" />
      </xsl:apply-templates>
    </xsl:copy>
  </xsl:template>

Because the outer document name is unknown, the currdoc parameter is empty
unless the current node is inside at least one include level.

The requirement now is to handle cross-document references. These are noted
with a
  <link src="uri" | idref="id" />
element (where one or the other attribute is supplied, but not both). The
src attribute is used for external references, while the idref attribute is
used for internal references - this requirement is due to the way in which
Framemaker records cross-references.

I don't believe that the export side of this is going to be too difficult;
most links in the merged file will be internal (idref), and I can use this
idref to look up the id; that element's parent <include> gives me the target
file while the link's parent <include> gives me the source file; if they are
the same I retain the idref attribute, otherwise I replace it with the src
attribute. No problems there.

The difficulties lie on the import side. Here the lookup to the target id is
not possible because that id (probably) is not in the initial input file but
in one of the included documents. If I had control over the XSL steps I
could execute a second stylesheet to do a second pass after the document had
been fully expanded, but I can only invoke a single XSL step and I don't
know how to do two-stage processing, or even if it is possible. An
alternative option would be to treat all links as external (even when they
point to an internal element in the expanded file) but to do this I need to
know the document filename. Again, if I had control of the XSL steps I could
pass the name in a parameter, but I don't and I can't. Is there a smart way
of accessing the document name (it would become a Xalan-specific question at
this point)?

Thanks for any advice
Trevor

Current Thread