Re: [xsl] Merging two sets of files

Subject: Re: [xsl] Merging two sets of files
From: davep <davep@xxxxxxxxxxxxx>
Date: Tue, 03 Apr 2012 17:46:45 +0100
On 03/04/12 09:55, Emma Burrows wrote:
I'm currently using XSLT 2.0 (using Saxon 9.3 via Oxygen 12) to merge
two sets of XML files together based on a third file which is a kind
of lookup table. However, I'm coming across a problem when I need to
effectively merge two source files into the same output file, and I
need some suggestions on a change of approach.

I have the following XML files:

- Main document - let's call it book.xml This contains various types
of topics, including about 4000 topics related to drugs, each
identified by a unique id.

- Ancillary drug information files auto-generated from an online drug
database. These are about 10,000 little XML files, each named after
the unique id of the drug information in the online catalogue.

- An XML file - let's call it lookup.xml - that is essentially a
look-up table, matching ids in book.xml to one or more drug catalogue
ids, and vice versa. However, not all records in book.xml have an
entry in lookup.xml.

Now my requirement is to convert book.xml from its current
proprietary format into a DITA-based specialisation, and while I'm
doing that:

1- Output the records with no corresponding catalogue entry as
standalone documents.

2- Merge each drug record in book.xml that has catalogue entries with
the corresponding auto-generated catalogue file(s), based on

3- If a record in book.xml has more than one catalogue id in
lookup.xml, I need to copy the book.xml record into every one of the
corresponding auto-generated files.

4- If more than one record in book.xml corresponds to one catalogue
id in lookup.xml, I need to merge all the book.xml records with that
same catalogue file.

Point 4 is the immediate stumbling block because my solution to fulfilling points 2 and 3 was as follows:

1. Convert the book.xml drug record into the desired DITA format and
place that in a variable. I'm doing this based on a matched template,
so this happens whenever the processor "encounters" a drug record as
it travels book.xml. This ensures that I can export records with no
catalogue id and keep track of where the record was in the

Re point 4

How to collect those records / catalogs together to let you search
for count (some id value) > 1?

Think two (or more ) passes, possibly one being to check for 'errors'
such as this?

Sounds like the DITA conversion is the easy bit, so look after
the other bits first and leave that till later.


Dave Pawson

Current Thread