Subject: Re: [xsl] Merging two sets of files|
From: davep <davep@xxxxxxxxxxxxx>
Date: Tue, 03 Apr 2012 17:46:45 +0100
I'm currently using XSLT 2.0 (using Saxon 9.3 via Oxygen 12) to merge two sets of XML files together based on a third file which is a kind of lookup table. However, I'm coming across a problem when I need to effectively merge two source files into the same output file, and I need some suggestions on a change of approach.
I have the following XML files:
- Main document - let's call it book.xml This contains various types of topics, including about 4000 topics related to drugs, each identified by a unique id.
- Ancillary drug information files auto-generated from an online drug database. These are about 10,000 little XML files, each named after the unique id of the drug information in the online catalogue.
- An XML file - let's call it lookup.xml - that is essentially a look-up table, matching ids in book.xml to one or more drug catalogue ids, and vice versa. However, not all records in book.xml have an entry in lookup.xml.
Now my requirement is to convert book.xml from its current proprietary format into a DITA-based specialisation, and while I'm doing that:
1- Output the records with no corresponding catalogue entry as standalone documents.
2- Merge each drug record in book.xml that has catalogue entries with the corresponding auto-generated catalogue file(s), based on lookup.xml.
3- If a record in book.xml has more than one catalogue id in lookup.xml, I need to copy the book.xml record into every one of the corresponding auto-generated files.
4- If more than one record in book.xml corresponds to one catalogue id in lookup.xml, I need to merge all the book.xml records with that same catalogue file.
Point 4 is the immediate stumbling block because my solution to fulfilling points 2 and 3 was as follows:
1. Convert the book.xml drug record into the desired DITA format and place that in a variable. I'm doing this based on a matched template, so this happens whenever the processor "encounters" a drug record as it travels book.xml. This ensures that I can export records with no catalogue id and keep track of where the record was in the hierarchy.
How to collect those records / catalogs together to let you search for count (some id value) > 1?
Think two (or more ) passes, possibly one being to check for 'errors' such as this?
Sounds like the DITA conversion is the easy bit, so look after the other bits first and leave that till later.
-- Dave Pawson XSLT XSL-FO FAQ. http://www.dpawson.co.uk