Re: [xsl] sgml to xml

Subject: Re: [xsl] sgml to xml
From: Wendell Piez <wapiez@xxxxxxxxxxxxxxxx>
Date: Thu, 22 Oct 2009 15:02:49 -0400
Hi,

At 01:50 PM 10/22/2009, you wrote:
Another issue I have is that even if the XMLized version is well
formed, I have to deal with the inclusions in the data. Since the XML
format thatI am converting to conforms to a schema, I am having a very
hard time writing a trasnformation to handle these inclusions.

SGML inclusions are bad news if you need to write a truly generalized transformation, i.e. a transformation capable of converting any document conforming to (SGML) DTD A into an equivalent conforming to DTD B. Of course, some cases are worse than others.


Your choices for alleviating the problem are pretty much limited to one of these or a combination:

1. Identify an element in your target format that is legal everywhere the inclusion may appear, and use that.

2. If no such element is available, construct a more complex mapping that accounts for the inclusion in different ways, depending on where the elements turn up.

3. Don't attempt a fully general transformation; instead, control the problem by identifying (usually by analyzing the source data itself, irrespective of its DTD) where the inclusion is actually used, and work from there.

If this isn't possible (maybe your data set is open-ended), then perform a kind of triage, declaring what's in scope for your transformation and which kinds of structures, formally legal in your source data (but hopefully unattested in actual data and unlikely in future data), should be declared out of scope in your transformation. (It can sometimes be helpful to formalize this limitation, for example by having an XML variant of your SGML DTD, without inclusions, to which the data must validate before you accept it as fit for transformation.)

In other words, you need to regard an SGML inclusion as what it actually is: a modeling escape hatch. If you can't close the hatch, you have to find ways either to handle the things that go through it, or decide they aren't worth the effort, either because they're unimportant or unlikely. Often, the best available alternative is some judicious combination of these.

Cheers,
Wendell



======================================================================
Wendell Piez                            mailto:wapiez@xxxxxxxxxxxxxxxx
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9635
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================

Current Thread