Re: Flat file conversion to XML

Subject: Re: Flat file conversion to XML
From: Mike Brown <mike@xxxxxxxx>
Date: Fri, 31 Mar 2000 03:13:02 -0700 (MST)
> > Hello, just wondering if it would be possible to convert any random data
> > file into XML using just XSLT?
> No.
> Any well-formed XML document can be represented as a tree of nodes. XSLT is
> based upon this model

Hmm... I bet there's a way he could do it, though. The file could be
treated as an external parsed entity, if care was taken to ensure the data
file was parsable (markup characters escaped, and the entire document
interpretable as UTF-8 representations of characters allowed in XML
documents). Then he could use an entity reference in an actual XML
document that will be the basis of the source tree. 

Of course, the contents of the file would be just one big text node, but
that might be sufficient for his needs. However, there are encoding
issues. By default, an external parsed entity with no prolog indicating
otherwise is going to be interpreted as UTF-8 (hmm, I know this is full of
caveats). So if the file is going to contain byte sequences that, when
interpreted as UTF-8 byte sequences, do not map back to characters in the
allowable range defined in the XML spec, he would have to preprocess the
data to make sure it is truly parseable. He could do this by Base64
encoding the data, and have the XSLT act on that version of the data.

Impractical, but not impossible. :)

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE mydata [
	<!ENTITY Base64Data SYSTEM "myBase64File.txt">
	<!NOTATION Base64 SYSTEM "";>
	<!ELEMENT myData (#PCDATA)>
	<!ATTLIST myData encoding NOTATION ( Base64 ) #IMPLIED>
<myData encoding="Base64">&Base64Data;</myData>
<!-- Good luck working with your Base64 data in a single text node! -->

> - an XSLT transform is a set of rules where each rule
> consists of a pattern to be matched against elements in the source tree.
> Whenever there is a match, a template is executed to create part of the
> result tree.

And it is important to note that the order of processing begins at the
root node and is thereafter determined by instructions in the templates
that are instantiated. (A common misconception that people should be
steered away from is the idea that all nodes are recursively processed in
one pass. Recursive processing of element and text nodes occurs because of
built-in, default templates mandated by the XSLT 1.0 spec.

   - Mike
Mike J. Brown, software engineer, Webb Interactive Services
XML/XSL stuff:

 XSL-List info and archive:

Current Thread