Re: include text file

Subject: Re: include text file
From: Mike Brown <mike@xxxxxxxx>
Date: Thu, 16 Nov 2000 17:33:25 -0700 (MST)
David Carlisle wrote:
> I think you have to separate two cases. Omitting end tags (and in some
> cases begin tags) isn't hideousness, it is a standard SGML feature

I know that much. What makes my example hideous is the overlapping tags, I
think. You may have already inferred an end tag as per the SGML DTD, and
then all of the sudden you've found another one hanging out there. Or are
you also allowed to infer start tags when you encounter end ones, as you
would have to in this situation to produce the intended result...

  <p><i>italic</p>also italic</i>

...because you will have already inferred </i> before the </p>?

> This is how sx (for example) in James Clark's sp suite can parse
> HTML (or any SGML) files and output the parse tree in XML syntax.

Right, it was toying with sx that led me to this train of thought.

> You want (I think) to do the same without the overhead of
> writing to a file and reading back. So you just want a SAX enabled SGML
> parser.


> You could perhaps have a sax interface to such a permissive parser,
> but unlike the case above, here you'd have to accept
> that the parse might fail in more interesting ways

I don't see how that changes anything. The browsers have proven that you
can parse a document into a DOM pretty much no matter what it looks like.
Yes there are interesting judgement calls that are made during the parse,
but there's no reason it should fail rather than just producing an
interpretation that is perhaps not quite what the document author

I am not suggesting that the extension function would need to divine the
intent of the author, only that it would need to always produce a node-set
from the parse results. This isn't outside the realm of possibility, is

   - Mike
Mike J. Brown, software engineer at         My XML/XSL resources: in Denver, Colorado, USA 

 XSL-List info and archive:

Current Thread