Re: XSLT and Text Processing Languages

Subject: Re: XSLT and Text Processing Languages
From: Dan Vint <dvint@xxxxxxxx>
Date: Thu, 7 Sep 2000 07:18:26 -0700 (PDT)
> > OmniMark advantages :
> > - Less resource hogging
> I beg to differ.  Given an 150 Mb XML input file, my memory usage with =
> Omnimark 5.1 grew to 400 Mb, and I was working with a strictly local =
> program.

This sounds like you were using some features of OmniMark that don't exist in
XSLT - like shelves/arrays and probably referenents for foward processing of
the document without processing it twice or requiring the entire source document
in memory.

> Additionally Omnimark has serious trouble with underscores in =
> element-names.  Apparenly XML-input data has to conform to a DTD, which =
> does not always apply for our data.

Yes the current version of OmniMark does require a DTD for XML work, that is 
changing this fall I belive when v6 is supposed to be released. To fix the 
underscore issue all you have to do is provide an SGML Declaration that allows
'_' to be used in names, by default SGML didn't allow this. A 2 minute fix 
at best.

> > - For any other format -> XML, Omnimark is mandatory
> Not entirely so.  XML is very easy to generate with Perl, and it is =
> equally easy to parse a lot of non-complex text formats in Perl (if you =
> can put it in a regular expression).  Parsing XML with Perl is rather =
> slow, though.

Yeah and how many times do you want to go back in and try to remember what you
did in that regular expression? What is nice about OmniMark (this is somewhat
future speaking based upon v6) is that the SGML/XML awareness was designed
into the language just like XSLT. That is why XSLT is such a nice and easy tool
to crank out HTML from an XML document - very much like OmniMark. Perl and
any of the other languages, I have to get the langague tools, find one of many
different parser and XML/XSLT/XPath modules to go use (of varying quality)
integrate that with the my language support and now write programs in a language
that has a different paradigm and understanding of the SGML/XML world.

Yes you could do any of this work in any lanaguage and make it work. Sometimes
by depricating features and trying "simplfy" XML even more it seems like - alas
that is a different rant I might get into ;-) but as people have seen, working
with XSLT is very easy and generally efficient for working directly with XML
data. Yeah you may want or need access to the network and ODBC connections for
other uses and more complete solutions but when I you have a basic XML -> ?
issue, XSLT is by far easier than any of the other XML solutions in Perl and 
Java and OmniMark is at least in the SGML a complete solution for DTD based
processing. Soon to have support for DTD less work.

Find the best tool for the job, its that hammer and nail issue all over again.
I use XSLT heavily, OmniMark for many related issues and similar work, and when
needed I move into Perl and Java and all thise other things. Each one has 
tradeoffs and limitations that to be used efffectivly you need to understand
and work with, work around, or find other solutions depending on how serious the
problem is.


> --=20
>   Thorbj=F8rn Ravn Andersen             "...and...Tubular Bells!"
>  XSL-List info and archive:

 XSL-List info and archive:

Current Thread