Re: DSSSL side effect-freeness

Subject: Re: DSSSL side effect-freeness
From: "W. Eliot Kimber" <eliot@xxxxxxxxxx>
Date: Wed, 28 Jan 1998 13:53:28 -0600
At 06:39 PM 1/28/98 +0100, Pierre Mai wrote:

>    FAC> distributed, though -- and I can't imagine distributed groves
>    FAC> being a typical case anyway.
>Well, if we look at the corporate world, where large technical
>document repositories are being maintained and deployed as IETMs for
>example, I would rather think that distributed groves might be heavily

In fact, there's no other way to manage very large documents, at least not
that I can see.  If you look at tools like Chrystal's Astoria or Texcel's
Information Manager, they are essentially grove managers.  As soon as these
tools let you connect two physically separate repositories to form a
single, logical, grove, you've got distributed groves.  The GroveMinder
system being developed by TechnoTeacher will provide very complete, very
robust grove management layer that will include support for distributed
groves (in the sense that a single grove or hypergrove could be constructed
from data stored in a variety of repositories).  

[Note that neither Astoria nor Texcel (nor any "grove manager" product of
which I'm aware) is a true grove manager for two reasons: 1. they got
entities wrong, which is fatal, 2. they don't necessarily reflect the SGML
property set. Both of these failings should be easy to fix, should either
company decide to do so. These critiques are not news to them, trust me.]

The issue of distributed groves is ultimately one of processing
optimization: if you're going to process huge documents in a reasonable
amount of time, it's probably unreasonable to construct the hypergrove anew
every time you want to process something. Thus, you construct the bits of
the grove whenever its source data changes so that the total hypergrove is
immediately available when you need it.  Of course, this is at the cost of
large amounts of disk storage, but DASD is cheap, right? (Note that I'm
assuming that large volumes of data will not be stored as single SGML
documents, which is both foolish and impossible. The processing of multiple
documents (and non-SGML component objects) always results in a hypergrove.)

Also, if you assume that the size of documents reflects the tasks they
support and not available computing resources, then document size should
not increase dramatically over time but computing power will increase
exponentially (given that Moore's law holds).  That means that at some
point, computers will be so fast that the thought of regenerating the
hypergrove for a multi-gigabyte body of data will not frighten the IS
folks. In other words, if it takes X pages to describe a ship today, it
will probably take X pages in 10 years, but computers will be many times

Note also that it's only a problem at very large scales, like preparing the
documents for ships or planes, which combine high data volumes with rapid
refresh cycles. At smaller scales, systems are more than fast enough to
brute force their way through most problems.


<Address HyTime=bibloc>
W. Eliot Kimber, Senior Consulting SGML Engineer
Highland Consulting, a division of ISOGEN International Corp.
2200 N. Lamar St., Suite 230, Dallas, TX 95202.  214.953.0004

 DSSSList info and archive:

Current Thread
  • Re: DSSSL side effect-freeness, (continued)
    • Paul Prescod - from mail1.ability.netby (8.8.5/8.6.12) with ESMTP id TAA10659Tue, 27 Jan 1998 19:42:27 -0500 (EST)
    • Frank A. Christoph - from mail1.ability.netby (8.8.5/8.6.12) with ESMTP id AAA12305Wed, 28 Jan 1998 00:06:25 -0500 (EST)
      • Pierre Mai - from mail1.ability.netby (8.8.5/8.6.12) with ESMTP id OAA22416Wed, 28 Jan 1998 14:15:53 -0500 (EST)
      • Paul Prescod - from mail1.ability.netby (8.8.5/8.6.12) with ESMTP id RAA24174Wed, 28 Jan 1998 17:19:24 -0500 (EST)
    • W. Eliot Kimber - from mail1.ability.netby (8.8.5/8.6.12) with ESMTP id OAA23003Wed, 28 Jan 1998 14:56:26 -0500 (EST) <=