Re: Architectural Forms, separation of formatting and loose-leaf management

Subject: Re: Architectural Forms, separation of formatting and loose-leaf management
From: Peter Newcomb <peter@xxxxxxxxxx>
Date: Wed, 6 May 1998 15:37:26 -0500
[Paul Prescod <papresco@xxxxxxxxxxxxxxxx> on Wed, 06 May 1998 09:22:07 -0400]
>> Jordi Mulet wrote:
>> How difficult would be to build a similar system ? 
> It would be a lot of C++ code for parsing PDFs, building a grove and so
> forth. I think that without giving away any secrets I can mention that
> TechnoTeacher is building infrastructure to make these kinds of projects
> easier.
Yup... In fact, you've basically described GroveMinder.  You'd still
need a PDF parser, but the building and maintenance of a PDF grove
(given an appropriate property set) is made dramatically easier by the
GroveMinder libraries and toolset.  PDF groves (and the individual
nodes that make them up) thus constructed would also be able to
participate in relationships with nodes built by other
GroveMinder-based grove implementations, where GroveMinder takes on
the responsibility for maintaining them.

However, it is not always necessary to take as comprehensive an
approach to the problem that GroveMinder does.  Much of the benefit of
groves is simply that the abstract interface to them is standardized.
This means that both grove-generic and property-set-specific
applications can be written such that they are insulated from the
actual storage implementation of the data they access.

For example, Eliot Kimber is developing a grove-based system in Visual
Basic that supports groves for multiple datatypes with multiple
property sets, and even keeps track of relationships between nodes of
different groves.  It is therefore essentially functionally equivalent
to GroveMinder, omitting only GroveMinder's greater scalability and
portability.  Because all of its interfaces are based on the grove
object model and specific property sets, the former being a standard
and the latter defined in terms of a standard, it provides for
applications built on it protection from drastic changes in its own
implementation, and to a slightly lesser degree, even from it itself.
Thus a programmer faced with porting an application from Eliot's
system to GroveMinder or vice-versa is faced with, at most, a purely
syntactic translation, where the differences lie only in how each
system binds the grove model to the specific language.  In the case of
the SDQL query language, this binding is already standardized.

The level of effort required for a project like Eliot's, while not
trivial, is certainly not insurmountable.  (Eliot is developing his in
his spare time, of which I know he has not much; even so, he has
already made his first limited release.)  Most of the benefits of a
grove-based system can be had with even simpler implementations.

>> It will be necessary to define property sets and grove plans for the
>> Layout scheme and Page scheme, doesn't ? Is there any working experience
>> on this topic ? ( PDF, Quark, FrameMaker,...)
> If anyone has defined grove plans for a layout-based language, it would be
> Peter Newcomb at TechnoTeacher, but he may not have got around to doing so
> yet. (so many grove plans, so little time)

I'm afraid I've not yet done any _real_ work on these sorts of
property sets, though I have studied the problem to some extent.
Basically, it comes down to deciding what level of detail you want to
include, and that comes down to deciding upon what applications you
intend to use the groves for.  If you don't want to make such
presuppositions, you can build a "complete" property set, as was done
with SGML and HyTime, where _everything_ is included in the property
set, and then grove plans are used to describe subsets needed for
specific applications.

It seems likely that a complete property set for something like PDF
will be extremely complex and difficult to maintain, especially if
you're not Adobe.  However, a generic page- and/or layout- based
property set that could work for a range of page description languages
might be both easier to develop and ultimately more useful.

Eliot has created property sets for several multimedia (graphics,
video, etc.) formats and has implemented grove constructors for them
based on available ActiveX controls.  Though far simpler, these are
similar to the PDF problem in that the source data contains much more
information than is represented in the grove.


Peter Newcomb                          at TTI: +1 972 231 4098
TechnoTeacher, Inc.                 at ISOGEN: +1 214 953 0004 x141                  email: peter@xxxxxxxxxx

 DSSSList info and archive:

Current Thread
  • Architectural Forms, separation of formatting and loose-leaf management
    • Jordi Mulet - from mail1.ability.netby (8.8.5/8.6.12) with ESMTP id DAA11500Wed, 6 May 1998 03:58:56 -0400 (EDT)
      • Paul Prescod - from mail1.ability.netby (8.8.5/8.6.12) with ESMTP id KAA15797Wed, 6 May 1998 10:28:17 -0400 (EDT)
        • Peter Newcomb - from mail1.ability.netby (8.8.5/8.6.12) with ESMTP id QAA27080Wed, 6 May 1998 16:45:20 -0400 (EDT) <=
      • <Possible follow-ups>
      • W. Eliot Kimber - from mail1.ability.netby (8.8.5/8.6.12) with ESMTP id KAA15376Wed, 6 May 1998 10:00:51 -0400 (EDT)
      • Frank A. Christoph - from mail1.ability.netby (8.8.5/8.6.12) with ESMTP id DAA18672Thu, 7 May 1998 03:23:11 -0400 (EDT)
      • Jordi Mulet - from mail1.ability.netby (8.8.5/8.6.12) with ESMTP id FAA19953Thu, 7 May 1998 05:31:48 -0400 (EDT)
      • W. Eliot Kimber - from mail1.ability.netby (8.8.5/8.6.12) with ESMTP id KAA24829Thu, 7 May 1998 10:32:50 -0400 (EDT)