RE: sgml-parse and GC

Subject: RE: sgml-parse and GC
From: Peter Nilsson <pnidv96@xxxxxxxxxxxxxx>
Date: Tue, 20 Jul 1999 14:17:50 +0200 (CEST)
On Mon, 19 Jul 1999, Didier PH Martin wrote:

> Hi Peter,
> 
> Peter said:
> The reason it's cached is that if you call sgml-parse many times for one
> entity, it would be very inefficient to parse the same document over and
> over again. It doesn't seem easy to me to know if the sgml-parse would get
> called further on.
> 
> I think the best solution would be to replace the current in memory grove
> implementation with an implementation on disk with mmap'd files. Then all
> groves would be cached and memory management would be passed to the
> OS kernel. This, I think, was proposed several times before.
> 
> 
> Didier says:
> Matthias did some documentation about the memory allocation mechanism and
> more particularly the garbage collector. It seems that we can improve the
> memory management from what we learned and what Matthias documented. Off
> course the sgml-parse has to occur only once or have the node-list tagged
> with an identifier. The last solution is quite hard to implement with this
> type of language. Then we can check how sgml-parse behave by tracing it and

sgml-parse works by asking the GroveManager to load the entity with the
system id sysid then returning a NodePtrNodeListObj pointing to the root
of the new grove. This NodeListPtrObj is GC'd like any ELObj, so there is
no problem with garbage collection I think.

The reason the grove stays in memory is that the GroveManager (actually
DssslApp) caches the groves resulting from parsing documents. If the
groves should be removed, then you ought to be quiet suer it won't be used
again during the same run of jade. How would one know this in advance?

BTW, the nodes in the grove have to stay accessible until the FOT is
built. This I think is true for all nodes resulting in something in the
FOT. See FOTBuilder::startNode()/endNode().

So my conclusion is that you'll need a lot of virtual memory (or other
storage for the groves) to process large documents. I don't see how to
make this different. (Ofcourse you may have the groves in a database.)

> documenting it. But actually, I am still working on the StyleEngine class
> and the dssslEventXXXX class and documenting them both internally with a
> header (finally) and an external document giving more details. About the
> permanent grove, I did some experiments and the trick is to overload the
> grove class with a CORBA or DCOM interface so that the grove could be seen
> also as a component and implemented either in memory, on memory mapped files
> or in an object data base. This way, we can bypass the XML/SGML parsing to
> only do the formatting. But before going further, I need to better
> understand the OpenJade memory management.
> 
I believe the grove interface is independent of the OJ memory management.

Regards
/Peter Nilsson
--
'(?P . (?e . (?t . (?e . (?r)))))


 DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist


Current Thread