sgml-parse and GC

Subject: sgml-parse and GC
From: Peter Nilsson <pnidv96@xxxxxxxxxxxxxx>
Date: Mon, 19 Jul 1999 23:23:12 +0200 (CEST)
On Mon, 19 Jul 1999, Didier PH Martin wrote:

> Brandon said:
>    I find your batch-processing idea interesting, but I think we'd be
> better off equipping OpenJade with a proper batch mode, where it can
> handle resources (like the gobs of memory those groves must take)
> better.  Doing it the way you describe, while it may seem elegant,
> could lead to major resource consumption problems, as Jade may be
> unable to effectively judge how long to keep each of those groves
> around, possibly resulting in huge amounts of memory usage.  Somebody
> more familiar with the details of this sort of thing can feel free to
> correct my statements here, if I'm mistaken. :)
> 
> Didier says:
> This is what we are studying also. We are documenting and understanding
> different part of our heritage: Jade. OpenJade could become what we decide
> it can become and then grove management is part of what we can modify. For
> instance, for some constructs, garbage collection could be improved. Let say
> for example, that you create a node list with
> sgml-parse("<url>http://www.metfolder.com/mydoc.sgml";) and process this node
> list with (process-node-list) then when the whole list is processed, the
> garbage collector should release the memory taken by the node list when the
> whole list has been processed. This is some area where we have to improve
> OpenJade. And don't worry, I am already familiar with the details :-).
> 
The reason it's cached is that if you call sgml-parse many times for one
entity, it would be very inefficient to parse the same document over and
over again. It doesn't seem easy to me to know if the sgml-parse would get
called further on.

I think the best solution would be to replace the current in memory grove
implementation with an implementation on disk with mmap'd files. Then all
groves would be cached and memory management would be passed to the
OS kernel. This, I think, was proposed several times before.

Regards
/Peter Nilsson

--
'(?P . (?e . (?t . (?e . (?r)))))


 DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist


Current Thread