Subject: RE: sgml-parse and GC From: "Didier PH Martin" <martind@xxxxxxxxxxxxx> Date: Tue, 20 Jul 1999 19:48:45 -0400 |
Hi Peter, Peter said: sgml-parse works by asking the GroveManager to load the entity with the system id sysid then returning a NodePtrNodeListObj pointing to the root of the new grove. This NodeListPtrObj is GC'd like any ELObj, so there is no problem with garbage collection I think. The reason the grove stays in memory is that the GroveManager (actually DssslApp) caches the groves resulting from parsing documents. If the groves should be removed, then you ought to be quiet suer it won't be used again during the same run of jade. How would one know this in advance? Didier says: I saw what you noticed in the code, but I didn't knew why it is not garbage collected. So, your observation (as a good OpenJade archeologist) lead you to conclude that DssslApp prevent garbage collection and force the grove to stay in memory and thus implement a grove cache. Thank for the info, I'll now investigate in dssslApp. About grove caching, I am not so sure that keeping a grove is a good thing. For example, the main grove (i.e. the grove created for the processed document) could be released as soon as the (process-children) procedure is finished on the Root . Same thing for a grove returned from a sgml-parse where the process is finished when the (process-node-list) on the grove's root is finished. In both cases, the grove could be released because in both ways, the FOT is completed because the processing on the root element is completed ( and therefore for all its children). Then a default condition for the processor could be to have no cache, and all groves garbage collected when the processing is finished for these groves. If necessary a switch could indicate that caching is required and then all groves kept in memory. Would could then be able to process collection of large documents. Peter said: BTW, the nodes in the grove have to stay accessible until the FOT is built. This I think is true for all nodes resulting in something in the FOT. See FOTBuilder::startNode()/endNode(). So my conclusion is that you'll need a lot of virtual memory (or other storage for the groves) to process large documents. I don't see how to make this different. (Ofcourse you may have the groves in a database.) Didier says: The grove has to be present as long as the processing is not completed for the root node and therefore not until all its children are processed. Thus, at least two groves could be present at a time: a) the source document grove b) the sgml-parse resultant grove. Off course, some scripts may lead to a situation where more than two groves are simultaneously present and then would require a lot of virtual memory (and then cause swapping). Speaking of swapping, it depend a lot of how we create object in the heap. Is object are created as near as possible then swapping is kept at a minimum, if objects are spread to much the swapping is increased. Thus, object allocation could improve the performance of objects access so that paging of virtual memory is minimized. Some product like smartheap does that. regards Didier PH Martin mailto:martind@xxxxxxxxxxxxx http://www.netfolder.com regards Didier PH Martin mailto:martind@xxxxxxxxxxxxx http://www.netfolder.com DSSSList info and archive: http://www.mulberrytech.com/dsssl/dssslist
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: sgml-parse and GC, Peter Nilsson | Thread | RE: sgml-parse and GC, Peter Nilsson |
ESIS, Groves and XML, Daniel Mahler | Date | Re: ESIS, Groves and XML, Liam R. E. Quin |
Month |