Subject: RE: About Constructions rules From: Avi Kivity <Avi@xxxxxxxxxxxxx> Date: Thu, 15 Jul 1999 23:25:58 +0300 |
On Thursday, July 15, 1999 22:34, Brandon Ibach [SMTP:bibach@xxxxxxxxxxxxxx] wrote: > Valid point, but unless I'm being naive here (which is a reasonable > possibility :), I don't think it's insurmountable. The idea of being > able to start processing before the entire document is parsed is very > nice, but does Jade really do this? I believe 1.2.1 does, at least on Windows (using something based on 1.1.1 myself). There is a parser thread and a processing thread. Grep for Thread and you'll see them too. > It would have to happen one of two ways. Jade would have to be > multi-threaded, with separate threads for the parsing (and grove > building) and processing. I'm fairly certain this is not the case. > The alternative would be that Jade parses the document on an as-needed > basis. It's possible that this could be happening, but is it? > Matthias, Didier, Avi, etc? Can you answer this? > Whichever (if either) the case is, I think there's a way to fit > query construction rules in without needing to have the entire > document parsed before you evaluate it. Just as the parsing is done > piece by piece, could not the query evaluation be done as well? In > other words, as much as possible, build up the result node list as you > build up the source grove? Keep in mind, the node list as a whole is > irrelevant to the query rule. All we care at any given point is > whether a certain node is in the list, and if we're to the point of > processing that node, chances are (though not certain) that enough of > the document has been parsed to have been able to determine whether > that node is in the result list of the query. There are usually more than one node eligible for processing (e.g. during evaluation of (process-children)). So if one of them waits due to a query requiring complete traversal of the grove, so what? There's plenty of work to feed a theoretical multithreaded dsssl processor. Node lists are lazy, so the query can return *immediately* after its called. It must only be (partially) computed when the first node is processed. This is not related directly to queries -- normal processing can also do this: (element x (let ( (r (with-mode rain (process-children))) (s (with-mode sunshine (process-children))) ) (if (raining?) r s) ) ) On first sight, it looks as if both r and s are evaluated fully, but a clever processor need not do that. It can (and perhaps does?) just keep a pointer to the process-children expression with its context. Only when the result of processing node x is incorporated into the flow object tree does one of {r, s} have to eb evaluated. And if it is discarded, so be it. When (process-children) is evaluated, the same applies: the processor can evaluate whatever nodes are available immediately, and wait for the parser to generate other nodes. What we have here is some tree-structured variant of the unix pipeline: the processing threads wait for the parser, and the formatting thread (which builds the FOT) waits for the processing threads. The processing threads are not started until their output is actually needed. /--processor 1--\ parse------+--processor 2--+--formatter--> \--processor 3--/ (This is all theoretical, of course, but I don't think it would be *too* difficult to implement.) > The disclaimer here is that this is highly dependent upon the > nature of the query and how it is written. Having just done a crash > course in relational query optimization (a la Oracle), I know that > there are many, many ways to break a query such that it can't return > results as it finds them. For instance, if the final results of the > query are to be sorted, then you won't get any results back from it > until the query is complete. The engine needs to see *all* of the > results in order to return any, because of the sort. > Again, this is all quite theoretical. Does Jade actually begin > processing anything before the source document has been completely > parsed and "groved"? If not, then it doesn't really matter. > It does, but I can't see any big win in it. Having two threads performing asymetric tasks yields a 100% improvement in speed if parsing and processing are evenly matched, but much less if they are not. And it doesn't scale with more than two processors. Now, parsing is inherently serial. Processing can conceptually be parallelized due to the side-effect-freeness, but of course depends on the parser. High-quality formatting (read: page-sequence) is probably serial in nature too, mostly. Multithreading can benefit if (a) processing is so complex it outweighs parsing time or (b) you have the grove in a random access database. In any case, waiting for the parser to return the last node can affect wall-clock time, but not the computational complexity which is more important (right?). --- "The only words which have meaning are the last ones spoken" DSSSList info and archive: http://www.mulberrytech.com/dsssl/dssslist
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: About Constructions rules, Didier PH Martin | Thread | RE: About Constructions rules, Avi Kivity |
Re: About Constructions rules, Matthias Clasen | Date | Re: About Constructions rules, Chris Maden |
Month |