Re: About Constructions rules

Subject: Re: About Constructions rules
From: Brandon Ibach <bibach@xxxxxxxxxxxxxx>
Date: Thu, 15 Jul 1999 14:34:25 -0500
Quoting Chris Maden <crism@xxxxxxxxxxx>:
> But here's why, as I understand it, James didn't implement the query
> construction rule: You can run the query against the entire document,
> but that means you can't start processing the document until you've
> parsed the entire thing.  Jade's current architecture allows parsing
> to begin almost immediately, in most cases.  But the alternative is to
> evaluate every query construction rule's pattern against every node
> (meaning every single data character, etc.), which is prohibitively
> slow.  As I found out a few months ago, Jade is optimized to treat
> sequences of characters as monolithic, and when it has to break that
> optimization, by treating each character as a single node, performance
> goes to hell.  I'm not sure that there's a way around that.
> 
   Valid point, but unless I'm being naive here (which is a reasonable
possibility :), I don't think it's insurmountable.  The idea of being
able to start processing before the entire document is parsed is very
nice, but does Jade really do this?
   It would have to happen one of two ways.  Jade would have to be
multi-threaded, with separate threads for the parsing (and grove
building) and processing.  I'm fairly certain this is not the case.
The alternative would be that Jade parses the document on an as-needed
basis.  It's possible that this could be happening, but is it?
Matthias, Didier, Avi, etc?  Can you answer this?
   Whichever (if either) the case is, I think there's a way to fit
query construction rules in without needing to have the entire
document parsed before you evaluate it.  Just as the parsing is done
piece by piece, could not the query evaluation be done as well?  In
other words, as much as possible, build up the result node list as you
build up the source grove?  Keep in mind, the node list as a whole is
irrelevant to the query rule.  All we care at any given point is
whether a certain node is in the list, and if we're to the point of
processing that node, chances are (though not certain) that enough of
the document has been parsed to have been able to determine whether
that node is in the result list of the query.
   The disclaimer here is that this is highly dependent upon the
nature of the query and how it is written.  Having just done a crash
course in relational query optimization (a la Oracle), I know that
there are many, many ways to break a query such that it can't return
results as it finds them.  For instance, if the final results of the
query are to be sorted, then you won't get any results back from it
until the query is complete.  The engine needs to see *all* of the
results in order to return any, because of the sort.
   Again, this is all quite theoretical.  Does Jade actually begin
processing anything before the source document has been completely
parsed and "groved"?  If not, then it doesn't really matter.

-Brandon :)


 DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist


Current Thread