| Subject: Re: Global Variables and JADE vs OMNIMARK From: Brandon Ibach <bibach@xxxxxxxxxxxxxx> Date: Tue, 8 Dec 1998 19:35:30 -0600 (CST) | 
Mike Sosteric said: > > > > > >Hi, Mike... > > First of all, let me just lay a little groundwork, to make sure > >we're on the same page. > > <snip> thanks for the info! I don't quite understand it all, but > whenever I go back to it, I understand a bit more. > My pleasure... I find much the same (as far as going back to it). > I'll walk through this, correct me i'm wrong > Okay, but first, I may as well explain a few of the DSSSL/Scheme constructs I've used. The first things I'll explain are the various "let" constructs. A let construct look like this: (let [name] (defs) body) The optional "name" is used to create a "named let". More on that in a minute. The "defs" are variable definitions (or "bindings). Each one takes the form of: (varname expr), where the result of the expression "expr" gets assigned to "varname". The "body" is one big expression, the result of which becomes the return value of the whole let construct. In a normal let, the "scope" or duration of the binding is the duration of the body. In a "let*", the duration of a binding is all of the following bindings, as well as the body. This lets you reference earlier bindings in later ones. There is also a "letrec", or recursive let, where the duration of a binding is all of the bindings (including earlier ones) as well as the body. I haven't quite figured out how best to use this yet, so I generally don't. :) A "named let" is the same as a regular let, only the expression which makes up the "body" is given a name so that you can treat the body like a function and call it. This allows for recursive calls, and is used for basic looping, as well. We use both techniques here. The last thing I want to cover is short-circuit (or) and (and) constructs. These are basic boolean logic functions. Each one can take any number of arguments. The (or) will return true if any of the arguments evaluate to true, and the (and) will return true only if all of the arguments evaluate to true. It is important to note that an argument is considered true if it returns any value other that #f, which is Scheme's representation of boolean false. So, if you return the string "hello", it is considered to be true. The interesting (and very useful) thing here is that the (or) and (and) constructs implement what is called short-circuit evaluation. That means that they will only evaluate as many of their arguments as they need to to determine the outcome. In the case of an (or), it only needs to go as far as the first "true" value, because it only needs one true value to return true. The cool thing is, as soon as it finds this "true" value, it returns the actual value, so if that value was the string "hello", it would return that string. Likewise with (and), it will only evaluate arguments until it finds a false value. So, on to the code... > >(define (get-publisher-name node) > > (let* ((gr (node-property 'grove-root node)) > > This first one assigns the root of the grove to the variable gr. Why > the named let. I tried this without the * and it didn't work. > Actually, this isn't a named let. And the * is important because we reference the value of "gr" in the next binding. > > (de (node-property 'document-element gr))) > > This assigns the top level document element to de. THis would be <HTML> > if the document was HTML > That's right. Notice the reference to "gr". > > (let loop ((nl (node-property 'content de))) > > this assigns everything below <HTML> as a nodelist to nl. > Close. The assignment to "nl" is a node list containing the immediate children of <HTML>. This would be the <HEAD> and <BODY> elements. So, this node list is only of length 2. Here is another important aspect of a named let. When we call this "function", to which we've assigned the name "loop", we call it with one argument. This is established by there being exactly one binding (to "nl") in the let construct. The node list we're assigning to "nl" here is just the value for nl for the first trip through the "function". In further calls, an explicit value will be specified for "nl". > > (if (node-list-empty? nl) #f > > this tests to make sure we haven't descended all the way down and > found nothing. > Correct. More specifically, it returns #f if we reach the end of the list. > > (let ((nd (node-list-first nl))) > > this assigns the left hand, top most node list to nd and then tests it > Actually, this assigns the first node in the node list "nl" to "nd". Though, remember that we never deal with individual nodes, so we're actually assigning a singleton node list to "nd". Same effect, tho. > > (or (and (equal? 'element (node-property 'class-name nd)) > I HAVEN"T a clue what (or (and (equal means except maybe its > a joke played on the unwitting masses by the developers of LISP :-) > Heh. :) This can look confusing. I actually should have kept this a bit simpler and just used some (if) constructs, but a little explanation of the short-circuit logic here should make sense of it. Let's work from the inside out. We'll start with the (and) construct. The first argument to the (and) tests to see if the node we're working with is an "element" node. It does this by comparing the "class-name" property of the node to the symbol "element". If we're not dealing with an element, the (and) will return immediately with a false value. Otherwise... > > (if (and (string=? "NAME" (gi nd)) > > (string=? "PUBLISHER" > > (gi (node-property 'parent nd)))) > if the current gi is NAME and the parent of the current gi is PUBLISHER > > (data nd) > assign the data in nd to data and return a string > > (loop (node-property 'content nd))) > The first part is right: we're checking the tag names of the current node and its parent. If they're the ones we're looking for, then we return the result of the (data) function, which extracts the textual content of the current node. Otherwise, we call our "loop" "function" again, this time with a list of the immediate children of the current element. This call will either return with a string, indicating that we found the element we were looking for somewhere in the descendents of the current node, or it will return a false value, indicating that it got to the end of the list of children (having possibly recursed into other elements along the way). So, if the current node is the one we're looking for, or if the node we're looking for is found in the descendents of this node, the (if) construct will return a "string" representing the contents of the correct node. This would be considered a true value, causing our (and) construct to return a true value. This would cause the (or) construct which encloses our (and) to return immediately, returning the string we're looking for. If we don't find the node we're looking for, we'll return a false value, which will cause the (or) to move on to its next value... > > (loop (node-list-rest nl)))))))) Whereas our other call to our "loop" function recursed down a level, this call just completes our basic loop by calling our function with the same node list, minus the first node (which we already processed). This uses a feature of Scheme called "tail recursion". If you look at the flow and logic of our function, you'll see that if we get to this point, all we'll really do is return the result of this call. So, instead of recursing into the function, we can effectively just jump back up to the top of the function with our new value for "nl". That way, if we have a list of 100 nodes, we won't recurse down 100 levels, we'll just loop through the code 100 times. This is why you don't find things like while loops or for loops in Scheme. They're harder to implement this way (until you know what you're doing and it becomes second nature), but it keeps the language simpler. Well, I hope I've made everything clear here. Let me know if you still have some questions. As for the replacement of "NAME" and "PUBLISHER", you answered that question in your next post. However, you do ask if it's possible to check multiple levels of nesting. With some work, it would certainly be possible to create a version of this procedure which would check an arbitrary number of levels, possibly even such that the levels could be separated (ie, other elements occurring in between). However, once you start getting that complicated, it may be time to look into the higher level functions that I mentioned. Chapter 10 of the DSSSL specification (available online at ftp://ftp.ornl.gov/pub/sgml/WG8/DSSSL/, I believe) covers the query language, including a subset known as Core SDQL. Jade implements Core SDQL, as well as some of the functions in Chapter 10 outside of the Core stuff. The Jade page (http://www.jclark.com/jade/) lists which of these functions it supports. Good luck, and happy DSSSLing! :) -Brandon :) DSSSList info and archive: http://www.mulberrytech.com/dsssl/dssslist
| Current Thread | 
|---|
| 
 | 
| <- Previous | Index | Next -> | 
|---|---|---|
| Re: Global Variables and JADE vs OM, Mike Sosteric | Thread | Re: Global Variables and JADE vs OM, Mike Sosteric | 
| Re: Global Variables and JADE vs OM, Mike Sosteric | Date | Re: underscores in XML tags, Tony Graham | 
| Month |