Re: Global Variables and JADE vs OMNIMARK

Subject: Re: Global Variables and JADE vs OMNIMARK
From: Brandon Ibach <bibach@xxxxxxxxxxxxxx>
Date: Tue, 8 Dec 1998 19:35:30 -0600 (CST)
Mike Sosteric said:
> 
> 
> > 
> >Hi, Mike...
> >   First of all, let me just lay a little groundwork, to make sure
> >we're on the same page.  
> 
> <snip> thanks for the info! I don't quite understand it all, but
> whenever I go back to it, I understand a bit more. 
> 
   My pleasure... I find much the same (as far as going back to it).

> I'll walk through this, correct me i'm wrong
> 
   Okay, but first, I may as well explain a few of the DSSSL/Scheme
constructs I've used.  The first things I'll explain are the various
"let" constructs.  A let construct look like this:
	(let [name] (defs) body)
   The optional "name" is used to create a "named let".  More on that
in a minute.  The "defs" are variable definitions (or "bindings).
Each one takes the form of: (varname expr), where the result of the
expression "expr" gets assigned to "varname".  The "body" is one big
expression, the result of which becomes the return value of the whole
let construct.
   In a normal let, the "scope" or duration of the binding is the
duration of the body.  In a "let*", the duration of a binding is all
of the following bindings, as well as the body.  This lets you
reference earlier bindings in later ones.  There is also a "letrec",
or recursive let, where the duration of a binding is all of the
bindings (including earlier ones) as well as the body.  I haven't
quite figured out how best to use this yet, so I generally don't. :)
   A "named let" is the same as a regular let, only the expression
which makes up the "body" is given a name so that you can treat the
body like a function and call it.  This allows for recursive calls,
and is used for basic looping, as well.  We use both techniques here.
   The last thing I want to cover is short-circuit (or) and (and)
constructs.  These are basic boolean logic functions.  Each one can
take any number of arguments.  The (or) will return true if any of the
arguments evaluate to true, and the (and) will return true only if all
of the arguments evaluate to true.  It is important to note that an
argument is considered true if it returns any value other that #f,
which is Scheme's representation of boolean false.  So, if you return
the string "hello", it is considered to be true.
   The interesting (and very useful) thing here is that the (or) and
(and) constructs implement what is called short-circuit evaluation.
That means that they will only evaluate as many of their arguments as
they need to to determine the outcome.  In the case of an (or), it
only needs to go as far as the first "true" value, because it only
needs one true value to return true.  The cool thing is, as soon as it
finds this "true" value, it returns the actual value, so if that value
was the string "hello", it would return that string.  Likewise with
(and), it will only evaluate arguments until it finds a false value.
   So, on to the code...

> >(define (get-publisher-name node)
> >  (let* ((gr (node-property 'grove-root node))
> 
> This first one assigns the root of the grove to the variable gr. Why
> the named let. I tried this without the * and it didn't work.
> 
   Actually, this isn't a named let.  And the * is important because
we reference the value of "gr" in the next binding.

> >         (de (node-property 'document-element gr)))
> 
> This assigns the top level document element to de. THis would be <HTML>
> if the document was HTML
> 
   That's right.  Notice the reference to "gr".

> >    (let loop ((nl (node-property 'content de)))
> 
> this assigns everything below <HTML> as a nodelist to nl.
> 
   Close.  The assignment to "nl" is a node list containing the
immediate children of <HTML>.  This would be the <HEAD> and <BODY>
elements.  So, this node list is only of length 2.  Here is another
important aspect of a named let.  When we call this "function", to
which we've assigned the name "loop", we call it with one argument.
This is established by there being exactly one binding (to "nl") in
the let construct.  The node list we're assigning to "nl" here is just
the value for nl for the first trip through the "function".  In
further calls, an explicit value will be specified for "nl".

> >      (if (node-list-empty? nl) #f
> 
>  this tests to make sure we haven't descended all the way down and
> found nothing. 
> 
   Correct.  More specifically, it returns #f if we reach the end of
the list.

> >          (let ((nd (node-list-first nl)))
> 
>  this assigns the left hand, top most node list to nd and then tests it
> 
   Actually, this assigns the first node in the node list "nl" to "nd".
Though, remember that we never deal with individual nodes, so we're
actually assigning a singleton node list to "nd".  Same effect, tho.

> >            (or (and (equal? 'element (node-property 'class-name nd))
>          I HAVEN"T a clue what (or (and (equal means except maybe its
> a joke played on the unwitting masses by the developers of LISP :-)
>
   Heh. :)  This can look confusing.  I actually should have kept this
a bit simpler and just used some (if) constructs, but a little
explanation of the short-circuit logic here should make sense of it.
Let's work from the inside out.  We'll start with the (and) construct.
The first argument to the (and) tests to see if the node we're working
with is an "element" node.  It does this by comparing the "class-name"
property of the node to the symbol "element".  If we're not dealing
with an element, the (and) will return immediately with a false value.
Otherwise...

> >                     (if (and (string=? "NAME" (gi nd))
> >                              (string=? "PUBLISHER"
> >                                        (gi (node-property 'parent nd))))
>  if the current gi is NAME and the parent of the current gi is PUBLISHER
>                         > (data nd)
> 			 assign the data in nd to data and return a string
>                         > (loop (node-property 'content nd)))
> 	 
   The first part is right: we're checking the tag names of the
current node and its parent.  If they're the ones we're looking for,
then we return the result of the (data) function, which extracts the
textual content of the current node.
   Otherwise, we call our "loop" "function" again, this time with a
list of the immediate children of the current element.  This call will
either return with a string, indicating that we found the element we
were looking for somewhere in the descendents of the current node, or
it will return a false value, indicating that it got to the end of the
list of children (having possibly recursed into other elements along
the way).
   So, if the current node is the one we're looking for, or if the
node we're looking for is found in the descendents of this node, the
(if) construct will return a "string" representing the contents of the
correct node.  This would be considered a true value, causing our
(and) construct to return a true value.  This would cause the (or)
construct which encloses our (and) to return immediately, returning
the string we're looking for.  If we don't find the node we're looking
for, we'll return a false value, which will cause the (or) to move on
to its next value...

> >                (loop (node-list-rest nl))))))))
   Whereas our other call to our "loop" function recursed down a
level, this call just completes our basic loop by calling our function
with the same node list, minus the first node (which we already
processed).
   This uses a feature of Scheme called "tail recursion".  If you look
at the flow and logic of our function, you'll see that if we get to
this point, all we'll really do is return the result of this call.
So, instead of recursing into the function, we can effectively just
jump back up to the top of the function with our new value for "nl".
That way, if we have a list of 100 nodes, we won't recurse down 100
levels, we'll just loop through the code 100 times.  This is why you
don't find things like while loops or for loops in Scheme.  They're
harder to implement this way (until you know what you're doing and it
becomes second nature), but it keeps the language simpler.

   Well, I hope I've made everything clear here.  Let me know if you
still have some questions.  As for the replacement of "NAME" and
"PUBLISHER", you answered that question in your next post.  However,
you do ask if it's possible to check multiple levels of nesting.  With
some work, it would certainly be possible to create a version of this
procedure which would check an arbitrary number of levels, possibly
even such that the levels could be separated (ie, other elements
occurring in between).  However, once you start getting that
complicated, it may be time to look into the higher level functions
that I mentioned.
   Chapter 10 of the DSSSL specification (available online at
ftp://ftp.ornl.gov/pub/sgml/WG8/DSSSL/, I believe) covers the query
language, including a subset known as Core SDQL.  Jade implements Core
SDQL, as well as some of the functions in Chapter 10 outside of the
Core stuff.  The Jade page (http://www.jclark.com/jade/) lists which
of these functions it supports.
   Good luck, and happy DSSSLing! :)

-Brandon :)


 DSSSList info and archive:  http://www.mulberrytech.com/dsssl/dssslist


Current Thread