Re: [xsl] Recursive function referencing an XML file, best programming technique

Subject: Re: [xsl] Recursive function referencing an XML file, best programming technique
From: Dimitre Novatchev <dnovatchev@xxxxxxxxx>
Date: Tue, 8 Jun 2010 12:39:36 -0700
> Theoretically, it's better to read the library.xml file once, into a
> (global, probably) variable, e.g.:

Actually, this is not true.

The XSLT specification states that regardless of the number of
different expressions containing

   document(someUrl)

the xml document at someUrl is read only once during the life of the
transformation.

Therefore, *performance-wise* it does not matter.


--
Cheers,
Dimitre Novatchev
---------------------------------------
Truly great madness cannot be achieved without significant intelligence.
---------------------------------------
To invent, you need a good imagination and a pile of junk
-------------------------------------
Never fight an inanimate object
-------------------------------------
You've achieved success in your field when you don't know whether what
you're doing is work or play


On Tue, Jun 8, 2010 at 12:28 PM, Richard Fozzard
<Richard.Fozzard@xxxxxxxx> wrote:
> Theoretically, it's better to read the library.xml file once, into a
> (global, probably) variable, e.g.:
>
> B <xsl:variable name="nameLibrary" select="doc('library.xml')/>
>
> This variable is now a node-set, i.e. the entire XML document, and can be
> referenced with XPath just as with your main input document. (By the way,
> your stylesheet will be more portable if you just use a shorter, relative
> document URL, as above, than a full "file://..." path.)
>
> Then inside your loop or template, use the variable to avoid doing file i/o
> each time:
>
> B  <xsl:template match="...">
> B  B  B <xsl:if test="$nameLibrary/names/name...">
>
> However in practice, unless your library is huge and/or you are doing lots
> of recursion and looping, you may not notice much of performance difference
> between defining the variable, and making the doc() call over and over.
>
> My intuition says you'll not see any significant performance problem on 400
> files -- but why not write it the more efficient way? It's more elegant and
> maintainable (what if you later decide to change the library file location?
> or put in on a web server far away? or discover you need to parse 400,000
> files, not 400?)
>
> Good luck!
> --Rich
>
>
>
> Richard Fozzard, Computer Scientist
> B Geospatial Metadata at NGDC: http://www.ngdc.noaa.gov/metadata
>
> Cooperative Institute for Research in Environmental Sciences (CIRES)
> Univ. Colorado & NOAA National Geophysical Data Center, Enterprise Data
> Systems 325 S. Broadway, Skaggs 1B-305, Boulder, CO 80305
> Office: 303-497-6487, Cell: 303-579-5615, Email: richard.fozzard@xxxxxxxx
>
>
>
> Mario Madunic said the following on 06/08/2010 12:57 PM:
>>
>> This is more of a programming technique question than a how to.
>>
>> I'll be creating a recursive function in XSLT that will manipulate a
>> tokenized string (on spaces), capitalize initial letter in each word and
>> lower case the rest of the word. An issue arises for product names and
>> acronyms, as the way they are spelt is quite specific. So I'll create a
>> library XML file that will contain the correct spelling for these strings
>> and do a check on each string.
>>
>> So my question is,
>>
>> What is the best method to call the library XML file? As a parameter into
>> the function (placing the content of the library into a variable that is
>> placed into the function), a straight <value-of
>> select="doc('file:///pathto/library.xml')..." /> each time the function is
>> called, or does it really matter as it might make no difference or
>> discernable difference? (I'll be parsing 400+ files all less than 30k.)
>>
>> Just trying to find the best method someone with a programming background
>> and education (read post secondary education in computer science) would
use.
>>
>> Marijan (Mario) Madunic
>> Publishing Specialist
>> New Flyer Industries
>>
>> --------------------------------------------------------------------
>> Please consider the environment before printing this e-mail.
>>
>> CONFIDENTIALITY STATEMENT: This communication (and B any and all
>> information or material transmitted with this communication) is
>> confidential, may be privileged and is intended only for the use of the
>> intended recipient. If you are not the intended recipient, any review,
>> retransmission, circulation, distribution, reproduction, conversion to
hard
>> copy, copying or other use of this communication, information or material
is
>> strictly prohibited and may be illegal. If you received this communication
>> in error or if it is forwarded to you without the express authorization of
>> New Flyer, please notify us immediately by telephone or by return email
and
>> permanently delete the communication, information and material from any
>> computer, disk drive, diskette or other storage device or media. Thank
you.

Current Thread