Subject: Re: [xsl] Using memory addressing to retrieve a value vice using a software string library to retrieve a value From: "Dimitre Novatchev dnovatchev@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Fri, 20 Nov 2015 16:38:55 -0000 |
In addition to the replies by Dr. Kay and David Carlisle: It depends on the concrete case and the concrete XML document and the string itself. 1. Imagine a string starting with a space and then having the wanted substring (that is not too-long). This may be faster than scanning sequentially through thousands of nodes -- as in general an XPath engine may use a linked-list and not a vector. 2. On the other side, imagine an XPath engine that has the sequence of nodes in a vector (self-expandable array structure). Then the wanted node can be obtained in O(1). Imagine at the same time that you have a sting long many Megs, snd the first space happens only at the end of the string, and the wanted substring is also very long. In this case searching through the sting using the specified expression can be orders of magnitude slower. A general remark about finding the first/all occurrence of a given string as a substring of another given string, or whether a string1 contains a string2. I am aware of at least two very efficient algorithms: 1. Using *suffix-trees* -- all possible suffixes of the given string are organized as a trie. The search (same as the contains() function) is very fast. 2. Using the hash of the search-string and scanning the given string and sequentially computing the hash of every next substring with the same length as the search string. Because computing the hash of the moving "window" can be done incrementally, this algorithm is O(N) -- much better than the naC/ve O(N*M) algorithm. More about this -- in the book of Steven Skiena "The Algorithm Design Manual", http://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/dp/1849967202/ref =sr_1_1?s=books&ie=UTF8&qid=1448037467&sr=1-1&keywords=skiena&pebp=1448037471 640&perid=0D1Z0T6Y90MCMDPDA330 Cheers, Dimitre To summarize: All depends. On Fri, Nov 20, 2015 at 4:13 AM, Costello, Roger L. costello@xxxxxxxxx <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote: > Hi Folks, > > I want to retrieve "west". > > Which of these is faster? > > ------------------- > Approach #1 > ------------------- > The <edge> element contains text: > > <edge>garden west door</edge> > > "west" is retrieved using this XPath: > > substring-before(substring-after(., ' '), ' ') > > Note: assume that <edge> is the context node. > > ------------------- > Approach #2 > ------------------- > The <edge> element contains markup: > > <edge> > <garden/> > <west/> > <door/> > </edge> > > "west" is retrieved using this XPath: > > *[2]/name() > > > I believe that Approach #2 is much, much faster. Do you agree? > > As I see it, Approach #2 is simply a memory addressing task (which computers do very well) to obtain <west/> and then output the symbols at that memory address. Approach #1, on the other hand, requires the use of software string libraries, which, I imagine, results in hundreds or thousands of machine instructions. In fact, I would imagine that Approach #2 is hundreds or thousands of times faster than Approach #1. Do you agree? > > /Roger > -- Cheers, Dimitre Novatchev --------------------------------------- Truly great madness cannot be achieved without significant intelligence. --------------------------------------- To invent, you need a good imagination and a pile of junk ------------------------------------- Never fight an inanimate object ------------------------------------- To avoid situations in which you might make mistakes may be the biggest mistake of all ------------------------------------ Quality means doing it right when no one is looking. ------------------------------------- You've achieved success in your field when you don't know whether what you're doing is work or play ------------------------------------- To achieve the impossible dream, try going to sleep. ------------------------------------- Facts do not cease to exist because they are ignored. ------------------------------------- Typing monkeys will write all Shakespeare's works in 200yrs.Will they write all patents, too? :) ------------------------------------- Sanity is madness put to good use. ------------------------------------- I finally figured out the only reason to be alive is to enjoy it.
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Using memory addressing t, David Carlisle d.p.c | Thread | Re: [xsl] Using memory addressing t, Dimitre Novatchev dn |
Re: [xsl] Using memory addressing t, David Carlisle d.p.c | Date | Re: [xsl] Using memory addressing t, Dimitre Novatchev dn |
Month |