|
Subject: Re: [xsl] Word Ladders as an example of a "Find shortest path between two nodes in a graph" problem From: Hermann Stamm-Wilbrandt <STAMMW@xxxxxxxxxx> Date: Fri, 7 Dec 2012 10:09:08 +0100 |
Hello Dimitre,
yes, it depends on the constant factors.
In computer science the alphabet is A={a,b,...,z}.
The language of all words of length k (not real words) is of size
V_max = 26^k
But the maximum node degree(v) is k*25.
V_max * k*25 is much less than (V_max ^ 2)
For your samples of length 5 and 6 the avarage degree is below 6!
(see other email reply to Wolfgang).
$ xpath++ "count(/*/w[string-length()=5])" dictEnglishBig.xml
10228
$ xpath++ "count(/*/w[string-length()=6])" dictEnglishBig.xml
17705
$
Mit besten Gruessen / Best wishes,
Hermann Stamm-Wilbrandt
Level 3 support for XML Compiler team and Fixpack team lead
WebSphere DataPower SOA Appliances
https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
https://twitter.com/#!/HermannSW/
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294
|------------>
| From: |
|------------>
>------------------------------------------------------------------------------------------------------------------------------------------|
|Dimitre Novatchev <dnovatchev@xxxxxxxxx> |
>------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To: |
|------------>
>------------------------------------------------------------------------------------------------------------------------------------------|
|xsl-list@xxxxxxxxxxxxxxxxxxxxxx, |
>------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date: |
|------------>
>------------------------------------------------------------------------------------------------------------------------------------------|
|12/06/2012 09:57 PM |
>------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject: |
|------------>
>------------------------------------------------------------------------------------------------------------------------------------------|
|Re: [xsl] Word Ladders as an example of a "Find shortest path between two nodes in a graph" problem |
>------------------------------------------------------------------------------------------------------------------------------------------|
On Thu, Dec 6, 2012 at 10:29 AM, Hermann Stamm-Wilbrandt
<STAMMW@xxxxxxxxxx> wrote:
> Hi Dimitre,
>
> you are right with O(|E|), but not with O(N^2).
> Implementation of "5.c" was a bit sloppy, initEdges() is indeed O(N^2).
>
> But if you have N real language words of length k, each single word can
> have at most k*26 = O(1) neighbors.
>
> Testing whether a word "c_1 c_2 ... c_k" is among the "words" can be done
> in
> constant time, too, if your language allows for constant time array
access
> (just use an array of size 26^k = O(1)).
Sure, if you would allocate 8GB for this array (for k = 7).
Also, calculating the index in the array requires k steps (and
probably multiplication and addition at each such step).
>
> Now with each node having constant number of neighbors O(|E|) = O(|V|) =
O
> (N).
>
I was speaking about graph traversal in general. One may argue that
the big O notation isn't too useful when the value of N is limited and
not too-big.
For example 26*N in this particular case is bigger than log2(N) * N
(unless we have more than 3 million words). So, this "O(N)"
implementation may take more actual time than an "O(N*log(N)))
To put it in other words, the constants do matter when N is limited.
Cheers,
Dimitre
>
> Mit besten Gruessen / Best wishes,
>
> Hermann Stamm-Wilbrandt
> Level 3 support for XML Compiler team and Fixpack team lead
> WebSphere DataPower SOA Appliances
> https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
> https://twitter.com/#!/HermannSW/
> ----------------------------------------------------------------------
> IBM Deutschland Research & Development GmbH
> Vorsitzende des Aufsichtsrats: Martina Koederitz
> Geschaeftsfuehrung: Dirk Wittkopp
> Sitz der Gesellschaft: Boeblingen
> Registergericht: Amtsgericht Stuttgart, HRB 243294
>
>
> |------------>
> | From: |
> |------------>
>
>------------------------------------------------------------------------------------------------------------------------------------------|
> |Dimitre Novatchev <dnovatchev@xxxxxxxxx>
|
>
>------------------------------------------------------------------------------------------------------------------------------------------|
> |------------>
> | To: |
> |------------>
>
>------------------------------------------------------------------------------------------------------------------------------------------|
> |xsl-list@xxxxxxxxxxxxxxxxxxxxxx,
|
>
>------------------------------------------------------------------------------------------------------------------------------------------|
> |------------>
> | Date: |
> |------------>
>
>------------------------------------------------------------------------------------------------------------------------------------------|
> |12/06/2012 07:11 PM
|
>
>------------------------------------------------------------------------------------------------------------------------------------------|
> |------------>
> | Subject: |
> |------------>
>
>------------------------------------------------------------------------------------------------------------------------------------------|
> |Re: [xsl] Word Ladders as an example of a "Find shortest path between
two nodes in a graph" problem |
>
>------------------------------------------------------------------------------------------------------------------------------------------|
>
>
>
>
>
> Hi Herman,
>
> I think that in a near-worst case the time-complexity is O(N^2),
> because every arc needs to be inspected and in a well-connected graph
> with N vertices there can be N*(N-1) vertices.
>
> This is also confirmed by Wikipedia and the 2-3 books on algoritms I
have:
>
> "The time complexity can be expressed as O(|E|) since every vertex and
> every edge will be explored in the worst case. Note: O(|E|) may vary
> between O(|V|) and O(|V|^2), depending on how sparse the input graph
> is (assuming that the graph is connected)".
>
>
> To summarize, this is a property of the graph and not of the
> programming language that is used.
>
>
> Cheers,
> Dimitre
>
>
>
> On Thu, Dec 6, 2012 at 10:01 AM, Hermann Stamm-Wilbrandt
> <STAMMW@xxxxxxxxxx> wrote:
>> Hi Dimitre,
>>
>>> That I have initially used a not probably the most efficient algorithm
>>> / implementation, shouldn't be used to make general conclusions about
>>> the appropriateness of using XSLT in solving a particular class of
>>> problems.
>>>
>> I agree with you -- and your solution is nice.
>>
>> But breadth-first-search algorithm can be implemented as linear time
>> algorithm in C or C++ -- I doubt that you can do linear time
>> implementation in XSLT since constant time array access is missing ...
>>
>>
>> Mit besten Gruessen / Best wishes,
>>
>> Hermann Stamm-Wilbrandt
>> Level 3 support for XML Compiler team and Fixpack team lead
>> WebSphere DataPower SOA Appliances
>> https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
>> https://twitter.com/#!/HermannSW/
>> ----------------------------------------------------------------------
>> IBM Deutschland Research & Development GmbH
>> Vorsitzende des Aufsichtsrats: Martina Koederitz
>> Geschaeftsfuehrung: Dirk Wittkopp
>> Sitz der Gesellschaft: Boeblingen
>> Registergericht: Amtsgericht Stuttgart, HRB 243294
>>
>>
>> |------------>
>> | From: |
>> |------------>
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>> |Dimitre Novatchev <dnovatchev@xxxxxxxxx>
> |
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>> |------------>
>> | To: |
>> |------------>
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>> |xsl-list@xxxxxxxxxxxxxxxxxxxxxx,
> |
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>> |------------>
>> | Date: |
>> |------------>
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>> |12/06/2012 05:50 PM
> |
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>> |------------>
>> | Subject: |
>> |------------>
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>> |Re: [xsl] Word Ladders as an example of a "Find shortest path between
> two nodes in a graph" problem |
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>>
>>
>>
>>
>>
>> Herman,
>>
>> That I have initially used a not probably the most efficient algorithm
>> / implementation, shouldn't be used to make general conclusions about
>> the appropriateness of using XSLT in solving a particular class of
>> problems.
>>
>> Cheers,
>>
>> Dimitre
>>
>> On Thu, Dec 6, 2012 at 8:15 AM, Hermann Stamm-Wilbrandt
>> <STAMMW@xxxxxxxxxx> wrote:
>>>> ... I think that this
>>>> isn't something that should be solved in XSLT at all, except as an
>>>> academic exercise. ...
>>>>
>>> Agreed, nice XSLT solution, but not fast.
>>>
>>> This simple C program does find the longest path (35) to angry in a
>>> second (on a W520 Thinkpad) based on Dimitie's word list of length 5:
>>> http://www.stamm-wilbrandt.de/en/xsl-list/5.c
>>>
>>> $ time ./5 angry
>>> yasht
>>> yacht
>>> pacht
>>> pecht
>>> wecht
>>> wicht
>>> wight
>>> dight
>>> digit
>>> dimit
>>> demit
>>> remit
>>> refit
>>> befit
>>> besit
>>> beset
>>> besee
>>> belee
>>> belve
>>> beeve
>>> breve
>>> brave
>>> brace
>>> braca
>>> araca
>>> arara
>>> amara
>>> amala
>>> alala
>>> alula
>>> aluta
>>> abuta
>>> abura
>>> anura
>>> anury
>>> angry
>>> 35
>>>
>>> real 0m1.046s
>>> user 0m1.039s
>>> sys 0m0.004s
>>> $
>>>
>>>
>>> Mit besten Gruessen / Best wishes,
>>>
>>> Hermann Stamm-Wilbrandt
>>> Level 3 support for XML Compiler team and Fixpack team lead
>>> WebSphere DataPower SOA Appliances
>>> https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
>>> https://twitter.com/#!/HermannSW/
>>> ----------------------------------------------------------------------
>>> IBM Deutschland Research & Development GmbH
>>> Vorsitzende des Aufsichtsrats: Martina Koederitz
>>> Geschaeftsfuehrung: Dirk Wittkopp
>>> Sitz der Gesellschaft: Boeblingen
>>> Registergericht: Amtsgericht Stuttgart, HRB 243294
>>>
>>>
>>> |------------>
>>> | From: |
>>> |------------>
>>>
>>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>>
>>> |Wolfgang Laun <wolfgang.laun@xxxxxxxxx>
>> |
>>>
>>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>>
>>> |------------>
>>> | To: |
>>> |------------>
>>>
>>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>>
>>> |xsl-list@xxxxxxxxxxxxxxxxxxxxxx,
>> |
>>>
>>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>>
>>> |------------>
>>> | Date: |
>>> |------------>
>>>
>>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>>
>>> |11/28/2012 07:15 PM
>> |
>>>
>>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>>
>>> |------------>
>>> | Subject: |
>>> |------------>
>>>
>>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>>
>>> |Re: [xsl] Word Ladders as an example of a "Find shortest path
between
>> two nodes in a graph" problem |
>>>
>>>------------------------------------------------------------------------------------------------------------------------------------------|
>
>>
>>>
>>>
>>>
>>>
>>>
>>> The fact that one XSLT program runs three times faster on one XSLT
>>> implementation
>>> than on another one is strange, *very* strange. But is Saxon 6.4 the
>>> "dernier cri"?
>>> I'd very much like to hear Michael Kay's opinion on this.
>>>
>>> With Saxon HE 9.2.0 running with the -t option, I compare execution
>> times:
>>> 1209 ms with 40065592 bytes for WL's solution
>>> to
>>> 2768 ms with 81184768 bytes for DN's solution.
>>>
>>> Note: DN's solution being the one *without* the optimizations!
>>>
>>> Not that this is conclusive. Algorithms like this one must be judged
>>> by more than a single run:
>>> they may behave well for small word lengths and small ladder sizes,
>>> and scale badly, or
>>> the other way round. (Dimitre and I aren't even using the same word
>>> data, AFAIK.)
>>>
>>> As an aside, I'd like to say that neither DN's nor WL's solution is
>>> something that should
>>> be used if this problem (i.e., shortest path) should ever need a
>>> solution. I think that this
>>> isn't something that should be solved in XSLT at all, except as an
>>> academic exercise.
>>> (Feel free to disagree - I'll not reply to anything contradicting me.)
>>>
>>> Cheers
>>>
>>>>
>>>> Ok, I was running it with Saxon 6.4
>>>>
>>>> Now, the times are:
>>>>
>>>> With Saxon:
>>>>
>>>> Wolfgang's transformation: 25sec.
>>>>
>>>> Dimitre's : 39sec.
>>>>
>>>>
>>>> However, with XQSharp:
>>>>
>>>> Wolfgang's transformation: 23sec.
>>>>
>>>> Dimitre's : 14sec.
>>>>
>>>>
>>>> Therefore, one can't say wich transformation is faster -- it depends
>>>> on the XSLT processor being used.
>>>>
>>>>
>>>> Cheers,
>>>> Dimitre
>>>>
>>>>
>>>>
>>>> On Wed, Nov 28, 2012 at 7:27 AM, Dimitre Novatchev
>> <dnovatchev@xxxxxxxxx>
>>>> wrote:
>>>> > I get this error, trying to run your code:
>>>> >
>>>> > SAXON 6.5.4 from Michael Kay
>>>> > Java version 1.6.0_31
>>>> > Loading my:my
>>>> > Preparation time: 250 milliseconds
>>>> > Processing file:/C:\XSLT Projects\WordLadders\Ver 0.2\dictGraph4.xml
>>>> > Building tree for file:/C:\XSLT Projects\WordLadders\Ver
>>>> > 0.2\dictGraph4.xml using class com.icl.saxon.tinytree.TinyBuilder
>>>> > Tree built in 351 milliseconds
>>>> > Error at xsl:variable on line 23 of file:/(Untitled):
>>>> > Error in expression key('kFindWord', $pStartWord, $vDictGraph)
>>>> > [count(../*) lt count(key('kFindWord',
>>>> > $pTargetWord, $vDictGraph)/../* )] |
>>>> > key('kFindWord', $pTargetWord, $vDictGraph)
>>>> > [count(../*) le count(key('kFindWord', $pStartWord,
>>>> > $vDictGraph)/../*)]: expected "]", found "<name>"
>>>> >
>>>> >
>>>> > Cheers,
>>>> > Dimitre
>>>> >
>>>> > On Wed, Nov 28, 2012 at 5:40 AM, Wolfgang Laun
>>> <wolfgang.laun@xxxxxxxxx>
>>>> > wrote:
>>>> >> <xsl:stylesheet version="2.0"
>>>> >> xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
>>>> >> xmlns:my="my:my"
>>>> >> xmlns:xs="http://www.w3.org/2001/XMLSchema"
>>>> >> exclude-result-prefixes="my xs">
>>>> >>
>>>> >> <xsl:output method="text"/>
>>>> >>
>>>> >> <xsl:variable name="vDictGraph" select="/"/>
>>>> >> <xsl:key name="kFindWord" match="w" use="."/>
>>>> >>
>>>> >> <xsl:param name="pStartWord" select="'nice'" as="xs:string"/>
>>>> >> <xsl:param name="pTargetWord" select="'evil'" as="xs:string"/>
>>>> >>
>>>> >> <xsl:variable name="vStartWord" as="xs:string"
>>>> >> select="key('kFindWord', $pStartWord, $vDictGraph)
>>>> >> [count(../*) lt count(key('kFindWord',
>>>> >> $pTargetWord, $vDictGraph)/../* )]
>>>> >> |
>>>> >> key('kFindWord', $pTargetWord, $vDictGraph)
>>>> >> [count(../*) le count(key('kFindWord',
>>>> >> $pStartWord, $vDictGraph)/../*)]"/>
>>>> >>
>>>> >> <xsl:variable name="vTargetWord" as="xs:string"
>>>> >> select="($pStartWord, $pTargetWord)[not(. eq
>>>> >> $vStartWord)]"/>
>>>> >>
>>>> >> <!-- This function iterates over the temporary tree
>>>> >>
>>>
>>
>
<result><arc level=".." from=".." to=".."/>...</result>
>>>> >> to find
>>
> the
>>> ladder. It starts at a node matching @to with
>>>> >>
$vTargetWord
>>>>
>>>
>> and
>>> proceeds with decreasing @level. -->
>>>> >> <xsl:function name="my:find-path" as="xs:string*">
>>>> >> <xsl:param name="root" as="node()"/>
>>>> >> <xsl:param name="level" as="xs:integer"/>
>>>> >> <xsl:param name="start" as="xs:string"/>
>>>> >> <xsl:param name="target" as="xs:string"/>
>>>> >> <xsl:param name="path" as="xs:string"/>
>>>> >>
>>>> >> <xsl:for-each select="$root/result/arc[@level = $level and @to
=
>>>> >> $target]">
>>>> >> <xsl:variable name="from" select="./@from"/>
>>>> >> <xsl:choose>
>>>> >> <xsl:when test="$start eq $from">
>>>> >> <xsl:value-of select="concat($from,'+',$path)"/>
>>>> >> </xsl:when>
>>>> >> <xsl:otherwise>
>>>> >> <xsl:value-of select="my:find-path($root,$level
>>>> >> -1,$start,$from,concat($from,'+',$path))"/>
>>>> >> </xsl:otherwise>
>>>> >> </xsl:choose>
>>>> >> </xsl:for-each>
>>>> >> </xsl:function>
>>>> >>
>>>> >> <xsl:template match="/">
>>>> >> <xsl:variable name='arcs'>
>>>> >> <result>
>>>> >> <xsl:call-template name="look-at-starts">
>>>> >> <xsl:with-param name="level" select="1"/>
>>>> >> <xsl:with-param name="starts" select="$vStartWord"/>
>>>> >> <xsl:with-param name="target" select="$vTargetWord"/>
>>>> >> <xsl:with-param name="toskip" select="()"/>
>>>> >> </xsl:call-template>
>>>> >> </result>
>>>> >> </xsl:variable>
>>>> >>
>>>> >> <xsl:variable name="finalArcs" select="$arcs/result/arc[@to =
>>>> >> $vTargetWord]"/>
>>>> >> <xsl:value-of select="my:find-path($arcs, $finalArcs[1]/@level,
>>>> >> $vStartWord, $vTargetWord, $vTargetWord)"/>
>>>> >> </xsl:template>
>>>> >>
>>>> >> <!-- Look at $starters nodes obtained from the current set of
>
words
>>>>
>>>>> ending all incomplete ladders. Generate result/arc for
>
each hop
>> to
>>>>
>>>>> the next step. Recurse if none of the arc
>
destinations is the
>>>> >>
>>> overall
>>>> >> target word, otherwise
>
return the last hop. -->
>>>> >> <xsl:template name="look-at-starts">
>>>> >> <xsl:param name="level" as="xs:integer"/>
>>>> >> <xsl:param name="starts" as="xs:string*"/>
>>>> >> <xsl:param name="target" as="xs:string"/>
>>>> >> <xsl:param name="toskip" as="node()*"/>
>>>> >>
>>>> >> <xsl:variable name="starters" as="node()*"
>>>> >> select="key('kFindWord', $starts, $vDictGraph)/..
>>>> >> except $toskip"/>
>>>> >>
>>>> >> <xsl:for-each select="$starters">
>>>> >> <xsl:variable name="w" select="./w"/>
>>>> >> <xsl:for-each select="./nb">
>>>> >> <arc level="{$level}" from="{$w}" to="{.}"/>
>>>> >> </xsl:for-each>
>>>> >> </xsl:for-each>
>>>> >>
>>>> >> <xsl:variable name="nbs" select="$starters/nb"/>
>>>> >>
>>>> >> <xsl:choose>
>>>> >> <xsl:when test="$target = $nbs">
>>>> >> <!--xsl:message select="'found a ladder'"/-->
>>>> >> </xsl:when>
>>>> >> <xsl:otherwise>
>>>> >> <xsl:call-template name="look-at-starts">
>>>> >> <xsl:with-param name="level" select="$level + 1"/>
>>>> >> <xsl:with-param name="starts"
>>>> >> select="distinct-values($nbs)"/>
>>>> >> <xsl:with-param name="target" select="$target"/>
>>>> >> <xsl:with-param name="toskip" select="$toskip union
>>>> >> $starters"/>
>>>> >> </xsl:call-template>
>>>> >> </xsl:otherwise>
>>>> >> </xsl:choose>
>>>> >> </xsl:template>
>>>> >> </xsl:stylesheet>
| Current Thread |
|---|
|
| <- Previous | Index | Next -> |
|---|---|---|
| Re: [xsl] Word Ladders as an exampl, Dimitre Novatchev | Thread | Re: [xsl] Word Ladders as an exampl, Dimitre Novatchev |
| Re: [xsl] Word Ladders as an exampl, Wolfgang Laun | Date | Re: [xsl] Word Ladders as an exampl, Hermann Stamm-Wilbra |
| Month |