Re: [xsl] Word Ladders as an example of a "Find shortest path between two nodes in a graph" problem

Subject: Re: [xsl] Word Ladders as an example of a "Find shortest path between two nodes in a graph" problem
From: Hermann Stamm-Wilbrandt <STAMMW@xxxxxxxxxx>
Date: Fri, 7 Dec 2012 10:22:40 +0100
Hello Wolfgang,

its really good to hear about your results.
It seems to indicate that XSLT seems to be really quick in your environment

In my C implementation I was sloppy, not only the initEdges was of
quadratic
complexity, but also BFS itself.

In the night I made the code really linear (by making use of perfect hash
function with uninitialized array safely).
In addition now the average node degree is displayed.

Since you seem to have a pretty fast system, may you please determine the
XSLT time needed to go from "anyone" to "chinik" in 47 steps?
As you can see below that 6 letter problem completed in 0.4s (5q in 0.3s).

On your question:
If your solution really comes into subsecond range, I would choose XSLT.
But if a C solution on bigger combinatorial complexity problems would
outperform a XSLT solution, I would go with that, at least for the compute
intense part component.

http://stamm-wilbrandt.de/en/xsl-list/5q.c
http://stamm-wilbrandt.de/en/xsl-list/6q.c

$ time ./5q angry
yasht
yacht
...
anury
angry
35 - 0
|V|=8416(10228)  |E|=22661  degree_avg=5.39
0.296955s

real	0m0.302s
user	0m0.127s
sys	0m0.171s
$


$ time ./6q anyone
chinik
chinin
...
ancone
anyone
47 - 0
|V|=8329(17705)  |E|=21701  degree_avg=5.21
0.391279s

real	0m0.396s
user	0m0.211s
sys	0m0.182s
$


For completeness, the complete output:

$ time ./6q anyone
chinik
chinin
chitin
chiton
chiron
charon
sharon
sharan
shaman
seaman
seasan
season
geason
genson
genion
genian
genial
denial
dental
rental
rectal
recoal
recool
recook
recock
relock
relick
relink
reline
meline
maline
saline
spline
upline
unline
unfine
unfile
unfill
unfull
ungull
ungula
angula
angola
angora
ancora
ancona
ancone
anyone
47 - 0
|V|=8329(17705)  |E|=21701  degree_avg=5.21
0.391279s

real	0m0.396s
user	0m0.211s
sys	0m0.182s
$



Mit besten Gruessen / Best wishes,

Hermann Stamm-Wilbrandt
Level 3 support for XML Compiler team and Fixpack team lead
WebSphere DataPower SOA Appliances
https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
https://twitter.com/#!/HermannSW/
----------------------------------------------------------------------
IBM Deutschland Research & Development GmbH
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschaeftsfuehrung: Dirk Wittkopp
Sitz der Gesellschaft: Boeblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294


|------------>
| From:      |
|------------>
  >------------------------------------------------------------------------------------------------------------------------------------------|
  |Wolfgang Laun <wolfgang.laun@xxxxxxxxx>                                                                                                   |
  >------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| To:        |
|------------>
  >------------------------------------------------------------------------------------------------------------------------------------------|
  |xsl-list@xxxxxxxxxxxxxxxxxxxxxx,                                                                                                          |
  >------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Date:      |
|------------>
  >------------------------------------------------------------------------------------------------------------------------------------------|
  |12/07/2012 09:21 AM                                                                                                                       |
  >------------------------------------------------------------------------------------------------------------------------------------------|
|------------>
| Subject:   |
|------------>
  >------------------------------------------------------------------------------------------------------------------------------------------|
  |Re: [xsl] Word Ladders as an example of a "Find shortest path between two nodes in a graph" problem                                       |
  >------------------------------------------------------------------------------------------------------------------------------------------|





Hello Hermann,

nevertheless comparisons should be made on an equal basis. The C program
doesn't search for a ladder from A to B (which might be implemented
easily),
and it uses a compiled-in dictionary, which saves time.

When I compare 5.c "as is" to my (!) XSLT version of finding the ladder
from
"yasht" to "angry", it's 1.3sec (/usr/bin/time) to 1.9sec (saxon9he
-t). (Dimitre's version #1 takes 26.3sec.)

And if I /usr/bin/time my Java version that finds *all* ladders
between "yasht" to "angry", it's 0.6sec...

But that's not the key issue for me. Let me put it this way: if you
would plan for a widely portable SW product "Word Ladders" (relying on
OS SW only) with the joint capabilities of finding all ladders between
words or all ladders of a given length, with a user interface for end
users and another one for administrators (for updating the
dictionary), what would you use?

Cheers
Wolfgang

On 06/12/2012, Hermann Stamm-Wilbrandt <STAMMW@xxxxxxxxxx> wrote:
> Hi Dimitre,
>
>> That I have initially used a not probably the most efficient algorithm
>> / implementation, shouldn't be used to make general conclusions about
>> the appropriateness of using XSLT in solving a particular class of
>> problems.
>>
> I agree with you -- and your solution is nice.
>
> But breadth-first-search algorithm can be implemented as linear time
> algorithm in C or C++ -- I doubt that you can do linear time
> implementation in XSLT since constant time array access is missing ...
>
>
> Mit besten Gruessen / Best wishes,
>
> Hermann Stamm-Wilbrandt
> Level 3 support for XML Compiler team and Fixpack team lead
> WebSphere DataPower SOA Appliances
> https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
> https://twitter.com/#!/HermannSW/
> ----------------------------------------------------------------------
> IBM Deutschland Research & Development GmbH
> Vorsitzende des Aufsichtsrats: Martina Koederitz
> Geschaeftsfuehrung: Dirk Wittkopp
> Sitz der Gesellschaft: Boeblingen
> Registergericht: Amtsgericht Stuttgart, HRB 243294
>
>
> |------------>
> | From:      |
> |------------>
>
>>------------------------------------------------------------------------------------------------------------------------------------------|

>   |Dimitre Novatchev <dnovatchev@xxxxxxxxx>
>                                                                 |
>
>>------------------------------------------------------------------------------------------------------------------------------------------|

> |------------>
> | To:        |
> |------------>
>
>>------------------------------------------------------------------------------------------------------------------------------------------|

>   |xsl-list@xxxxxxxxxxxxxxxxxxxxxx,
>                                                                 |
>
>>------------------------------------------------------------------------------------------------------------------------------------------|

> |------------>
> | Date:      |
> |------------>
>
>>------------------------------------------------------------------------------------------------------------------------------------------|

>   |12/06/2012 05:50 PM
>                                                                 |
>
>>------------------------------------------------------------------------------------------------------------------------------------------|

> |------------>
> | Subject:   |
> |------------>
>
>>------------------------------------------------------------------------------------------------------------------------------------------|

>   |Re: [xsl] Word Ladders as an example of a "Find shortest path between
two
> nodes in a graph" problem                                       |
>
>>------------------------------------------------------------------------------------------------------------------------------------------|

>
>
>
>
>
> Herman,
>
> That I have initially used a not probably the most efficient algorithm
> / implementation, shouldn't be used to make general conclusions about
> the appropriateness of using XSLT in solving a particular class of
> problems.
>
> Cheers,
>
> Dimitre
>
> On Thu, Dec 6, 2012 at 8:15 AM, Hermann Stamm-Wilbrandt
> <STAMMW@xxxxxxxxxx> wrote:
>>> ... I think that this
>>> isn't something that should be solved in XSLT at all, except as an
>>> academic exercise. ...
>>>
>> Agreed, nice XSLT solution, but not fast.
>>
>> This simple C program does find the longest path (35) to angry in a
>> second (on a W520 Thinkpad) based on Dimitie's word list of length 5:
>> http://www.stamm-wilbrandt.de/en/xsl-list/5.c
>>
>> $ time ./5 angry
>> yasht
>> yacht
>> pacht
>> pecht
>> wecht
>> wicht
>> wight
>> dight
>> digit
>> dimit
>> demit
>> remit
>> refit
>> befit
>> besit
>> beset
>> besee
>> belee
>> belve
>> beeve
>> breve
>> brave
>> brace
>> braca
>> araca
>> arara
>> amara
>> amala
>> alala
>> alula
>> aluta
>> abuta
>> abura
>> anura
>> anury
>> angry
>> 35
>>
>> real    0m1.046s
>> user    0m1.039s
>> sys     0m0.004s
>> $
>>
>>
>> Mit besten Gruessen / Best wishes,
>>
>> Hermann Stamm-Wilbrandt
>> Level 3 support for XML Compiler team and Fixpack team lead
>> WebSphere DataPower SOA Appliances
>> https://www.ibm.com/developerworks/mydeveloperworks/blogs/HermannSW/
>> https://twitter.com/#!/HermannSW/
>> ----------------------------------------------------------------------
>> IBM Deutschland Research & Development GmbH
>> Vorsitzende des Aufsichtsrats: Martina Koederitz
>> Geschaeftsfuehrung: Dirk Wittkopp
>> Sitz der Gesellschaft: Boeblingen
>> Registergericht: Amtsgericht Stuttgart, HRB 243294
>>
>>
>> |------------>
>> | From:      |
>> |------------>
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|

>
>>   |Wolfgang Laun <wolfgang.laun@xxxxxxxxx>
> |
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|

>
>> |------------>
>> | To:        |
>> |------------>
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|

>
>>   |xsl-list@xxxxxxxxxxxxxxxxxxxxxx,
> |
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|

>
>> |------------>
>> | Date:      |
>> |------------>
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|

>
>>   |11/28/2012 07:15 PM
> |
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|

>
>> |------------>
>> | Subject:   |
>> |------------>
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|

>
>>   |Re: [xsl] Word Ladders as an example of a "Find shortest path between
> two nodes in a graph" problem                                       |
>>
>>------------------------------------------------------------------------------------------------------------------------------------------|

>
>>
>>
>>
>>
>>
>> The fact that one XSLT program runs three times faster on one XSLT
>> implementation
>> than on another one is strange, *very* strange. But is Saxon 6.4 the
>> "dernier cri"?
>> I'd very much like to hear Michael Kay's opinion on this.
>>
>> With Saxon HE 9.2.0 running with the -t option, I compare execution
> times:
>>    1209 ms with 40065592 bytes for WL's solution
>> to
>>    2768 ms with 81184768 bytes for DN's solution.
>>
>> Note: DN's solution being the one *without* the optimizations!
>>
>> Not that this is conclusive. Algorithms like this one must be judged
>> by more than a single run:
>> they may behave well for small word lengths and small ladder sizes,
>> and scale badly, or
>> the other way round. (Dimitre and I aren't even using the same word
>> data, AFAIK.)
>>
>> As an aside, I'd like to say that neither DN's nor WL's solution is
>> something that should
>> be used if this problem (i.e., shortest path) should ever need a
>> solution. I think that this
>> isn't something that should be solved in XSLT at all, except as an
>> academic exercise.
>> (Feel free to disagree - I'll not reply to anything contradicting me.)
>>
>> Cheers
>>
>>>
>>> Ok, I was running it with Saxon 6.4
>>>
>>> Now, the times are:
>>>
>>> With Saxon:
>>>
>>> Wolfgang's transformation: 25sec.
>>>
>>> Dimitre's :                            39sec.
>>>
>>>
>>> However, with XQSharp:
>>>
>>> Wolfgang's transformation: 23sec.
>>>
>>> Dimitre's :                            14sec.
>>>
>>>
>>> Therefore, one can't say wich transformation is faster -- it depends
>>> on the XSLT processor being used.
>>>
>>>
>>> Cheers,
>>> Dimitre
>>>
>>>
>>>
>>> On Wed, Nov 28, 2012 at 7:27 AM, Dimitre Novatchev
> <dnovatchev@xxxxxxxxx>
>>> wrote:
>>> > I get this error, trying to run your code:
>>> >
>>> > SAXON 6.5.4 from Michael Kay
>>> > Java version 1.6.0_31
>>> > Loading my:my
>>> > Preparation time: 250 milliseconds
>>> > Processing file:/C:\XSLT Projects\WordLadders\Ver 0.2\dictGraph4.xml
>>> > Building tree for file:/C:\XSLT Projects\WordLadders\Ver
>>> > 0.2\dictGraph4.xml using class com.icl.saxon.tinytree.TinyBuilder
>>> > Tree built in 351 milliseconds
>>> > Error at xsl:variable on line 23 of file:/(Untitled):
>>> >   Error in expression key('kFindWord', $pStartWord, $vDictGraph)
>>> >                     [count(../*)  lt  count(key('kFindWord',
>>> > $pTargetWord, $vDictGraph)/../* )]                        |
>>> >             key('kFindWord', $pTargetWord, $vDictGraph)
>>> >            [count(../*) le count(key('kFindWord',  $pStartWord,
>>> > $vDictGraph)/../*)]: expected "]", found "<name>"
>>> >
>>> >
>>> > Cheers,
>>> > Dimitre
>>> >
>>> > On Wed, Nov 28, 2012 at 5:40 AM, Wolfgang Laun
>> <wolfgang.laun@xxxxxxxxx>
>>> > wrote:
>>> >> <xsl:stylesheet version="2.0"
>>> >>    xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
>>> >>    xmlns:my="my:my"
>>> >>    xmlns:xs="http://www.w3.org/2001/XMLSchema";
>>> >>    exclude-result-prefixes="my xs">
>>> >>
>>> >>   <xsl:output method="text"/>
>>> >>
>>> >>   <xsl:variable name="vDictGraph" select="/"/>
>>> >>   <xsl:key name="kFindWord" match="w" use="."/>
>>> >>
>>> >>   <xsl:param name="pStartWord"  select="'nice'" as="xs:string"/>
>>> >>   <xsl:param name="pTargetWord" select="'evil'" as="xs:string"/>
>>> >>
>>> >>   <xsl:variable name="vStartWord" as="xs:string"
>>> >>                 select="key('kFindWord', $pStartWord, $vDictGraph)
>>> >>                          [count(../*)  lt  count(key('kFindWord',
>>> >> $pTargetWord, $vDictGraph)/../* )]
>>> >>                       |
>>> >>                       key('kFindWord', $pTargetWord, $vDictGraph)
>>> >>                          [count(../*) le count(key('kFindWord',
>>> >> $pStartWord, $vDictGraph)/../*)]"/>
>>> >>
>>> >>   <xsl:variable name="vTargetWord" as="xs:string"
>>> >>                 select="($pStartWord, $pTargetWord)[not(. eq
>>> >> $vStartWord)]"/>
>>> >>
>>> >>   <!-- This function iterates over the temporary tree
>>> >>
>>
>
<result><arc level=".." from=".." to=".."/>...</result>
>>> >>     to find
>
the
>> ladder. It starts at a node matching @to with
>>> >> $vTargetWord
>>>
>>
> and
>> proceeds with decreasing @level. -->
>>> >>   <xsl:function name="my:find-path" as="xs:string*">
>>> >>     <xsl:param name="root"   as="node()"/>
>>> >>     <xsl:param name="level"  as="xs:integer"/>
>>> >>     <xsl:param name="start"  as="xs:string"/>
>>> >>     <xsl:param name="target" as="xs:string"/>
>>> >>     <xsl:param name="path"   as="xs:string"/>
>>> >>
>>> >>     <xsl:for-each select="$root/result/arc[@level = $level and @to =
>>> >> $target]">
>>> >>       <xsl:variable name="from" select="./@from"/>
>>> >>       <xsl:choose>
>>> >>         <xsl:when test="$start eq $from">
>>> >>           <xsl:value-of select="concat($from,'+',$path)"/>
>>> >>         </xsl:when>
>>> >>         <xsl:otherwise>
>>> >>           <xsl:value-of select="my:find-path($root,$level
>>> >> -1,$start,$from,concat($from,'+',$path))"/>
>>> >>         </xsl:otherwise>
>>> >>       </xsl:choose>
>>> >>     </xsl:for-each>
>>> >>   </xsl:function>
>>> >>
>>> >>   <xsl:template match="/">
>>> >>     <xsl:variable name='arcs'>
>>> >>       <result>
>>> >>       <xsl:call-template name="look-at-starts">
>>> >>         <xsl:with-param name="level"  select="1"/>
>>> >>         <xsl:with-param name="starts" select="$vStartWord"/>
>>> >>         <xsl:with-param name="target" select="$vTargetWord"/>
>>> >>         <xsl:with-param name="toskip" select="()"/>
>>> >>       </xsl:call-template>
>>> >>       </result>
>>> >>     </xsl:variable>
>>> >>
>>> >>     <xsl:variable name="finalArcs" select="$arcs/result/arc[@to =
>>> >> $vTargetWord]"/>
>>> >>     <xsl:value-of select="my:find-path($arcs, $finalArcs[1]/@level,
>>> >> $vStartWord, $vTargetWord, $vTargetWord)"/>
>>> >>   </xsl:template>
>>> >>
>>> >>   <!-- Look at $starters nodes obtained from the current set of
words
>>>
>>>>      ending all incomplete ladders. Generate result/arc for
each hop
> to
>>>
>>>>      the next step. Recurse if none of the arc
destinations is the
>>> >>
>> overall
>>> >>      target word, otherwise
return the last hop. -->
>>> >>   <xsl:template name="look-at-starts">
>>> >>     <xsl:param name="level"  as="xs:integer"/>
>>> >>     <xsl:param name="starts" as="xs:string*"/>
>>> >>     <xsl:param name="target" as="xs:string"/>
>>> >>     <xsl:param name="toskip" as="node()*"/>
>>> >>
>>> >>     <xsl:variable name="starters" as="node()*"
>>> >>                   select="key('kFindWord', $starts, $vDictGraph)/..
>>> >> except $toskip"/>
>>> >>
>>> >>     <xsl:for-each select="$starters">
>>> >>       <xsl:variable name="w" select="./w"/>
>>> >>       <xsl:for-each select="./nb">
>>> >>         <arc level="{$level}" from="{$w}" to="{.}"/>
>>> >>       </xsl:for-each>
>>> >>     </xsl:for-each>
>>> >>
>>> >>     <xsl:variable name="nbs" select="$starters/nb"/>
>>> >>
>>> >>     <xsl:choose>
>>> >>       <xsl:when test="$target = $nbs">
>>> >>         <!--xsl:message select="'found a ladder'"/-->
>>> >>       </xsl:when>
>>> >>       <xsl:otherwise>
>>> >>         <xsl:call-template name="look-at-starts">
>>> >>           <xsl:with-param name="level"  select="$level + 1"/>
>>> >>           <xsl:with-param name="starts"
>>> >> select="distinct-values($nbs)"/>
>>> >>           <xsl:with-param name="target" select="$target"/>
>>> >>           <xsl:with-param name="toskip" select="$toskip union
>>> >> $starters"/>
>>> >>         </xsl:call-template>
>>> >>       </xsl:otherwise>
>>> >>     </xsl:choose>
>>> >>   </xsl:template>
>>> >> </xsl:stylesheet>
>>> >
>>> >
>>> >
>>> > --
>>> > Cheers,
>>> > Dimitre Novatchev
>>> > ---------------------------------------
>>> > Truly great madness cannot be achieved without significant
>> intelligence.
>>> > ---------------------------------------
>>> > To invent, you need a good imagination and a pile of junk
>>> > -------------------------------------
>>> > Never fight an inanimate object
>>> > -------------------------------------
>>> > To avoid situations in which you might make mistakes may be the
>>> > biggest mistake of all
>>> > ------------------------------------
>>> > Quality means doing it right when no one is looking.
>>> > -------------------------------------
>>> > You've achieved success in your field when you don't know whether
what
>>> > you're doing is work or play
>>> > -------------------------------------
>>> > Facts do not cease to exist because they are ignored.
>>> > -------------------------------------
>>> > Typing monkeys will write all Shakespeare's works in 200yrs.Will they
>>> > write all patents, too? :)
>>> > -------------------------------------
>>> > I finally figured out the only reason to be alive is to enjoy it.
>>>
>>>
>>>
>>> --
>>> Cheers,
>>> Dimitre Novatchev
>>> ---------------------------------------
>>> Truly great madness cannot be achieved without significant
intelligence.
>>> ---------------------------------------
>>> To invent, you need a good imagination and a pile of junk
>>> -------------------------------------
>>> Never fight an inanimate object
>>> -------------------------------------
>>> To avoid situations in which you might make mistakes may be the
>>> biggest mistake of all
>>> ------------------------------------
>>> Quality means doing it right when no one is looking.
>>> -------------------------------------
>>> You've achieved success in your field when you don't know whether what
>>> you're doing is work or play
>>> -------------------------------------
>>> Facts do not cease to exist because they are ignored.
>>> -------------------------------------
>>> Typing monkeys will write all Shakespeare's works in 200yrs.Will they
>>> write all patents, too? :)
>>> -------------------------------------
>>> I finally figured out the only reason to be alive is to enjoy it.
>>
>
>
>
> --
> Cheers,
> Dimitre Novatchev
> ---------------------------------------
> Truly great madness cannot be achieved without significant intelligence.
> ---------------------------------------
> To invent, you need a good imagination and a pile of junk
> -------------------------------------
> Never fight an inanimate object
> -------------------------------------
> To avoid situations in which you might make mistakes may be the
> biggest mistake of all
> ------------------------------------
> Quality means doing it right when no one is looking.
> -------------------------------------
> You've achieved success in your field when you don't know whether what
> you're doing is work or play
> -------------------------------------
> Facts do not cease to exist because they are ignored.
> -------------------------------------
> Typing monkeys will write all Shakespeare's works in 200yrs.Will they
> write all patents, too? :)
> -------------------------------------
> I finally figured out the only reason to be alive is to enjoy it.

Current Thread