RE: [xsl] xsl search engine

Subject: RE: [xsl] xsl search engine
From: "Ricaud Matthieu" <matthieu.ricaud@xxxxxxx>
Date: Thu, 11 Mar 2004 15:12:15 +0100
Concerning the doubled matched elements (those who are displayed twice) I
made something (which I don't find really fantastic...) :
I had in node-set variable $NodeSetMErecherchees the whole found nodes :

$NodeSetMErecherchees contains :
		<THEME id="12" label="Droit du travail"/>
		<THEME id="2" label="droit social"/>
		<THEME id="12" label="Droit du travail"/>
		<THEME id="34" label="travail à la chaîne"/>

In order to deletes THEME nodes that appear twice (or more) in this node-set
I did :

	<xsl:variable name="NodeSetMErechercheesWithoutDouble">
	<xsl:for-each select="msxsl:node-set($NodeSetMErecherchees)/THEME">
		<xsl:variable name="id" select="@id"/>
		<xsl:variable name="StringPrecedingTHEMEs">
			<xsl:for-each select="preceding-sibling::*">
				<xsl:value-of select="@id"/>///
			</xsl:for-each>
		</xsl:variable>
		<xsl:if test="not(contains($StringPrecedingTHEMEs,$refMEE))">
			<ME ref-ME="{$refMEE}" label={@label}/>
		</xsl:if>
	</xsl:for-each>
	</xsl:variable>

$NodeSetMErechercheesWithoutDouble contains then :
		<THEME id="12" label="Droit du travail"/>
		<THEME id="2" label="droit social"/>
		<THEME id="34" label="travail à la chaîne"/>



-----Message d'origine-----
De : owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
[mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]De la part de Ricaud
Matthieu
Envoyé : jeudi 11 mars 2004 12:18
À : xsl-list@xxxxxxxxxxxxxxxxxxxxxx
Objet : RE: [xsl] xsl search engine


Thanks Robert and Jarno for your help.
As we say in french "I'm re-inventing the wheel !"
I don't know about Jakarta Lucene, but as Robert said, I need to use Java to
run it, but unfortunately, my project don't use Java, it's only XML/XSL
files displaying HTML and ASP is used to :
- Générate a XML listing of all xml files
- add parameters to XSL stylesheets
- and using values from HTML forms
And indeed I've never learnt JAVA...

The search engine I want to do is almost ok, just a "few" things doesn't
work yet.
It can use it that way but if I find a solution for those "little" problems,
then I 'll be able to search in any xml files and even in many files (using
document(...))
and then make a engine for my whole project.
Actually i'd like to try going further in that direction...

So I let you know about thoses problems, which i did not manage to solve, if
you maybe have an idea...

Recapitulation of the problem
I want to search a string in a xml file and display the matched nodes.

The XML document looks like this :
<LIST>
	<THEME label="Droit du travail" id="12"/>
	<THEME label="droit social" id="2"/>
	<THEME label="travail à la chaîne" id="34"/>
	<THEME label="rien du tout" id="17"/>
</LIST>

The search engine only search on the labels attribute of the THEME nodes of
this xml document and it display the @label.

The xsl files does 2 things : it gives a HTML form so that the user write
the (next) searching words and it display the (former) result.

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
xmlns:msxsl="urn:schemas-microsoft-com:xslt">
	<xsl:output method="html" indent="yes"/>
	<xsl:param name="String"/> <!-- string sent by the xslt processor -->

	<xsl:template name="tokenizer">
  		<xsl:param name="text" select="concat(normalize-space($string),' ')"/>
  		<xsl:if test="$text">
   			 <xsl:for-each select="THEME[contains(@label,substring-before($text, '
'))]">
				<THEME id="{@id}" label="{@label}"/>
			 </xsl:for-each>
   			 <xsl:call-template name="tokenizer">
    				 <xsl:with-param name="text" select="substring-after($text, ' ')"/>
   			</xsl:call-template>
  		</xsl:if>
	</xsl:template>

	<xsl:template match="LIST">
		<form action="display.asp" method="POST">
			<input type="text" name="UserString"/>
			<input type="submit" value="Go!"/>
		</form>
		<xsl:variable name="NodeSetMErecherchees">
			<xsl:call-template name="tokenizer"/>
		</xsl:variable> <!-- I get here the matched THEME elements corresponding
to the search in a node-set variable -->

		<xsl:for-each select="msxsl:node-set($NodeSetMErecherchees)/ME">
			<xsl:value-of select="@ref-ME"/><br/>
		</xsl:for-each>
	</xsl:template>

</xsl:stylesheet>

The first problem is :

If the user search for "droit travail", the search engine will display twice
the first THEME id=12 because it contains 2 of the query words.
==> As you can see in the xsl I put the whole result in a node-set variable
which would in this case look like this :
		<THEME id="12" label="Droit du travail"/> 	--> matching "droit"
		<THEME id="2" label="droit social"/>		--> matching "droit"
		<THEME id="12" label="Droit du travail"/>		--> matching "travail"
		<THEME id="34" label="travail à la chaîne"/>	--> matching "travail"

THEME id=12 apears twice...
So I 'd like now to delete the doubled THEME in this node-set variable...
But I don't see any way to say <xsl:for-each select="distinc(THEME)">
I could make a loop comparing for each node if there 's another who is the
same but it makes a lot to do to the processor and will make the engine
slower, is there another simple solution ?

The second problem is (it's less nescessary but would be great) :
I'd like to highlight the searched words in the displayed result...how to ?

Thanks for advices.


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread