Re: Microsoft Index Server

Subject: Re: Microsoft Index Server
From: "Kent Fitch" <kent.fitch@xxxxxxxxxxxx>
Date: Wed, 26 May 1999 08:41:08 +1000
From: Wendy Cameron <wcam001@xxxxxxxxxx>

> Has anyone used Microsoft Index Server or any other Search engine on
> XML files.
> 
> If anyone has Id appreciate knowing which search engine they used
> and if the did anything special to configure the search engine for
> XML such as
> Searching on a particular field in an XML file and or
> Returning specific fields from and XML file that is in the
> search results.

If you download Filtreg.exe (see
http://msdn.microsoft.com/library/techart/msdn_is-index.htm )
and run it on your system, you'll probably see that XML
files aren't being filtered by the HTML filter.

This was important for us, as we wanted to use the
metadata recognition in the HTML filter to allow us to
search on HTML-like metadata fields in our XML documents
using Index Server.  This is how we associated XML
files with the HTML filter:

 a. Using REGEDT32, find the default file type registered for .xml
    which is the default value in HKEY_CLASSES_ROOT\.xml
   (it will probably be'xmlfile').

 b. Get the CLSID for the filetype under
    HKEY_CLASSES_ROOT\ 'xmlfile' \CLSID

   ...might be {48123bc4-99d9-11d1-a6b3-00c04fd91555}

 c. Under the CLSID for the filetype add a
    registry key called PersistentHandler:
  KEY_CLASSES_ROOT\CLSID\ 'clsid of filetype' \PersistentHandler

 d. Add a REG_SZ value to this with no name with a value of
   {eec97550-47a9-11cf-b952-00aa0051fe20}

 e. run filtreg to confirm xml is now being filtered thru
    the HTML filter

 f. rescan documents with Index Server

For us, putting the XML fields we want to search on
as HTML-like metadata fields (actually, we repeat
them in a special <METADATA> section) worked fine
and was easy as our XML is machine generated but it may
not suit everyone.

To display these metadata fields values using Index
Server's search result properties, you have to cache
the fields (using Index Server Management Console plugin)
and rescan your documents.

Kent Fitch                           Ph: +61 2 6276 6711
ITS  CSIRO  Canberra  Australia      kent.fitch@xxxxxxxxxxxx


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread