RE: Searching huge xml-documents

Subject: RE: Searching huge xml-documents
From: "Ed Nixon" <ed.nixon@xxxxxxxxxxxxxxxxx>
Date: Wed, 14 Apr 1999 19:06:14 -0400
There was a posting on Robin Cover's XML News site last week or the week
before about an implementation of XQL from folks in Darmstadt. They have
implemented an 'compliation' mechanism that takes the DOM tree, indexes it
and writes to disk. At that point it's possible to run XQL against this
file, either in memory or cached to disk.

Perhaps this would be worth a look?				...edN

> -----Original Message-----
> From: owner-xsl-list@xxxxxxxxxxxxxxxx
> [mailto:owner-xsl-list@xxxxxxxxxxxxxxxx]On Behalf Of Thomas Weholt
> Sent: Wednesday, April 14, 1999 6:54 AM
> To: xsl-list@xxxxxxxxxxxxxxxx
> Subject: Searching huge xml-documents
>
>
> hi,
>
> I was thinking using XML as fileformat for a CD database. Each cd can
> containg approx. 20000 files, like clipart or source code, and
> I need a fast ( I don`t mind waiting for 5-10 secs. to search
> 300-500 cds (
> max 20000 entries pr. cd )) method to pick out items
> according to a given
> index.
>
> Something like
>
> <cd_doc>
> 	... info about the cd ...
> 	<entries>
> 	  <entry no="1" path="/cdrom/stugg/long path/more
> text/python_stuff.tar.gz" ... more info .../>
> 	  ... 199999 or so more entries
> 	</entries>
> </cd_doc>
>
> I want to search by entry no or sort by any other attribute. Perhaps
> genererate an word-index to speed-up the process. The reason
> I want to use
> XML is that I use java, perl, python and other programming
> languages on
> several platforms. XML is readable for humans and easy to put
> on the web.
> My main consern is speed. Storagespace is not an issue. How
> fast is XSL?
> How fast is available Java packages? Any thoughts?
>
> I don`t even know if this is "doable", like generating
> indexes etc., but
> would like to use xml at least for learning purposes.
>
> As an experiment I created a xml-document with the structure above,
> containing 90000 entries and searched for a given entry no,
> using Xt and a
> simple xsl-stylesheet. The result was a little slow. Has
> anybody tested the
> IBM java tools for searching, not generating html, but just
> looking up a
> given element in a huge document, the result ( in time ) would be
> interesting. Xt probably has some overhead due to the fact
> it`s written in
> java -> starting VM and so on. If a java-app is allready
> running, how fast
> can I locate several elements in a given xml-document?
>
>
>
>
>
> ----------------------------------------------
>               Thomas Weholt
>        eMail : weholt@xxxxxxxxxxxxxx
>      HTTP://www.linuxfreak.com/~weholt
>         Phone : +47 - 92 09 59 68
> ----------------------------------------------
>
>
>
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread