Re: [xsl] Searching for values in XML using XSL using Saxon

Subject: Re: [xsl] Searching for values in XML using XSL using Saxon
From: Вячеслав Седов <schematronic@xxxxxxxxx>
Date: Thu, 14 Oct 2010 16:58:17 +0400
2010/10/14 Michael Kay <mike@xxxxxxxxxxxx>:
>  Handling thousands of topics shouldn't be a problem; if there were
millions
> I would consider an XML database.

http://exist-db.org - very good case

>
> Search time in the document shouldn't be a problem if you can keep it (and
> its indexes, whether xsl:key indexes or auto-generated Saxon indexes) in
> memory. But repeated loading of the document from disk every time it's
> needed could get very slow.
>
> Michael Kay
> Saxonica
>
> On 14/10/2010 12:01 PM, Jacobus Reyneke wrote:
>>
>> Good day,
>>
>> I am trying to write a system on a pure XML data store. There are various
>> reasons for doing this, but the most important is that I am always
>> transforming the results, and because the system's data structure is
dynamic
>> an hierarchical, so XML is a lovely fit.
>>
>> One part of my data will be large vocabularies of data, like dictionaries,
>> and I would like to know from the experts if I'm going to run into trouble
>> in the long term and should rather move to a relational database solution
>> with proper indexing etc. I intend to use Saxon, simply because it's
written
>> in Java, it supports XSLT 2.0 and Michael has a good history of sticking
>> behind his product.
>>
>> Other options may be using XML databases, but the visibility provided by
>> free standing XML files compared to an administrator console to a database
>> is nice.
>>
>> The data will look something like this:
>>
>> <topic>
>> <name>hamburger</name>
>>
>>
<related-topics><topic-ref>food</topic-ref><topic-ref>dead-cows</topic-ref><t
opic-ref>health</topic-ref>
>> <keywords>burger, ketchup, mustard, hungry</keywords>
>> <description>Hamburgers are nice, but are not always good for your health.
>> They are especially bad for the health of the cow, but this is o.k. if you
>> don't know the cow</description>
>> </topic>
>>
>> These topics will be built on the fly during chatroom conversations, so
>> the related-topics and keywords will not be known before hand. Yet, it's
the
>> related-topics and keywords, that will be used on-the-fly to find matching
>> topics, and format them into diargrams and charts etc.
>>
>> In a couple of month's time there will be thousands of topics, so I am
>> looking for a way to do this that will scale. Another problem is that some
>> topics may be different in structure, e.g. a topic on cars may have
>> a<max-speed>  element, while one on houses may have a<price>, again
another
>> reason why a dynamic hierarchical data store makes more sense than a
>> traditional relational database.
>>
>> If someone can give me some advice, or suggest an efficient search on
>> something like the keywords, I will be very grateful.
>>
>> Kind regards,
>> Jacobus

Current Thread