Subject: Re: [xsl] Searching for values in XML using XSL using Saxon From: Jacobus Reyneke <jacobusreyneke@xxxxxxxxx> Date: Thu, 14 Oct 2010 17:34:04 +0200 |
Thank you both Michael and PQQP5QP;P0P2, You answered my main question: From a design point it would appear that XML as a datastore is viable and that if someday Saxon should start feeling the load, then I can refactor the architecture to use exist-db for enormous data sets. One question regarding keys and indexing, is it possible to index comma separated values such as my keywords list, or should each keyword by a separate element? Thank you kindly for the help, Jacobus On 14 Oct 2010, at 2:58 PM, PQQP5QP;P0P2 P!P5P4P>P2 wrote: > 2010/10/14 Michael Kay <mike@xxxxxxxxxxxx>: >> Handling thousands of topics shouldn't be a problem; if there were millions >> I would consider an XML database. > > http://exist-db.org - very good case > >> >> Search time in the document shouldn't be a problem if you can keep it (and >> its indexes, whether xsl:key indexes or auto-generated Saxon indexes) in >> memory. But repeated loading of the document from disk every time it's >> needed could get very slow. >> >> Michael Kay >> Saxonica >> >> On 14/10/2010 12:01 PM, Jacobus Reyneke wrote: >>> >>> Good day, >>> >>> I am trying to write a system on a pure XML data store. There are various >>> reasons for doing this, but the most important is that I am always >>> transforming the results, and because the system's data structure is dynamic >>> an hierarchical, so XML is a lovely fit. >>> >>> One part of my data will be large vocabularies of data, like dictionaries, >>> and I would like to know from the experts if I'm going to run into trouble >>> in the long term and should rather move to a relational database solution >>> with proper indexing etc. I intend to use Saxon, simply because it's written >>> in Java, it supports XSLT 2.0 and Michael has a good history of sticking >>> behind his product. >>> >>> Other options may be using XML databases, but the visibility provided by >>> free standing XML files compared to an administrator console to a database >>> is nice. >>> >>> The data will look something like this: >>> >>> <topic> >>> <name>hamburger</name> >>> >>> <related-topics><topic-ref>food</topic-ref><topic-ref>dead-cows</topic-ref><t opic-ref>health</topic-ref> >>> <keywords>burger, ketchup, mustard, hungry</keywords> >>> <description>Hamburgers are nice, but are not always good for your health. >>> They are especially bad for the health of the cow, but this is o.k. if you >>> don't know the cow</description> >>> </topic> >>> >>> These topics will be built on the fly during chatroom conversations, so >>> the related-topics and keywords will not be known before hand. Yet, it's the >>> related-topics and keywords, that will be used on-the-fly to find matching >>> topics, and format them into diargrams and charts etc. >>> >>> In a couple of month's time there will be thousands of topics, so I am >>> looking for a way to do this that will scale. Another problem is that some >>> topics may be different in structure, e.g. a topic on cars may have >>> a<max-speed> element, while one on houses may have a<price>, again another >>> reason why a dynamic hierarchical data store makes more sense than a >>> traditional relational database. >>> >>> If someone can give me some advice, or suggest an efficient search on >>> something like the keywords, I will be very grateful. >>> >>> Kind regards, >>> Jacobus
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Re: [xsl] Searching for values in X, Вячеслав Седов | Thread | Re: [xsl] Searching for values in X, Martin Honnen |
Re: [xsl] Converting milestone tags, Wendell Piez | Date | Re: [xsl] Searching for values in X, Martin Honnen |
Month |