Re: OT: XML Server dream

Subject: Re: OT: XML Server dream
From: Eric van der Vlist <vdv@xxxxxxxxxxxx>
Date: Mon, 25 Oct 1999 21:01:51 +0200
Liam,

Thanks for your post.

I like your pragmatic aproach !

Eric

"Liam R. E. Quin" wrote:
> 
> I've lost count of the number of systems I have seen storing SGML or XML
> in a database.
> 
> I have seen 3 basic approaches.  Of these, one approach almost never
> works, or if it does, dramatically increases sales of snacks and
> coffee while people wait for a response when it's used.
> 
> The approaches are
> (1) decompose every element into a field in a relational or OO database.
>     With a relational database, this always seems to end up sad.
>     A wait of 30 seconds to a minute or more on half a million dollars
>     worth of server hardware is pathetic.
> 
> (2) decompose down to paragraphs, but store mixed content as blobs.
>     much faster, for most people, but you lose the ability to find
>     things like embedded part numbers that might have been the reason
>     for using the database in the first place.
> 
> (3) store documents in flat files.  Use the database to manage them,
>     and to store metadata.  Use external text retrieval.
>     Fast performance (sub-second response on a million-document
>     database with a middling SPARC server is plausible, or a few
>     seconds for a more complex text search).
> 
> There are many varioations on these.  It is possible to use an
> object oriented database in such a way as to give good response, but
> it is difficult.  It's possible to get good response for a specific
> application with a relational database too.  But if you compare the
> performance with Oracle (the market leader) with mySQL (free, does not
> support transactions, rollback, cursors), you see that you are not
> paying for performance.
> 
> Marc Rochkind mentions [1] a database that did 40,000 or more transactions
> per second on a PDP-11, but I doubt there was locking or rollback or
> journalling.  My own text retrieval packaeg can do several million
> database operations a second, but again without locking.
> 
> Luckily, you don't need locking and rollback below the "file" level
> for most XML applications -- where "file" is the granularity at which
> documents are saved and/or edited.
> 
> If performance isn't a major issue, though, I agree that using te
> database is often the simplest way.  Some databases even support
> searching of text fields and BLOBS these days, which makes it more
> attractive.
> 
> Don't underestimage grep, by the way -- I've seen a good version of
> gerp search over 50 megabytes a second, on a fairly low-end
> SPARC system (an SS10, you can't buy them that slow now).  You won't
> get that performance on a PC, usually, because the I/O just isn't
> there, even with a SCSI PCI system, but it's coming.  And two PCs
> in parallel are less than half the price of a SPARC Ultra.
> 
> You have to look at what staff you have.
> 
> If you have Unix programmers, the grep solution may be a good one,
> once you deal with normalising white-space.
> 
> If you have SQL programmers, then any problem will seem to have a
> solution involving a database :-) and that's the way to go.
> 
> It's better to have a slow system that works, and that you can fix
> and extend, than a super-duper quantum rocket-science thingy that
> 99% works but no-one can do the last 1% unless you hire a wizard.
> There _are_ no wizards, only people.
> 
> The best solution is the one that works and can be supported in
> your environment.
> 
> Lee
> 
> --
> Liam Quin, Barefoot Computing, Toronto;  The barefoot agitator
> l i a m q u i n     at    i n t e r l o g    dot   c o m
> Ankh on irc.sorcery.net, ankle5/Ankle{MD} on DALnet
> 
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list

-- 
------------------------------------------------------------------------
Eric van der Vlist                                              Dyomedea

http://www.dyomedea.com                          http://www.ducotede.com
------------------------------------------------------------------------


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread