RE: [xsl] xslt, xml, and databases - howto combine?

Subject: RE: [xsl] xslt, xml, and databases - howto combine?
From: Graham Seaman <graham@xxxxxxxxx>
Date: Tue, 9 Apr 2002 20:06:13 +0100 (BST)
On Tue, 9 Apr 2002, Michael Kay wrote:

> You are struggling hard to find an alternative to using an XML database.
> 
> Why not give up the struggle, and use an XML database? After all, this is
> why they were invented!

Because I'm a cheap-skate (or alternatively, just have no money to spend
on this), am trying to use only free software,  and the only open source
native-xml database I know of is unfinished and currently just a wrapper
for a relational db anyway. And only has a Java interface, where I'm using
perl. Maybe not good reasons, but I'm stuck with them :-( 

But I guess I should take your reply as letting me know I'm trying to do
something unreasonable..

Graham


> 
> Michael Kay
> Software AG
> home: Michael.H.Kay@xxxxxxxxxxxx
> work: Michael.Kay@xxxxxxxxxxxxxx
> 
> > -----Original Message-----
> > From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf Of
> > Graham Seaman
> > Sent: 09 April 2002 15:58
> > To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> > Subject: [xsl] xslt, xml, and databases - howto combine?
> >
> >
> > Hi all,
> >
> > This question may be a little off-topic; if anyone thinks it belongs
> > elsewhere, please let me know of a more suitable list!
> >
> > I have a web site based on numbers of relatively small xml documents
> > (currently stored as flat files). The documents contain a mix
> > of simple
> > elements (strings, integers etc) and elements with longer stretches of
> > text.
> > Each HTML page is typically generated using xsl from a number of these
> > files. Given the filenames for the xml files, the xsl processing is
> > reasonably fast (it will never be a very high traffic site).
> >
> > The problem comes with identifying the filenames in the first place.
> > Currently, minimal information (id, filename,date) on each
> > file is held in
> > a database and a database lookup is used to select a set of files; the
> > filenames are passed to the xsl as a csv string. I only have
> > access to a
> > standard relational db, not xml-based ones.
> >
> > To do more complicated queries to select the set of files to
> > process for a
> > page by anything more than date, I could:
> >
> > a) Use xsl to search through each file using document(). I
> > believe this
> > would be impossibly slow (number of files is currently only
> > around 1,000,
> > but it will grow).
> >
> > b) Duplicate the simpler information from the xml files in
> > the database,
> > and search on this to identify files, combined with a search engine
> > indexer to index the text in the files. I would then have
> > duplicated data
> > to be kept coherent through each update; which makes the site rather
> > fragile.
> >
> > c) Move the whole of the flat files into the database and
> > generate the xml
> > from them on demand. All searches can then be directly on the
> > database,
> > and there's no duplication of data. But there's an extra
> > processing stage
> > needed to generate the xml for the xsl to process.
> >
> > c) sounds the most robust, but both b) and c) run into a problem: I'm
> > trying to keep the background site code (perl) and the xsl relatively
> > independent of the particular xml format; for example,
> > generating forms by
> > using xsl to process an xsd schema, so that schema changes
> > don't need the
> > site code to be rewritten. But if I force my xml into a relational
> > database, I lose this independence.
> > One way I can see to keep it is to assume a one-to-one correspondence
> > between field names and XPath expressions (eg. fields with names like
> > item/admin/poster/email), so that I can regenerate xml from
> > the database
> > without one-off rules which break if the schema changes. But
> > this means I
> > can only ever have one leaf node identified by a particular XPath
> > expression, which seems very limiting.
> > Another possibility might be to generate xml from the
> > database in multiple
> > stages: first a quite flat xml document which directly
> > reflects db field
> > names, which is then processed by xsl via a rule-set which
> > converts the
> > first stage xml to xml as defined in the schema. Yet another layer of
> > processing...
> >
> > Am I missing other ways round this? Is there some standard way to deal
> > with this problem? Or are pure xml databases the only way to avoid
> > kludges like this?
> >
> > Thanks for any advice
> > Graham
> >
> >
> >  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> >
> 
> 
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> 
> 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread