RE: [xsl] xslt, xml, and databases - howto combine?

Subject: RE: [xsl] xslt, xml, and databases - howto combine?
From: "Michael Kay" <michael.h.kay@xxxxxxxxxxxx>
Date: Tue, 9 Apr 2002 17:54:10 +0100
You are struggling hard to find an alternative to using an XML database.

Why not give up the struggle, and use an XML database? After all, this is
why they were invented!

Michael Kay
Software AG
home: Michael.H.Kay@xxxxxxxxxxxx
work: Michael.Kay@xxxxxxxxxxxxxx

> -----Original Message-----
> From: owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> [mailto:owner-xsl-list@xxxxxxxxxxxxxxxxxxxxxx]On Behalf Of
> Graham Seaman
> Sent: 09 April 2002 15:58
> To: xsl-list@xxxxxxxxxxxxxxxxxxxxxx
> Subject: [xsl] xslt, xml, and databases - howto combine?
>
>
> Hi all,
>
> This question may be a little off-topic; if anyone thinks it belongs
> elsewhere, please let me know of a more suitable list!
>
> I have a web site based on numbers of relatively small xml documents
> (currently stored as flat files). The documents contain a mix
> of simple
> elements (strings, integers etc) and elements with longer stretches of
> text.
> Each HTML page is typically generated using xsl from a number of these
> files. Given the filenames for the xml files, the xsl processing is
> reasonably fast (it will never be a very high traffic site).
>
> The problem comes with identifying the filenames in the first place.
> Currently, minimal information (id, filename,date) on each
> file is held in
> a database and a database lookup is used to select a set of files; the
> filenames are passed to the xsl as a csv string. I only have
> access to a
> standard relational db, not xml-based ones.
>
> To do more complicated queries to select the set of files to
> process for a
> page by anything more than date, I could:
>
> a) Use xsl to search through each file using document(). I
> believe this
> would be impossibly slow (number of files is currently only
> around 1,000,
> but it will grow).
>
> b) Duplicate the simpler information from the xml files in
> the database,
> and search on this to identify files, combined with a search engine
> indexer to index the text in the files. I would then have
> duplicated data
> to be kept coherent through each update; which makes the site rather
> fragile.
>
> c) Move the whole of the flat files into the database and
> generate the xml
> from them on demand. All searches can then be directly on the
> database,
> and there's no duplication of data. But there's an extra
> processing stage
> needed to generate the xml for the xsl to process.
>
> c) sounds the most robust, but both b) and c) run into a problem: I'm
> trying to keep the background site code (perl) and the xsl relatively
> independent of the particular xml format; for example,
> generating forms by
> using xsl to process an xsd schema, so that schema changes
> don't need the
> site code to be rewritten. But if I force my xml into a relational
> database, I lose this independence.
> One way I can see to keep it is to assume a one-to-one correspondence
> between field names and XPath expressions (eg. fields with names like
> item/admin/poster/email), so that I can regenerate xml from
> the database
> without one-off rules which break if the schema changes. But
> this means I
> can only ever have one leaf node identified by a particular XPath
> expression, which seems very limiting.
> Another possibility might be to generate xml from the
> database in multiple
> stages: first a quite flat xml document which directly
> reflects db field
> names, which is then processed by xsl via a rule-set which
> converts the
> first stage xml to xml as defined in the schema. Yet another layer of
> processing...
>
> Am I missing other ways round this? Is there some standard way to deal
> with this problem? Or are pure xml databases the only way to avoid
> kludges like this?
>
> Thanks for any advice
> Graham
>
>
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread