Re: [xsl] combine xml files

Subject: Re: [xsl] combine xml files
From: "Thomas B. Passin" <tpassin@xxxxxxxxxxxx>
Date: Thu, 11 Apr 2002 11:08:40 -0400
[Ming]
>
> I think I can make it more clear with an example:
>

Good.  Let me summarize what I think I understand:

1) Each search record is saved in a single xml file.

2) All contents of any one of these xml files pertain to a single work.

3) A single xml file may contain data obtained from several sources (the
"db" values).

4) All information relevant to a particular search result is contained in a
single xml file.

5) For formatting, reliability, or other reasons, information from a
particular source may be preferred over that from another (the db preference
order).  The preferred source may be different for titles than for authors.

6) The data from the most preferred source available is the data to be
displayed.

Now to check out a few things I am assuming:

a) The db preferences will be the same for all xml files.

b) The db preferences will either not change over searches, or only change
infrequently.

c) The number of different dbs is small and will always be known before a
search is processed (in case we want to hard-code them).

If all these things are correct, it should be fairly easy, modulo the time
needed to process 1000 files.

Let us know if these things are correct.

Tom P

> My saved searched files named: C9A1876A75333C9.tomcat1 (the session id).
Each
> entry is saved to this file after a user click on the check box in front
of
> each search result.
>
> In this file, the entries are like these:
> /records/sci01/1082-6068/30/1/69_DOU-PSOCFPGCWPIIC
> /records/sci02/0254-3052/24/10/892_BAI-SDJPLGGVRP
>
> And each entry is a xml file. And the format of each xml file is like
this:
>
> <xml>
>   <db1>
>      <jauthor>
>         <author db=db1> Smith, J</author>
>         <author db=db1> Mou, S </author>
>     </jauthor>
>     <jtitle>
>        <title db=db1> Preliminary study on network (II) </title>
>     </jtitle>
>   </db1>
>
>   <db2>
>      <jauthor>
>        <author db=db2> Smith, JR </author>   <!-- note here,  since it's
the
> same article, the author is the same
>
> but displayed differently for different database -->
>        <author db=db2> Mou, ST </author>
>      </jauthor>
>      <jtitle>
>        <title db=db2> Preliminary Study on Network (II) </title><!-- same
as
> author, same article, but display title is slightly different -->
>      </jtitle>
>   </db2>
> </xml>
>
> And here is my preference file (It can be in any format, here I just put
it in
> a text file with space delimited format):
> filename: DbPref.txt
> content:
> title: db2 db1 db3
> author: db1 db3 db2
>
> Actually, there are about 6 dbs (from db1 to db6). And each xml file (or
each
> record) can be in any one or more dbs.
>
> So, my job is to display something like this on the website:
> Title: Preliminary Study on Network (II)  <!-- note here, this title is
from
> title in db2, since db2 is the preferred title display database -->
> Author: Smith, J; Mou, S  <!-- note here, the authors are from the authors
in
> db1, since db1 is the preferred author display database -->
>
> I've thought about this over and over again and think maybe the way you
> mentioned is a good idea. And what I need to do more is to add the
preference
> information (in order to do this, I may need to process each xml file in
my
> java servlet and find the preference) to the xml file. Something like:
> <files>
> <file title=db2 author=db1> xml file 1 </file>
> <file title=db3 author=db2> xml file 2 </file>   <!-- note here, the
record in
> xml file 2 is in db 2 and db3 -->
> </files>
>
> I don't think  I answer your question correctly. But I really don't know
how to
> find a proper answer. So, I gave you this complete scenario. Hope this can
help
> to clarify the problem.
>
> Thanks a lot.
>
> Ming
>
>
>
>
>
>
>
>
>
>
> "Thomas B. Passin" wrote:
>
> > We're getting closer, I think.  If a work can appear, with a different
> > format,  in more than one xml file, then how can you tell when an entry
in
> > one file is for the same work as an entry in another file?  You need to
be
> > able to do that, it would seem, or you won't be able to match up
entries.
> >
> > What data is contained in any one xml file?  Is it data on one single
work
> > from one single database?  Is it many works, but all from one database?
Is
> > it one single work, but possibly from many databases?
> >
> > Are you expecting to get a fast response when looking through 1000 files
for
> > each query?  How fast?  Or can it be a batch process?  Even doing a
> > directory listing of 1000 files can take some time, depending on your
> > system, and that's not doing any processing on the files.
> >
> > Cheers,
> >
> > Tom P
> >
> > [Ming]
> >
> > >
> > > To make my explanation easier to understand (sorry for the
misleading),
> > I'm
> > > going to describe my task.
> > >
> > > Actually I'm doing the "View Marked" function after a search.  The
saved
> > > searched are saved in a temporary file with the session id as the file
> > name.
> > > And each entry in the file is a complete path to a xml file. So, the
> > number of
> > > xml files saved in the temporary file can vary from 1 to 1000.  After
the
> > user
> > > click on the view marked button, I need to display the title and
author
> > > information for each xml file to the user. So, it's a
> > > dynamic process.
> > >
> > > For the title in each xml file, the title format for each database is
> > slightly
> > > different and so are others such as author. That's why we have a
> > preference
> > > list for titles, authors, etc because different group of people prefer
> > > different display format for titles, authors, etc.
> > >
> > > Yes, I need to look through each xml record since some titles appears
only
> > in
> > > one database and some appear in more than one database. So, the <db*>
tags
> > are
> > > different. And I need to find out the most preferred one to display
from
> > my
> > > preferrence list.
> > >
> > ...
> >
> >  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>
>
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread