RE: Searchable XML

Subject: RE: Searchable XML
From: "Hunter, David" <dhunter@xxxxxxxxxxxx>
Date: Wed, 16 Jun 1999 11:34:16 -0400
Ben Robb [mailto:Ben@xxxxxxxxxx] writes:
> Problem is, though, our client wants this to be 
> searchable....going back a
> year (that's 15k x 365 = 3.67 M if you bundle it into one XML file - a
> little large).
> Any thoughts?
> [I am using the IE5 parser to convert into a result string in 
> ASP; my web
> server is running Site Server 3/ IIS4, if that makes a difference]

Searching through 3.67MB of text is probably doable, but it may not be
pretty.  (Well, maybe watching the hard drive light as your server opens
each individual file might be kind of pretty...)  But if you want that kind
of searching capability, you <em>might</em> be better off taking the data in
as an XML file, but then parsing it and sticking the relevant information
into a database.  When you do a search, and find the information you want,
have a server-side component that takes a recordset from that database, and
then converts it back into your XML format for you.

In fact, if your XML file has a lot of data, you don't necessarily even have
to store each individual field in a separate column in your database; if you
KNOW that you're only going to search on some fields, you may only want to
put those fields into separate columns, and put the rest of the XML file
into its own column.  For example, suppose I have the following XML format:


(My XML examples are always so unrealistically simple.)  Further suppose
that I know for sure I'll only ever want to search by lastname and
social-insurance-number.  I can create a person table with a lastname
column, a social-insurance-number column, and an xml column which will store
the entire XML file.  Then I can write SQL queries like

SELECT xml FROM person WHERE lastname='Hunter'

This gives me my searching capability, and lets me use XML for all of its

Hope this helps, or gives you ideas for an even better solution.  :-)

David Hunter
MediaServ Information Architects

 XSL-List info and archive:

Current Thread