Subject: [xsl] How should I structure a huge XSLT dataset best? From: Anthony Zawacki <zwacki@xxxxxxxxxx> Date: Mon, 29 Sep 2003 16:32:22 -0400 |
Hello, First, the question, and then I'll provide additional background that may influence the answers... A customer will have the ability to edit data using the program of their choice, and provide the data in a comma delimited file. The data will consist of two peices of information: 3-6 digits, text string. Where the 3-6 digits reprensent the initial portion of a telephone number, and the text string specifies a treatment. There will be approximately 10,000 entries in this data file. My application processes messages describing telephone calls. For each message, I apply a stylesheet to determine the treatment. My current stylesheet takes into account much more information than just the telephone number, and that processing logic will still be required. I have two concerns. 1. I will be writing a program/script that accepts the CSV file, and converts it into an XML document to be included into the stylesheet. I want to make sure that my output from this program is in a form that is as effecient as possible. 2. The XSLT will be executed many times, and needs to be effiecent as possible, executing in not more than a few milliseconds. This is a place where speed is a higher priority than memory constraints. I've done a little looking around, and most of the concerns are the other way, meaning that the XML data file is huge, and the XSL file is tiny. This is completely opposite of what I will be experiencing. The XML data that I will be processing is usually less than 1K. Now, the background information: My application is written in C++ using Xalan-C v1.6 on the AIX5.2 platform. I have complete control over the XSLTs, but not over incoming/outgoing messages. Every stylesheet in my application is compiled at start-up time to maximize efficiency. Due to the requirement to insert/remove items from the message, I am using Xerces 2.3 Deprecated DOM objects with Xalan. My first reaction, without any planning, is to create an XML that is easily indexed to pick out the status of each telephone number. For example, if 41057 had a treatment code of 5, I would first lookup 410, then 4105, then 41057, then 410571. The 410571 would not be found, so I would fall back tothe 41057 answer. I have not yet implemented anything, I am in the design phase of how I should handle this, so I'm not sure what performance impact this would have. An obvious improvement would be to go in reverse, all six digits, and then stop when a match is found. Also, to avoid full scans of the data, I thought about building a tree that would be indexed by a single digit at each level, so to find the answer, it would start by indexing by the 4. This would result in another tree that could be indexed by the [1], and continue down until there was a failure to match, similar to the first method, but hopefully more efficient. The lack of obvious precendence for this type of work makes it much more difficult. I'm used to being able to search the list or looking at the XSLT FAQ and seeing easy solutions, but this type of issue doesn't seem to have been addressed in the past. Or am I missing something? Thanks, Anthony Zawacki 410-571-7161 zwacki@xxxxxxxxxx XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
RE: [xsl] replace multiple distinct, Jarno . Elovirta | Thread | RE: [xsl] How should I structure a , Michael Kay |
[xsl] Re: defaut xslt stylesheet fo, Dimitre Novatchev | Date | RE: [xsl] How should I structure a , Michael Kay |
Month |