Subject: Practical Suggestion for XSLT Performance Improvement From: "Clark C. Evans" <clark.evans@xxxxxxxxxxxxxxxxxxxx> Date: Wed, 6 Oct 1999 22:21:04 -0400 (EDT) |
A Practical Suggestion for XSLT Performance Improvement Summary: In this document a method is suggested for dramatically improving XSLT processor performance (2-10x or more) for financial information or other 'high volume' information streams. It is shown how the time to produce a summary report for an 100K input stream changes from two and a half minutes to 14 seconds. It is also suggested how this number could be further reduced with an alternative XSLT processor design. Outline: 1. Application domain 2. Design of an appropriate benchmark 3. A solution using existing XSLT constructs 4. "Functional Programming" and "Touch It Once" 5. A possible solution: Child to parent named pipes? 6. A practical suggestion: Aggregate hashtables 7. Conclusion: Perspective is everything 8. Credits and References Appendix: A. Sample source data set. B. Sample report C. Generator for source data set. D. Stylesheet using existing constructs E. Implementation of new construct for jclark's XT. F. Stylesheet using new constructs 1. Application domain One of the larger 'obvious' application domains for XSL is the reporting of business transaction and/or financial information. Typically these reports summarize the information according to two or more categories to produce a cross-tabulated result. In lighter weight cases, a simple total for an order is required. In either case, summary information is generated based on the content of the source data. 2. Design of an appropriate benchmark To test various alternatives, the time required to generate a two dimensional cross-tabulated summary report seems like the best objective. For this summary, we assume that the axis may be known by the stylesheet author, however, the actual categories (labels) on the access may not. Thus, the stylesheet should discover these categories in addition to generating tables. It is also assumed that the categories (labels) will be sorted. For example, information about orders (de-normalized here) such as: +-------+-----------+---------+--------+ | Order | DayofWeek | Product | Amount | |-------+-----------+---------+--------+ | 1 | monday | t-shirt | 5.50 | | 1 | monday | jeans | 25.00 | | 2 | monday | boxers | 9.50 | | ... | | 99 | friday | jeans | 25.00 | +-------+-----------+---------+--------+ Could produce the following summary: +--------+---------+- -+--------+----------+ | | monday | ... | friday | total | +--------+---------+- -+--------+----------+ |boxers | 9.50 | | | 9.50 | |jeans | 50.00 | | 100.00 | 550.00 | ... ... ... ... ... |t-shirt | 11.00 | | 27.50 | 79.00 | +--------+---------+- ... -+--------+-------- -+ | total | 122.25 | | 525.75 | 1,650.25 | +--------+---------+- -+--------+----------+ Of course, day of the week could be known in advance, but for the benchmark we do not assume this, since it could very well be any other category. In addition, due to complexity in the order, there may be several layers deep depending on how the data is structured. For instance, shift information sales person, comments, and other details about weekly store events may also be included in the source document. Thus, much may be discarded in order to extract information relevant. Unfortunately, this benchmark did not put in significant 'irrelevant' data. However, a three level hierarchy was used. Here is a snippet of the input data stream used: <upload> <day> <dow>tue</dow> <order> <line> <product>Hat</product> <price>11.43</price> </line> ... 3. A solution using existing XSLT constructs Note: This is the best solution I could muster, if there are other solutions out there that are better this is great; let's see them. To produce the summary report, the stylesheet must first generate a list of the categories for both axes. This is done with the following: <xsl:variable name="product-list" select="//product[not(.=following::product)]" /> Here the product list is made of all products which do not have a product entry matching it later in the document. Thus the first 't-shirt' is picked up, and remaining 't-shirt' nodes are discarded. In XSL, the result is a node list, so, one must be careful to only use information in the node that was used to derive uniqueness. Once the category lists have been identified, then for-each loops can be recursively nested, with a sum, filtered by the appropriate category. See the appendix for more detail. Here is a snippet: <xsl:for-each select="$product-list"> <xsl:variable name="product" select="." /> ... <xsl:for-each select="$day-list"> <xsl:variable name="day" select="." /> ... <xsl:value-of select="sum(//price[../product=$product] [../../../dow=$day])" /> . Although I am no expert with jclark's processor, I believe that the entire tree is processed, as a minimum, count(day-list)*count(product-list) times. Here are the timing results for this stylesheet: Tiny (1.6K) 6.01 6.24 6.18 6.22 6.05 [ 6.14] Small (10.3K) 16.68 16.55 16.51 16.43 16.64 [ 16.56] Medium (104K) 145.27 145.51 145.11 144.89 145.33 [ 145.22] Large (974K) 1334.99 1335.26 1334.73 1335.03 [1335.01] (Linux; PPro 180; 128MB; JClark's XT) Although performance may be acceptable for tiny documents, in this domain medium size documents (50-200K) are the norm, and 2"41" is unacceptable performance. Five to fifteen seconds would be more in-line with user expectations; not minutes. 4. "Functional Programming" and "Touch It Once" Structure and Interpretation of Computer Programs, Abelson has some wonderful comments which are relevant to XSL. In the first two chapters of the book Abelson teaches functional programming (lack of assignment operator) and the substitution evaluation model. Then, in chapter three, he introduces assignment and shows how it is "a powerful technique for maintaining a modular design". A few pages however, he goes on to warn: "In contrast to functional programming, programming that makes extensive use of assignment is known as imperative programming. In addition to raising complications about computational models, programs written in imperative style are susceptible to bugs that cannot occur in functional programs." (p234) This quote is important since the designers of XSL have been striving to keep the language functional. That is, they wish to prevent: "expressions which will produce different values depending upon the order in which the subexpressions in a combination are evaluated" (p238) To follow this goal, a template (function) can create its own local variables. In addition, the template can read variables and parameters declared globally, but cannot write to them. Furthermore, a template can pass parameters to another template called directly. And finally, it is possible to specify parameters for templates invoked for a child element match. Thus, the designers have achieved their goal of a completely functional language; which is a fantastic quality. In fact, the designers of XSLT have designated a exquisite recursive template matching system for XML stream processing! Unfortunately, XSL's communication system is one way; parameters can be passed to templates, but no return is admitted. This is, of course, exactly the opposite of what a summary report must do -- aggregate information upward. Due to this current limit, XSL's ability to process summary reports is severely hampered. The currently prescribed approach is to by-pass the beautifully designed recursive template mechanism and to iterate through the in-memory document object using the for-each construct. Hardly a divine situation. In fact, the bias towards this approach is now apparent in the popular tutorials. Seldom seen is a recursive template; often shown is a brute force multi-pass for-each example. The affects of this broken communication mechanism in the XSL standard cannot be under-stressed; every implementation now uses an in-memory document object model (DOM) or like mechanism for intermediate storage. This increases memory usage and CPU cycles -- dramatically slowing down processing and putting increasing hardware constraints for its deployment. This is completely absurd, since there is no reason why most summary reports should require multiple passes. To quote Grandmother Weatherfax: "Touch it once! Or don't touch it at all." Realistically, this is a hard and there are few possibilities. We could introduce an assignment operation, but Abelson clearly warns us: "this power extracts a price: the loss of referential transparency, giving rise to a thicket of questions about sameness and change, and the need to abandon the substitution model of evaluation in favor of the more intricate environment model [...] by introducing assignment we are forced to admit _time_ [synchronization, mutex, and other devices] into our computational models. (p297)" Perhaps the answer can be found, as Abelson suggests, with a stream processing model which "lets us model systems that have state without ever using assignment or mutable data." (p317) Abelson presents a particularly interesting example where a bank account object is re-written as a functional stream (p355). Here the state is external, making the approach functional. Clearly this approach is at the heart of XSL; where the goal of each processor is to be stateless. However, all excitement aside, as more elaborate usage of stream processing is put to practice, multiple streams must be introduced, and thus a stream merge mechanism is required. Abelson concludes the chapter saying: "Thus, in an attempt to support the functional style, the need to merge inputs from different agents reintroduces the same problems that the functional style was meant to eliminate." (p357) "We can model the world as a collection of separate, time-bound, interacting objects with state, or we can model the world as a single, timeless stateless unity. Each view has powerful advantages, but neither view alone is completely satisfactory" (p357) 5. A Possible Solution: Child to Parent named pipes. So if we want to "touch it once", Abelson has left us with two choices: (a) Introduce a mutable object or (b) introduce multiple data streams and deal with the merge problem. I chose the former since it is well understood. But perhaps it would be fun to talk about the multiple-stream case just for kicks. Suppose each parent could open one or more named pipe(s), which in effect is the communication channel from the children back to the parent element: <xsl:template match="\" > <xsl:apply-templates> <xsl:open-pipe name="amount-by-category" /> </xsl:apply-templates> ... I'll come back here later ... </xsl:template> Then, each of the "child" templates could write to this pipe: <xsl:template match="order" > <xsl:write-pipe name="amount-by-category" > <xsl:attribute name="category" select="product" /> <xsl:attribute name="amount" select="price*quantity" /> </xsl:write-pipe> <table> <tr><td>Order#</td><td><xsl:value-of select="order-number" /></td></tr> <tr><td>Product</td><td><xsl:value-of select="product" /></td></tr> <tr><td>Color</td><td><xsl:value-of select="color" /></td></td> ... pretty print rest of order ... </table> </xsl:template> In essence, the write-pipe directive tells the XSL processor what fragment needs to be stored in memory for later use; allowing all of the remaining information (such as the color and order-number) to be discarded. This would, for instance, put into temporary internal storage an internal DOM object or event stream which could then be accessed by the root template. <amount-by-category> <fragment category="X" amount="2.38" /> <fragment category="X" amount="2.62" /> <fragment category="Y" amount="3.43" /> ... <fragment category="X" amount="2.50" /> </amount-by-category> Which could then be accessed by the root template: <xsl:template match="\" > <xsl:apply-templates> <xsl:open-pipe name="amount-by-category" /> </xsl:apply-templates> <xsl:for-each select="$amount-by-category" /> <xsl:sort select="category" /> <xsl:if test="category=previous-sibling::category" > , </xsl:test> <xsl:if test="not(category=previous-sibling::category"> <xsl:value-of select="category" /> : </xsl:if> <xsl:value-of select="amount" /> <xsl:for-each> </xsl:template> Which would produce: X: 2.38, 2.62, ... , 2.50 Y: 3.43, ... ... Most of the time, however, only a summary is required; so keeping all of the detail information in memory would be painful. So, perhaps another syntax could be used to convert the "pipe" into an "accumulator": <xsl:template match="\" > <xsl:apply-templates> <xsl:open-pipe name="amount-by-category" > <xsl:accumulate using="sum" select="amount" /> </xsl:open-pipe> </xsl:apply-templates> <xsl:for-each select="$amount-by-category" /> <xsl:sort select="category" /> sum( <xsl:value-of select="category" /> ) equals <xsl:value-of select="amount" /> <xsl:for-each> </xsl:template> Which would produce: sum(X) = 7.50 sum(Y) = ... ... In the multi-dimentional cases, this gets to be very powerful, yet lean on memory and processing time. And, if you allow the pipe to be passed as a non-mutable variable; things get even more fun: <xsl:template match="\" > <xsl:apply-templates> <xsl:open-pipe name="amount-by-category" > <xsl:accumulate using="sum" select="amount" /> </xsl:open-pipe> </xsl:apply-templates> <xsl:call-template name="print-amount-by-category > <xsl:with-param name="amount-by-category select="$amount-by-category" /> <xsl:open-pipe name="total-amount" > <xsl:accumulate using="sum" select="amount" /> </xsl:open-pipe> </xsl:call-template> Grand Total: <xsl:value-of select="$total-amount" > </xsl:template> <xsl:template name="print-amount-by-category" > <xsl:param name="amount-by-category"/> <xsl:write-pipe name="total-amount"> <xsl:attribute name="amount" select="amount" /> </xsl:write-pipe> <xsl:for-each select="$amount-by-category" /> <xsl:sort select="category" /> sum( <xsl:value-of select="category" /> ) equals <xsl:value-of select="amount" /> <xsl:for-each> </xsl:template> Which would produce: sum(X) = 7.50 sum(Y) = ... ... Grand Total: ... The obvious advantage here is that summaries of sums are far less expensive to calculate when compared to a grand total over every amount in the incoming stream. Thus, this is would be far faster than, say: <value-of select="sum(\\amount)" /> Of course, there are also a few "minor" problems, namely how to merge multiple streams as pointed out by Abelson. I'm not sure if this is a show stopper. Thoughts? 6. A practical suggestion: Aggregate hash-tables The way I solved the problem, introducing a mutable object (namely java.util.Hashtable) is far less elegant than the method described above. However, with 10K file, generating a summary report this way turned out be significantly faster than the current approach. The solution was put in place by building off of jclark's XT processor. A static Java class was created and is accessed via XSL using the ht name-space. Most Hashtable member functions have been wrapped with a static versions. Since multiple hash tables are needed to store different types of information, the first argument to each function is the name of the hashtable. Internally, it is actually a Hashtable of Hashtables. Here is an example of putting items into the hash table: <xsl:template match="day" > <xsl:variable name="day" select="dow" /> <xsl:value-of select="ht:put('general','day',string($day))" /> <xsl:value-of select="ht:put('day',string($day),'')"/> <xsl:apply-templates select="order" /> </xsl:template> Here, each day that is found is put into the 'day' table, a list of days encountered. In this example, I also set the 'day' key in the 'general' table to reflect the current day. I do this so that a child template can know which day it is processing. Yes, one could use "../../day" to get this information at the child level, but my goal was to prove the hashtable for general parameter passing. One could also use the with-param mechanism; however, this only seems to work for immediate descendants of the current template. <xsl:template match="line" > <xsl:variable name="day" select="ht:get('general','day')" /> <xsl:variable name="product" select="product" /> <xsl:variable name="price" select="price" /> <xsl:value-of select="ht:put('product',string($product),'')"/> <xsl:value-of select="ht:add('price',concat($day,$product),number($price))"/> </xsl:template> In the above fragment, the ht:get is used to retrieve the current day which was "put" by the day template. Next, the product and price from the current node set are moved into variables; this is done so that they can be used in expressions later. Following this, the current product is put into the 'product' list which contains all product names encountered (the hash table key is unique). For efficiency, a ht:add(table,key,value) is implemented, it is identical to: ht:put(table,key,value+ht:get(table,key)). The last instruction uses ht:add to store the price using a concatenated key of day and product name. To make the hash table inter-operable with XSLT, a to-node-list function is provided. This copies the contents of the Hashtable into a node list of keys. Unfortunately, I was not able to return a node list of key/value pairs. Thus, the following fragment: <xsl:template match="/" > <xsl:apply-templates/> <xsl:for-each select="ht:to-node-list('day')" > <xsl:value-of select="."/> </xsl:for-each> ... First runs all templates -- filling in the 'day', 'product', and 'price' hash tables. Then, the list of keys from the 'day' hash table is retrieved using the to-node-list function. <xsl:template match="/"> ... <xsl:for-each select="ht:to-node-list('product')" > <xsl:variable name="product" select="." /> <xsl:for-each select="ht:to-node-list('day')" > <xsl:variable name="day" select="." /> <xsl:variable name="price" select="ht:get('price',concat($day,$product))" /> <xsl:value-of select="$price" /> <xsl:value-of select="ht:add('day',string($day),number($price))"/> <xsl:value-of select="ht:add('product',string($product),number($price))"/> </xsl:for-each> <xsl:value-of select="ht:get('product',string($product))" /> </xsl:for-each> ... In the above fragment, the 'product' and 'day' lists are used in a nested iteration. Note how the concatenated day/product is used to retrieve the sum(price) for that combination. Once retrieved, the sum is printed and while within the loop, a sum for each day and product is generated using ht:add. <xsl:template match="/" > ... <xsl:for-each select="ht:to-node-list('day')" > <xsl:variable name="day" select="." /> <xsl:variable name="price" select="ht:get('day',string($day))" /> <xsl:value-of select="$price" /> <xsl:value-of select="ht:add('general','total',number($price))"/> </xsl:for-each> <xsl:value-of select="ht:get('general','total')" /> </xsl:template> In this last fragment, the template once again iterates through the list of days. This is used to generate the last line of the table (summaries by day). In this case, once the price is obtained, it is used to generate the sum for the grand total. ... Performance measurements: tiny 1.6K : 6% faster (traditional) 6.01 6.24 6.18 6.22 6.05 [6.14] (/w hash sum) 5.83 5.77 5.71 5.68 5.85 [5.76] small 10.3K : 2.44 times faster. (traditional) 16.68 16.55 16.51 16.43 16.64 [16.56] (/w hash sum) 6.69 6.72 6.79 6.83 6.83 [ 6.77] medium 104K = 10.56 times faster (traditional) 145.27 145.51 145.11 144.89 145.33 [145.22] (/w hash sum) 13.79 13.78 13.80 13.73 13.69 [ 13.75] large 974K = 17.53 times faster (traditional) 1334.99 1335.26 1334.73 1335.03 [1335.01] (/w hash sum) 77.9 77.25 77.52 78.50 77.48 [ 77.53] 7. CONCLUSION By augmenting current functional approach using a mutable object like a hash table, measurable performance increase is realized. For a medium size file (100K) of financial information, the processor can generate results in thirteen (13) seconds as opposed to two minutes and forty one seconds. I believe that it is possible to get response time far below five (5) seconds if the processor does not generate an in-memory node tree before moving on to processing. Thus, the performance increase from moving from an object based to to an event driven based processor is very measurable and well worth researching. One might complain that the syntax of the hash-table system is too complicated, however, this could be made much better /w integration into the standard. Although the traditional template approach seems easier once xpath is internalized. It is also possible that an "optimizing" XSLT processor could figure out how to compute partial sums effectively; I would love to see the problem attacked this way. Overall, I'd love to see child to parent communication. With a technique like this, a hash-table approach seems archaic. The goal, of course, is to eliminate the iterative "procedural" approach to stylesheet authoring, and instead develop on a recursive "event-driven" model. This could bring several benefits, among them: 1. We increase modularity. With recursive applications of this each type of child may need to summarize differently. Thus, we can have many "listeners" to the event stream doing different things and combinable in different ways. The alternative approach (in practice) is to use large nested <for-each loops. 2. The XSL processor has greatly reduced memory requirements since a document object model of the incoming stream does not have to be generated. Practically, a processor would have to check to see if any "wandering xpath" constructs were in use; or see if any for-each loops cause folding. Kay, could SAXON perhaps be an XSLTS, S for "stream" and "speed" ? Anyway, this is intended to be a departure point for further discussion, not a 'canned' solution. 8. Credits and References A. Big thanks to the XML and XSL community. It is not very often to find such an open community willing to discuss ideas. Especially members of the xml-dev and xsl-list servers. B. To Harold Abelson and Gerald Sussman, for writing Structure and Interpretation of Computer Programs. Most excellent book packed with great ideas. I have much more to learn from it. Also in this must-read category is the Dragon Book by Aho, Sethi and Ullman. C. Thanks to Dan Palanza, a builder in Cape Cod which has driven this book home to me; especially the parallels he draws from this book to bookkeeping, building contract work, and the ICHING. D. To James Clark for his wonderful XT, to David Megginson for SAX, to Kay Mitchael for SAXON, and to David Carlisle for his help. Appendix A. Sample source data set. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <upload> <day> <dow>tue</dow> <order> <line> <product>T</product> <price>11.96</price> </line> <line> <product>Pants</product> <price>12.13</price> </line> </order> <order> <line> <product>Underwear</product> <price>13.47</price> </line> </order> </day> <day> <dow>fri</dow> <order> <line> <product>Tie</product> <price>1.91</price> </line> <line> <product>Pants</product> <price>14.0</price> </line> </order> <order> <line> <product>T</product> <price>9.50</price> </line> </order> <order> <line> <product>Coat</product> <price>4.65</price> </line> </order> <order> <line> <product>Shirt</product> <price>11.20</price> </line> </order> </day> <day> <dow>tue</dow> <order> <line> <product>Boots</product> <price>4.24</price> </line> </order> <order> <line> <product>Overalls</product> <price>11.89</price> </line> </order> </day> </upload> Appendix B. Sample report (has been edited for clarity) ~~~~~~~~~~~~~~~~~~~~~~~~~~~ <html xmlns="http://www.w3.org/TR/REC-html40"> <body> <table> <tr><td> </td><th> fri </th><th> tue </th> </tr> <tr><td>Boots </td><td> </td><td> 4.24 </td><td> 4.24 </td></tr> <tr><td>Coat </td><td> 4.65 </td><td> </td><td> 4.65 </td></tr> <tr><td>Overalls </td><td> </td><td> 11.89 </td><td> 11.89 </td></tr> <tr><td>Pants </td><td> 14.00 </td><td> 12.13 </td><td> 26.13 </td></tr> <tr><td>Shirt </td><td> 11.20 </td><td> </td><td> 11.20 </td></tr> <tr><td>T </td><td> 9.50 </td><td> 11.96 </td><td> 21.46 </td></tr> <tr><td>Tie </td><td> 1.91 </td><td> </td><td> 1.91 </td></tr> <tr><td>Underwear</td><td> </td><td> 13.47 </td><td> 13.47 </td></tr> <tr><td> </td><td> 41.26 </td><td> 53.69 </td><td> 94.95 </td></tr> </table> </body> </html> Appendix C. Generator for source data set ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ import java.io.*; import java.util.*; public class MakeSampleData { /* TINY SMALL LARGE private static final int DAYS = 10 10 50 private static final int ORDERS = 5 15 250 private static final int PRODUCT = 2 15 100 */ private static final int DAYS = 2; private static final int ORDERS = 10; private static final int PRODUCT = 15; private static Random gen = new Random(); public static void main (String [] args ) { Vector prod = new Vector(); prod.addElement("Shirt"); prod.addElement("Hat"); prod.addElement("Pants"); prod.addElement("Sweater"); prod.addElement("Overalls"); prod.addElement("Underwear"); prod.addElement("Scarf"); prod.addElement("Jeans"); prod.addElement("Belt"); prod.addElement("Cuffs"); prod.addElement("Tie"); prod.addElement("T"); prod.addElement("Coat"); prod.addElement("Umbrella"); prod.addElement("Jacket"); prod.addElement("Boots"); Vector days = new Vector(); days.addElement("mon"); days.addElement("tue"); days.addElement("wed"); days.addElement("thu"); days.addElement("fri"); days.addElement("sat"); System.out.println("<upload>"); for(int day = 1; day < DAYS; day++ ) { System.out.println(" <day>"); System.out.println(" <dow>" + days.elementAt( day % 6 ) + "</dow>" ); for(int order = 1; order < (1+ rand(ORDERS)); order++ ) { System.out.println(" <order>"); for(int product = 1; product < (1+ rand(PRODUCT)); product++ ) { System.out.println(" <line>"); System.out.println(" <product>" + prod.elementAt(rand(prod.size())) + "</product>"); System.out.println(" <price>" + (1+rand(14))+ "." + rand(100) + "</price>"); System.out.println(" </line>"); } System.out.println(" </order>"); } System.out.println(" </day>"); } System.out.println("</upload>"); } private static int rand(int mod) { int val = (gen.nextInt() % mod) ; if ( val < 0 ) return -val; return val; } } Appendix D. Stylesheet for existing constructs ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <xsl:stylesheet xmlns:xsl="http://www.w3.org/XSL/Transform/1.0" xmlns="http://www.w3.org/TR/REC-html40" result-ns=""> <xsl:variable name="day-list" select="//dow[not(.=following::dow)]" /> <xsl:variable name="product-list" select="//product[not(.=following::product)]" /> <xsl:template match="/"> <html> <body> <table> <tr> <td><xsl:text> </xsl:text></td> <xsl:for-each select="$day-list"> <xsl:sort order="ascending" select="." /> <th><xsl:value-of select="."/></th> </xsl:for-each> </tr> <xsl:for-each select="$product-list"> <xsl:sort order="ascending" select="." /> <xsl:variable name="product" select="." /> <tr> <td> <xsl:value-of select="$product" /> </td> <xsl:for-each select="$day-list"> <xsl:sort order="ascending" select="." /> <xsl:variable name="day" select="." /> <td> <xsl:value-of select="sum(//price[../product=$product][../../../dow=$day])" /> . </td> </xsl:for-each> <td> <xsl:value-of select="sum(//price[../product=$product])" /> . </td> </tr> </xsl:for-each> <tr> <td><xsl:text> </xsl:text></td> <xsl:for-each select="$day-list"> <xsl:sort order="ascending" select="." /> <xsl:variable name="day" select="." /> <td> <xsl:value-of select="sum(//price[../../../dow=$day])" /> </td> </xsl:for-each> <td> <xsl:value-of select="sum(//price)" /> </td> </tr> </table> </body> </html> </xsl:template> </xsl:stylesheet> E. Implementation of new construct for jclark's XT. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ package com.clarkevans.xtk; import java.io.*; import java.util.Hashtable; import java.util.Enumeration; import org.xml.sax.*; import org.xml.sax.helpers.*; import com.jclark.xsl.om.*; import com.jclark.xsl.sax.*; public class HashtableAdapter { private static Hashtable table_of_tables = new Hashtable(); private static Hashtable cache_table = null; private static String cache_key = null; private static Exception last_error = null; private static Hashtable getTable( String name ) { if(name != cache_key) { cache_key = name; cache_table = (Hashtable) table_of_tables.get(name); if( null == cache_table ) { cache_table = new Hashtable(); table_of_tables.put(name,cache_table); } } return cache_table; } public static void put( String table, String key, String value ) { getTable(table).put(key,value); } public static void add( String table, String key, double value ) { Hashtable hash = getTable(table); Object obj = hash.get(key); double val = value; if( Double.isNaN(val) ) val = 0; try { if( obj instanceof String ) val += Double.parseDouble((String)obj); else val += ((Double)obj).doubleValue(); } catch (Exception e) { } hash.put(key,new Double(val)); } public static void concat( String table, String key, String value, String sep, boolean unique ) { String curr = get(table,key); if(curr == null) { put(table,key,value); } else { if(unique) if( curr.indexOf(value) > -1 ) return; put(table,key,curr+sep+value); } } public static String remove( String table, String key) { return (String) getTable(table).remove(key); } public static String get( String table, String key ) { return get(getTable(table),key); } private static String get( Hashtable tab, String key ) { Object obj = tab.get(key); if( obj instanceof Double ) return obj.toString(); else return (String) obj; } public static int size(String table) { return getTable(table).size(); } public static String toString(String table) { String list = new String(); Hashtable tab = getTable(table); for (Enumeration e = tab.keys() ; e.hasMoreElements() ;) { String key = (String) e.nextElement(); String value = get(tab,key); list += wrap("entry",wrap("key",key)+wrap("value",value)); } return wrap("table",list); } public static String wrap(String element, String content) { return "<" + element + ">" + content + "</" + element + ">"; } public static NodeIterator toNodeList(String table) { try { XMLProcessorImpl proc = new XMLProcessorImpl( new HashtableParser( getTable(table))); Node root = proc.load( new InputSource(), 0, new LoadContextImpl(), new NameTableImpl() ); return root.getChildren(); } catch (Exception e) { last_error = e; return null; } } public static String getLastError() { if(null != last_error) { StringWriter w = new StringWriter(); last_error.printStackTrace(new PrintWriter(w)); last_error = null; return w.toString(); } return ""; } private static class LoadContextImpl implements LoadContext { public boolean getStripSource(Name elementTypeName) { return false; } public boolean getIncludeComments() { return true; } public boolean getIncludeProcessingInstructions() { return true; } } private static class HashtableParser implements Parser { Hashtable table; DocumentHandler handler; public void setLocale (java.util.Locale locale) throws SAXException {} public void setEntityResolver (EntityResolver resolver) {} public void setDTDHandler (DTDHandler handler) {} public void setErrorHandler (ErrorHandler handler) {} public void setDocumentHandler (DocumentHandler handler) { this.handler = handler; } public void parse (InputSource source) throws SAXException { parse(); } public void parse (String systemId) throws SAXException { parse(); } HashtableParser(Hashtable table) { this.table = table; } private void parse() { try { handler.startDocument(); for (Enumeration e = table.keys() ; e.hasMoreElements() ;) { AttributeListImpl al = new AttributeListImpl(); String key = (String) e.nextElement(); String value = get(table,key); al.addAttribute("key","CDATA",key); al.addAttribute("value","CDATA", value); handler.startElement("entry",al); handler.characters( key.toCharArray(),0,key.length()); handler.endElement("entry"); } handler.endDocument(); } catch( Exception e ) { table.put("exception",e.toString()); } } } } F. Stylesheet using new construct ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <xsl:stylesheet xmlns:xsl="http://www.w3.org/XSL/Transform/1.0" xmlns="http://www.w3.org/TR/REC-html40" xmlns:ht="http://www.jclark.com/xt/java/com.clarkevans.xtk.HashtableAdapter" result-ns=""> <xsl:template match="day" > <xsl:variable name="day" select="dow" /> <xsl:value-of select="ht:put('general','day',string($day))" /> <xsl:value-of select="ht:put('day',string($day),'')"/> <xsl:apply-templates select="order" /> </xsl:template> <xsl:template match="line" > <xsl:variable name="day" select="ht:get('general','day')" /> <xsl:variable name="product" select="product" /> <xsl:variable name="price" select="price" /> <xsl:value-of select="ht:add('price',concat($day,$product),number($price))"/> <xsl:value-of select="ht:put('product',string($product),'')"/> </xsl:template> <xsl:template match="/"> <html> <body> <xsl:apply-templates /> <table> <tr> <td><xsl:text> </xsl:text></td> <xsl:for-each select="ht:to-node-list('day')" > <xsl:sort order="ascending" select="." /> <th><xsl:value-of select="."/></th> </xsl:for-each> </tr> <xsl:for-each select="ht:to-node-list('product')" > <xsl:sort order="ascending" select="." /> <xsl:variable name="product" select="." /> <tr> <td> <xsl:value-of select="$product" /> </td> <xsl:for-each select="ht:to-node-list('day')" > <xsl:sort order="ascending" select="." /> <xsl:variable name="day" select="." /> <xsl:variable name="price" select="ht:get('price',concat($day,$product))" /> <xsl:value-of select="ht:add('day',string($day),number($price))"/> <xsl:value-of select="ht:add('product',string($product),number($price))"/> <td> <xsl:value-of select="$price" /> . </td> </xsl:for-each> <td> <xsl:value-of select="ht:get('product',string($product))" /> </td> </tr> </xsl:for-each> <tr> <td><xsl:text> </xsl:text></td> <xsl:for-each select="ht:to-node-list('day')" > <xsl:sort order="ascending" select="." /> <xsl:variable name="day" select="." /> <xsl:variable name="price" select="ht:get('day',string($day))" /> <xsl:value-of select="ht:add('general','total',number($price))"/> <td> <xsl:value-of select="$price" /> . </td> </xsl:for-each> <td> <xsl:value-of select="ht:get('general','total')" /> </td> </tr> </table> </body> </html> </xsl:template> </xsl:stylesheet> XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
Generating HTML with DataChannel XJ, Warren Hedley | Thread | Re: Practical Suggestion for XSLT P, Oren Ben-Kiki |
Re: What does fo: stand for?, Stephen Deach | Date | Re: XSL transformation from a java , RSuiter |
Month |