Re: [xsl] How to write a stream-oriented XSLT filter?

Subject: Re: [xsl] How to write a stream-oriented XSLT filter?
From: Michael Kay <mike@xxxxxxxxxxxx>
Date: Tue, 26 Mar 2013 19:20:59 +0000
There are two challenges here. The first is to pipe your servlet response into the XSLT transformer without materializing it as a string in memory. That's doable, though it could be tricky, because there's what I call a "push-pull" conflict - the servlet expects to write (push) the output, and the XML parser expects to read (pull) it, so the only way to get the data from one to the other without putting it all in memory is to run them in separate threads with a shared buffer.

The second is to run the XSLT transformation without building a complete tree representation of the source document in memory. As far as I know the only XSLT processor that can do that today is Saxon-EE, and even then, it can only do it if you're very careful to write the transformation in a streamable way. There's information on this here:

http://www.saxonica.com/documentation9.4-demo/index.html#!sourcedocs/streaming

However... How large is your output? It's quite possible to handle 100Mb or so before you have to resort to streaming. And surely 100Mb is far more than you want to send to a browser. Perhaps you're dealing with much smaller data sizes and you just need to change your system configuration so it doesn't run out of memory?

Michael Kay
Saxonica

On 26/03/2013 14:41, John English wrote:
Apologies in advance: this is not an XSLT-specific question, but I thought someone here might be able to help me anyway...

I have an XSLT filter which transforms XML output from a servlet to HTML. The filter uses a CharResponseWrapper to extract the response output as a string and then run it through the transformer:

      CharResponseWrapper wrapper =
          new CharResponseWrapper((HttpServletResponse)response);
      chain.doFilter(request,wrapper);
      s = wrapper.toString();
      if (s.length() > 0) {
        Transformer transformer = xslt.newTransformer();
        Source transformSource = new StreamSource(new StringReader(s));
        StreamResult result = new StreamResult(response.getWriter());
        transformer.transform(transformSource, result);
      }

This works fine unless the output is larger than a certain size, at which point I get an OutOfMemoryError.

The sensible way to deal with this would presumably be to connect up the response's output stream to the transformer's input source, so that I don't have to materialise all the output as a string before I start transforming it. I can't find any examples which show how I can do this; all the examples I've seen also work by materialising the output as a string before processing it.

It seems like I will need to move the transformation into the response wrapper, and maybe a separate thread to run the transformation while the response is being generated, but it all seems incredibly complicated. I'm hoping it's not as complicated as I imagine, and that someone here has already gone through all this and will be able to tell me how to proceed.

Please can anyone point me in the right direction here?

TIA,

Current Thread