Subject: Re: [xsl] Why does my streaming program hang when the input is a streaming web site ? From: "Abel Braaksma (Exselt) abel@xxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> Date: Tue, 10 Jun 2014 09:55:32 -0000 |
On 7-6-2014 15:50, Costello, Roger L. costello@xxxxxxxxx wrote: > This web site emits a continuous stream of XML: > > http://xmpp.wordpress.com:8008/firehose.xml?type=text/plain > > <snip /> > > java saxon9ee.jar -o:Titles.html test.xml Show-Titles.xsl > > Any thoughts on why the command line hangs and I get no output? You touch on a very intericate and subtle point with regards to streaming: how and when output streaming is allowed or can be achieved. AFAICT Saxon buffers output, which is allowed, even enforced by the streaming definition in the current XSL Working Draft. I think that if you run it long enough, at some point the buffer will get full and you will see output, albeit a non-complete one, and potentially in a temporary file (not sure about internals of Saxon here). It reminds me of a discussion at XML London I had with Charles Foster, and a question that came up at the end of my talk (not sure who to credit for the question). It was about what happens when output is large (needs streaming), but input is small. In your case both output and input are large (I call your input "intrinsically streaming", even if it isn't large, it must be processed using streaming because the stream is never-ending and you want intermittent output), but the same question applies. The answer was: the XSL WD is not prescriptive here, but it does require to run in constant memory, which at some point requires buffering and intermittent flushing. But this is in conflict with another rule. A processor is not allowed to create a non-valid principal output document (it is allowed to do this with result-document though, as in the case of failure or interruption). So either it must write all (when successfully processed whole stream) or nothing (in case of error/interruption). To prevent this from happening, a processor must buffer until it knows it will successfully finish. So even if it flushes intermittently, there must be a mechanism that does a rollback in case of failure. In other words: interrupt your processor, it signals an error, and your output will be lost (go back to start do not pass go, do not collect $200). Your stylesheet might work differently and more to your expectations if you switch to using result-document on, say, each atom:entry. That way the processor can process a complete node and write a complete document, and it is more likely to flush it to disk each time a node goes out of focus. Another option is to output something that is not required to be validated or well-formed (i.e., text), but I'm not sure if it will change the behavior. And yet another option is to customize Saxon to use a different XML Writer, one that you control. But I'm afraid my knowledge of Saxon and streaming is not deep enough to give you an example of that, or even whether it can solve the buffering problem. Cheers, Abel Braaksma Exselt XSLT 3.0 streaming processor http://exselt.net
Current Thread |
---|
|
<- Previous | Index | Next -> |
---|---|---|
[xsl] Why does my streaming program, Costello, Roger L. c | Thread | Re: [xsl] Why does my streaming pro, Michael Sokolov msok |
AW: [xsl] problem with intersect in, Szabo, Patrick (LNG- | Date | Re: [xsl] Why does my streaming pro, Michael Sokolov msok |
Month |