RE: Streaming XSL

Subject: RE: Streaming XSL
From: "Didier PH Martin" <martind@xxxxxxxxxxxxx>
Date: Tue, 23 Feb 1999 15:30:55 -0500
Hi Oren

<YourComment>
Actually I'm not thrilled with this "associate-a-program-with-each-file"
paradigm, however popular it became in windowing systems. The same file
might be processable by several unrelated interpreters. At any rate, this
seems something outside the scope of XML.
</YourComment>

<Reply>
But you have to do it in some way though. Either with a command line type
stuff or anything else. But remember, we where taking about streams and ways
to tell the other end of the stream what to do. Processing instruction is a
clean way to do it. At the source end of the stream, it can be any kind of
generator as long as their insert processing instruction telling the other
end waht to do. Of course there is some issues to resolve like: if a markup
is not yet processed (thus, we have a not well formed ???? I still don't
know how to call this so let's simply say that a begin-markup is without an
end-markup :-) we may have problem is a new PI is inserted in the stream
before the data is complete and well formed (i.e. all the end-markups are
received and thus we then have complete elements). I don't know yet how this
situation could be resolved because it means that an interpreter has to
aware of processing instruction and give the contrrol back to the stream
handler or have the stream handler alway process the incomming data before
the interpreter. In the latter case, the interpreter as not to be aware
about the PI. The interpreter knowledge starts only with its first known
markup so in the case of xsl it would be <xsl:stylesheet> and it would do
its job until a </xsl:stylesheet> is encountered. By definition an
interpreter control starts with a begin-markup and ends with an end-markup.
So, practically when ended up with a solution where the stream handler
always watch the incomming data for a PI and just give the data to the
interpreter if no PI is present. If there is one that would imply that we
change control, then a new thread is started. This is, in fact multiplexing.
Several simultaneous sessions can co-exist. Each session start with a
begin-markup and end with a end-markup. Thus, xsl streams begin and end with
their own kind of begin-markup, end-markup so is rdf. Own own processor
actuelly under developement involves three kind of interpretations: a) xsl,
b) dsssl c) rdf (also we just started yesterday topic maps development - ISO
13250). However, we are just beginning to experiment with PIs as context
switching, I guess that with MEMUX we'll bet a better protocol because using
PI is not so easy to implement for context switching. However, I should say
that it works very well for format indication and to request the right
interpreter at the other end of the stream.
</Reply>

<YourComment>
Isn't there some rule which says that unrecognized processing instructions
are ignored? It is then quite OK to have processing instructions specific
for a certain interpreter, or even for several unrelated ones, without
harming processing by other XML tools.

This would be similar to the way that some text editors recognize lines in
certain formats in order to customize the handling of text files - choosing
the right syntax highlighting or whatever - without harming the semantics of
the file itself.
</YourComment>

<Reply>
yes
</Reply>

<YourComment>
>b) We should also be able to specify constrain within the language and not
>solely to the processor. The <xsl:stylesheet> element seems indeed to bve a
>good candidate for this.

>
>But, I guess we will see that feature more in version 2 or 3 if we see it.

Is it _really_ too late to discuss things like the 'complete' attribute for
<xsl:template> or <xsl:stylesheet>? (Hopefully) August is still a long time
in the future...
</YourComment>

<Reply>
I don't know, but from my years of developement experience, anything takes
more time than expected, so imagine now a workgroup composed of several
companies.... 6 months is a century for a lean and mean team but it is a
minute for a consortium workgroup :-)
</Reply>

<YourComment>
I'd rather have:

<?vim vim customization?>
<?emacs emacs customization?>

Etc. - that is, let each interpreter recognize a set of PIs if it wants to,
and ignore the rest. Not that you can stop them from going that way, anyway,
since it is standard XML. But I think that encouregment - and some mechanism
for registering the PIs somewhere so they won't collide and would converge
to common forms - is something the W3C should consider.

In our case, consider:

<?my-xsl-processor by-default-templates-are="[in]complete"?>

Which should appear _before_ <xsl:stylesheet>, and:

<?my-xsl-processor next-template-is="[in]complete"?>

Which should appear before the effected <xsl:template>.

If I understood you correctly, you could trivially modify your processor to
recognize some form of the first PI above, instead of relying on the 'media'
hack. If anyone decides to implement a more complete solution, he could rely
on something like the second PI.

This is OK as long as the PIs do not effect the semantics of the XSL
transformation - that is, as long as they are just "optimization hints".
This is true for the above two PIs.

Anyway, this is still second best. I'd still rather that the XSL proposal
would directly address the issue of processing large documents. How about a
compromise:

Include in the XSL proposal a list of recommended PIs using the reserved
<?xsl-processor ...?> name - the above two being in this list, of course :-)
Implementers would then choose which, if any, of these PIs to implement, and
may add their own. When version 2 comes in, we'd be able to track which of
these PIs are used in practice, which new ones were added etc. and integrate
the functionality into XSL itself (if it makes sense to).

This was done in HTTP, for example - there is a 'pragma' header field, a
recommended value of 'no-cache' was defined in HTTP/1.0 and widely
implemented, and as a result of the experience gained using it a special
'cache-control' field was added in HTTP/1.1.
</YourComment>

<Reply>
Or we can get a single <?xml-something...?> for a very simple pattern match.
then have the interpreter type defined by a mandatory property. Thus, a
general mechanism check for a single PI used for interpreter selection
something like xml-interpreter or anything that make sense. Then have at
least a mandatory property for interpreter type, all other properties
including the script location could be optional and treated as command line
parameter. The command line system is quite general and do not oblige a
general shell to re-compile or set new attributes for each new program ;-)
So, a general purpose PI could fulfill the same goal and thus have XML
transformed into a useful tool (like today but with broader horizons). Thus,
to associate a xsl style sheet to a xsl interpreter and indicate to this
interpreter which script to use we would have: <?xml-interpreter
type="text/xsl" href="myscript.xsl" ...etc...?> So, we just have to replace
the first word and keep everything like today. The PI identifier
xml-interpreter just indicates that we refer to a xmml interpreter (of
course my dear :-))) and the real switcher would then be the "type"
property. Pros: a bit like today and with more generality, Cons: I should
maintain a an associative map of all interpreters and their corresponding
MIME type. But MIME type has the advantage to be standardized and have a
registration mechanism
</Reply>

<YourComment>
Which is exactly what 'pragma's are all about :-)
</YourComment>

<Reply>
yes :-))
</Reply>

>Do you think I should bring this thread to
>xml-dev?

<YourComment>
No need. The mechanism is already there, in standard XML, today.
</YourComment>

<Reply>
not so sure, there is no general purpose PI mechanism. XML is still with a
document=file origins. To move to XML+interpreter is an other step. As I
said earlier, but this time with style: a XML document without an
interpreter is like a sleeping beauty waiting for the charming prince.
Nothing will happen until the prince does something :-) I we have a
documented PI that provides a general purpose mechanism to associated XML
documents to an interpreter, something as versatile as shell command line,
we get something.

So, in your proposal above, the interpreter type is given by the first word
<?vim is used for the vim interpreter, <?emac for the emac interpreter
etc... everything after that is interpreter dependant. Thus, a general
purpose mechanism for style interpreter would then be <?dsssl..?> for dsssl
interpreters, <?xsl..?> for xsl interpreters and <?css..?> for css
interpreter all other properties would be interpreter dependent. So, instead
of <?xml-stylesheet..?> we would have <?xsl...?> etc... Is that what you
mean? if yes, it is easy to implement and should please to the New Jersey
people :-) and the pragmatic ones like me. And it has the benefit to be
general and open the doors to myriad of xml interpreters. The problem with
this schema now is a problem of registration. MIME type is more standard. So
what about:
<?xml-interpreter type="yourtypehere"......everything else is interpreter
dependant a considered as a parameter list for this interpreter...?> so the
MIME type property is used to select the right interpreter and is therefore
mandatory. Thus only the PI indentifier xml-interpreter and the type
property would be mandatory everything else optional and intepreter
dependant. All properties would be treated as interpreter parameters. what
do you think?
</Reply>

Regards
Didier PH Martin
mailto:martind@xxxxxxxxxxxxx
http://www.netfolder.com


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Current Thread