Re: [xsl] How to avoid adding defaulted attributes

Subject: Re: [xsl] How to avoid adding defaulted attributes
From: Mark Giffin <m1879@xxxxxxxxxxxxx>
Date: Sun, 12 Jan 2014 16:22:39 -0800
Nice! I hit paydirt with this group. Thanks for several fast answers that solved my problem. And I got a lot of interesting background discussion as well. I had been wondering about this for a while.

Martin's Oxygen suggestion worked for my manual transform: Options > Preferences > XSLT-FO-XQuery > Saxon > Saxon-HE/PE/EE, turn off option called Expand attribute defaults ("-expand").

I learned from from Mr. Kay that I can use the Saxon -expand:off command line switch to automate this.

As for other XSLT processors supporting this option, I notice that Oxygen has a similar-sounding option for xsltproc: "Do not apply default attributes from document's DTD". I didn't see a similar option with other XSLT processors that come with Oxygen: Saxon 6, MSXML, MSXML.NET.

Mark


On 1/11/14 10:34 PM, Ihe Onwuka wrote:
Thanks Wendell, there are different routes to the arrival destination
and how you get there sometimes matters alot. Very attuned to the way
I think.

On Fri, Jan 10, 2014 at 3:03 PM, Wendell Piez <wapiez@xxxxxxxxxxxxxxx> wrote:
Ihe,

No, that's just pipelining, or "internal pipelining", if you want to
distinguish it from a pipeline of calls to the XSLT engine as in
XProc, Ant or a bash/bat script.

Micropipelining is when you use this technique at the level of a
branch, not of the entire tree.

Admittedly, the distinction is a bit blurry in some architectures.

David writes:
yes but looking at the schema doesn't help, if the schema/dtd defaults
shape="rect" on every a element (as the HTML4 one does) and you don't
want those but do want the 1 in a million times where someone has
shape="rect" in their source, then you can't just remove all attributes
that are defaulted by the schema, you have to stop the schema defaulting
attributes.
Indeed. Yet this raises an interesting conundrum. In the case where we
need to make a distinction between an attribute value present by
default and attribute value given in the instance, even when they are
nominally the same, then evidently (tautologically) they have
different semantics -- a difference that is blotted out by the
defaulting mechanism (hence the problem).

(Note I mean "semantics" not in the sense of "what will happen" but in
the sense of "what might potentially happen", i.e. addressing the
entire field of what might potentially happen "correctly".)

It is troubling if documents that have been validated to schemas with
attribute defaults thus have different semantics from identical
documents that happen not to have been validated. "Valid" is no longer
just an abstract state that can be tested and confirmed by a process
called "validation"; now we also have "has been validated" (and thus
transformed in some way), an internal state in a processing
architecture.

I think the only clean solutions to this are (a) don't use a schema,
but instead a normative processing step (XSLT!), to provide attribute
defaults, or (b) enforce at a policy level the rule that no semantic
distinction can or should be made between an attribute value provided
by default, and the same attribute value given in the instance.

Fortunately, except in bad designs (which arguably includes the HTML4
assignment of shape='rect' to 'a' elements by default), this is mainly
an edge case. Mostly the problem arises in cases like Mark's, where
dropping the defaults is only done to optimize the serialization,
since a subsequent validation step is expected to supply them back
again anyway.

Yet again, at a deeper level the same problem comes with using schemas
for type annotation. This is much harder. Type annotation is certainly
a reasonable requirement -- no argument there. But isn't type
assignment actually a transformation also? It may be useful to make a
distinction between a schema used as a specification of lexical and
syntactic constraints on a set of instances (XML as a string of
tags-and-tags, not XML as XDM), and a schema used as a specification
of how to parse an XML document into an XDM for processing.

Cheers, Wendell
Wendell Piez | http://www.wendellpiez.com
XML | XSLT | electronic publishing
Eat Your Vegetables
_____oo_________o_o___ooooo____ooooooo_^


On Fri, Jan 10, 2014 at 3:21 AM, Ihe Onwuka <ihe.onwuka@xxxxxxxxx> wrote:
Capture the main output in a variable and apply a template rule to it
that gets rid of the attributes you want removed.

I think they call this micro pipelining.

On Fri, Jan 10, 2014 at 12:53 AM, Mark Giffin <m1879@xxxxxxxxxxxxx> wrote:
I am running an identity transform on some files and changing a few things as they pass through. They are DITA XML files and their DTDs have defaulted attributes that do not appear in the instance files. But during the transform the defaulted attributes are added in on elements that I do not explicitly handle with XSLT templates. I can stop this behavior by commenting out the doctypes, but is there a way to do this with some setting? I'm using Saxon PE 9.4.0.6.

Thanks,
Mark

Current Thread