Re: [xsl] Performance of predicate-based patterns

Subject: Re: [xsl] Performance of predicate-based patterns
From: "Wendell Piez wapiez@xxxxxxxxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Fri, 6 Feb 2015 16:49:57 -0000
Hi,

I'm afraid there may be times when processing HTML, for example, when
one might want to have match="*[@class/tokenize(.,'\s+
)='x'] and the like ... one might hope to be able to specify the
element type, but then there will be fallback cases such as

match="*[not(@class/tokenize(.,'\s+')=$knownclasses)]

where $knownclasses represents a controlled set of classes the XSLT is
expected to handle ...

Cheers, Wendell


On Wed, Feb 4, 2015 at 9:27 AM, Eliot Kimber ekimber@xxxxxxxxxxxx
<xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> The DITA for Publishers Word-to-DITA framework
> (https://github.com/dita4publishers/org.dita4publishers.word2dita) has a
> generic WordML-to-SimpleML transform
> (https://github.com/dita4publishers/org.dita4publishers.word2dita/blob/mast
> er/xsl/wordml2simple.xsl). This generates a simplified and generic form of
> "word processing markup" from the WordML.
>
> But because it's handling the whole WordML in a generic way it doesn't
> have many templates that use predicates, so not sure it helps here.
>
> The intermediate format puts the style name and ID on the element to which
> it applies, so the the templates that process that file are either also
> simple or are handled within gnarly for-each-group loops that infer
> hierarchy from the flat paragraph structures.
>
> This process is also driven by a separately-defined style-to-tag mapping
> document, so there's little or no need for templates that match on
> variable properties of the input elements.
>
> Cheers,
>
> Eliot
> bbbbb
> Eliot Kimber, Owner
> Contrext, LLC
> http://contrext.com
>
>
>
>
> On 2/3/15, 5:34 PM, "Michael Kay mike@xxxxxxxxxxxx"
> <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
>>>
>>> You will end up with similar match pattern if you try to map Word
>>> styles (saved in WordprocessingML) into some XML structure. Style name
>>> is stored in a subelement which is two levels down from actual
>>> paragraph element. And a lot of publishing companies is processing
>>> Word input documents. You will have templates like:
>>>
>>> <xsl:template match="p[pPr/pStyle/@val = 'Heading 1']">
>>> <h1>
>>> <xsl:apply-templates/>
>>> <h1>
>>> </xsl:template>
>>>
>>
>>We've got a precondition there that it will only match a <p> element, so
>>that's a good start. It then depends how many other rules there are that
>>also match <p> elements.
>>
>>But yes, it would be good to look at some Word-ML stylesheets if anyone
>>knows of any. (I've come across a few over the years, but all specific to
>>a particular client.)
>>
>>Michael Kay
>>Saxonica
>>
>>
>



--
Wendell Piez | http://www.wendellpiez.com
XML | XSLT | electronic publishing
Eat Your Vegetables
_____oo_________o_o___ooooo____ooooooo_^

Current Thread