[xsl] Modeling matrices in an XML environment

Subject: [xsl] Modeling matrices in an XML environment
From: "David Birnbaum djbpitt@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Mon, 13 Jul 2020 17:25:43 -0000
Dear XSL-list,

Roger's question invites a tangential one (for which reason I have changed
the subject line): How might we usefully think about representing matrices
in an XML environment?

This question is tangential because Roger's original question assumed his
hierarchical representation (*/matrix/row/col*), and asked about how to
operate over it efficiently. But because over the years I have represented
matrices in XML in three different ways, with no particular insights into
how to compare them from an engineering perspective, I would be grateful
for any thoughts other readers of this list might have about that
comparison. I am not asking about which implementation will be the quickest
when performing a particular operation because, as we are often reminded on
this list, the only way to answer that question (aside from the obvious,
like avoiding repeating an assignment unnecessarily inside a for-each,
using keys where appropriate, etc.) is to time the results. But a more
general engineering question might be how the following alternative
representations might differ in the operations they allow or support or
facilitate:


   1. */matrix/row/col* This was Roger's representation, and it mirrors the
   one found in HTML tables.
   2. A sequence of *<cell row="m" col="n">* elements. This is economical
   for sparse matrices (not relevant for the matrix addition about which Roger
   inquired, but harmless there, and economical in other, sparse contexts),
   and the values can be accessed efficiently with composite keys (where those
   are supported, e.g., Saxon PE or EE), or by simulating composite keys
   (likely to be less efficient, since it requires composing the simulated key
   within invocations of the *key()* function).
   3. Nested arrays, e.g., *[ [1, 2], [2, 4] ]*. This mirrors a common
   representation of matrices in other programming languages, e.g., as nested
   lists in Python.


I used #2 in
https://archive.xmlprague.cz/2020/files/xmlprague-2020-proceedings.pdf and
I am using #3 for matrix transposition and dot-product operations in my
upcoming Balisage presentation (see
https://www.balisage.net/2020/Program.html). As a general model, #2 may be
most appropriate to the data because it does not implement a hierarchy
where one may not be inherent in the meaning of the data, that is, it does
not regard columns as subordinate to rows or rows as subordinate to
columns. After all, matrix addition does not depend on the subordination of
rows to columns or vice versa, and dot product operations associate rows in
one matrix with columns in the other.

I wrote about table models in 2007 (see
http://conferences.idealliance.org/extreme/html/2007/Birnbaum01/EML2007Birnbaum01.html),
but that was before arrays were part of the XML environment, and my focus
then was primarily on the fit between the model and the reality, and less
on how different models might encourage or encumber different types of
operations. The addition of arrays to our arsenal invites a reconsideration
of both the modeling and the engineering question.

If the only differences are likely to be finding the model that is most
performative with the XSLT engine at my disposal (with the complication
that that engine may incorporate different optimizations today than it will
at its next release), then a decision might come down only to timing the
different options. But aside from actual timing differences with a specific
XSLT engine, are there inherent differences in what these models do and do
not support that might encourage us to favor one over another? For what
it's worth, I find the imposed hierarchy of #1 and #3 artificial and
distracting, and it took me longer to work out the logic for operations
over #3 than it did over #2. But I have no professional expertise in either
computer science or software engineering, so I would welcome insights from
those who might notice considerations that I overlook.

Best,

David

On Mon, Jul 13, 2020 at 5:19 AM Martin Honnen martin.honnen@xxxxxx <
xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx> wrote:

> Am 12.07.2020 um 23:53 schrieb Martin Honnen martin.honnen@xxxxxx:
> > On 12.07.2020 22:47, Dr. Roger L Costello costello@xxxxxxxxx wrote:
> >> Hi Folks,
> >>
> >> My application uses lots of matrix operations. I used the SAXON Tool
> >> Profile (-TP) option to generate a web page that shows the performance
> >> of each of my matrix operations. I found that my matrix addition
> >> function is taking an appallingly large amount of time. Below is my
> >> matrix addition function. Do you have suggestions on ways to improve
> >> its performance?  /Roger
>
> > And of course in XPath 3/XSLT 3 there is the "for-each-pair" function
> > made for tasks like this but I have no idea whether that would perform
> > better. And to write it solely as a function expression you need XQuery
> > or the XPath 4 extension function introduced in Saxon 10 to create
> > elements.
>
>
> Here is an XSLT function uses for-each-pair twice together with
> saxon:new-element to construct the row and col elements:
>
>
>      <xsl:function name="matrix:addition" as="element(Matrix)"
> visibility="public">
>          <xsl:param name="M" as="element(Matrix)"/>
>          <xsl:param name="N" as="element(Matrix)"/>
>          <Matrix>
>              <xsl:sequence
>                  select="
>                      for-each-pair(
>                        $M/row,
>                        $N/row,
>                        function ($row1, $row2) {
>                          saxon:new-element(
>                            'row',
>                            for-each-pair(
>                              $row1/col,
>                              $row2/col,
>                              function ($col1, $col2) {
>                                saxon:new-element('col', $col1 + $col2)
>                              }
>                            )
>                          )
>                        }
>                      )"
>              />
>          </Matrix>
>      </xsl:function>
>
>
>
> Needs Saxon 10 PE or EE and I haven't tested whether it performs any
> better than the previous suggestions.

Current Thread