[xsl] [ANN] Saxon 12.0

Subject: [xsl] [ANN] Saxon 12.0
From: "Michael Kay michaelkay90@xxxxxxxxx" <xsl-list-service@xxxxxxxxxxxxxxxxxxxxxx>
Date: Sat, 14 Jan 2023 14:40:59 -0000
12.0 is a new major release of SaxonJ, SaxonCS, and SaxonC, all built from the
same code base. It is the first time we have released simultaneously for all
three platforms, which we have been able to achieve thanks to progress on
build and test automation and continuous integration.

SaxonC is a library with APIs for C, C++, PHP, and Python.  This version is
reengineered to use GraalVM, replacing the defunct ExcelsiorJET technology.
The main benefit of the change is that we are now running on a supported
platform, greatly reducing the risk of security vulnerabilities. In addition
GraalVM is showing excellent performance results. SaxonC-HE ships with GraalVM
Community Edition; SaxonC-PE and SaxonC-EE use GraalVM Enterprise Edition
which gives a further performance boost. For Python users (versions 3.8 -
3.10), the SaxonC API for Python can be installed
using pip from the standard PyPi repository.

SaxonCS is built by transpiling the source code of SaxonJ to C#. As well as
new features inherited from SaxonJ, it now targets .NET 6, and plugs some gaps
in the C# API.

SaxonJ is the flagship Java product and remains available in three editions,
Home (HE), Professional (PE), and Enterprise (EE). SaxonJ 12.0 is built and
tested using Java 11.

Bytecode generation has been dropped from SaxonJ in this release. The benefits
of bytecode generation were becoming more and more marginal as Java's JIT
compilation improved, and in the end, very few real production workloads
gained a significant boost. Dropping bytecode generation also removes a
potential security attack surface, as demonstrated by recent vulnerabilities
found in the Xalan product. In its place we have introduced improvements to
the interpreter. Using a technique we call expression elaboration, pioneered
in SaxonJS, we now generate a Java or C# lambda function for every XPath
expression in the stylesheet. This is done on first execution, to reduce the
effort spent optimizing template rules that might never be used. It turns out
that this gives particularly good results for SaxonCS, where the performance
boost is around 20%, compared with 5% for SaxonJ.

Another significant development for performance optimization is a move towards
use of learning strategies. Influenced by trends with JIT optimization
technology, rather than relying entirely on static analysis to make
optimization decisions, we are increasingly making decisions based on run-time
monitoring. For example, it is hard to decide purely from static analysis what
evaluation strategy to use for function parameters (lazy, eager, or
incremental). By monitoring how the supplied parameter is actually used at
run-time, we can make a more informed choice.

Most of the new functionality in 12.0 consists of experimental implementation
of new features being developed for XSLT 4.0, XQuery 4.0, and XPath 4.0, a W3C
Community initiative in which Saxonica is playing a leading role. This
provides a wide range of handy new functions and operators. One of the most
signfiicant features is that user-defined functions can now define optional
parameters with a default value; they can also be called using keyword
arguments as well as positional arguments. These new features cannot yet be
considered stable, and must be explicitly enabled if they are to be used. They
generally require Saxon-PE or Saxon-EE.

One of the new features is a parse-html() function for processing HTML5
documents. Saxon previously offered a saxon:parse-html() extension, but it was
not well tested, and was not conformant with HTML5. The new function provides
welcome new capability for applications that need to consume HTML5 data. The
SaxonJ implementation uses validator.nu, while SaxonCS uses AngleSharp.

Saxon-HE is no longer distributed on SourceForge. The JAR files can be
downloaded from Maven (package name Saxon-HE) or from
https://github.com/Saxonica/Saxon-HE/ <https://github.com/Saxonica/Saxon-HE/>
. The GitHub repository also provides source code for those who need it.

Stylesheets that have been compiled into SEF files should be recompiled with
the new release.

Michael Kay
Saxonica

Current Thread