Tuesday, 26 July 2016

.NET Core and the state of XML

We had an enquiry regarding using XmlPrime with .NET Core.  We were quite disappointed to find that porting isn't going to be easy.

System.Xml

If I've understood the code correctly, there are some worrying changes in System.Xml.  There are minor niggles, such as the disappearance of Close(), but that's not a major headache.  Some members relating to System.Xml.Schema have gone, which means it will be difficult to support schema-aware processing  (more below).

The loss of XmlResolver as a public class is most definitely in the category of "blocking issue".  XmlResolver is now internal and appears to have a single concrete implementation which can only resolve URLs to files on the file system.  It's the mechanism we use for resolving DTDs, XSDs, XQuery modules, XSLT files and probably much more besides.

On the one hand, it's always been a bit of a frustrating class.  The "role" attribute never got used.  It would have been nice to be informed whether a resolver was resolving a PUBLIC or SYSTEM identifier, or whether it was resolving a DTD or an XSLT file included from another transform.

As an aside, there have been other annoying breaking changes in .NET 4.5.2 in this area, such as the case of the mysteriously disappearing external entities.


System.Xml.XPath

I had thought that this was largely intact until I discovered that XPathItem's default constructor is now internal.  That means we can't derive XPathAtomicValueXPathFunctionItemXPathMap and XPathArray, the classes representing the XPath and XQuery data model items.

If we can't use XPathItem as a base class, that pretty much rules out using XPathNavigator, which in turn pretty much rules out using System.Xml.XPath at all as things stand.

System.Xml.Schema

This is gone.  In a sense that's a good thing, as it gives us a clean slate to produce a fully compliant XSD 1.0 and XSD 1.1 implementation, but that's no small task.  In the meantime, we could implement just enough to support non-schema aware processing.

System.Xml.Xslt

This is gone, and was the main reason for us wanting to port XmlPrime.

Conclusions

In summary, there seems to have been quite a bit of damage done to .NET's XML handling, and unless I'm mistaken or some of the issues given above are addressed, it will be a while before we can port to .NET Core.



XslCompiledTransform: Speed versus Correctneess

Performance was a key requirement in the development of XmlPrime 3.0.  The benchmark for XSLT performance on the .NET framework is of course the built-in System.Xml.Xsl.XslCompiledTransform class, which superceded the old XslTransform class.

First off, it's worth pointing out the obvious difference.  XmlPrime is an XSLT 2.0 processor, whereas XslCompiledTransform only supports XSLT 1.0.  This means we can only compare the two processors executing XSLT 1.0 stylesheets, which requires XmlPrime to operate in "backwards-compatiblilty" mode, as defined by the XSLT 2.0 specification.  Operating in this mode means that we are not on a level playing field.

Numeric operations

In XSLT 1.0, all numbers are double-precision floating point.  In XSLT 2.0, numbers may be integers, decimals or floating point numbers.  To meet the conformance requirements of XSLT 2.0, we use System.Decimal to represent both integer and decimal types.  Mathematical operations can therefore be a little slower than using System.Double.  For a fairer comparison, we could update the XSLT 1.0 stylesheet to XSLT 2.0 and ensure that all numeric literals are written as xs:double values.

String operations

The implementation of string operations in XslCompiledTransform is wrong.  Just try

<xsl:value-of select="substring('&#x1f4a9', 1, 1)" />

The input is a string with a single unicode codepoint. This string is represented by a System.String instance of length two.  The string consists of the two System.Char values which form the surrogate pair

The correct response for the above code is to output &#x1f4a9; to the result document.  Instead, XslCopmiledTransform outputs half of the surrogate pair.  I suspect this is a case where performance outweights correctness.  The string handling functions such as string-length, substring-before, substring-after and substring are all considerably simpler if yone ignores the requirements of the specification.

Access to Internals

It is most frustrating when you discover a really useful method or property of a class, only to discover that it is internal.  One such case is the UniqueId property of XPathNavigator.  This property is exactly what you would want to call to implement the fn:generate-id function efficiently.  However, because it is internal, it is just out of reach (without doing something nasty).

Type Inference

Suppose an XSLT 1.0 stylesheet contains the code

<xsl:template name="add">
  <xsl:param name="a" /> 
  <xsl:param name="b" />
  <xsl:value-of select="$a + $b" />
</xsl:template>

The compiler can't be sure what will be supplied for parameters a, and b.  We might expect them to be numeric, but the caller could equally well pass in a node set and a boolean value.  The clever compiler might check all usages of template add and discover that it is always called with a numeric value, thus avoiding any runtime checks.

XSLT 2.0 permits a stylesheet to be invoked by specifying an initial template.  In that case, we don't have a "closed world" over which to run a static analysis.  That is, we can't determine statically that add will always be supplied with numeric values.  We could of course compile a specialized version of this template.  This is something XmlPrime may do in the future.

Final Thoughts

In our benchmarking, we've seen cases where XmlPrime will outpace XslCompiledTransform on XSLT 1.0 stylesheets.  There are also cases where XslCompiledTransform is a clear winner, and this is due at least in part to the differneces discussed above.  When time permits, it would be interesting to convert some of the XSLT 1.0 benchmarks to XSLT 2.0 with an eye to extracting the best possible performance from an XSLT 2.0 processor.

As ever, the advice is to measure performance for your use case and put a little effort into updating stylesheets where needed.

Wednesday, 4 May 2016

XmlPrime 3.0 released

A major new version of XmlPrime has been release, which brings huge improvements in performance and some great new features.

XmlPrime 3.0 includes full support for XQuery 3.0 and XPath 3.0, the highlights of which include:
  • Grouping and windowing
  • Exception handling
  • Higher order functions
For the first time, we've differentiated the trial/non-commercial version from the paid-for product.  Purchasing the product gives access to byte-code compilation.  We'll explore in a future posting under what circumstances byte-code compilation really comes into its own.

Even without byte-code compilation, performance is greatly increased.  In some test cases, XmlPrime 3.0 is around 100 times faster than XmlPrime 2.9, and over a mixed set of benchmarks, the execution is four times quicker.  In real-world performance on a high-traffic website, we saw CPU usage drop by 25%.  How will it perform for your workload?  Find out for yourself by running the XSLT benchmarking software.

Tuesday, 1 November 2011

XmlPrime 2.3 released

We're pleased to announce the release of XmlPrime 2.3.

The XmlPrime 2.x series has grown to include XSLT 2.0, XML Inclusion 1.0 (XInclude) and now we have added support for the XQuery Update Facility 1.0.

Download it from here.

You can also download a Visual Studio 2010 solution with code samples.

Monday, 6 September 2010

XmlPrime 2.0 Beta - XSLT 2.0 for the .NET Framework!

The question we get asked most often is: "When will XmlPrime support XSLT 2.0?".

Well, we finally have an answer: Now!

We are pleased to announce the XmlPrime 2.0 Beta Program. XmlPrime 2.0 includes a native XSLT 2.0 processor for the .NET Framework.

You can download a copy of XmlPrime 2.0 beta now to test the new functionality. Please leave any comments on our forum.

We expect the beta program to finish at some point in the next month, at which time we intend to release a fully supported version.

XmlPrime 1.5

Last week we released XmlPrime 1.5.

The main improvement in this version is with the performance of recursive user defined functions.

The biggest improvements are for functions that recurse over a sequence. Consider the following user defined function:

declare function local:product2($seq as xs:decimal*,
$product as xs:decimal)
as xs:decimal
{
if (empty($seq))
then $product
else local:product2(subsequence($seq,2), $seq[1] * $product)
};

declare function local:product($seq as xs:decimal*)
as xs:decimal
{
local:product($seq, 1)
};

In previous versions of XmlPrime, the subsequence operation would create a closure and each call would have an increasing number of closures to evaluate the argument. In XmlPrime 1.5.0 subsequence operations are treated as a special case and a closure is no longer required. Look out for a future blog post explaining this optimization in more detail, and how to take advantage of it.

The optimizations in this version particularly help computationally-expensive queries - the raytracer sample shows a 20% speed improvement.

As always, there are a large number of other performance improvements and bug fixes so we highly recommend all our users update to this version. For more details as to what has changed in this version, see the changelog.

As always, XmlPrime 1.5 is available to download for free for trial and non-commercial use.

Tuesday, 25 May 2010

Join optimizations in XmlPrime

XmlPrime has "Join Optimizations" listed on its feature list. What exactly are these join optimizations and how are they used?

The join optimizations performed by XmlPrime attempt to reduce the algorithmic complexity of complex queries. In theory you do not need to do anything to take advantage of these join optimizations - if your query contains a join-like construct then it should be optimized to a join. However it is good to be aware of the optimizations that are performed so you can understand why a join has not been spotted, or what joins are possible in XmlPrime.

Tuples


Consider the following query:

for $x in 1 to 10
let $y := $x * 2
return $y

An algebra (at least for our purposes) is a system for modelling a query.
For most of the compilation process we use an algebra in which the previous expression looks something like this: (note that I have made up this syntax for the purpose of explaining the algebra)

For($x)
Range
Constant(1)
Constant(10)
Let($y)
Multiply
VariableReference($x)
Constant(2)
VariableReference($y)

The for expression enumerates over its domain and for each item sets a value of $x and evaluates its return expression (in this case the let expression).

When we perform join optimizations we first transform the query to a different algebra (a tuple algebra). In the tuple algebra, each FLWOR clause is an expression that returns a set of tuples. A tuple assigns values to a set of variables. For example for $x in 1 to 10 returns a sequence of 10 tuples. The first one sets the value of $x to 1, the second tuple sets the value of $x to 2 and so on. Our previous example looks like this (again the notation is made up).

Return
Concatenate
For($x)
Range
Constant(1)
Constant(10)
Let($y)
Multiply
VariableReference($x)
Constant(2)
VariableReference($y)

The Concatenate expression above takes each tuple from the input, and then uses it to generate a sequence of tuples from the right hand input. In this case the value of $x from each input tuple is used to generate a single tuple that defines $y. Each tuple returned from the second input is combined with the tuple from the left input, and the result is combined. In this case for each tuple that defines $x we generate a tuple that defines $y and return a single tuple that defines both $x and $y.

Note that this change of algebra is merely a different way of describing the query, so far this hardly affects how the query is run. In practice "returns a tuple" means "sets some variables" and "combining two tuples" means "setting some variables, and then setting some other variables".

In a future version of XmlPrime we intend to eliminate the first algebra entirely and perform all our optimizations with this join algebra. A number of the new expressions added in XQuery 1.1, eg. group by fit a lot more cleanly with the tuple algebra.

We can perform certain optimizations in this algebra that are not possible (or at least very difficult) in the basic algebra above. These optimizations are the join optimizations.

Products


The simplest join optimization performed by XmlPrime is forming products. A product is constructed when two independent expressions are concatenated. The expressions are independent if none of the variables defined in the first expression are used in the second expression.

Here is an example:

for $row in /rows/row
for $column in /columns/column
return
<intersection row="{$row}" column="{$column}" />

In tuple algebra, this gets converted to something like the following:

Return
Product
For($row)
Step(child::row)
Step(child::rows)
Root
For($column)
Step(child::column)
Step(child::columns)
Root
Element(intersection)
Sequence
Attribute(row)
VariableReference($row)
Attribute(column)
VariableReference($column)

Below is an excerpt from the query plan, showing this product:

product
for $row in
step
$fs:dot_0/child::rows
child::row
with
for $column in
step
$fs:dot_0/child::columns
child::column

The original query plan looks like the columns would be navigated once for each row retrieved from the document. The product prevents this. When the product is evaluated it stores a list of all the tuples from the right hand side, and then each tuple in the left gets combined with each tuple from the right, without reevaluating the inner for loop. In practical terms, all the assignments to $y are recorded and replayed for each value of $x.

For this case the result can be mimicked by storing the definition of $y in a variable, but in the general case the product may not be storing just the values of a single for clause, but a whole set of FLWOR clauses.

Joins


A join is formed when a where clause depends on both sides of a product.

Here is an example query with a join:

for $author in doc("bookshelf.xml")/book/author
for $book in doc("bookshop.xml")/book
where $book/author = $author
return $book

Note that the where expression applies a predicate to a sequence of tuples returning only those which match. Since the two for expressions are independent they form a product, and since the where clause depends on both sides of the product, this is a join.

In this form joins are evaluated no differently to products with a where clause. This is known as a nested loop join. The predicate is applied to the product of the tuples returned from the two for expressions. This gives quadratic performance.


In the query plan a join looks something like the following:

hash join
for $author in doc("bookshelf.xml")/book/author
to
for $book in doc("bookshop.xml")/book
on
$book/author = $author


Hash Join


If he predicate of a join is an equality operator the join then gets converted to a hash join. The expressions on either side of the equality operator are known as the keys. When the right (second) list of tuples is constructed, the right key is evaluated for each tuple. The tuples are stored in a Dictionary mapping from a given key value to a list of tuples that match this key (this dictionary is known as the join index. Once this dictionary is built up then for each left tuple (from the first input), the left key is evaluated. This key is used to look up the matching tuples from the right hand side and and the results are combined and returned. Assuming that only a few tuples from the right hand side match each time then this reduces the complexity from O(n2) to O(n) (assuming constant overhead for a dictionary lookup).

There were many complications in implementing such a join, especially in the general case when the arguments have type xs:anyAtomicType*. If two types are incomparable, a type error should be raised. xs:untypedAtomic values need to be converted to the base type of every type that appears in the other key, and used for comparison. numeric types should be compared as if they had been promoted to the lowest base type. Numeric NaN values never match. If keys are sequences they may match more than one key in the index, but all the matching tuples must be returned once, and in the order they were encountered. Because of all these potential issues, there are several implementations of the hash join in XmlPrime which tackle different combinations of these issues. All the implementations have the same O(n) basic performance characteristics.

Left Outer Joins


A left outer join is similar to a join, but the results are grouped for each input argument. Here is an example of a query with a left outer join:

for $user in doc("users.xml")/user
return
<user name="{$user/@name}">
{
for $comment in doc("comments.xml")/comment
where $comment/@user = $user/@name
return $comment
}
</user>

When converting this query into the join algebra, XmlPrime extracts each seperate FLWOR expression and puts it into a new let clause (with a new variable; in this case we will call it $comments), in order to group all FLWOR clauses together:

for $user in doc("users.xml")/user
let $comments := for $comment in doc("comments.xml")/comment
where $comment/@user = $user/@name
return $comment
return
<user>{$comments}</user>

The following pattern is then spotted as a left outer join.

[ClausesA]
let $a := [ClausesB]
where [Predicate]
return [AggregateExpr]

The above pattern is only a Left outer join if ClausesA and ClausesB are independent, and if Predicate depends
on both ClausesA and ClausesB. Note that in order to reach this condition some (let) clauses may need to be moved from ClausesB to ClausesA and some clauses may need to be moved from ClausesA and concatenated to the entire expression.

The left outer join is evaluated like a standard join, except for each tuple stored from the right hand side we also compute the aggregate expression (or if the join is evaluated lazily, the variables required to compute the aggregate expression) and then for each left hand tuple we find all the tuples that match it. We get (or compute) the aggregate value for these tuples, concatenate (this is normal sequence concatenation, not tuple concatenation) the aggregate values and then store the result in $a.

As with the standard join, both nested-loop and hash variants are supported. Again, in the case of the hash join, assuming only a few tuples match in each case, this reduces an O(n2) operation to O(n).

In the query plan a left outer hash join looks something like the following:

left outer hash join
for $user in doc("users.xml")/user
to
for $comment in doc("comments.xml")/comment
on
$comment/@user = $user/@name
aggregate
$comment
as $comments


Group-by


One particularly common form of the left outer join is the group-by. A group by clause is where you want to group all items in a sequence by a particular key. Here is an example which groups a list of books by their author.

let $books := doc("bookshelf.xml")/books/book
for $author in distinct-values($books/@author)
return
<author name="{$author}">{
$books[@author=$author]
}</author>

Using the join optimizations so far described this will optimize to a left outer join (and inventing a couple of variables, $book and $matching-books):

let $books := doc("bookshelf.xml")/books/book
left outer hash join
for $author in distinct-values($books/@author)
to
for $book in $books
on
$book/@author = $author
aggregate
$book
as
$matching-books
return
<author name="{$author}">{$matching-books}</author>

The group by is formed when you have a left outer join with the following form:

left outer hash join
for $x in distinct-values(keys($sequence))
to
for $y in $sequence
on
$x = key($y)
aggregate
$y
as
$group

In the above plan keys and key are general expressions depending on the variables shown. A left outer join is formed if applying keys to a sequence is the same as applying key to each item in the sequence and concatenating the result. Also with the current implementation key must return zero or one items.

Determining the condition above (applying keys to a sequence is the same as applying key to each item in the sequence) in general is not that easy, and our implementation is not yet perfect, although this improves with every release.

Note also that in practice the left outer join is liable to have many more clauses (in particular common subexpression elimination introduces a lot of let clauses), and these may have to be moved out of the join in order to get it into this form.

If XmlPrime does manage to manipulate the query to be in this form, and all the conditions are satisfied, then the query plan now looks like this:

for $book in doc("bookshelf.xml")/books/book
group by
$book/@author
aggregate
$book
as
$matching-books
return
<author name="{$author}">{$matching-books}</author>

The group by clause combines the join and the distinct values operation. It builds a join index, storing the aggregate values as it goes. Iterating over this join index then provides the values for the aggregate variable. Over the left outer join, this group by reduces the amount of memory used, and ensures that the input expression is evaluated only once. In the best case this can reduce the memory usage by more than half. This is because in the left outer join, the input expression is used twice and so all its values are stored in a variable. Also an additional hash table is built up to perform the distinct-values operation.

Conclusion


Hopefully this post has shed some light on the various join optimizations that are performed in XmlPrime. By examining the query plan of a query you can see which join optimizations have been applied.

Most of the improvements to join optimizations come from rewriting queries to make them closer to the forms above. If you have a query which you believe should contain one of the above optimazations, but does not then please let us know as we are keen to spot as many different ways of writing a join as possible.