Wednesday, 28 October 2009

Variables and Strictness analysis in XmlPrime 1.0

In the last post I discussed the different ways in which variables are placed on the stack in XmlPrime 0.9, and the strictness analysis that is performed in order to determine which of these methods is most appropriate for each variable.

In this post I will explain how this mechanism has been expanded for XmlPrime 1.0.

In 0.9 variables were considered either to be eager, or lazy. For XmlPrime 1.0 we redefined these terms slightly. A variable is strict if it is always evaluated at least in part. A variable is eager if whenever it is evaluated it is evaluated in its entirety. If a variable is not eager, it is lazy. This provides us with a middle ground of a non-strict eager variable. $x in the code sample below is an example of a non-strict eager variable.

if (...)
then
sum($x)
else
()

These variables are placed on the stack in closures, but when they are needed they are evaluated in their entirety, and the value on the stack is replaced with the strict version. By evaluating the variable in its entirety, the extra layer of caching that was required for lazy variables in XmlPrime 0.9 is not needed.

Both of these conditions propagate throughout the query in a similar way so the only significant barrier for development was annotating all the built in expressions and functions to more accurately describe how they might use their arguments. This may depend on how the function itself is evaluated.

The biggest change to variables in XmlPrime 1.0 is that they can now be unboxed. Recall that unboxing is one of new optimizations in XmlPrime 1.0. In XmlPrime 0.9, all entries on the stack were instances of IEnumerable<T> for items. In order for unboxing to have a significant effect we needed to extend the stack to allow unboxed values. In XmlPrime 1.0 all entries on the stack implement IStackEnumerable which provides IEnumerable<T> implementations for all types that are used in XmlPrime (although in most cases most of the GetEnumerator methods throw an exception).

Implementations of all the types of closure are generic and can be used to store lists or closures of any unboxed type. The implementations for XdmItem
also allow for evaluation as appropriate unboxed types.

Whilst individual items need not be boxed, the value as a whole still needs to be wrapped in a suitable stack entry. This can be mitigated in some cases, for example variables defined in for expressions now share the same stack entry object for every item in their definition.

As with unboxed expressions, a variable can only be unboxed if its dynamic type is not needed to be preserved anywhere that the variable is used. The strictness analysis was extended with a new property - a variable is unboxable if its type will never be inspected. The strictness analysis pass was extended to propagate this property as well, and so all the information required to identify which type of closure is to be created at runtime is described by the static type of the variable, and these three properties. The combination of these three properties is known as the evaluation strategy.

The techniques above fairly thoroughly cover variables introduced by let clauses. Local variables are also introduced by function arguments, and so a lot of work has gone into these for XmlPrime 1.0.

In XmlPrime 0.9 all function arguments were evaluated eagerly. Clearly this is not always beneficial. The strictness analysis rules have been extended to be applied to user defined function arguments.

The first challenge that is that the evaluation strategy of an argument depends on the evaluation strategy of the function itself. The effect of the strict property is easy to determine - if a function is not strict (might not be evaluated) then neither is its arguments. As it turns out strictness and eagerness of a function will never affect the unboxability of its arguments and vice versa. The strictness rules are applied twice to every function - once assuming that it is evaluated eagerly and unboxed, and once assuming that it is evaluated lazily and boxed. The evaluation strategy for each argument in each case is stored, and these can be combined to determine the evaluation strategy of each argument wherever the function is called.

The other big challenge for computing the evaluation strategy for function arguments comes with recursive (or mutually recursive functions). How do we determine whether $y is strict in the following case for example?

declare function local:f($x, $y)
{
if ($x = 0)
then $y
else f($x - 1, $y)
}

The solution to this problem is to assume that every argument to every recursive function is strict, lazy and not unboxable. After the strictness analysis has been applied, if any of the evaluation strategies have been updated then the analysis is run again for any function that calls it. The analysis is repeated until the stratgies do not change. This algorithm ensures that optimal evaluation strategies are always selected.

Another improvement that has been applied to function calls is that if a function calls another function, then often it is passing local variables as arguments. Using the model described above, these arguments are either fully evaluated, or a lazy closure is created wrapping the previous definition. If a variable is fully evaluated then this causes an extra copy of the variables value to be stored in memory every time the function recurses. Adding an extra lazy thunk causes an extra level of caching to be added to the variable each time it recurses. This latter effect can compound causing successive recursions of a function perform increasingly badly. XmlPrime adds a simple check for this case during code generation - if a local variable is passed as an argument, then the stack entry is simply copied into the new stack frame.

This case (setting a variable to be the value of another variable) only occurs for function arguments, as otherwise it would have been rewritten by the optimizer. Another case that is eliminated by the optimizer is local variables whose definitions are values. When a function is called with a constant argument, the stack entry can be constructed ahead of time, and so no boxing is needed to make the call.

There are still many ways in which argument passing could be much improved. One case is for lazy variables whose definitions are only inspected once (this can only occur in function arguments, as otherwise the variable would have been eliminated). Currently the values are cached as the variable is evaluated, but this is unnecessary if the variable will never be inspected again. Look out for improvements like this in future versions of XmlPrime.

Friday, 9 October 2009

Variables and strictness analysis in XmlPrime 0.9

This is the fourth post in a series about performance and optimizations in XmlPrime. The first post in the series is available here. In this post I will look at how we implement variables in XmlPrime.

XmlPrime 1.0 has introduced a number of improvements to the way we use variables, mainly to increase performance of user defined functions and to unbox variable values. See the previous post for more information about unboxing.

All expressions returning sequences in XmlPrime are (currently) evaluated lazily. This means that every expression returns an enumerator, and the values are computed as they are needed.

This post looks at how we implement variables in XmlPrime. All variables are stored in the stack. The stack is implemented as an array of IEnumerable instances. Every variable is allocated an index into the stack, and each function call is allocated its own stack in which to place its variables (this is required in order to support recursive functions - each time the function recurses we need a new set of variables). There is also a separate stack for global variables, which is implemented in much the same way.

The simplest way to place a value on the stack is to evaluate it in its entirety, into a List<T> for example. This list is placed on the stack, and then every variable reference can retrieve and enumerate over this value.
If our value is a single item then we can do slightly better by skipping the list.

However this goes against our design philosophy of evaluating as much as possible lazily; if only the first item of the variable is looked at then we still end up computing the whole sequence.

Another way variables can be stored on the stack is by creating a closure (sometimes known as a lazy thunk) around the expression defining the variable. A closure consists of a copy of the current stack frame, and the expression defining the variable. This closure is placed on the stack. When a reference to the variable is evaluated, the closure is retrieved, and the act of enumerating it causes the expression to be evaluated in its own context, as if it had been evaluated when it was first seen. Although this addresses the issue that only as much of the value that is needed is evaluated, it introduces a new problem that if the variable is evaluated more than once then the result is recomputed. To defeat this an extra level of caching is added to the enumerators. A list of cached values is shared between all enumerators over the closure, and every time the enumerator is moved, an attempt is made to retrieve the value from this list before evaluating it.

This extra level of caching adds a performance hit however, and results in the sequence being stored in memory. Some small optimizations can be made though; once we have enumerated over the entire sequence, the closure on the stack can be replaced with its value, simplifying any further accesses to the variable, but any references still being enumerated still pay a performance penalty.

For this reason XmlPrime 0.9 used both ways of evaluating variables. All variables are classed as eager or lazy. A strict variable is one that is always used in its entirety - for example $x in the expression sum($x). In order to sum a sequence, every value in the sequence must be examined. A lazy variable is one that may not be evaluated in its entirety (or may not be evaluated at all). This is known as strictness analysis. It is performed in a separate pass after optimization, as optimization moves variables around which can invalidate judgements made. Special annotations are needed for every expression and function in order to correctly propagate the strictness rules. The analysis can get quite complex as for example in the following case, which illustrates some of the considerations that need to be made:

let $x := ...
let $y := ...
let $z := sum($y)
return
if (...)
then sum ($x)+ $z
else $z

In this case $x is lazy, since if the else branch is hit $x is never evaluated. $z however is always evaluated in both branches and so is eager. $y is eager, but only because $z is eager.

Eager variables are clearly best suited to having their values placed on the stack (as they will be used in their entirety), and lazy variables are better evaluated using a closure, as they may not be evaluated in their entirety.

As I mentioned at the start, this is the behaviour as implemented in XmlPrime 0.9. XmlPrime 1.0 has dramatically expanded the number of ways that variables are placed on the stack, and the strictness analysis is now a lot more sophisticated, giving more detail on how variables are used. Variables can now be stored as unboxed values, and the strictness analysis has been improved for recursive functions. I will explain these differences in the next post.