Tuesday 22 December 2009

XmlPrime 1.2 Released

XmlPrime 1.2 has just been released.

The major new features in this release are push-based evaluation.

Queries can now be evaluated directly to an XmlWriter or directly serialized to a TextWriter or Stream. This prevents an immediate result tree from being constructed, which reduces memory usage, and allows the result to be returned sooner, and quicker.

All expressions returning sequences can now be evaluated via push-based mechanisms. This method of evaluation is more efficient, but only where the entire sequence is consumed. This optimization is applied automatically to any expression that is consumed eagerly.

Expect a post next week with a more detailed analysis of this feature.

See the full change log for more information about the changes in this version, or download the new version now.

Thursday 17 December 2009

Avoiding StackOverflowExceptions

This post describes several situations that can cause StackOverflowExceptions in XmlPrime, and how to avoid them.

Look at the following fairly innocuous XQuery function:

declare function local:fact($x)
{
if ($x = 0)
then 1
else $x * local:fact($x-1)
};

As you probably expect this function works fine for small values of $x, but as $x grows large then the function will overflow the stack. The stack is overflowed far more quickly than a direct C# implementation of this function, as a function call is made for each expression. In this case the call stack contains one call to prepare an (XmlPrime) stack frame, one call to bind the arguments, one call to evaluate the if expression, and finally a call to evaluate the multiplication operator before recursing. This means that the stack will overflow quite quickly.

So, what can we do about this?

The trick to solving this problem is an optimization called tail recursion. If a function returns the result of a function call as a value, then the runtime can evaluate this new function call in place of the old one, without growing the call stack. Let me try to demonstrate what I mean by this. The factorial function we defined above is not tail recursive, as the value from the recursive function call is not returned directly, but is multiplied by $x after it is returned. We can rewrite the function in order to make it tail recursive, by adding an accumulating argument:

declare function local:fact2($x, $accum)
{
if ($x = 0)
then $accum
else local:fact($x - 1, $accum * $x)
};

declare function local:fact($x)
{
local:fact($x, 1)
};

Note that the value of the recursive call is now returned directly, this means that the tail-recursion optimization can be applied. Evaluating this function is now a lot more like evaluating the following C# function:

int Fact(int x)
{
int accum = 1;
while (x != 0)
{
x = x-1;
accum = accum * x;
}
return accum;
}

As you can see, evaluated in this way the function should no longer trigger a stack overflow exception. Unfortunately though, the function had to be manually rewritten in order to take advantage of this optimization. In the future we would like to try to make this transformation automatically, to widen the set of functions in which tail recursion can be applied.

Does this solve our problem? For this case, yes, but things can get more complicated.

Consider writing our own fold function. fold takes a function f that takes two arguments and returns an accumulated value. fold applies this function to a sequence by applying it to current result and each item in turn to build up a final result. For example if we fold the addition function over a sequence we get the sum of the sequence, and if we fold the multiply function over a sequence we get the product.

In XQuery 1.1, fold can be written like this:

declare function local:fold($func as function(item()*, item()) as item()*,
$seq as item()*,
$accum as item()*)
{
if (empty($seq))
then $accum
else local:fold($func, subsequence($seq, 2), $func($accum, $seq[1]))
};

However with XQuery 1.0, which does not support higher-order functions, we have to fake it by passing around a node to represent the function, and writing an apply function that applies it:

declare function local:apply($func as node(),
$accum as item()*,
$arg as item())
{
typeswitch ($func)
case element(add) return $accum + $arg
case element(multiply) return $accum * $arg
case element(reverse) return ($arg, $accum)
...
default error((), "Function not recognised")
};

declare function local:fold($func as node(),
$seq as item()*,
$accum as item()*)
{
if (empty($seq))
then $accum
else local:fold($func,
subsequence($seq, 2),
local:apply($func, $accum, $seq[1]))
};

local:fold is tail recursive and so we should never get a stack overflow, right?

Unfortanately, this is not the case. The problem occurs because local:apply (or $func in the XQuery case) is not nescassarily strict in the accumulator argument. In the XQuery 1.0 case this is because we might hit the default clause in the local:apply function, and in the XQuery 1.1 case we might have a function like:

declare function last($accum as item()*, $item as item()) as item*
{
$item
};

which when folded over a sequence returns the last item.

Aside: In the XQuery 1.0 version, if you are to call fold with a statically known function, then XmlPrime may specialize the fold function and create a new version inlining the function application. This simplifies the function enough to identify that the argument is strict. This may not always be the case though, so it shouldn't be relied on.

Since $accum may not be evaluated, it is defined lazily. So rather than applying the function and passing the value to the recursive function, a closure is created and the value is computed later.

Consider evaluating the function call

local:fold(<add />, (1,2,3,4,5), 0)

After the first iteration the closure for $accum looks like:

local:apply(<add />, 0, 1)

The next iteration applies the value 2 to $accum, and so the closure for $accum looks like

local:apply(<add />,
local:apply(<add />, 0, 1), 2)

And when the function finally returns $accum it returns a closure that evaluates

local:apply(<add />,
local:apply(<add />,
local:apply(<add />,
local:apply(<add />,
local:apply(<add />, 0, 1), 2), 3), 4), 5)

When $seq is quite large, computing the result of this closure will result in a stack overflow error.

So how can we fix this?

In order to fix this we need to ensure that the argument for the recursive call is evaluated fully. In the general case there is no easy way to fix this beyond ensuring that the argument is used entirely in every eventuality. In order to facilitate this we have added a new xmlprime:eager extension expression that will be available in the upcoming XmlPrime 1.1 release.

To force the argument to be fully evaluated we can annotate the expression like this:

declare namespace xp = "http://www.xmlprime.com/"

declare function local:fold($func as node(),
$seq as item()*,
$accum as item()*)
{
if (empty($seq))
then (# xmlprime:eager #) { $accum }
else local:fold($func,
subsequence($seq, 2),
local:apply($func,
(# xmlprime:eager #) { $accum },
$seq[1])))
};

Now $accum is evaluated fully in both branches. This forces the strictness analysis to identify $accum as a strict variable and hence the function becomes a simple tail recursive function and is more resiliant to stack overflow errors.

This fold function still has a problem that can cause stack overflow errors. The $seq argument is evaluated lazily (as each iteration only looks at the first value of $seq). Whilst technically the function is strict in $seq (as it does inspect every value eventaully), the current analysis engine is not capable of spotting this. It is impossible to always determine the strictness of an argument with 100% certainty as this can be specialized to the halting problem. The value of $seq ends up in a closure that looks something like this:

subsequence(
subsequence(
subsequence(
subsequence(
subsequence(..., 2), 2), 2), 2), 2)

This will eventually cause stack overflow errors.

This case of using the first item of a sequence and recursing with the remainder of the sequence is an important and common one in functional programs. In a future version of XmlPrime we would like to spot this case, read the first few items from the enumerator, and recurse by passing the same enumerator to the next iteration. This would avoid building a closure to evaluate the subsequence operation, and avoid caching the values returned and so should drastically improve performance of this kind of function (probably by a factor of about O(n) where n is the depth of recursion). Unfortunately this feature has not yet been implemented, and so this problem enforces a limit on the number of items in a sequence that can be processed in this manner.

You may be wondering why it is necessary to put the eager extension on the then branch of the if expression. Consider the following query, which reverses a sequence:

local:fold(<reverse />, $seq, ())

This query behaves properly and avoids the problem. However, lets make a simple modification to the query, to return the $nth last item in $seq:

local:fold(<reverse />, $seq, ())[$n]

For large values of $n removing the xmlprime:eager around the then clause this would cause the query to stack-overflow. This is because we do not always use all of the result of the function, and so the variable is evaluated lazily. This causes a tree of enumerators that perform the concatenation to be produced, and can cause a stack overflow when iterating over the sequence.

This case is another case that we would like to fix in a future version of XmlPrime. To fix this we would need to implement a custom MoveNext method that rather than returning a boolean value would return a new IEnumerator to use to enumerate the rest of the sequence (or null if we are at the end of the sequence). This would avoid pushing a new (C#) stack frame for each concatenation in a lazy result.