Friday 4 September 2009

XmlPrime 1.0 performance enhancements

This is the first post in a series of posts about the performance updates in XmlPrime.

First off, some numbers.

In order to demonstrate the performance changes I have run the 3 samples that we provide, with the command line tool in the XmlPrime 0.9 Technical Preview and the command line tool provided with XmlPrime 1.0. All these tests were run on an x64 machine with the x64 VM (under x86 the performance is much better - we will cover this in a later post).

The times are as reported by the command line tool, so the compile time includes parsing, optimization and compilation; the run time includes the time to evaluate the query and serialize the results.

Raytracer

The times here are from running ppm.xq of the final raytracer at a resolution of 640x480. Here is the command used to run this test:
XQuery -t ppm.xq width=640 height=480
compile(ms)run(ms)
0.91182412186
1.0163307593

Currency

These times were computed with a modified version of currency.svg that takes a source document as a context item, supplied as GBPNoon.xml. This is to avoid network latency from affecting the timings.

The command used to run this query was
XQuery -t -s GBPNoon.xml currencysvg.xq
compile(ms)run(ms)
0.911991344
1.064955

iTunes Artist Graph

The iTunes query used was the one from part 3 of the sample. This because the part 4 query is largely IO bound, as it retrieves a lot of data from Wikipedia.

The command used to run this query is:
XQuery -t iTunes3.xml
compile(ms)run(ms)
0.9145893483
1.08098529

This particular query runs slower in 1.0 than in 0.9 which we were not expecting. The reason behind this is that we changed some of the rules around variable dereferencing when we added support for modules to be compiled seperately. In this case global variables cannot be removed as they are part of the public interface of the query. Variables were being dereferenced in the main module, but their declarations were not being removed. Compounded with the fact that currently all global variables are evaluated eagerly meant that the source document was being validated several times. Suffice it to say after discovering this problem, the dereferencing issue has been fixed, and will be available in the next version available in the next couple of weeks. A run with the latest dev version brings the runtime down to 66884 ms.
[UPDATE (09/09/2009)]: These changes have been incorporated in XmlPrime 1.0.2

So how have we acchieved this performance boost? We have implemented a number of significant optimizations since 0.9. The biggest ones are listed here:
  • Unboxed evaluation of expressions
  • Improved strictness analysis
  • Unboxing variables
  • Static type analysis of recursive functions
  • Strictness and unboxing analysis of function arguments
  • Precompiling the DLL
  • Elimination of some quadratic algorithms during common sub-expression elimination
  • New XDM type system implementation, tighter bound to XML Schema
  • Function specialization
  • Specialization of core expressions
Most of these changes will probably sound quite cryptic, so this shall mark the start of a series of posts on how XmlPrime works under the hood, with in-depth articles on some of the different optimizations performed by XmlPrime.

No comments:

Post a Comment