The Rllvm Package
Rllvm on github
Rllvm_0.4-2.tar.gz
JSM2012 Talk
The Rllvm package is an R-interface to the llvm library that provides
facilities for creating native code and compilers for different
languages. The aim of this package is to provide an interface
to the llvm facilities so that we, as a community, can experiment
with building compilers for the R language and aim to speed up
its evaluation.
I have long felt that we should build on
other platforms that provide their own compilers, e.g. LISP,
but this is an effort to stay within the R community but provide
the foundation to build on native compilers rather than
intepreted byte compilers.
This package is not yet a compiler for R. It merely provides the
tools on which one can write a compiler to create native code. I
expect that we can utilize Luke Tierney's compiler package
on top of this to leverage some optimizations there and then generate
the native code and then use LLVMs optimization passes.
It remains to be seen whether these two optimization approaches
are orthogonal oor share a great deal in common.
Documentation
There are several examples, adapted from the LLVM tutorials
and developed as explorations of that API.
-
- Compiling GPU kernels
-
-
- compiling simple scalar arithmetic
-
This example comes from the LLVM
tutorial
(actually the documentation for the previous release).
This takes 3 numbers and squares the 3rd and adds the
result to the sum of others. This is not a vectorized
function.
The code in R closes parallels the C++ code described in that tutorial.
-
- Computing the greatest common divisor
(GCD)
- Computing the GCD of two integer.
This example comes from the
LLVM
tutorial
(actually the documentation for the previous release).
The code in R closes parallels the C++ code described in that tutorial.
-
- cumulative sum of a vector
- This implements the cumulative sum of a vector
in R and via a manually generated native routine.
The speedup is a factor of 32.7 on a Mac OS X machine
and 26 on a Linux machine.
-
- Implementing x + 1 in R and natively
- This is the example Luke Tierney illustrated timing results
for in his UseR! 2010 talk illustrating two approaches to byte-code compilation,
and Stephen Milborrow’s Ra/jit system.
Here, we manually generated native code to implement the R
function. The code we created performs 108 times
faster than the interpreter R code on a Mac OS X,
and 72 times faster on a Linux machine.
This contrasts with the numbers Luke reported (on a different machine)
on which there was a a speedup of a factor of 3.4 for the original byte compilation,
20 for Ra, and 29 for the experimental byte compilation
system Luke is working on.
-
- 2-D random walk
-
This is an implementation and comparison of Ross
Ihaka's example of a 2-D random walk.
Ross progressively illustrates how to improve the naieve
implementation using profiling and gradually vectorizing
the code.
The result is a speed up of a factor of 200.
By implementing the naieve version via Rllvm,
we obtain a speedup of a factor of 340.
-
- Generating a routine that calls an
existing, external routine
-
This also shows how to control how the external symbols
are resolved.
-
- Global variables
-
-
- Details of storing values in variables and
elements of arrays
-
store.R,
store1.R
store2.R
-
- Comparison of Timings
for 5 different problems.
- The problems and approaches are described
in examples in the package in explorations/ and tests/.
These not only show that we can outperform R's
interpreter, but also outperform R's vectorized code
by changing the
-
- FAQ
-
Other Approaches
I was unaware when I started this work that Byron Ellis had also
started on bindings to llvm back in 2008. See rllvm on r-forge.
Byte-code compilation is another worthwhile approach.
See Luke Tierney's articles on this
and the existing support in the R engine and his compiler package.
Issues
There are many classes and methods to add
to this interface.
License
This is distributed under the GPL2 License.
Duncan Temple Lang
<duncan@wald.ucdavis.edu>
Last modified: Tue Jul 16 11:30:42 PDT 2013