Abstract
In this article, we describe how we can use the RGCTranslationUnit code to generate the registration information for R to acess native routines in a DLL. This information provides more structured and robust access to the routines, including some reflection information. Automating the generation of this code simplifies the development and ensures or at least increases the chance that it is correct by minimizing human error. This article describes the computations that are available via the high-level function getRegistrationInfo() There is a second implementation also available via the function generateRegistrationInfo() and these two need to be merged.void foo(int *x, int *x_len, double *ans); SEXP bar(SEXP a, SEXP b);
static R_CallMethodDef CallEntries[] = { {"bar", (DL_FUNC) &bar, 2}, {NULL, NULL, 0} }; static R_CMethodDef CEntries[] = { {"foo", (DL_FUNC) &foo, 2, {INTSXP, INTSXP, REALSXP}}, {NULL, NULL, 0} }; R_registerRoutines(dll, CEntries, CallEntries, NULL, NULL, NULL);
Note | |
---|---|
This doesn't yet deal with expressions at the top-level, i.e. outside of functions. This is easy to add, just not done yet. |
NULL
or a list of
the names or symbols which were referenced in any expression in the
function via the interface.
There is now an additional element named
"expressions" which contains the actual expressions
in which the foreign routine calls are made.
This allows us to process them a we have the information
about the routines without revisiting all the functions.
However, this is relatively inexpensive.
Now that we have the identities of the routines that are called, we
can create the registration information. Let's assume that the there
is no aliasing of the symbols via the useDynLib() directive in the
NAMESPACE file. Since we are automating the registration information,
we will typically generate that information also and so there will no
aliasing. If there is to be aliasing, we will create the aliases via
a mapping function or prefix/suffix pair when creating the registration
information.
So our task is to read the tu files and find either
the declaration or definition of each routine referenced
and ensure that it has the appropriate signature
for the interface by which it is being called and
also to generate the
We should note that when we find the expression, we can determine the
number of arguments which are being passed and also, we can determine
the types of literals if there are any and ensure that they are
compatible. This allows us to do static or off-line checking rather
than run-time/dynamic checking. To do this, we need the expressions
themselves and we can squirrel this information away in a single pass
or alternatively do a second pass once we have the information about
the routines.
Let's work with the XML package as an example.
library(XML) ff = getNativeRoutineCalls("XML")
names(ff)[1] "htmlTreeParse" "libxmlVersion" "newXMLDoc" "newXMLNode" [5] "parseDTD" "parseURI" "xmlDOMApply" "xmlEventParse" [9] "xmlTree" "xmlTreeParse" "xpathApply"
sapply(ff, function(x) sum(sapply(x[1:5], length)))htmlTreeParse libxmlVersion newXMLDoc newXMLNode parseDTD 1 1 1 2 1 parseURI xmlDOMApply xmlEventParse xmlTree xmlTreeParse 1 1 1 14 1 xpathApply 1
sapply(ff, function(x) unlist(x[1:5]))$htmlTreeParse .Call "RS_XML_ParseTree" $libxmlVersion .Call "RS_XML_libxmlVersion" $newXMLDoc .Call "R_newXMLDoc" $newXMLNode .Call1 .Call2 "R_newXMLNode" "R_insertXMLNode" $parseDTD .Call "RS_XML_getDTD" $parseURI .Call "R_parseURI" $xmlDOMApply .Call "RS_XML_RecursiveApply" $xmlEventParse .Call "RS_XML_Parse" $xmlTree .Call1 .Call2 .Call3 .Call4 "R_newXMLDtd" "R_insertXMLNode" "R_newXMLTextNode" "R_xmlNewNs" .Call5 .Call6 .Call7 .Call8 "R_xmlSetNs" "R_newXMLNode" "R_insertXMLNode" "R_insertXMLNode" .Call9 .Call10 .Call11 .Call12 "R_xmlNewComment" "R_insertXMLNode" "R_newXMLCDataNode" "R_insertXMLNode" .Call13 .Call14 "R_newXMLPINode" "R_insertXMLNode" $xmlTreeParse .Call "RS_XML_ParseTree" $xpathApply .Call "RS_XML_xpathEval"
sapply(ff, function(x) (names(x)[1:5])[sapply(x[1:5], length) > 0])htmlTreeParse libxmlVersion newXMLDoc newXMLNode parseDTD ".Call" ".Call" ".Call" ".Call" ".Call" parseURI xmlDOMApply xmlEventParse xmlTree xmlTreeParse ".Call" ".Call" ".Call" ".Call" ".Call" xpathApply ".Call"
R_INCLUDE_DIR=${R_HOME}/include R_SHARE_DIR=${R_HOME}/share %.tu: %.c $(CC) -fdump-translation-unit $(ALL_CPPFLAGS) $(ALL_CFLAGS) -c -o /dev/null $< TU_FILES=$(wildcard *.c) tu: $(TU_FILES:%.c=%.tu)to our package's Makevars.in or Makevars file. And then we can create all the tu files in one command
make -f Makevars -f $R_HOME/share/make/shlib.mk tuNote that here we only deal with .c files and we do not use g++ as the compiler as we are only interested in the declarations of the routines and not their bodies. If we want more information, we would use g++. Now we have the tu files and we can process them.
filenames = list.files("/tmp/R/XML/src", ".+\\.tu", full.names = TRUE)
library(RGCCTranslationUnit) routines = lapply(filenames, function(f) { p = parseTU(f) r = getRoutines(p, gsub("\\.t00\\.tu$", "", basename(f))) resolveType(r, p) }) names(routines) = basename(filenames)
rRoutines = as.character(unlist(sapply(ff, function(x) unlist(x[1:5]))))
allRoutines = as.character(unlist(lapply(routines, names)))
i = match(rRoutines, allRoutines) if(any(is.na(i))) stop("missing", paste(rRoutines[is.na(i)], collapse = ", "))
rr = unlist(routines, recursive = FALSE) names(rr) = gsub(".*\\.", "", names(rr)) rr = rr[rRoutines]
rr[[1]]$returnType@name
rr[[1]]$returnType@name == "USER_OBJECT_" && all(sapply(rr[[1]]$parameters, function(x) x$type@name) == "USER_OBJECT_")
all(c(rr[[1]]$returnType@name, sapply(rr[[1]]$parameters, function(x) x$type@name)) == "USER_OBJECT_")
class(rr[[1]]$returnType@type)[1] "PointerType" attr(,"package") [1] "RGCCTranslationUnit"class(rr[[1]]$returnType@type@type)[1] "StructDefinition" attr(,"package") [1] "RGCCTranslationUnit"rr[[1]]$returnType@type@type@name[1] "SEXPREC"
# see findFF.R get_.C_type, is_.C_routine
gcc -fdump-translation-unit -c foo.c -o /dev/null -I`R RHOME`/includeGiven this, we have all we need and can call the R code to generate the registration information as
rfile = system.file("examples", "foo.R", package = "RGCCTranslationUnit") regInfo = getRegistrationInfo(rfile, tu.dir = system.file("examples", package = "RGCCTranslationUnit"))
writeCode(regInfo, "native", dll = "duncan", dynamic = FALSE)
writeCode(regInfo, "r", dll = "duncan")