CodeDepends
CodeDepends_0.2-0.tar.gz (06 June 2009)
The CodeDepends provides tools for processing
R code (functions and scripts) and
- calculating dependencies between the different expressions,
to facilitate
- caching results and avoiding recomputation
- running code up to a particular expression or variable
- providing general overview of code,
- providing a brief vocabulary for high-level annotation of code,
- identify and displaying high-level tasks,
- creating call graphs between sets of functions
- thinking about scripts as higher-level objects
and facilitating thinking about aspects such as alternative
approaches or branches where , and generally capturing the thought process of an
analysis/computation with its code.
The primary motivation of this package is to provide a central
location for potentially sophisticated dependency analysis between
expressions that can be used for caching of intermediate results. See
the cacher and
weaver
packages for use with Sweave. We are using this in XDynDocs, an
XML-based dynamic document system that works for Docbook and Word.
We also use this to provide a higher-level view of code. The idea
is that somebody viewing an R script would look at a figure
representing the flow of variables or a graph of the relationships
between the high-level tasks and what they are doing (e.g. data input,
data cleaning, exploratory data analysis, modeling, and so on). These
tools attempt to provide ways to look at code in more intuitive,
high-level ways than detail-oriented code statements intended for an
interpreter.
We also expect to use this package to identify potential
- refactoring
- redundancy
- parallelism
We also want to use this to create much richer documents
that capture the entire thought process and activities
during an analysis or simulation.
We want the author to be able to reproduce not only
the final results they present to the reader, but
the additional activities that
- confirmed their approaches
- alternative avenue that they tried
- dead-ends that did not come to fruition
- ideas for other things to pursue
This is the sense of reproducability that we want to get to, not just
being able to repeat the computations but the analysis process. For
this, we need a richer document and richer relationships between code
blocks representing higher-level tasks. We want to be able say that
these, for example, three code blocks relate to fitting a classifier.
The inputs are the data and the output is a a classifier function and
residuals, say. If one wanted to try a different statistical method
one would add a parallel task which would have the same inputs (or a
superset) and produce a classifier function.
Documentation
-
- preliminary overview
-
-
- R function documentation
-
Duncan Temple Lang
<duncan@wald.ucdavis.edu>
Last modified: Mon Mar 30 11:56:49 PDT 2009