The RCurl Package
RCurl_1.96-0.tar.gz (20 June 2014)
Manual
The RCurl package is an R-interface to the libcurl library that provides HTTP facilities. This
allows us to download files from Web servers, post forms, use HTTPS
(the secure HTTP), use persistent connections, upload files, use
binary content, handle redirects, password authentication, etc.
The primary top-level entry points are
However, access to the C-level routines is also available
via the R code, and one can specify options to all of the
libcurl operations to control how they are performed.
Documentation about the options and commands
can be found at the libcurl web site
R functions can be specified to collect text from both the
response and its headers. This can be used to customize the processing
of the requests and feed the results to higher-level processing
(e.g. HTML parsing via the htmlTreeParse function in the XML package).
This package will be used to implement the low-level communication
in the SSOAP package
and other high-level packages that utilize HTTP to exchange
requests and data.
Documentation
-
- Paper
outlining the package with some advanced examples.
-
-
- Guide
-
-
- Changes across releases
-
-
- Examples of using asynchronous, multiple concurrent requests.
-
-
- FAQ
-
Other Approaches
- httpRequest
- The httpRequest is a package on CRAN that implements a small
part of HTTP directly in R using sockets.
- httpClient
- I have developed the httpClient package using
R code and connections that supports additional
aspects of R and HTTP, such as cookies, character escaping, and also
SSL for HTTPS. I haven't released the code (favoring the
approach of building on existing C code) but can make it available if anyone
is interested.
While having code in R makes it easier to understand, explore and
modify, it is probably better to use existing specialized libraries
like libcurl rather than doing this ourself. We gain speed and a
large development community that cares about getting things right and
testing them.
We will explore the use of libwww
Issues
Using the opaque data structures of the libcurl infrastructure
means that we cannot easily access the file descriptors used
in the communication. This makes it somewhat more difficult
to integrate these streams into an R even loop
(e.g. REventLoop).
We can potentially turn them into regular connections
(if the internal API is made "public").
License
This is distributed under the BSD license
in the same spirit as libcurl itself.
Duncan Temple Lang
<duncan@wald.ucdavis.edu>
Last modified: Mon May 25 11:35:38 PDT 2009