ERROR: configuration failed for package 'RCurl'
locate libcurl
and see
if it returns something like libcurl.so in the lines it
emits. Alternatively, do
locate curl-config
and see if it is present.
If libcurl is not installed, use your binary package manager
to install the curl-dev package. This is different from
the curl package which is the command-line tool
for using curl to download files. We need the "linkable" library.
On Ubuntu, for example, you can use
sudo apt-get install libcurl4-openssl-dev
CURLOPT_HTTPAUTH
not being defined when the C code is being compiled.
What's the problem?
url.c: In function `RCurlVersionInfoToR': curl.c:613: structure has no member named `libidn' curl.c:613: structure has no member named `libidn'
options(RCurlOptions = list(verbose = TRUE, followlocation = TRUE, timeout = 100, useragent = "myApp in R")) v = getURLContent("http://www.omegahat.org")With this, we can see that our useragent appears in the HTTP header of the request. We see this since the verbose = TRUE shows the header of the request and the response.
Basically, we expect others to specify a meaningful value for useragent so that they identify themselves correctly.
Note that users (not recommended for programmers) can set the R option named RCurlOptions via R's option() function. The value should be a list of named curl options. This is used in each RCurl request merging these values with those specified in the call. This allows one to provide default values.
There are a couple of things that come to mind. Firstly, specify a value for the curl option "useragent" and give it something meaningful to identify your application. Hopefully adding that resolves the problem. It does in the case
getURLContent("https://mtgox.com/code/data/ticker.php", ssl.verifypeer = FALSE, useragent = "R")
If that doesn't solve the problem, set a value for the "timeout" option.
getURLContent("https://mtgox.com/code/data/ticker.php", timeout = 4, ssl.verifypeer = FALSE)This will at least cure the "hanging" indefinitely and return within that period of time.
http://www.omegahat.org/RCurlto
http://www.omegahat.org/RCurl/That works for me.
curl-config --featureIf ssl doesn't appear there, you don't have support for it. You should reinstall curl, having first installed SSL (e.g. openssl).
httpauth = 1L, # "basic"
to the curl options.
But there is a general approach to trying to figure out how
to get R to do the same thing as a browser or curl
or wget.
One approach is to make certain that both R
and curl are giving us as much information as possible.
So make sure both have verbose switched on.
In R, this is a curl option verbose = TRUE
and for curl, is is the command line switch -v.
Then look at the header information both produce and see if
anything is obviously missing or different in the R version.
A different idea is somewhat advanced, but not very. When the browser or curl makes a request, it is sent across the network via your operating system. What we want to do is look at the contents of the HTTP request and specifically the header information. There are two ways to do this. If we are doing this in a browser, we can examine the headers in that browser using browser-specific tools. Alternatively, if we are doing this via a command line utility or if we want to do this generally, we can capture all the network communication using a tool such as tcpdump or wireshark.
If we want to do this via the browser, we can use the LiveHTTPHeaders extension for Firefox or the "special" URL chrome://net-internals for Google Chrome.
For command-line/stand-alone tools, with the appropriate permissions on the computer, we can use a program such as tcpdump or wireshark or ethereal to "sniff" or capture the packets as they go across the network device and then we can look at them. We can do this for the curl or browser and then for R and compare what is being sent. This allows us to see the headers as we can with the verbose options, but it also allows us to see the content of the body of the request. This is only important for POST requests.
We should also note that if you are using HTTPS, the body will be encrypted and you won't be able to make any sense of it. However, if the data in the post are not sensitive, you can send it via HTTP - not HTTPS - and curl and the Web browser will do the same thing and we will be able to see the contents. The server will likely be confused and upset and give an error, but we are trying to determine the problem on the initial client request so that is not a problem. (It is a problem if we are trying to understand why R is not handling the response correctly, but that's a different problem.)
How do we use tcpdump and ethereal? First, start tcpdump just before you run the R or curl command
sudo /usr/sbin/tcpdump -s 1518 -i eth1 -w r_packets.tcpIf you do this and wait too long, you will capture all the background packets that are flying through your network interface that have nothing to do with your problem. This is not a problem, but it makes it harder to find the packets which we want to examine.
So next, go back to R or curl and run the command. When this has completed, kill the tcpdump process, e.g Ctrl-C in the terminal in which it is running or kill with the relevant process id (see the ps command or the Mac activity monoitor.) Now run ethereal with the name of the file to which tcpdump serialized the packets
ethereal r_packets.tcpAnd then you will get a window that looks something like You navigate the list in the top panel to find the HTTP entry (#64 in our example). Click on that and the details of this TCP interaction are displayed in the lower panel. Then you can expand the elements by clicking on the lines that have an arrow on the left. And then you'll see the details of the header and the body.
If the connection is via SSL, e.g. HTTPS, things are a little more complicated as the content is encrypted. There are a variety of ways and tools to deal with this. Some are (in no particular order)
verbose = TRUE
option in curlPerform() or some of the other higher-level
functions,
but there is no output appearing on the console.
What's the problem?
curlVersion
.
If scp and sftp are not there, reinstall libcurl but with support
using libssh2.
You will need to have the libssh2 development libraries and headers installed before installing libcurl.
On some OSes, you will need to rebuild RCurl from source.
getURL('ftp://ftp.wcc.nrcs.usda.gov/data/snow/snow_course/table/history/idaho/13e19.txt', verbose = TRUE)One can see the request just waits for a long period of time and then eventually the contents of the file are displayed in the R session.
The fix for this is to avoid extended passive mode and use the regular passive mode. This is controlled via the ftp.use.epsv option in calls to curlPerform() and we set this to FALSE so that PASV is used rather than EPSV.
getURL('ftp://ftp.wcc.nrcs.usda.gov/data/snow/snow_course/table/history/idaho/13e19.txt', ftp.use.epsv = FALSE)
url = 'ftp://ftp.wcc.nrcs.usda.gov/data/snow/snow_course/table/history/idaho/' filenames = getURL(url, ftp.use.epsv = FALSE, ftplistonly = TRUE) filenames = paste(url, strsplit(filenames, "\n")[[1]], sep = "")Now we can download each of these in turn. It is advisable to create a curl handle just once and to set any options before downloading any of the files.
con = getCurlHandle( ftp.use.epsv = FALSE) sapply(filenames, getURL, curl = con)
Error in curlPerform(url = url, headerfunction = header$update, curl = curl, : SSL certificate problem, verify that the CA cert is OK. Details: error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failedWhat can I do?
Information about this came from http://ademar.name/blog/2006/04/curl-ssl-certificate-problem-v.html
Basically, the remote server is sending us a certificate to say it is who it says it is. However, we have to trust that certificate. We do this by providing information about a collection of trusted signing authorites, e.g. Verisign, Entrust, Thawte We can use the certificates from these agents from the Netscape collection, available via http://curl.netmirror.org/docs/caextract.html, but you can find other collections. We download this file or its equivalent. Next we need to tell libcurl to use that file and where to find it. We do this with the cainfo option.
x = getURLContent("https://www.google.com", cainfo = "/Users/duncan/cacert.pem")Note that we cannot use a ~ in the file path; we have to expand it ourselves.
x = getURLContent("https://www.google.com", cainfo = path.expand("~/cacert.pem"))
To avoid having to specify the location of the bundle in each call, you can place the file in a place that libcurl looks. This is usually the file /usr/local/share/curl/curl-ca-bundle.crt If you have write permission for this directory, you can place the files. (The file must be present before libcurl is configured and compiled.)
On some versions of UNIX, the certificates will also be found in /usr/share/ssl/certs/ca-bundle.crt
If you don't have a certificate from an appropriate signing agent, you can suppress verifying the certificate with the ssl.verifypeer option:
x = getURLContent("https://www.google.com", ssl.verifypeer = FALSE)This does risk a 'man in the middle' attack.
x = postForm("http://www.fas.usda.gov/psdonline/psdResult.aspx", style = "post", .params = list(visited="1", lstGroup = "all", lstCommodity="2631000", lstAttribute="88", lstCountry="**", lstDate="2011", lstColumn="Year", lstOrder="Commodity%2FAttribute%2FCountry"))This fails with a simple message from the server that the "Custom Query results" cannot be displayed. Note the lstOrder parameter. It has %2F in two places. This is the "percent encoding" for the character /. postForm does the percent encoding on the inputs and so ends up encoding %2F, ending up with "%252F" in place of each "%2F".
So, don't pass percent encoded strings, but rather use human-readable versions, e.g. lstOrder="Commodity/Attribute/Country" and leave postForm to handle the encoding. Alternatively, you can tell postForm not to percent-encode specific parameters by passing them As-Is, e.g.
lstOrder = I("Commodity%2FAttribute%2FCountry")
val = rawToChar(textToSend) curlPerform(url = "http://localhost:9200/a/b/axyz", customrequest = "PUT", readfunction = val, infilesize = length(val), upload = TRUE)textToSend is the content we want to send. We convert it to raw. Then we specify the URL, that we want a PUT request, and we specify the raw object as the value for readfunction and also the number of elements in val as the value of infilesize. The last option - upload - is vital. What happens here is that RCurl reads bytes from val as it needs to send them to the server.
A slightly shorter way to do this is
httpPUT("http://localhost:9200/a/b/axyz", readfunction = val, infilesize = length(val), upload = TRUE)From version 1.91-0, you can simply use
httpPUT(url, textToSend)and the function will fill in the other curl options
userpwd
)
libcurl doesn't know which authentication mechanism to use.
So it tries several in order until one is successful.
So it is best to specify the authentication mechanism explicitly
via httpauth
.
Specify one of the values CURLAUTH_