Last Release: 0.93-2 (06 Apr 2011)
This package is a basic interface to the zlib and bzip2 facilities for compressing and uncompressing data that are in memory rather than in files. This is useful when the data we have to work with is never in a file on our local file system but rather given to us as part of a transaction with a remote server. For example, we might receive a gzipped-text file from retrieving a URI via the RCurl package. Or we might receive a compressed micro-array file from a Web service via the SSOAP package. Rather than having to collect that data, then write it to disk and then read it back into R, we can uncompress it directly in memory. This avoids unecessary I/O and also improves "security" as our scripts do not need to access the file system. (This is currently not that important as R is not secure in any way, but as we use R more extensively in embedded situations, e.g. in databases, Web servers, spreadsheets, other languages like Perl & Python, etc., this does become an issue).
The current interface is more complete than earlier versions. It provides access to
At present, one must have the entire data vector in memory before the call and the tools operate on it directly. It is entirely feasible to allow us to generalize this and have the tools ask for more data as it is needed by the decompression libraries. And we can do the same thing with the output. In this way, it could work with the existing connections mechanism in R at the R level. Unfortunately, the connections API at the C-level is not public and it is not amenable to extensions implemented in R packages, i.e. externally from the R source code.