An XML package for the S language
Last Release:
3.98-1 (Sun Oct 4 17:00:33 PDT 2015)
The latest version (3.99-0) introduces the ability to define XPath functions
for use in the getNodeSet() and xpathApply() R functions.
One can use R functions and C routines to implement new XPath functions.
Additionally, several XPath 2.0 functions are implemented by default.
Some people have encounterd memory leaks with this package.
As far as I am aware, these are only on Windows. I think this is due to the binary
versions of the package created for the package missing compiler flag.
This package provides facilities for the S language
to
- parse XML files, URLs and strings,
using either the DOM (Document Object Model)/tree-based
approach, or the event-driven SAX (Simple API for XML)
mechanism;
- parse HTML documents,
- perform XPath queries on a document,
- generate XML content to buffers, files, URLs,
and internal XML trees;
- read DTDs as S objects.
It is an interface to the libxml2 library.
It can be combined with the RCurl package
for parsing documents that require more involved HTTP requests
to fetch the document.
Download
The source for the S package can
be downloaded as XML_3.98-1.tar.gz.
There is also a Windows version available
from the Omegahat repository.
Use
install.packages("XML", repos = "http://www.omegahat.org/R")
Documentation
-
- Best practices for using the XML package
- PDF version.
-
- A short overview: HTML, PDF
-
-
- A brief introduction to parsing XML in R: HTML, PDF
-
-
- A reasonably detailed overview
of the package and what we might use XML for.
-
-
- A manual in
and a quick guide to the package (PDF).
-
-
- A short overview
of the package.
-
-
- Brief and incomplete Notes on generating XML
within S
-
-
- FAQ for the package.
-
-
- Changes to the packages (by release).
-
Examples of Reading Generic XML files
-
- XML form of plist (property list) files (e.g. property lists on
OS X, old iTunes databases)
- keyValueDB.R
library(XML)
source(url("http://www.omegahat.org/RSXML/keyValueDB.R"))
o = readKeyValueDB("http://www.omegahat.org/RSXML/plist.xml")
-
- XML "solr" files that are similar to JSON and name-value pairs
with nodes of the form
<lst name="info">
<str name="ABC">A string</str>
<int name="xyz">103</int>
<long name="big">1000012310303</long>
<bool>true</bool>
<date name="lastModified">2011-02-10T11:29:03Z</date>
</lst>
- solrDocs.R
library(XML)
source(url("http://www.omegahat.org/RSXML/solrDocs.R"))
o = readSolrDoc("http://www.omegahat.org/RSXML/solr.xml")
Duncan Temple Lang
<duncan@wald.ucdavis.edu>
Last modified: Sun Dec 25 09:52:10 PST 2011