Jump to content

Statistical Analysis: an Introduction using R/R/Packages

From Wikibooks, open books for an open world
The strength and depth of R comes from the various functions and other objects which are provided for your use. These are actually provided by a variety of separate packages. For example, Figure 1.1 is based on data from the "datasets" package. To use the contents of a package, it must be made available to R, then loaded into your R session.

Some packages should always be available within R, and a number of these are automatically loaded at the start of an R session. These include the "base" package (which is where the max() and sqrt() functions are defined), the "utils" package (which is where RSiteSearch() and citation() are defined), the "graphics" package (which allows plots to be generated), and the "stats" package (which provides a broad range of statistical functionality). In total, the default packages allow you to do a considerable amount of statistics.

However, one of the strengths of R is the variety of additional packages that are available. Packages are, for example, which allow you to analyse genetic data, to interface to geographical information systems, for economic analysis, and so forth. To make a package available to R, you need to download it and install it somewhere on your system. There is a central place (called "CRAN") from where you can download most additional packages. Once you have installed a package, you can load it into R at any time by using the library() function.
Input:
library("datasets")  #Load the already installed "datasets" package
cars                #Having loaded "datasets", the "cars" object (containing a set of data) is now available
library("vioplot")    #Try loading the "vioplot" package: will probably fail as it is not installed by default
install.packages("vioplot") #This is one way of installing the package. There are other ways too.
library("vioplot")    #This should now work
example("vioplot")    #produces some pretty graphics. Don't worry about what they mean for the time being
Result:
> ## N.B. the "datasets" package is installed by default and provides useful example data

> library(datasets) #Load the datasets package (actually, it has probably been loaded already) > cars #Display one of the datasets: see ?car for more information

  speed dist

1 4 2 2 4 10 3 7 4 4 7 22 5 8 16 6 9 10 7 10 18 8 10 26 9 10 34 10 11 17 11 11 28 12 12 14 13 12 20 14 12 24 15 12 28 16 13 26 17 13 34 18 13 34 19 13 46 20 14 26 21 14 36 22 14 60 23 14 80 24 15 20 25 15 26 26 15 54 27 16 32 28 16 40 29 17 32 30 17 40 31 17 50 32 18 42 33 18 56 34 18 76 35 18 84 36 19 36 37 19 46 38 19 68 39 20 32 40 20 48 41 20 52 42 20 56 43 20 64 44 22 66 45 23 54 46 24 70 47 24 92 48 24 93 49 24 120 50 25 85 > library(vioplot) #Try loading the "vioplot" package: this will probably fail as it is not installed by default Error in library(vioplot) : there is no package called 'vioplot' > install.packages("vioplot") #This is one way of installing the package. There are other ways too. also installing the dependency ‘sm’

trying URL 'http://cran.uk.r-project.org/bin/macosx/universal/contrib/2.8/sm_2.2-3.tgz' Content type 'application/x-gzip' length 306188 bytes (299 Kb) opened URL

=======================

downloaded 299 Kb

trying URL 'http://cran.uk.r-project.org/bin/macosx/universal/contrib/2.8/vioplot_0.2.tgz' Content type 'application/x-gzip' length 9677 bytes opened URL

=======================

downloaded 9677 bytes


The downloaded packages are in /tmp/RtmpR28hpQ/downloaded_packages > library(vioplot) #This should now work Loading required package: sm Package `sm', version 2.2-3; Copyright (C) 1997, 2000, 2005, 2007 A.W.Bowman & A.Azzalini type help(sm) for summary information > example(vioplot) #produces some pretty graphics. Don't worry about what they mean for the time being

vioplt> # box- vs violin-plot vioplt> par(mfrow=c(2,1))

vioplt> mu<-2

vioplt> si<-0.6

vioplt> bimodal<-c(rnorm(1000,-mu,si),rnorm(1000,mu,si))

vioplt> uniform<-runif(2000,-4,4)

vioplt> normal<-rnorm(2000,0,3)

vioplt> vioplot(bimodal,uniform,normal) Hit <Return> to see next plot:

vioplt> boxplot(bimodal,uniform,normal)

vioplt> # add to an existing plot vioplt> x <- rnorm(100)

vioplt> y <- rnorm(100)

vioplt> plot(x, y, xlim=c(-5,5), ylim=c(-5,5)) Hit <Return> to see next plot:

vioplt> vioplot(x, col="tomato", horizontal=TRUE, at=-4, add=TRUE,lty=2, rectCol="gray")

vioplt> vioplot(y, col="cyan", horizontal=FALSE, at=-4, add=TRUE,lty=2)

Note that some packages require other packages to be installed in order to work properly (one package is said to "depend" on another). For example, "vioplot" requires the "sm" package. If these dependent packages are not installed, then you will not be able to load the original package within R. Installing a package in the manner above, by calling install.packages(), should also install dependencies[1].

There are several other ways of installing packages. If you start R by typing "R" on a unix command-line, then you can install packages by running "R CMD INSTALL packagename" from the command-line instead (see ?INSTALL). If you are running R using a graphical user interface (e.g. under Macintosh or Windows), then you can often install packages by using on-screen menus. Note that these methods may not install other, dependent packages.


Notes

[edit | edit source]
  1. Actually, the details are slightly more complex, depending on whether there is a default location to install the packages, see ?install.packages