Statistical Analysis: an Introduction using R/R/A simple R session
Appearance
Even though R has not been fully introduced yet, it is instructive to see how simple a useful R session can be. As an example, we will fit a statistical model using the cars data from the previous topic, and see how to produce a similar plot to Figure 1.2b, with a straight best-fit line. This is a common task in many simple analyses.
Some of the commands in this example will be unfamiliar: don't worry, the main point is not to understand the commands, but to get an overall sense of how R works. Nevertheless, if you do want to understand the commands fully, you will need to know about data frames (essentially, tables of data with named columns) and model formulae (essentially, a notation of the form
Input:a ~ b + c
, meaning a is predicted by b and c).plot(dist ~ speed, data=cars) #A common way of creating a specific plot is via a model formula
straight.line.model <- lm(dist~speed, data=cars) #This creates and stores a model ("lm" means "Linear Model").
abline(straight.line.model, col="red") #"abline" will also plot a straight line from a model
straight.line.model #Show model predictions (estimated slope & intercept of the line)
> plot(dist ~ speed, data=cars) #A common way of creating a specific plot is via a model formula
> straight.line.model <- lm(dist~speed, data=cars) #This creates and stores a model ("lm" means "Linear Model").
> abline(straight.line.model, col="red") #"abline" will also plot a straight line from a model
> straight.line.model #Show model predictions (estimated slope & intercept of the line)
Call:
lm(formula = dist ~ speed, data = cars)
Coefficients:
(Intercept) speed
-17.579 3.932
Note that unlike the examples in the graphics topic, we have plotted the data by specifying a model formula, rather than just giving the name of the dataset. Although in this case the resulting plot is the same as seen with
plot(cars)
, the formula interface makes it clearer what is being plotted.