Jump to content

Statistical Analysis: an Introduction using R/R/Functions

From Wikibooks, open books for an open world
Apart from numbers, perhaps the most useful named objects in R are functions. Nearly everything useful that you will do in R is carried out using a function, and many are available in R by default. You can use (or "call") a function by typing its name followed by a pair of round brackets. For instance, the start up text mentions the following function, which you might find useful if you want to reference R in published work:
Input:
citation()
Result:
> citation() To cite R in publications use: R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org. A BibTeX entry for LaTeX users is @Manual{, url = {http://www.R-project.org}, title = {R: A Language and Environment for Statistical Computing}, author = {{R Development Core Team}}, organization = {R Foundation for Statistical Computing}, address = {Vienna, Austria}, year = {2008}, note = {{ISBN} 3-900051-07-0}, } We have invested a lot of time and effort in creating R, please cite it when using it for data analysis. See also ‘citation("pkgname")’ for citing R packages.
Many R functions can produce results which differ depending on arguments that you provide to them. Arguments are placed inside the round brackets, separated by commas. Many functions have one or more optional arguments: that is, you can choose whether or not to provide them. An example of this is the citation() function. It can take an optional argument giving the name of an R add-on package. If you do not provide an optional argument, there is usually an assumed default value (in the case of citation(), this default value is "base", i.e. provide the citation reference for the base package: the package which provides most of the foundations of the R language).

Most arguments to a function are named. For example, the first argument of the citation function is named package. To provide extra clarity, when using a function you can provide arguments in the longer form name=value. Thus

citation("base")

does the same as

citation(package="base")
If a function can take more than one argument, using the long form also allows you to change the order of arguments, as shown in the example code below.
Input:
citation("base")      #Does the same as citation(), because the default for the first argument is "base"
                      #Note: quotation marks are needed in this particular case (see discussion below)
citation("datasets")  #Find the citation for another package (in this case, the result is very similar)
sqrt(25)              #A different function: "sqrt" takes a single argument, returning its square root.
sqrt(25-9)            #An argument can contain arithmetic and so forth
sqrt(25-9)+100        #The result of a function can be used as part of a further analysis
max(-10, 0.2, 4.5)    #This function returns the maximum value of all its arguments
sqrt(2 * max(-10, 0.2, 4.5))             #You can use results of functions as arguments to other functions
x <- sqrt(2 * max(-10, 0.2, 4.5)) + 100  #... and you can store the results of any of these calculations
x
log(100)              #This function returns the logarithm of its first argument
log(2.718282)         #By default this is the natural logarithm (base "e")
log(100, base=10)     #But you can change the base of the logarithm using the "base" argument
log(100, 10)          #This does the same, because "base" is the second argument of the log function
log(base=10, 100)     #To have the base as the first argument, you have to use the form name=value
Result:
> citation("base") #Does the same as citation(), because the default for the first argument is "base" To cite R in publications use: R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org. A BibTeX entry for LaTeX users is @Manual{, title = {R: A Language and Environment for Statistical Computing}, author = {{R Development Core Team}}, organization = {R Foundation for Statistical Computing}, address = {Vienna, Austria}, year = {2008}, note = {{ISBN} 3-900051-07-0}, url = {http://www.R-project.org}, } We have invested a lot of time and effort in creating R, please cite it when using it for data analysis. See also ‘citation("pkgname")’ for citing R packages. > #Note: quotation marks are needed in this particular case (see discussion below) > citation("datasets") #Find the citation for another package (in this case, the result is very similar) The 'datasets' package is part of R. To cite R in publications use: R Development Core Team (2008). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org. A BibTeX entry for LaTeX users is @Manual{, title = {R: A Language and Environment for Statistical Computing}, author = {{R Development Core Team}}, organization = {R Foundation for Statistical Computing}, address = {Vienna, Austria}, year = {2008}, note = {{ISBN} 3-900051-07-0}, url = {http://www.R-project.org}, } We have invested a lot of time and effort in creating R, please cite it when using it for data analysis. See also ‘citation("pkgname")’ for citing R packages. > sqrt(25) #A different function: "sqrt" takes a single argument, returning its square root. [1] 5 > sqrt(25-9) #An argument can contain arithmetic and so forth [1] 4 > sqrt(25-9)+100 #The result of a function can be used as part of a further analysis [1] 104 > max(-10, 0.2, 4.5) #This function returns the maximum value of all its arguments [1] 4.5 > sqrt(2 * max(-10, 0.2, 4.5)) #You can use results of functions as arguments to other functions [1] 3 > x <- sqrt(2 * max(-10, 0.2, 4.5)) + 100 #... and you can store the results of any of these calculations > x [1] 103 > log(100) #This function returns the logarithm of its first argument [1] 4.60517 > log(2.718282) #By default this is the natural logarithm (base "e") [1] 1 > log(100, base=10) #But you can change the base of the logarithm using the "base" argument [1] 2 > log(100, 10) #This does the same, because "base" is the second argument of the log function [1] 2 > log(base=10, 100) #To have the base as the first argument, you have to use the form name=value [1] 2
Note that when typing normal text (as in the name of a package), it needs to be surrounded by quotation marks[1], to avoid confusion with the names of objects. In other words, in R
citation

refers to a function, whereas

"citation"

is a "string" of text. This is useful, for example when providing titles for plots, etc.

You will probably find that one of the trickiest aspects of getting to know R is knowing which function to use in a particular situation. Fortunately, R not only provides documentation for all its functions, but also ways of searching through the documentation, as well as other ways of getting help.


Notes

[edit | edit source]
  1. you can use either single (') or double (") quotes to delimit text strings, as long as the start and end quotes match