Jump to content

Biomedical Engineering Theory And Practice/R Language

From Wikibooks, open books for an open world

Data Types

[edit | edit source]

R has a various objects for holding data, including scalars, vectors, matrices, arrays, data frames, and lists.

Scalar and Constant

[edit | edit source]

"Scalar" generally means "one-dimensional"vector. Constants only have one value ever. You can constants is similar to zero-dimensional values (a single point).

  • Scalar
> x<-3
> y<-6
> z<-x+y
> z
[1] 9
  • Constant
> 2+3
[1] 5
> 5-4
[1] 1
> 6*4
[1] 24

Vector

[edit | edit source]

Vectors are one-dimensional arrays that can hold numeric data, character data, or logical data. The combine function c() is used to form the vector. Here are examples of each type of vector:

> a<-c(1,2,5,-3,-6,5) #nummeric vector
> b<-c("one","two","three") #character vector
> d<-c(TRUE,FALSE,TRUE,FALSE,TRUE,TRUE) #logical vector
> a[c(2,4)]
[1]  2 -3
> a[4]
[1] -3
> a[2:4]
[1]  2  5 -3

Matrix

[edit | edit source]

A matrix is a two-dimensional array where each element has the same mode (numeric, character, or logical). Matrices are created with the "matrix" function . The general format is as follows:

> mymatrix <- matrix(vector, nrow=number of rows, ncol=number of columns,byrows=logical value, dimnames=list(vector-of-rownames,vector-of-colnames))
> A<-matrix(1:9,nrow=3)
> A
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9
> A[2,1]
[1] 2
> A<-matrix(1:9,nrow=3,byrow=T)
> A
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9

Array

[edit | edit source]
> myarray <-array(vector, dimensions, dimnames)
> dim1 <- c("A1","A2","A3")
> dim2 <- c("B1","B2","B3","B4")
> dim3 <- c("C1","C2")
> x<-array(1:24,c(3,4,2),dimnames=list(dim1,dim2,dim3))
> x
, , C1

   B1 B2 B3 B4
A1  1  4  7 10
A2  2  5  8 11
A3  3  6  9 12

, , C2

   B1 B2 B3 B4
A1 13 16 19 22
A2 14 17 20 23
A3 15 18 21 24

Data Frame

[edit | edit source]
>mydata <-data.frame(col1,col2,col3....)
> patientID<-LETTERS[1:4]
> age<-c(24,35,28,52)
> diabetes<-c("Type1","Type2","Type1","Type2")
> stats<-c("Poor","Improved","Poor","Excellent")
> patientDATA<-data.frame(patientID,age,diabetes,stats,row.names=letters[1:4])
> patientDATA
  patientID age diabetes     stats
a         A  24    Type1      Poor
b         B  35    Type2  Improved
c         C  28    Type1      Poor
d         D  52    Type2 Excellent

Factors

[edit | edit source]
> patientID<-LETTERS[1:4]
> age<-c(24,35,28,52)
> diabetes<-c("Type1","Type2","Type1","Type2")
> stats<-c("Poor","Improved","Poor","Excellent")
> status <- factor(stats, order=TRUE)
> patientdata <- data.frame(patientID, age, diabetes, status)
> str(patientdata)
'data.frame':	4 obs. of  4 variables:
 $ patientID: Factor w/ 4 levels "A","B","C","D": 1 2 3 4
 $ age      : num  24 35 28 52
 $ diabetes : Factor w/ 2 levels "Type1","Type2": 1 2 1 2
 $ status   : Ord.factor w/ 3 levels "Excellent"<"Improved"<..: 3 2 3 1
> summary(patientdata)
 patientID      age         diabetes       status 
 A:1       Min.   :24.00   Type1:2   Excellent:1  
 B:1       1st Qu.:27.00   Type2:2   Improved :1  
 C:1       Median :31.50             Poor     :2  
 D:1       Mean   :34.75                          
           3rd Qu.:39.25                          
           Max.   :52.00

Lists

[edit | edit source]
>mylist <- list(name1=object1,name2=object2,...)
> x<-"TheList"
> y<-c(25,19,20)
> z<-matrix(1:10,nrow=2,byrow=TRUE)
> theta<-LETTERS[1:10]
> delta<-c(2+3i,4-6i)
> mylist<-list(title=x,components=y,z,theta,delta)
> mylist
$title
[1] "TheList"

$components
[1] 25 19 20

[[3]]
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    2    3    4    5
[2,]    6    7    8    9   10

[[4]]
 [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"

[[5]]
[1] 2+3i 4-6i

> mylist[[2]]
[1] 25 19 20
> mylist[["components"]]
[1] 25 19 20

> lapply(mylist,length)
$title
[1] 1

$components
[1] 3

[[3]]
[1] 10

[[4]]
[1] 10

[[5]]
[1] 2

> lapply(mylist,class)
$title
[1] "character"

$components
[1] "numeric"

[[3]]
[1] "matrix"

[[4]]
[1] "character"

[[5]]
[1] "complex"

> lapply(mylist,mean)
$title
[1] NA

$components
[1] 21.33333

[[3]]
[1] 5.5

[[4]]
[1] NA

[[5]]
[1] 3-1.5i

Warning messages:
1: In mean.default(X[[1L]], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(X[[4L]], ...) :

Basic Functions

[edit | edit source]

Arithmetic Operators

[edit | edit source]

The arithmetic operators and their examples which are used in R programming are listed in the table below.

Function R Command Example
Exponentiation, > a^b > 3+6

[1] 9

Multiplication, > a*b > 22*5

[1] 110

Division, > a/b > 30/3

[1] 10

Addition, > a+b > 10+9

[1] 19

Subtraction, > a-b > 10-3

[1] 7

Integer(Quotient) > a%/%b > 20%/%3

[1] 6

Modulo(Remainder) > a%%b > 20%%3

[1] 2

Complex Number

[edit | edit source]
> x<-5.2-3i
R command R command
Complex number
> Re(x)
[1] 5.2
Real part
> Im(x)
[1] -3
Imaginary part
> Im(x)
[1] -3
Modulus
> Mod(x)
[1] 6.003332
Argument
> Arg(x)
[1] -0.5232783
Conjugate
> Conj(x)
[1] 5.2+3i
Membership
 
> is.complex(x)
[1] TRUE
Coercion
 
> as.complex(19.6)
[1] 19.6+0i

Rounding

[edit | edit source]
Function R Command Function R Command
Greatest integer less than
> floor(9.9)
[1] 9
> floor(-9.9)
[1] -10
Next integer
> ceiling(9.9)
[1] 10
> ceiling(-9.9)
[1] -9
Rounding function
> round(9.9)
[1] 10
> round(9.2)
[1] 9
Strip off the decimal
> trunc(8.6)
[1] 8
> trunc(-8.6)
[1] -8

Trigonometric Functions

[edit | edit source]
Function Trigometric Function Trigometric Inverse Function Hyperbolic Function Hyperbolic Inverse Function
sine sin(x) asin(x) sinh(x) asinh(x)
cosine cos(x) acos(x) cosh(x) acosh(x)
tangent tan(x) atan(x) tanh(x) atanh(x)

Log and Exponential Functions

[edit | edit source]
Function R command R Example
Absolute, abs(x) > abs(-7.4)

[1] 7.4

Log to the base e, > log(10)

[1] 2.302585

Log to the base 10, log10(x) > log10(100)

[1] 2

Log to the base n of x log(x,n) > log(64,4)

[1] 3

exp(x) > exp(3)

[1] 20.08554

sqrt(x) > sqrt(25)

[1] 5

factorial(x) > factorial(10)

[1] 3628800

combinations(n,r) > choose(5,4)

[1] 5

Relational Operators and Logical Variables

[edit | edit source]

Relational Operators

[edit | edit source]
Relational Operator
Equal ==
Not equal !=
Less than <
Greater than >
Less than or equal <=
Greater than or equal >=
  • TRUE=1,FALSE=0
> x<-c(6,3,4)
> y<-c(5,15,9)
> z<-(x<y)
> z
[1] FALSE  TRUE  TRUE
> z<-(x<y)+5
> z
[1] 5 6 6

Logical Operators

[edit | edit source]
&
False(0) False(0) True(1) False(0) False(0) False(0)
False(0) True(1) True(1) False(0) True(1) True(1)
True(1) False(0) False(0) False(0) True(1) True(1)
True(1) True(1) False(0) True(1) True(1) False(0)
> x<-c(6,2,8)
> y<-c(14,6,7)
> z<-c(4,5,11)
> z1<-x>y
> z1
[1] FALSE FALSE  TRUE
> z2<-y>z
> z2
[1]  TRUE  TRUE FALSE
> z3<-(x>y) & (y>z)
> z3
[1] FALSE FALSE FALSE

> z1<-xor(x,y) > z1 [1] FALSE FALSE FALSE

Sequence Generation and Repeats

[edit | edit source]
> x1
[1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0
> x2 <- seq(from=0.4,by=0.01,length=15)
> x2
 [1] 0.40 0.41 0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.50 0.51 0.52 0.53 0.54
> x3<-seq(1.4,2.1,0.3)
> x3
[1] 1.4 1.7 2.0
> x4<-rep(15,7)
> x4
[1] 15 15 15 15 15 15 15
> x5<-rep(1:4,3)
> x5
 [1] 1 2 3 4 1 2 3 4 1 2 3 4
> x6<-rep(1:3,each=2,times=3)
> x6
 [1] 1 1 2 2 3 3 1 1 2 2 3 3 1 1 2 2 3 3
> x7<-rep(c("a","b","c"),c(1,2,3))
> x7
[1] "a" "b" "b" "c" "c" "c"

Random Number Generation

[edit | edit source]
> set.seed(100)
> runif(5)
[1] 0.5465586 0.1702621 0.6249965 0.8821655 0.2803538
> runif(5)
[1] 0.3984879 0.7625511 0.6690217 0.2046122 0.3575249
> x<-c(5,10,8,6,9,11,14,16,18)
> sample(x)
[1]  6 11 18  9  8 16 10  5 14
> sample(x)
[1] 16  9 10  8 18 14 11  6  5
> sample(x,4)
[1] 10 11 14  5

Vector Functions

[edit | edit source]

Length and Statistics

[edit | edit source]
> x<-c(6,9,11,14,12,2,33,76,0,90)
Function R command Function R command
Length
> length(x)
[1] 10
Mean
> mean(x)
[1] 25.3
Max
> max(x)
[1] 90
Min
> min(x)
[1] 0
Distribution
> quantile(x)
   0%   25%   50%   75%  100% 
 0.00  6.75 11.50 28.25 90.00
Sort
> sort(x)
 [1]  0  2  6  9 11 12 14 33 76 90
Function R command
Reference the 5th element of Vector from the vector
> x[5]
[1] 12
Delete the 3rd element of vector from the vector
> x1<-x[-3]
> x1
[1]  6  9 14 12  2 33 76  0 90
Delete the last element of vector from the vector
> x2<-x[-length(x)]
> x2
[1]  6  9 11 14 12  2 33 76  0
Delete 1st and the last element of vector from the vector
> x3<-x[c(-1,-length(x))]
> x3
[1]  9 11 14 12  2 33 76  0
Remove the smallest 2 and the largest 3 element from the vector
> trim <-function(x)sort(x)[-c(1,2,length(x)-2,length(x)-1,length(x))]
> trim(x)
[1]  6  9 11 12 14
R code
Sum
> sum(x)
[1] 253
Mean,Median
> mean(x)
[1] 25.3
> median(x)
[1] 11.5
Range > range(x)

[1] 0 90

Standard Deviation,variance
> sd(x)
[1] 31.87841
> var(x)
[1] 1016.233
Which is the largest and smallest number
> which(x==max(x))
[1] 10
 
> which(x==min(x))
[1] 9
sort and reverse sort
> sort(x)
 [1]  0  2  6  9 11 12 14 33 76 90
> rev(sort(x))
 [1] 90 76 33 14 12 11  9  6  2  0
> x<-matrix(rpois(15,1.2),nrow=3)
> x
     [,1] [,2] [,3] [,4] [,5]
[1,]    2    1    0    3    3
[2,]    0    2    3    1    2
[3,]    2    2    0    1    1
> mean(x[,5])
[1] 2
> var(x[3,])
[1] 0.7
> rowSums(x)
[1] 9 8 6
> colSums(x)
[1] 4 5 3 5 6
> rowMeans(x)
[1] 1.8 1.6 1.2
> colMeans(x)
[1] 1.333333 1.666667 1.000000 1.666667 2.000000

Parallel min and max

[edit | edit source]
> x<-c(2,5,10,-6,29,45)
> y<-c(5,9,15,-22,38,88)
> z<-c(9,10,2,7,55,24)
> q<-c(22,3,5,6,-23,88)
> pmin(x,y,z,q)
[1]   2   3   2 -22 -23  24
> pmax(x,y,z,q)
[1] 22 10 15  7 55 88

'table' and 'tapply'

[edit | edit source]
> data(ChickWeight)
weight Time Chick Diet
1     42    0     1    1
2     51    2     1    1
3     59    4     1    1
4     64    6     1    1
5     76    8     1    1
6     93   10     1    1
.......................
576    234   18    50    4
577    264   20    50    4
578    264   21    50    4
> tapply(ChickWeight$weight,ChickWeight$Time,mean)
        0         2         4         6         8        10        12        14 
 41.06000  49.22000  59.95918  74.30612  91.24490 107.83673 129.24490 143.81250 
       16        18        20        21 
168.08511 190.19149 209.71739 218.68889 
> tapply(ChickWeight$weight,ChickWeight$Diet,median)
    1     2     3     4 
 88.0 104.5 125.5 129.5
> codon1=c("UUU","UUC","UUA","UUG","UUA","UUG","UUC")
> table(codon1)
codon1
UUA UUC UUG UUU 
  2   2   2   1 
> aminoacid=list(Phe=c("UUU","UUC"),Leu=c("UUA","UUG"))
> codon=as.factor(codon1)
> levels(codon)=aminoacid
> codon
[1] Phe Phe Leu Leu Leu Leu Phe
Levels: Phe Leu
> table(codon)
codon
Phe Leu 
  3   4

'apply'

[edit | edit source]
> x<-matrix(1:15,nrow=3,byrow=T)
> x
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    2    3    4    5
[2,]    6    7    8    9   10
[3,]   11   12   13   14   15
> apply(x,1,sum)
[1] 15 40 65
> apply(x,2,sum)
[1] 18 21 24 27 30
> apply(x,1,sqrt)
         [,1]     [,2]     [,3]
[1,] 1.000000 2.449490 3.316625
[2,] 1.414214 2.645751 3.464102
[3,] 1.732051 2.828427 3.605551
[4,] 2.000000 3.000000 3.741657
[5,] 2.236068 3.162278 3.872983
> apply(x,2,sqrt)
         [,1]     [,2]     [,3]     [,4]     [,5]
[1,] 1.000000 1.414214 1.732051 2.000000 2.236068
[2,] 2.449490 2.645751 2.828427 3.000000 3.162278
[3,] 3.316625 3.464102 3.605551 3.741657 3.872983

Closets

[edit | edit source]
> x<-c(3,22,15,11,50,85)
> x-10
[1] -7 12  5  1 40 75
> abs(x-10)
[1]  7 12  5  1 40 75
> min(abs(x-10))
[1] 1
> which(abs(x-10)==min(abs(x-10)))
[1] 4

Sort,Rank,Order

[edit | edit source]
> x<-c(2,5,10,-6,29,45)
> # rank: the rank of unsorted vector
> rank(x)
[1] 2 3 4 1 5 6
> # order:the rank of the sorted vector
> order(x)
[1] 4 1 2 3 5 6

Unique and Duplicated

[edit | edit source]
> x<-c("a","b","c","a","a","a","b","c")
> table(x)
x
a b c 
4 2 2 
> unique(x)
[1] "a" "b" "c"
> duplicated(x)
[1] FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE
> x[!duplicated(x)]
[1] "a" "b" "c"

Run length

[edit | edit source]
> x<-rpois(20,0.5)
> x
 [1] 2 0 0 1 0 1 0 0 1 1 0 0 2 0 0 0 0 0 0 0
> rle(x)
Run Length Encoding
  lengths: int [1:10] 1 2 1 1 1 2 2 2 1 7
  values : int [1:10] 2 0 1 0 1 0 1 0 2 0

Set functions

[edit | edit source]
> setA <-c("I","II","III","IV","V")
> setB <-c("III","IV","V","VI")
> union(setA,setB)
[1] "I"   "II"  "III" "IV"  "V"   "VI" 
> intersect(setA,setB)
[1] "III" "IV"  "V"  
> setdiff(setA,setB)
[1] "I"  "II"
> setdiff(setB,setA)
[1] "VI"

Practise

[edit | edit source]

Reference

[edit | edit source]