R Bootcamp - Lecture 2 by Tengyu, Dash, Taylor, Milton and Bingjie
Functions
Functions are a block of code that can be called by its name. It can be used to perform a specific task. Functions are used to automate computation and reduce the amount of code.
To define a function in R, we use the keyword function. The syntax is as follows.
function_name <- function(arg_1, arg_2, ...) {
# body of the function
...
return(value)
}
Here's an example.
minus <- function(x, y) {
z <- x - y
return(z)
}
Here, we define a function called minus that takes two arguments x and y. The function returns the difference between x and y.
Now, we can call the function minus by passing two arguments.
minus(10, 5)
If you run the code above, you will get the result 5.
Functions Exercise
Now it's your turn to write a function. Write a function to find the product of the 4th and 5th element of a vector and pass the input Vector = c(2, 4, 6, 8, 10, 12)
For Loop
For loop is used to iterate through vector, columns/rows of df, lists, etc. The syntax is as follows.
for (val in sequence) {
# body of the loop
...
}
Here is an example to print all the numbers from 1 to 5 using for loop.
for (i in 1:5) {
print(i)
}
We can also use for loop to iterate through a vector.
letters <- c("a", "b", "c", "d", "e")
for (letter in letters) {
print(letter)
}
Within a loop, we can also have another loop. This is called nested loop.
for (i in 1:3) {
print("Outer loop")
for (j in 1:3) {
print("Inner loop")
}
}
While Loop
While loop is used to iterate through a block of code as long as a specific condition is met. The syntax is as follows.
while (condition) {
# body of the loop
...
}
You can define the condition using logical operators. For example using ==, <, >, <=, >=, !=, &&, ||, etc.
For example, similar to the for loop example above, we can print all the numbers from 1 to 5 using while loop.
i <- 1
while (i < 6) {
print(i)
i <- i + 1
}
Loop Exercise
Now it's your turn to write a while loop. Write a while loop that prints out the 1st through 4th element in a vector Vector = c(2, 4, 6, 8, 10, 12)
If/Else Statement
If/Else statement is used to execute a block of code if a specified condition is true. The following flow chart shows the structure of an if/else statement.

To write an if/else statement in R, we use the keyword if and else. The syntax is as follows.
if (condition) {
# body of if statement
...
} else {
# body of else statement
...
}
Here is an example to use if/else statement to check if a number is even or odd.
num <- 10
if (num %% 2 == 0) {
print("Even")
} else {
print("Odd")
}
If/Else Exercise
Use loop and if/else statement to check if elements of a vector are divisible by 4 w/ no remainder. If yes, print the element. If no, print "Not divisible by 4".
Factor Review
In lecture 1, we discussed about factor. Here is a quick review.
In R, we can use factor represent categorical variables with unique values.
-
To transform a vector into factor, we call
factor(c(data1, data2, data3, data4, data5)). - We can use function levels() to retrieve the levels of the factor.
Data transformation
Here are the functions that are frequently used to transform data.
apply()lapply()sapply()tapply()by()split()
Splitting a Vector into Groups
Suppose you have a vector. Each element belongs to a different group, and the groups are identified by a grouping factor. You want to split the elements into the groups.
We can use split(x, f) function, where x is the vector and f is the factor. The function returns a list of vectors, where each vector contains the elements for one group.
Here is an example.
x <- c(1, 2, 3, 4, 5, 6)
f <- c("a", "b", "a", "b", "a", "b")
group <- split(x, f)
Now we can check the result.
group
group$a
group$b
Apply a Function to Each List Element
Suppose you have a list, and you want to apply a function to each element of the list.
We can use either the lapply function or the sapply function, depending upon the desired form of the result. lapply always returns the results in list, whereas sapply returns the results in a vector if that is possible.
Here is an example.
x <- list(a = 1:5, b = rnorm(10))
lapply(x, mean)
sapply(x, mean)
The first variable to lapply or sapply is the list. The second variable is the function to be applied to each element of the list.
Apply a Function to Each Row
Now instead of a 1-D list, you have a matrix or dataframe. And you want to apply a function to every row, calculating the function result for each row.
We can use the apply function. Set the second argument to 1 to indicate row-by-row application of a function.
Here is an example.
x <- matrix(rnorm(200), 20, 10)
apply(x, 1, mean)
You can also try to use it on a dataframe and see what happens.
Apply a Function to Each Column
Similarly we can apply a function to each column of a matrix or dataframe using apply function. But we need to set the second argument to 2 to indicate column-by-column application of a function. The rest is the same.
x <- matrix(rnorm(200), 20, 10) apply(x, 2, mean)x <- data.frame(x) apply(x, 2, mean)
Apply a Function to Groups of Data
Now we have a vector and a grouping factor. We want to apply a function to each group of data.
We can use tapply function, which will apply a function to each group of data.
Here is an example.
x <- c(1, 2, 3, 4, 5, 6)
f <- c("a", "b", "a", "b", "a", "b")
tapply(x, f, mean)
Here, x is a vector, f is a grouping factor, and fun is a function. The function should expect one argument, which is a vector of elements taken from x according to their group.
Apply a Function to Groups of Rows
This is quite common in data analysis. We have a dataframe and a grouping factor. We want to apply a function to each group of rows.
We can use by function, which will apply a function to each group of rows.
Here is an example on the iris dataset.
by(iris$Sepal.Length, iris$Species, mean)
This calculates the mean of Sepal.Length for each species. We can also calculate for all the columns at the same time.
by(iris[, 1:4], iris$Species, colMeans)