R Basics

In this document, the gray blocks show the code that you can run in R or R Studio, the white blocks show what you would get if you do run the code.

Assigning variables

There are various ways we can assign variables in R depending on our purpose. A simple assignment for a single variable is done as follows:

a <- 5
print(a)

## [1] 5

To assign a vector, you can use:

b <- c(1,2,4)
print(b)

## [1] 1 2 4

However, you can also assign specific types of vectors.For instance, one can assign numbers between a minimum and maximum with given steps. This could be useful if you want to use e.g. when you want to track an evolutionary process until a given end time, using specific time steps. For instance, if you’d like to initialize a vector with numbers between 0 and 100, with steps of 1, you can run:

c <- seq(0,100,1)
print(c)

##   [1]   0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17
##  [19]  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35
##  [37]  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53
##  [55]  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71
##  [73]  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89
##  [91]  90  91  92  93  94  95  96  97  98  99 100

You can also create a vector to hold a given number repeated for a specific number of time:

d <- rep(5,10)
print(d)

##  [1] 5 5 5 5 5 5 5 5 5 5

Sometimes, you want to create a vector to hold numbers that you’ll calculate in the future. You can initialize empty vectors of a specific type of variable with:

e <- numeric(100) # a vector that will hold numbers, initialized with zeros per default
print(e)

##   [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
##  [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
##  [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

f <- character(100) # a vector that will hold characters, initialized with empty strings per default
print(f)

##   [1] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
##  [26] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
##  [51] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""
##  [76] "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""

If you now want to change a given element of vectors, you can do so by:

e[5] <- 3
print(e)

##   [1] 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
##  [38] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
##  [75] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Now the 5th element of the vector e is 3.

For loops

For loops are used to do some calculation for a specific number of times.

for (i in 1:10) {
  print(i)
}

## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10

A for loop takes an index (in this case i), starts the index from a minimum value (in this case 1), and changes it’s value in every round. It runs its final round when i is equal to the maximum value (in this case 10). That is why when the above R code is run, the last value it prints is 10.

We can use for loops to make calculations over time. For instance:

min_time <- 1
max_time <- 10
time_step <- 1
times <- seq(min_time,max_time,time_step)
r <- 2 # growth rate 
population_size <- 10 # population begins with 10 individuals
population <- numeric(length(times)) # creating a vector to hold our population
for (i in times) {
  population_size <- population_size*r # calculating population size in the next time step
  print(population_size) # printing the population size
  population[i] <- population_size # if we want to plot it later, we should also keep these values in a vector
}

## [1] 20
## [1] 40
## [1] 80
## [1] 160
## [1] 320
## [1] 640
## [1] 1280
## [1] 2560
## [1] 5120
## [1] 10240

population # printing the vector we created in the for-loop

##  [1]    20    40    80   160   320   640  1280  2560  5120 10240

While loops

We use while loops to run a calculation while a specific condition is satisfied. For instance:

i <- 1
while (i<=10) {
  i <- i+1
  print(i) 
  }

## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
## [1] 11

Our while loop will run until i is greater than 10. So the last value of i should be 11; at that point R will compare it to our value (10) and seeing i is greater, it will stop executing whatever we have in the while loop.

Continuing our population example, let’s calculate population size while the population size smaller than 5000.

r <- 2 # growth rate 
population_size <- 10 # population begins with 10 individuals
population <- numeric() # creating a vector to hold our population
# Notice that now we cannot say how large our vector will be, because we are not pre-defining a specific length of time
# to run our calculation, but are stopping it when the population size exceeds a certain value
i <- 1 # we now need to define our own index outside the loop
while (population_size < 5000) {
  population_size <- population_size*r # calculating population size in the next time step
  print(population_size) # printing the population size
  population[i] <- population_size # if we want to plot it later, we should also keep these values in a vector
  i <- i + 1 # the for loop was doing this "automatically", now we need to add an increment to the index ourselves
}

## [1] 20
## [1] 40
## [1] 80
## [1] 160
## [1] 320
## [1] 640
## [1] 1280
## [1] 2560
## [1] 5120

population # printing the vector we created in the for-loop

## [1]   20   40   80  160  320  640 1280 2560 5120

Our population became larger than 5000 before the while loop ended because before the calculation of 5120 happened, the population size was 2560. So R still ran the while loop and did the final calculation of 5120, before checking it against our conditional statement and halting the while loop.

If statements

We use an if statement to check if a condition we are checking is satisfied (much like we do inside the parentheses next to the while loop). For instance:

t <- 5
if (t < 10) {
  print("t is smaller than 10") 
} else if (t >= 10 && t < 1000) {
  print("t is greater than or equal to 10 and smaller than 1000") 
} else {
  print("t is greater than or equal to 1000")
}

## [1] "t is smaller than 10"

If we ran the same code with different values of t, we would get:

t <- 15
if (t < 10) {
  print("t is smaller than 10") 
} else if (t >= 10 && t < 1000) {
  print("t is greater than or equal to 10 and smaller than 1000") 
} else {
  print("t is greater than or equal to 1000")
}

## [1] "t is greater than or equal to 10 and smaller than 1000"

t <- 1005
if (t < 10) {
  print("t is smaller than 10") 
} else if (t >= 10 && t < 1000) {
  print("t is greater than or equal to 10 and smaller than 1000") 
} else {
  print("t is greater than or equal to 1000")
}

## [1] "t is greater than or equal to 1000"

Here are some operators that can be used in an if statement:

Column 1	Column 2
&&	AND
\|\|	OR
<	smaller than
<=	smaller than or equal to
>	greater than
>=	greater than or equal to
==	equal to

Conditional expressions can also be used to retrieve a specific information from the data. For instance:

data <- c(1,2,3,4,5,6) # let's create a vector first
data

## [1] 1 2 3 4 5 6

data[data<5] <- 1 # assign 1 to all elements smaller than 5
data

## [1] 1 1 1 1 5 6

Functions

In R (as well as many other programming languages such as MATLAB), there are two ways to execute code: scripts and functions. Scripts are used for specific initial values, whereas functions can be used flexibly since we can call them with our own variables when we need them. In that, this is not so different from how functions work in mathematics. If you do a one-off calculations, you might write:

y=2+3=5

But if you want to add 3 to something else, maybe over and over again, you’d define a function:

y=f(x)=x+3

If you call this function with x=2, the result would be the same as the one-off calculation. But you can also use other values of x and do other calculations with it.

In R, it works like this:

y <- 2+3 # one-off calculation

f <- function(x) {
  y <- x + 3
} # function
y <- f(2)
print(y)

## [1] 5

y <- f(5)
print(y)

## [1] 8