x <- 1 # Assign 1 to x
2 * x # Multiply x by 2[1] 2
Motivating scenario: You keep typing a bunch of numbers into R and you forget what they mean. You wish there was a better way.
Learning goals: By the end of this sub-chapter you should be able to
<-.Up until now, we had to enter data into a vector every time we wanted to access it. But this is a pain, and generates many opportunities for errors.
Storing values in variables allows for efficient (and less error-prone) analyses, while paving the way to more complex calculations. In R, we assign values to variables using the assignment operator, <-. For example, to store the value 1 in a variable named x, type x <- 1. Now, 2 * x will return 2.
x <- 1 # Assign 1 to x
2 * x # Multiply x by 2[1] 2
But R must have a value defined before it can use it. The code below aims to set y equal to five, and see what y plus one is (it should be six). However, it returns an error. Run the code to see the error message, then fix it!
R reads and executes each line of code sequentially, from top to bottom. Think about what y + 1 means to R if it hasnβt seen a definition of y yet.
In R, variables must be defined before they are used. When you try to use y + 1 before assigning a value to y, R throws an error because it doesnβt know what y is yet. When we switch the orderβassigning y <- 5 before using y + 1βR understands the command and evaluates it properly.
Now, try assigning different numbers to x and y, or even using them together in a calculation, such as x + y. Understanding this concept of assigning values is critical to understanding how to use R.
Variable assignment gets even more useful when weβre dealing with the real kind of data we store in vectors. Letβs return to our four hypothetical Clarkia plants with
By assigning these vectors to variables with reasonable names our code is so much clearer!
petals_per_flower <- 4
num_flowers <- c(1, 2, 3, 2)
seeds_per_flower <- c(3, 0.5, 1, 1)
petals_per_flower * num_flowers # total petals[1] 4 8 12 8
num_flowers * seeds_per_flower # total seeds (for each plant)[1] 3 1 3 2
sum(num_flowers * seeds_per_flower) # total seeds (overall)[1] 9
Variable assignment can be optional: In the code, I assigned observations to the vector, num_flowers, and then found the mean. But we could have skipped variable assignment mean(c(1, 2, 3, 2)) also returns 2.
There are two good reasons not to skip variable assignment:
Variable assignment makes code easier to understand. If I revisited my code in weeks I would know what the mean of this vector meant.
Variable assignment allows us to easily reuse the information. For example, below I can easily find the mean petal number.