Vectors in R

Filed Under: R Programming
Vectors In R

Vectors in R are the fundamental data types. This is because the R compiler treats all scalars (numerics, integers, etc.) and matrices as special cases of vectors.

From a data scientist’s perspective, you can consider a vector as a collection of observations across an interval of time, such as temperatures read every day, total sales for the day, etc. R provides several relevant functions to handle vectors from this perspective.

Creating Vectors in R

The creation of a vector is done using the c() function.

myvec <- c(3.1,45,1,2,80)

R language provides us the functionality to dynamically calculate values and assign them to vectors.

> myvec2 <- c(3,5*8,(9/4))
> myvec2
[1]  3.00 40.00  2.25

We can create vectors using previously created variables.

> a <- 10
> b <-14.8
> c <-2
> myvec3 <- c(1,a,b,c)
> myvec3
[1]  1.0 10.0 14.8  2.0

We can also create a vector using two or more of the existing vectors.

> bigvec <-c(myvec,myvec2)
> bigvec
[1]  3.10 45.00  1.00  2.00 80.00  3.00 40.00  2.25

Vectors can have any number of items of the same data type (also sometimes known as the mode). Note: We cannot mix data types when we’re creating vectors in R.

Operations on Vectors in R Language

Vectors are stored contiguously in the memory, similar to C. You can index the elements in a vector, extract subsets of vectors, sort and perform routine mathematical operations over vectors element-wise. We shall look at some examples that make these clear.

Indexing elements of a vector

The elements of a vector can be extracted by using their index in a manner similar to accessing array elements. The following code snippet provides you with an example.

> bigvec
[1]  3.10 45.00  1.00  2.00 80.00  3.00 40.00  2.25
> bigvec[2]
[1] 45
> avar <- bigvec[7]
> avar
[1] 40

When you try accessing an element beyond the vector’s size, R returns an NA value.

Getting the length of a vector in R language

Oftentimes, we deal with data from a dataset we download off the internet. We read entire columns into vector variables and may not be aware of the dimensions beforehand. In these cases, the length will be an important parameter to know so that we don’t run into NA values when working with data. The length of a vector can be known using a length() function.

> length(bigvec)
[1] 8

Subsetting vectors in R

When dealing with long vectors, it is sometimes necessary to extract only the elements of interest from the vector. We can do this by making use of subsetting in R.

> bigvec
[1]  3.10 45.00  1.00  2.00 80.00  3.00 40.00  2.25

#Extract the last element of a vector - where index equals length
> bigvec[length(x=bigvec)]
[1] 2.25

#Extract the last but one element - subtract one from the length
> bigvec[length(x=bigvec)-1]
[1] 40

#Extract all elements except for the first element
> bigvec[-1]
[1] 45.00  1.00  2.00 80.00  3.00 40.00  2.25

#All elements except for the second one
> bigvec[-2]
[1]  3.10  1.00  2.00 80.00  3.00 40.00  2.25

#Elements from index 1 to index 3
> bigvec[1:3]
[1]  3.1 45.0  1.0

#Extract elements at specified indiced 1 and 5
> bigvec[c(1,5)]
[1]  3.1 80.0

We will look deeper into subsetting when we are working with real datasets in our further tutorials.

Generating Sequences

Sequences are vectors in R that are generated using a sequence operator (:). They can also be generated using the seq function. These two methods are illustrated below.

> 4:10
[1]  4  5  6  7  8  9 10

#From represents the starting range and to represents the ending range
#By is the increment factor.
> seq(from=1,to=20,by=2)
 [1]  1  3  5  7  9 11 13 15 17 19

#By value is negative for decreasing sequences
> seq(from=10,to=2,by=-1)
[1] 10  9  8  7  6  5  4  3  2

Instead of using a by parameter, you can also supply a length.out parameter to indicate the length you need and get evenly spaced values from the starting range to ending range.

> seq(from=3,to=20,length.out=25)
 [1]  3.000000  3.708333  4.416667  5.125000  5.833333  6.541667  7.250000  7.958333
 [9]  8.666667  9.375000 10.083333 10.791667 11.500000 12.208333 12.916667 13.625000
[17] 14.333333 15.041667 15.750000 16.458333 17.166667 17.875000 18.583333 19.291667
[25] 20.000000

Vectors can be repeated using the rep function in R. The usage of rep is illustrated below.

> rep(x=1,times=5)
[1] 1 1 1 1 1

The x can be replaced by a vector to obtain a repeating vector as follows.

> rep(x=bigvec, times=3)
 [1]  3.10 45.00  1.00  2.00 80.00  3.00 40.00  2.25  3.10 45.00  1.00  2.00 80.00
[14]  3.00 40.00  2.25  3.10 45.00  1.00  2.00 80.00  3.00 40.00  2.25

Sorting vectors

Vectors can be sorted in ascending or descending order using the sort() function in the following manner.

#Sorts in ascending order by default.
#decreasing=FALSE is an optional parameter.
> sort(bigvec, decreasing = FALSE)
[1]  1.00  2.00  2.25  3.00  3.10 40.00 45.00 80.00

#Sort in descending order
> sort(bigvec, decreasing=TRUE)
[1] 80.00 45.00 40.00  3.10  3.00  2.25  2.00  1.00

Vector arithmetic

Vector arithmetic has been covered in the operators in R discussion earlier. One important point about vector arithmetic is recycling. When the specified vector operation has two vectors with mismatched length as operands, R simply recycles the values from the shorter vector until it reaches the length.

> a <-c(0,1)
> b <-c(1,2,3,4,5)

#The new value of a after recycling will be (0,1,0,1,0) which gets added to b.
> a+b
[1] 1 3 3 5 5

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages