Time Series Plots in R

Filed Under: R Programming
Time Series Plots In R

In this tutorial, we’ll be going over how to create time series plots in R. Time series data refers to data points that represent a particular variable changing over different points of time. It can be thought of as a sequence of data that was recorded at regular time intervals.

Time series data is widely used in stock market analysis, weather analysis, market trend analysis and any other scenarios where data variations with time are important.

R has several packages to perform time-series plotting and analysis tasks. Let us begin by acquiring some standard time series data for our work.

Acquiring Data

Several data scientists and organizations have open-sourced time series datasets that could be directly downloaded to the R environment. Two of these sources are:

The packages can be installed into your R environment using install.packages("packagename") command. Other relevant instructions are present on the websites give above.

Let us proceed with some data from the tsdl package for illustrating time series plotting.

Viewing Time Series Data

The tsdl package has numerous data series across several categories. Let us try accessing some of these sets. The first step is to load the package into memory.

library(tsdl)
tsdl
Time Series Data Library: 648 time series  

                       Frequency
Subject                 0.1 0.25   1   4   5   6  12  13  52 365 Total
  Agriculture             0    0  37   0   0   0   3   0   0   0    40
  Chemistry               0    0   8   0   0   0   0   0   0   0     8
  Computing               0    0   6   0   0   0   0   0   0   0     6
  Crime                   0    0   1   0   0   0   2   1   0   0     4
  Demography              1    0   9   2   0   0   3   0   0   2    17
  Ecology                 0    0  23   0   0   0   0   0   0   0    23
  Finance                 0    0  23   5   0   0  20   0   2   1    51
  Health                  0    0   8   0   0   0   6   0   1   0    15
  Hydrology               0    0  42   0   0   0  78   1   0   6   127
  Industry                0    0   9   0   0   0   2   0   1   0    12
  Labour market           0    0   3   4   0   0  17   0   0   0    24
  Macroeconomic           0    0  18  33   0   0   5   0   0   0    56
  Meteorology             0    0  18   0   0   0  17   0   0  12    47
  Microeconomic           0    0  27   1   0   0   7   0   1   0    36
  Miscellaneous           0    0   4   0   1   1   3   0   1   0    10
  Physics                 0    0  12   0   0   0   4   0   0   0    16
  Production              0    0   4  14   0   0  28   1   1   0    48
  Sales                   0    0  10   3   0   0  24   0   9   0    46
  Sport                   0    1   1   0   0   0   0   0   0   0     2
  Transport and tourism   0    0   1   1   0   0  12   0   0   0    14
  Tree-rings              0    0  34   0   0   0   1   0   0   0    35
  Utilities               0    0   2   1   0   0   8   0   0   0    11
  Total                   1    1 300  64   1   1 240   3  16  21   648
> 

Let us try choosing a time-series for our plotting. We first create a subset of the above dataset using the subset function for the respective category.

crime <-subset(tsdl,'Crime')

Now, in order to access the time series, we need to index the data frame created above. This particular time series represents the number of monthly armed robberies in Boston from Jan 1965 to Oct 1977.

crime[[2]]
> crime[[2]]
     Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
1966  41  39  50  40  43  38  44  35  39  35  29  49
1967  50  59  63  32  39  47  53  60  57  52  70  90
1968  74  62  55  84  94  70 108 139 120  97 126 149
1969 158 124 140 109 114  77 120 133 110  92  97  78
1970  99 107 112  90  98 125 155 190 236 189 174 178
1971 136 161 171 149 184 155 276 224 213 279 268 287
1972 238 213 257 293 212 246 353 339 308 247 257 322
1973 298 273 312 249 286 279 309 401 309 328 353 354
1974 327 324 285 243 241 287 355 460 364 487 452 391
1975 500 451 375 372 302 316 398 394 431 431 

We now create a time series object from this data frame using the function.

series <- ts(crime[[2]])
series
Time Series:
Start = 1 
End = 118 
Frequency = 1 
  [1]  41  39  50  40  43  38  44  35  39  35  29  49  50  59  63  32  39  47  53  60
 [21]  57  52  70  90  74  62  55  84  94  70 108 139 120  97 126 149 158 124 140 109
 [41] 114  77 120 133 110  92  97  78  99 107 112  90  98 125 155 190 236 189 174 178
 [61] 136 161 171 149 184 155 276 224 213 279 268 287 238 213 257 293 212 246 353 339
 [81] 308 247 257 322 298 273 312 249 286 279 309 401 309 328 353 354 327 324 285 243
[101] 241 287 355 460 364 487 452 391 500 451 375 372 302 316 398 394 431 431
attr(,"source")
[1] McCleary & Hay (1980)
attr(,"description")
[1] Monthly Boston armed robberies Jan.1966-Oct.1975 Deutsch and Alt (1977)
attr(,"subject")
[1] Crime

The ts() function converts a numeric vector into a time series object. The syntax is as follows:

ts(vector, start, end, frequencY)

You can choose to convert only a part of the time series instead of the whole series by selecting the start and endpoints from the whole series.

We can retrieve only the crime data from 1970 January to 1972 December using the following command:

> shortseries <-ts(crime[[2]], start=c(1970,1), end=c(1983,12))
> shortseries
Time Series:
Start = 1970 
End = 1983 
Frequency = 1 
 [1] 41 39 50 40 43 38 44 35 39 35 29 49 50 59

The frequency option indicates how often the observations are to be made. 1 indicates annual, 4 indicates quarterly and so on. By default, frequency takes one observation per year by calculating the mean of all observations.

If we need more fine-grained observations, we need to specify 12 as the frequency (one observation every month).

Creating Time Series Plots in R

R provides plot.ts() function to plot time-series graphs. Let us re-examine our series data.

series <- ts(crime[[2]])
plot.ts(series)

Since this series was not specified with a start and end date, the plot will just display the observation number instead of the year number.

Plain Time Series plots in R
A plain time-series graph with no years

We are now going to redefine the series object with starting and ending dates and frequency set to 12.

series <- ts(crime[[2]], start =c(1966,1), end=c(1975,12),frequency = 12)
plot.ts(series)
Time series plots in R With Years
Timeseries With Years

Decomposing Time Series

It is possible to further analyze the time series by using decomposition. These additional pieces of information can be separately plotted as 3 different plots along with the observed plot:

  • Seasonal: How patterns repeat over certain intervals of time
  • Trend: The general direction of the time series progress – whether rising or falling.
  • Random: The inherent irregularity present in the data when the trend and seasonality are removed.

This information can be derived from a series using the decompose() function as follows.

decseries <-decompose(series)

The result is a list of all the above components of the series. These can be plotted using a plot() function directly.

plot(decseries)
Decomposed Time Series
Decomposed Time Series

From the graph, it can be observed that there is a seasonality in the crimes being performed, and the trend is generally on the rise.

Time series plots are an important means of data analysis for sequential and time-varying data. R functionalities like those mentioned above make the tasks easier.

Leave a Reply

Your email address will not be published. Required fields are marked *

close
Generic selectors
Exact matches only
Search in title
Search in content
Search in posts
Search in pages