0

I have a dataset of two columns, the year and temperature.

Year     Temperature
1869     51.4
1870     53.6
1871     51.1
1872     51.0
1873     51.0
1874     51.3
1875     49.4
1876     51.9
1877     52.8
1878     53.6
1879     52.3
1880     53.2
1881     52.4
1882     52.0
1883     50.5
1884     52.4
1885     51.1
1886     51.0
1887     50.9
1887     49.3

I need to write an r script to calculate the mean percent change every year, plot and print the output along with the original data.

For example the mean % change for 1873 = ((mean from 1874 to 1878 - mean from 1869 to 1873)/mean from 1874 to 1878)*100 I need to repeat this for 1874 to 1884, print the out in a csv file and plot as a time series along with the original data.

I am not sure to begin here, any ideas or suggestions welcome.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
Christiana
  • 23
  • 3

3 Answers3

1

using library(zoo) we can do

l = rollmean(head(x, -5)[,2], 5)
r = rollmean(tail(x, -5)[,2], 5)
percent_change = 100 * (r-l)/r
# [1]  0.3474903  0.7692308  3.7907506  3.6700719  2.6944972  0.5376344  0.1919386 -2.0897833 -2.8404669
#[10] -2.9699101 -2.2379270

The data:

x = structure(list(Year = c(1869, 1870, 1871, 1872, 1873, 1874, 1875, 
1876, 1877, 1878, 1879, 1880, 1881, 1882, 1883, 1884, 1885, 1886, 
1887, 1887), Temp = c(51.4, 53.6, 51.1, 51, 51, 51.3, 49.4, 51.9, 
52.8, 53.6, 52.3, 53.2, 52.4, 52, 50.5, 52.4, 51.1, 51, 50.9, 
49.3)), class = "data.frame", row.names = c(NA, -20L))
dww
  • 30,425
  • 5
  • 68
  • 111
0

To start, you will need to compute the mean temperature in every 5-year period. You can do this with embed() and rowMeans():

x <- c(51.4, 53.6, 51.1, 51.0, 51.0,
       51.3, 49.4, 51.9, 52.8, 53.6,
       52.3, 53.2, 52.4, 52.0, 50.5,
       52.4, 51.1, 51.0, 50.9, 49.3)

M <- embed(x)
M
#      [,1] [,2] [,3] [,4] [,5]
# [1,] 51.0 51.0 51.1 53.6 51.4
# [2,] 51.3 51.0 51.0 51.1 53.6
# [3,] 49.4 51.3 51.0 51.0 51.1
# [4,] 51.9 49.4 51.3 51.0 51.0
# [5,] 52.8 51.9 49.4 51.3 51.0
# [6,] 53.6 52.8 51.9 49.4 51.3
# [7,] 52.3 53.6 52.8 51.9 49.4
# [8,] 53.2 52.3 53.6 52.8 51.9
# [9,] 52.4 53.2 52.3 53.6 52.8
#[10,] 52.0 52.4 53.2 52.3 53.6
#[11,] 50.5 52.0 52.4 53.2 52.3
#[12,] 52.4 50.5 52.0 52.4 53.2
#[13,] 51.1 52.4 50.5 52.0 52.4
#[14,] 51.0 51.1 52.4 50.5 52.0
#[15,] 50.9 51.0 51.1 52.4 50.5
#[16,] 49.3 50.9 51.0 51.1 52.4

means <- rowMeans(M)
means
# [1] 51.62 51.60 50.76 50.92 51.28 51.80 52.00 52.76 52.86 52.70 52.08 52.10
#[13] 51.68 51.40 51.18 50.94

These are the 16 means for 1969-1973 up to 1984-1988. You would compute year to-year per cent change from 1973 to 1983 like so:

changes <- 100 * (means[6:16] - means[1:11]) / means[6:16]
changes
# [1]  0.3474903  0.7692308  3.7907506  3.6700719  2.6944972  0.5376344
# [7]  0.1919386 -2.0897833 -2.8404669 -2.9699101 -2.2379270

Before adding changes (11 elements) to your data frame (20 rows), you'll want to pad it with NA so that the years line up:

changes <- c(rep(NA, 4), changes, rep(NA, 5))

You can run ?plot and ?write.csv to get help with plotting and saving data frames as .csv.

Mikael Jagan
  • 9,012
  • 2
  • 17
  • 48
0

Here's your dataset with the percentage change in temperature. It uses data.table. I'd recommend using ggplot for the graph if you have experience with it.

df2 <- data.frame(Year =c(1869, 1870, 1871, 1872, 1873, 1874, 1875, 1876, 1877, 1878, 1879, 1880, 1881, 1882, 1883, 1884, 1885, 1886, 1887, 1887),
                  Temperature = c(51.4, 53.6, 51.1, 51.0, 51.0, 51.3, 49.4, 51.9, 52.8, 53.6, 52.3, 53.2, 52.4, 52.0, 50.5, 52.4, 51.1, 51.0, 50.9, 49.3))
df2 <- as.data.table(df2)

mean_df <- df2[,mean_temp := mean(Temperature),by =c("Year")]
mean_df <- unique(mean_df,by=c("Year","mean_temp"))

temperature_change <- mean_df[,temp_change := (mean_temp/shift(mean_temp,1L,type="lag"))*100-100]
temperature_change$Temperature <- NULL

df3 <- merge(df2,temperature_change,by=c("Year","mean_temp"), all=T)