I have some fish data that I would like to make predictions for into the future. I would like to predict 'fishcount' based on two variables (airtemp_f and watertemp_f). I, ideally would like to use the R package forecast to predict fishcount 2 or 3 period numbers ahead, however, I don't know how to include airtemp_f and watertemp_f into the model. Below is a small dataset:
library(forecast)
library(ggfortify)
library(ggplot2)
library(xts)
fish <- structure(list(year = c(2011, 2011, 2011, 2011, 2011, 2011, 2011,
2011, 2011, 2011, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012,
2012, 2012, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011, 2011,
2011, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012, 2012
), period = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6,
7, 8, 9, 10), district = c("221", "221", "221", "221", "221",
"221", "221", "221", "221", "221", "221", "221", "221", "221",
"221", "221", "221", "221", "221", "221", "222", "222", "222",
"222", "222", "222", "222", "222", "222", "222", "222", "222",
"222", "222", "222", "222", "222", "222", "222", "222"), date = structure(c(15158,
15160, 15162, 15164, 15165, 15167, 15168, 15169, 15172, 15174,
15512, 15519, 15525, 15529, 15531, 15533, 15535, 15536, 15537,
15538, 15187, 15190, 15192, 15194, 15197, 15199, 15201, 15203,
15205, 15207, 15903, 15905, 15908, 15911, 15914, 15916, 15918,
15919, 15920, 15921), class = "Date"), fishcount = c(2101, 16031,
13498, 7024, 42569, 36288, 101565, 204305, 235376, 39851, 14879,
24512, 97382, 109688, 164938, 182892, 115047, 203842, 247499,
33879, 89551, 25576, 61377, 4517, 0, 11739, 22318, 69831, 2895,
16720, 349586, 136904, 365634, 369484, 1054650, 1009362, 1080558,
671706, 631603, 1007896), airtemp_f = c(54.95, 56.15, 54.1325,
53.3975, 54.1775, 53.945, 54.62, 54.0773913043478, 56.63, 54.7625,
50.8025, 49.6625, 49.8575, 49.3775, 49.55, 49.49, 50.0525, 49.775,
49.6775, 50.795, 57.8675, 53.9225, 53.5475, 51.905, 58.8875,
55.0475, 54.185, 56.24, 53.915, 54.1325, 56.8154545454545, 58.6021052631579,
60.5381818181818, 58.084347826087, 57.6930434782609, 56.9808695652174,
59.3109090909091, 57.8136363636364, 174.548, 56.1623529411765
), watertemp_f = c(56.735, 57.2225, 56.4125, 55.5275, 54.6575,
54.7625, 54.4475, 53.7095652173913, 55.6925, 53.09, 50, 51.635,
52.61, 51.0425, 51.095, 50.63, 50.825, 51.065, 50.8625, 52.25,
59.7425, 55.9325, 55.67, 54.6575, 55.2575, 54.8375, 55.7525,
56.78, 55.985, 55.595, 59.09, 59.4263157894737, 59.7690909090909,
58.7417391304348, 59.7513043478261, 60.424347826087, 61.2172727272727,
59.9163636363636, 58.676, 58.2588235294118)), row.names = c(NA,
-40L), class = c("tbl_df", "tbl", "data.frame"))
head(fish)
year period district date fishcount airtemp_f watertemp_f
<dbl> <dbl> <chr> <date> <dbl> <dbl> <dbl>
1 2011 1 221 2011-07-03 2101 55.0 56.7
2 2011 2 221 2011-07-05 16031 56.2 57.2
3 2011 3 221 2011-07-07 13498 54.1 56.4
4 2011 4 221 2011-07-09 7024 53.4 55.5
5 2011 5 221 2011-07-10 42569 54.2 54.7
6 2011 6 221 2011-07-12 36288 53.9 54.8
This is my attempt:
#convert fish to xts or ts?
count <- as.xts(fish$fishcount,order.by=seq(as.Date("2011-07-03"),by=2,len=40))
d.arima <- auto.arima(count)
d.forecast <- forecast(d.arima, level = c(95), h = 3)
d.forecast
Question: How do I include airtemp_f and watertemp_f into the model to predict by period and how do I plot it in ggplot?
Thanks for any help beforehand.