I have weather dataset my data is date-dependent
I want to predict the temperature from 07 May 2008 until 18 May 2008 (which is maybe a total of 10-15 observations) my data size is around 200
I will be using decision tree/RF and SVM & NN to make my prediction
I've never handled data like this so I'm not sure how to sample non random data I want to sample data 80% train data and 30% test data but I want to sample the data in the original order not randomly. Is that possible ?
install.packages("rattle")
install.packages("RGtk2")
library("rattle")
seed <- 42
set.seed(seed)
fname <- system.file("csv", "weather.csv", package = "rattle")
dataset <- read.csv(fname, encoding = "UTF-8")
dataset <- dataset[1:200,]
dataset <- dataset[order(dataset$Date),]
set.seed(321)
sample_data = sample(nrow(dataset), nrow(dataset)*.8)
test<-dataset[sample_data,] # 30%
train<-dataset[-sample_data,] # 80%
output
> head(dataset)
Date Location MinTemp MaxTemp Rainfall Evaporation Sunshine WindGustDir WindGustSpeed
1 2007-11-01 Canberra 8.0 24.3 0.0 3.4 6.3 NW 30
2 2007-11-02 Canberra 14.0 26.9 3.6 4.4 9.7 ENE 39
3 2007-11-03 Canberra 13.7 23.4 3.6 5.8 3.3 NW 85
4 2007-11-04 Canberra 13.3 15.5 39.8 7.2 9.1 NW 54
5 2007-11-05 Canberra 7.6 16.1 2.8 5.6 10.6 SSE 50
6 2007-11-06 Canberra 6.2 16.9 0.0 5.8 8.2 SE 44
WindDir9am WindDir3pm WindSpeed9am WindSpeed3pm Humidity9am Humidity3pm Pressure9am
1 SW NW 6 20 68 29 1019.7
2 E W 4 17 80 36 1012.4
3 N NNE 6 6 82 69 1009.5
4 WNW W 30 24 62 56 1005.5
5 SSE ESE 20 28 68 49 1018.3
6 SE E 20 24 70 57 1023.8
Pressure3pm Cloud9am Cloud3pm Temp9am Temp3pm RainToday RISK_MM RainTomorrow
1 1015.0 7 7 14.4 23.6 No 3.6 Yes
2 1008.4 5 3 17.5 25.7 Yes 3.6 Yes
3 1007.2 8 7 15.4 20.2 Yes 39.8 Yes
4 1007.0 2 7 13.5 14.1 Yes 2.8 Yes
5 1018.5 7 7 11.1 15.4 Yes 0.0 No
6 1021.7 7 5 10.9 14.8 No 0.2 No
> head(test)
Date Location MinTemp MaxTemp Rainfall Evaporation Sunshine WindGustDir WindGustSpeed
182 2008-04-30 Canberra -1.8 14.8 0.0 1.4 7.0 N 28
77 2008-01-16 Canberra 17.9 33.2 0.0 10.4 8.4 N 59
88 2008-01-27 Canberra 13.2 31.3 0.0 6.6 11.6 WSW 46
58 2007-12-28 Canberra 15.1 28.3 14.4 8.8 13.2 NNW 28
96 2008-02-04 Canberra 18.2 22.6 1.8 8.0 0.0 ENE 33
126 2008-03-05 Canberra 12.0 27.6 0.0 6.0 11.0 E 46
WindDir9am WindDir3pm WindSpeed9am WindSpeed3pm Humidity9am Humidity3pm Pressure9am
182 E N 2 19 80 40 1024.2
77 N NNE 15 20 58 62 1008.5
88 N WNW 4 26 71 28 1013.1
58 NNW NW 6 13 73 44 1016.8
96 SSE ENE 7 13 92 76 1014.4
126 SSE WSW 7 6 69 35 1025.5
Pressure3pm Cloud9am Cloud3pm Temp9am Temp3pm RainToday RISK_MM RainTomorrow
182 1020.5 1 7 5.3 13.9 No 0.0 No
77 1006.1 6 7 24.5 23.5 No 4.8 Yes
88 1009.5 1 4 19.7 30.7 No 0.0 No
58 1013.4 1 5 18.3 27.4 Yes 0.0 No
96 1011.5 8 8 18.5 22.1 Yes 9.0 Yes
126 1022.2 1 1 15.7 26.2 No 0.0 No
> head(train)
Date Location MinTemp MaxTemp Rainfall Evaporation Sunshine WindGustDir WindGustSpeed
7 2007-11-07 Canberra 6.1 18.2 0.2 4.2 8.4 SE 43
9 2007-11-09 Canberra 8.8 19.5 0.0 4.0 4.1 S 48
11 2007-11-11 Canberra 9.1 25.2 0.0 4.2 11.9 N 30
16 2007-11-16 Canberra 12.4 32.1 0.0 8.4 11.1 E 46
22 2007-11-22 Canberra 16.4 19.4 0.4 9.2 0.0 E 26
25 2007-11-25 Canberra 15.4 28.4 0.0 4.4 8.1 ENE 33
WindDir9am WindDir3pm WindSpeed9am WindSpeed3pm Humidity9am Humidity3pm Pressure9am
7 SE ESE 19 26 63 47 1024.6
9 E ENE 19 17 70 48 1026.1
11 SE NW 6 9 74 34 1024.4
16 SE WSW 7 9 70 22 1017.9
22 ENE E 6 11 88 72 1010.7
25 SSE NE 9 15 85 31 1022.4
Pressure3pm Cloud9am Cloud3pm Temp9am Temp3pm RainToday RISK_MM RainTomorrow
7 1022.2 4 6 12.4 17.3 No 0.0 No
9 1022.7 7 7 14.1 18.9 No 16.2 Yes
11 1021.1 1 2 14.6 24.0 No 0.2 No
16 1012.8 0 3 19.1 30.7 No 0.0 No
22 1008.9 8 8 16.5 18.3 No 25.8 Yes
25 1018.6 8 2 16.8 27.3 No 0.0 No