1

I have the bank data of around 4 years of different branches. I am trying to predict number of rows in daily and hourly level. I have issue_datetime (year, month, day, hour) as important features. I applied different regression techniques (linear, decision trees, random forest, xgb) using graph lab but could not get better accuracy. I was also thinking to set the threshold based on past data like taking the mean of counts in daily, monthly level after removing outliers and set that as a threshold. What is the best approach?

Has QUIT--Anony-Mousse
  • 76,138
  • 12
  • 138
  • 194
user1584253
  • 975
  • 2
  • 18
  • 55
  • 3
    Because `What is the best approach?` cannot be answered meaningfully. And even if someone tries to, the answer would be highly speculative. All in all off-topic for stackoverflow. – cel Jan 20 '17 at 08:18
  • I agree with @cel. But I think before trying to make predictions trying to get rid of your seasonality patterns may help improving your prediction. – Shobeir Jan 20 '17 at 08:29
  • I am open to suggestions also like is there any other way to achieve this task? – user1584253 Jan 20 '17 at 08:30
  • This question would be better suited on http://datascience.stackexchange.com/ . That being said: you are basically trying to model a behavior, and you will want to test different models. I have once solved a similar question (varying intensities over time) with Fourier analysis (and in that cased it worked very well). Fourier presuposses that the observed intensities are caused by stacked periodical events. – S van Balen Jan 20 '17 at 09:55
  • Get a **book** on time series prediction. We can't put all this information in an answer here. – Has QUIT--Anony-Mousse Jan 21 '17 at 08:59

1 Answers1

1

Since you have 1d time series data, it should be relatively easy to graph your data and look for interesting patterns.

Once you establish that there are some non-stationary aspects to your data, the class of models you are probably wanting to check out first are auto-regressive models, possibly with seasonal additions. ARIMA models are pretty standard for time-series data. http://www.seanabu.com/2016/03/22/time-series-seasonal-ARIMA-model-in-python/

Him
  • 5,257
  • 3
  • 26
  • 83