I have 3.25 years of time-based data and I'm using scikit-learn's RandomForestClassifier to try and classify live data as it comes in. My dataset has roughly 75,000 rows and 1,100 columns, and my train/test split is the first 3 years for train (66,000 rows), and the last 0.25 years (3 months or 9,000 rows) for test.
Since there's variability each time you train, I don't always see good precision on classifying the test data...but sometimes I do. So what I've tried doing is re-training the classifier over and over until I do see good precision on classifying the test data, then save that version to disk for use in live classification as new data comes in.
Some may say this is over-fitting the model to the test data...which is likely true, but I have decided that, due to randomness in training, finding a good fit on the first iteration versus the 100th makes no difference, because the iteration in which a good fit occurs happens completely by chance. Hence my determination to keep re-training until I find a good fit.
What I've seen is that I can find a fit that will have good/stable precision for the entire 3 months of the test period, but then when I use that model to classify live data as it comes in for the 4th month it's not stable, and the precision is drastically worse.
Question 1: how could a model have great/stable precision for 3 months straight but then flounder in the 4th month?
Question 2: how can I change or augment my setup or process to achieve classification precision stability on live data?