how to perform data exploration on 1000 unique timeseries data?

Question

This is the first time i am working on time series, hence kindly pardon me. My dataset consists of following, product id column with 1000 different products, date column, sales column. Since the first step will be to perform data exploration(time series is decomposed into three parts- Trend, Seasonality and Random), how do i explore data with no additional information but just product id and their sales for past 3 years.

Based on this data i need to build set of models that forecast sales for 4 months into the future. Kindly help me with this.

As seen in the plot, i tried to understand distribution of each time series, but there are 1000 plots, and its very hard to understand or comprehend the data, this is the challenge i am facing. I want to split my Sales data for each item, into the Trend, Seasonality and Random part.

I have the basic code, however not sure how to incorporate the same for multiple items, to identify the trend, seasonality and random for each item.

I have no additional information, like product category, sales region, etc, just have product id, date and sales...

We do not write code or develop algorithms for people from scratch, we only help with concrete issues in existing code. Please show what you've tried and where exactly you're experiencing issues. — ForceBru, Oct 28 '18 at 16:52
@ForceBru thanks for your reply, i have updated my question, basically i am trying to perform EDA on time series and its impractical to perform on each individual data... because without EDA we can not understand the trend, seasonality, etc of the data, which is crucial for applying a forecast algorithm — AVR, Oct 28 '18 at 17:13

score 0 · Answer 1 · answered Oct 28 '18 at 17:47

I'm not sure what else you've tried thus far so I apologize if I'm unable to provide you with new usable information.

Prophet itself is already a forecasting model so I'm assuming that rather than building a machine learning model you're working on generating the forecast data for your EDA.

What you need to do at this stage is to first determine what do you want out of your data. If you really want to forecast the performance of every single one of one of the products in the dataset then you would have to find a way to work around the impracticality of forecasting for each of your 1000 products individually. However even if you did manage to do so, the results might not be particularly meaningful without knowing the product categories or any similarities between each product.

What I would suggest would is to figure out what questions you would want your EDA to answer and working with the data from there. Perhaps you could select the lowest performing products and forecast their future performance. Based on that you could suggest if there's any point in continuing to sell those products.

If you're looking for information on how to use forecasting models then here's a few resources you could check out

Predicting the future with Facebook Prophet

An Introduction to Time Series Forecasting with Prophet Package in Exploratory Data Analysis

And of course, Prophet's own documentation

Thanks for your reply, so i am tasked to build a forecasting system, that if given one of the product id, it would generate forecast sales for next 6 months into future, i am so confused to even take the initial step... i want to perform EDA to atleast understand the data but again its not easy to perform eda on 1000 data... I was thinking if i could classify data into few clusters and create a class (basically feature engineering), and in order to do this i want to know what set of characteristics i need to use to classify my 1000 time series data... — AVR, Oct 28 '18 at 17:53
Well, with Facebook Prophet, the model building is already done for you. I'd imagine what you could do on top of that is to build code to generate the forecast for a given product id rather than to forecast for all 1000 products. The thing about feature engineering is that you would need to possess some domain knowledge. With how little information you have on the product types I'd imagine it would be difficult to produce features of consequence. Perhaps a possible starting point could be generating quarterly sale performance for each year? — evantkchong, Oct 28 '18 at 19:14
Thanks for your reply, so would clustering help at all? affinity propagation, dtw, etc — AVR, Oct 29 '18 at 01:07

how to perform data exploration on 1000 unique timeseries data?

1 Answers1