Questions tagged [time-series]

A Time series is a sequence of data points with values measured at successive times (either in continuous time or at discrete time periods). Time series analysis exploits this natural temporal ordering to extract meaning and trends from the underlying data.

Time series data is data with a pattern (“trend”) over time. Quantitative forecasting can be applied when two conditions are satisfied:

  1. numerical information about the past is available;
  2. it is reasonable to assume that some aspects of the past patterns will continue into the future.

Time series data are useful when you are forecasting something that is changing over time (e.g., stock prices, sales figures, profits, etc.). Examples of time series data include:

  • Daily IBM stock prices
  • Monthly rainfall
  • Quarterly sales results for Amazon
  • Annual Google profits

https://www.otexts.org/fpp/1/4

Time series models attempt to make use of the natural one-way ordering of time so that values for a given period will be expressed as a function of past values. This same idea is used in time series forecasting — future values based on past data.

Typically, time series data points are spaced at uniform time intervals.

A time series model will generally reflect the fact that observations close together in time will be more closely related than observations further apart.

As a place to start, take a look at Wikipedia's page on time series. For further reading, refer to the Statsoft website which has an online textbook on time series analysis.

For time series analysis in , consider looking at the Time Series Task View and questions tagged for the zoo package and for the xts package.


Tag usage:

Questions on tag should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis or Data Science, the StackExchange site for Data Science related topics like time series.

15192 questions
3
votes
1 answer

How to extract hour, day of the week from categorical data?

I have data like this (it's a time series problem): Time y 2017-01-01 00:00:00 34002 2017-01-01 01:00:00 37947 2017-01-01 02:00:00 41517 2017-01-01 03:00:00 44476 2017-01-01 04:00:00 46234 I want to extract the hour,…
Max AweTery
  • 103
  • 6
3
votes
2 answers

pandas bucket timestamp into TimeGrouper frequency group

I have a data frame in pandas with a DateTime index. When grouping it with a time grouper: pd.Grouper(freq='360Min'), how can I join this result back onto the original timestamp? I.e. an equijoin timestamp=bucket will not work? Is there a…
Georg Heiler
  • 16,916
  • 36
  • 162
  • 292
3
votes
0 answers

Forecasting using fable and future, time & memory issues

I'm using fable and future to try to forecast in parallel, unfortunately it seems that for each iteration in the for loop, the model() step takes more time and consumes more memory. What I am trying to do is step forward one week at a time and…
Jerry424
  • 188
  • 6
3
votes
1 answer

Plot a .resample(D).size() from 2 different years into one chart?

I have some data from 2019 and 2020 starting in March until the end of May for each year. I've done this to the datetime ####Working with Date df['Date']= pd.to_datetime(df['Date']) df['Time_Hour'] = df['Date'].apply(lambda x:…
3
votes
1 answer

Cross Validation for longitudinal/panel data in scikit-learn

I have some longitudinal/panel data that takes the form below (code for data entry is below the question). Observations of X and y are indexed by time and country (eg USA at time 1, USA at time 2, CAN at time 1). time x y USA 1 5 10 USA 2 …
3
votes
2 answers

How to use the ccf() method in the statsmodels library?

I am having some trouble with the ccf() method in the (Python) statsmodels library. The equivalent operation works fine in R. ccf produces a cross-correlation function between two variables, A and B in my example. I am interested to understand the…
dkent
  • 151
  • 1
  • 7
3
votes
1 answer

How to set a daily frequency in a pandas dataframe?

this is how my dataset looks like Datetime MinDistance AvgDiameter RelativeV InfinityV 1900-01-04 0.00962 410.0 8.69 8.65 1900-01-11 0.03989 59.5 10.65 10.65 1900-01-29 0.02076 880.0 5.55 5.52 1900-02-04 0.03201…
3
votes
3 answers

How to forecast time series using AutoReg in python

I'm trying to build old school model using only auto regression algorithm. I found out that there's an implementation of it in statsmodel package. I've read the documentation, and as I understand it should work as ARIMA. So, here's my code: import…
Yoskutik
  • 1,859
  • 2
  • 17
  • 43
3
votes
2 answers

Aggregate time series data to make a scatter plot

I want to make time series scatter plot for my time series data, where my data has categorical columns which needs to be aggregated by group to make plotting data first, then make scatter plot either using seaborn or matplotlib. My data is product…
kim
  • 556
  • 7
  • 28
3
votes
1 answer

Find the biggest drops/rises in a time series without a loop (preferably using tidy/dplyr)?

I have many time series and want to find a way to identify the top 10 greatest rises and falls for each time series. This is not as easy as it sounds because the most prominent features on a time series can sometimes be interrupted by movements in…
stevec
  • 41,291
  • 27
  • 223
  • 311
3
votes
2 answers

MCMC Changepoint model in R

I want to run an MCMC linear Gaussian Multiple Changepoint model to detect changepoints for a time-series vector of continuous values. In doing so, I am thinking of using MCMCregressChange function, but I have several questions here: (1) How can I…
MasK
  • 71
  • 4
3
votes
0 answers

Time series data adding a gap between dates

I have a data set that I want to plot with a line graph. I have two "sets" of dates from 2006-2008 and then from 2015-2019. Ggplot keeps adding a line between these two points from 08-15 because technically there aren't any data gaps, it just spans…
Kelsey
  • 41
  • 2
3
votes
1 answer

Isolation Forest for time series data

I just wonder if the isolation Forest (iForest) can work with time-series data. As far as I know, iForest is used for anomaly detection and it is based on randomization techniques to randomly and recursively partition the data and then save the…
Amhs_11
  • 233
  • 3
  • 10
3
votes
2 answers

Postgres query - 24 hour time series filtered by specific column, but still return row for each hour

I'm working on a query that returns an hourly time series for a given day, but I need to filter by a particular column on another table, which in my case is the user id. This is my current query, which returns the submission count for every hour of…
Tim Wheeler
  • 75
  • 2
  • 6
3
votes
2 answers

Python pandas: insert rows for missing dates, time series in groupby dataframe

I have a dataframe df: Serial_no date Index x y 1 2014-01-01 1 2.0 3.0 1 2014-03-01 2 3.0 3.0 1 2014-04-01 3 6.0 2.0 2 2011-03-01 1 5.1 1.3 2…
cowboykevin05
  • 85
  • 2
  • 8