Questions tagged [time-series]

A Time series is a sequence of data points with values measured at successive times (either in continuous time or at discrete time periods). Time series analysis exploits this natural temporal ordering to extract meaning and trends from the underlying data.

Time series data is data with a pattern (“trend”) over time. Quantitative forecasting can be applied when two conditions are satisfied:

  1. numerical information about the past is available;
  2. it is reasonable to assume that some aspects of the past patterns will continue into the future.

Time series data are useful when you are forecasting something that is changing over time (e.g., stock prices, sales figures, profits, etc.). Examples of time series data include:

  • Daily IBM stock prices
  • Monthly rainfall
  • Quarterly sales results for Amazon
  • Annual Google profits

https://www.otexts.org/fpp/1/4

Time series models attempt to make use of the natural one-way ordering of time so that values for a given period will be expressed as a function of past values. This same idea is used in time series forecasting — future values based on past data.

Typically, time series data points are spaced at uniform time intervals.

A time series model will generally reflect the fact that observations close together in time will be more closely related than observations further apart.

As a place to start, take a look at Wikipedia's page on time series. For further reading, refer to the Statsoft website which has an online textbook on time series analysis.

For time series analysis in , consider looking at the Time Series Task View and questions tagged for the zoo package and for the xts package.


Tag usage:

Questions on tag should be about implementation and programming problems, not about the statistical or theoretical properties of the technique. Consider whether your question might be better suited to Cross Validated, the StackExchange site for statistics, machine learning and data analysis or Data Science, the StackExchange site for Data Science related topics like time series.

15192 questions
16
votes
4 answers

Converting irregularly time stamped measurements into equally spaced, time-weighted averages

I have series of measurements which are time stamped and irregularly spaced. Values in these series always represent changes of the measurement -- i.e. without a change no new value. A simple example of such a series would be: 23:00:00.100 …
Tim Tröndle
  • 465
  • 5
  • 12
16
votes
4 answers

Efficient comparison of POSIXct in data.table

Hello I am looking for an efficient way of selecting POSIXct rows from a data.table such that the time of day is less than say 12:00:00 (NOTE that millisecond is NOT required, so we can use ITime for example) set.seed(1); N = 1e7; DT =…
statquant
  • 13,672
  • 21
  • 91
  • 162
16
votes
2 answers

Pandas - grouping intra day timeseries by date

I have an intra day series of log returns over multiple days that I would like to downsample to daily ohlc. I can do something like hi = series.resample('B', how=lambda x: np.max(np.cumsum())) low = series.resample('B', how=lambda x:…
signalseeker
  • 4,100
  • 7
  • 30
  • 36
16
votes
1 answer

Why OpenTSDB chose HBase for Time Series data storage?

I would really appreciate if somebody put some light on the choice of HBase as a data storage engine for OpenTSDB? Which other choices, such as Whisper (Graphite front-end + Carbon persistence), were considered? How is a column-oriented db such as…
Rajan
  • 739
  • 1
  • 6
  • 8
16
votes
2 answers

What is the state-of-the-art in unsupervised learning on temporal data?

I'm looking for an overview of the state-of-the-art methods that find temporal patterns (of arbitrary length) in temporal data and are unsupervised (no labels). In other words, given a steam/sequence of (potentially high-dimensional) data, how do…
16
votes
4 answers

What is the quickest/easiest way to count active users in last one minute?

You work for Zynga, and want to count the number of currently active players for different games. Your web server handles pings from many different games and each user has a unique GUID. Must be able to query number of active users for one game at a…
Master Yoda
  • 587
  • 3
  • 7
16
votes
4 answers

pandas, python - how to select specific times in timeseries

I worked now for quite some time using python and pandas for analysing a set of hourly data and find it quite nice (Coming from Matlab.) Now I am kind of stuck. I created my DataFrame like that: SamplingRateMinutes=60 index =…
Dr. Dave
  • 163
  • 1
  • 1
  • 4
16
votes
4 answers

Finding lag at which cross correlation is maximum ccf( )

I have 2 time series and I am using ccf to find the cross correlation between them. ccf(ts1, ts2) lists the cross-correlations for all time lags. How can I find the lag which results in maximum correlation without manually looking at the data?
tan
  • 1,569
  • 5
  • 14
  • 30
15
votes
2 answers

Trying to Understand FB Prophet Cross Validation

I have a dataset with 84 Monthly Sales (from 01/2013 to 12/2019) - just months, not days. Month 01 | Sale 1 Month 02 | Sale 2 Month 03 | Sale 3 .... | ... Month 84 | Sale 84 By visualization it looks like that the model fits very well...…
15
votes
4 answers

How to plot a time series graph

I have a time series data as follows: Datum Menge 1/1/2018 0:00 19.5 1/1/2018 0:15 19.0 1/1/2018 0:30 19.5 1/1/2018 0:45 19.5 1/1/2018 1:00 21.0 1/1/2018 1:15 19.5 1/1/2018 1:30 20.0 1/1/2018 1:45 23.0 and the dataframe data has a…
some_programmer
  • 3,268
  • 4
  • 24
  • 59
15
votes
2 answers

Python Smooth Time Series Data

I have some data in python that is unixtime, value: [(1301672429, 274), (1301672430, 302), (1301672431, 288)...] Time constantly steps by one second. How might I reduce this data so the timestamp is every second, but the value is the average of the…
Kyle Brandt
  • 26,938
  • 37
  • 124
  • 165
15
votes
2 answers

how to calculate all pairwise distances in two dimensions

Say I have data concerning the position of animals on a 2d plane (as determined by video monitoring from a camera directly overhead). For example a matrix with 15 rows (1 for each animal) and 2 columns (x position and y…
distance deprived
  • 151
  • 1
  • 1
  • 3
15
votes
5 answers

Set units of difference between datetime objects

The diff command returns the differences between dates in a vector of dates in the R date format. I'd like to control the units that are returned, but it seems like they are automatically determined, with no way to control it w/ an argument. Here's…
John Horton
  • 4,122
  • 6
  • 31
  • 45
15
votes
3 answers

How to efficiently compute a rolling unique count in a pandas time series?

I have a time series of people visiting a building. Each person has a unique ID. For every record in the time series, I want to know the number of unique people visiting the building in the last 365 days (i.e. a rolling unique count with a window of…
15
votes
1 answer

How to prepare data for LSTM when using multiple time series of different lengths and multiple features?

I have a dataset from a number of users (nUsers). Each user is sampled randomly in time (non-constant nSamples for each user). Each sample has a number of features (nFeatures). For example: nUsers = 3 ---> 3 users nSamples = [32, 52, 21] ---> first…
AR_
  • 468
  • 6
  • 18