4

Given a set of (time-series) data, how to interpret the data in such a way that it is increasing/decreasing, not steady, unchanged, etc.

Year  Revenue
1993     0.85
1994     0.99
1995     1.01
1996     1.12
1997     1.25
1998     1.36
1999     1.28
2000     1.44
  • I'm not sure if this is really python or pandas related. How do you define, increasing, decreasing, not steady, unchanged? How would you solve this without pandas? Maybe is better suited for [crossvalidated](http://stats.stackexchange.com/) – Quickbeam2k1 Mar 21 '17 at 07:21
  • 1
    pandas, sure can perform time series analysis, however, you still need to define how you would identify a trend. For example, you simply perform a linear regression on you values and use the slope as indicator of trend strength. However, typically, the less data you have the more volatile such a trend is. Additionally, you may want to discover trend changes, thus the context of time becomes important. Time series analysis is not so simple, however, pandas and numpy can help you there – Quickbeam2k1 Mar 21 '17 at 07:33

2 Answers2

17

you can use numpy.polyfit, you can provide order as Degree of the fitting polynomial.

Refer:numpy.polyfit documentation

import numpy as np
import pandas as pd

def trendline(data, order=1):
    coeffs = np.polyfit(data.index.values, list(data), order)
    slope = coeffs[-2]
    return float(slope)

#Sample Dataframe
revenue = [0.85, 0.99, 1.01, 1.12, 1.25, 1.36, 1.28, 1.44]
year = [1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000]

# check if values are exactly same
if (len(set(revenue))) <= 1:
    print(0)
else:
    df = pd.DataFrame({'year': year, 'revenue': revenue})

    slope = trendline(df['revenue'])
    print(slope)

so now if the value of the slope is +ve the trend is increasing, if it is 0 trend is constant, else decreasing

In your given data slope is 0.0804761904762. So, the trend is increasing

Update: trendline fails in case of exactly constant value, you can add custom check (len(set(revenue))) <= 1 to verify, if that is the case return 0.

Ashish
  • 4,206
  • 16
  • 45
  • 2
    Hi, if i set constant value to revenue say revenue =[200,200,200,200,200,200,200,200]. i get negative output . According to you it should be 0. Can you clarify – Indraneel Aug 30 '17 at 09:36
  • why does this answer have so many upvotes? it does not seem correct to me – Snow Aug 16 '18 at 12:54
  • @Indraneel Regarding -ve slope since we are trying polynomial fit, It will not work in case of constant data, you can have a simple check if `len(set(listChar))==1` return 0. – Ashish May 21 '20 at 20:55
  • @Snow It is not a very correct way to do prediction or forecast, but you can certainly check for the current trend with that. – Ashish May 21 '20 at 21:00
  • @Ashish it looks like you are returning the intercept rather than the slope with slope = coeffs[-2] ? `np.polyfit` returns `[intercept, slope]` while `numpy.polynomial.polynomial.polyfit` returns `[slope, intercept]` – Colin Anthony Jul 29 '21 at 07:47
8

if you sort the dataframe by 'Year'

df.sort_values('Year', inplace=True)

You can then observe the pd.Series attributes
df.Revenue.is_monotonic
df.Revenue.is_monotonic_decreasing
df.Revenue.is_monotonic_increasing

piRSquared
  • 285,575
  • 57
  • 475
  • 624