0

I downloaded annual earnings by Google over past 4 years:

library(quantmod)
getFinancials(GOOG)
df<-viewFinancials(GOOG.f, type='IS', period='A',subset = NULL)['Net Income',]
df<-(as.data.frame(df))

Here is how the data is displayed:

2015-12-31 16348
2014-12-31 14136
2013-12-31 12733
2012-12-31 10737

I would like to "extrapolate" this data as an averaged linear growth over next 10 years, something in this fashion:

enter image description here.

In Excel, all I need to paste the above data, sort from oldest to newest, select it, and "stretch" the selection over 10 additional rows, with this result:

12/31/2012  10737
12/31/2013  12733
12/31/2014  14136
12/31/2015  16348
12/31/2016  18048
12/31/2017  19871
12/31/2018  21695
12/31/2019  23518
12/31/2020  25342
12/31/2021  27166
12/31/2022  28989
12/31/2023  30813
12/31/2024  32636
12/31/2025  34460

How can I do the same (or something close to it) in R?

Oposum
  • 1,155
  • 3
  • 22
  • 38

1 Answers1

3

It takes a few additional steps in R. Here is your sample data:

date<-as.Date(c("2015-12-31", "2014-12-31", "2013-12-31", "2012-12-31"))
value<-c(16348, 14136, 12733, 10737)

Assuming a linear growth into the future. Use the lm command to perform the linear regression. The variable "model" stores the fit.

#fit linear regression
model<-lm(value~date)

Looking 10 years into the future, create a date sequence for the next 10 years and store as a dataframe (required for the predict command)

#build predict dataframe
dfuture<-data.frame(date=seq(as.Date("2016-12-31"), by="1 year", length.out = 10))
#predict the futurne
predict(model, dfuture, interval = "prediction")

The above model is assuming linear growth. If there is a different prediction of what the growth would be then lm formula needs modification or use the nlm equation. I will leave out the warnings about making predictions outside the range of available data.

Dave2e
  • 22,192
  • 18
  • 42
  • 50
  • this is it. It works smoothly for 'GOOG' but when I applied the same to 'TSLA', it only predicted 4 years instead of 10 years. Do you know why? Too unpredictable? – Oposum Apr 25 '16 at 19:49
  • The predict function should produce an estimate for each row in the dataframe passed to it. Make sure the column names in the dataframe matches the dependent variable name(s) in the model! In the case above, the lm model was value vs "date" and thus the column name in the dfuture dataframe was also named "date" – Dave2e Apr 25 '16 at 20:36